Comment Re:This has been suggested before (Score 1) 120
I checked the supplied links. The point of the
Cilibrasi-Vitanyi method is not that it uses Google
page counts (like the supplied links many different
approaches have done so). The point is that a
particular distance (formula) based on an extensive
mathematical theory and a sequence of research papers spanning over a decade has been developed.
Experimental testing shows that it always works,
in diffrent settings, in a relatively precise way.
For example, in a massive randomized experiment
the mean agreement with the decade-effort WordNet database at Princeton University is 87.5% with
a standard deviation of 0.11. This means that
the automatic method is remarkably precise and almost equals the knowledge put in by experts with a PhD.
The comment above is like saying that "internet is
nothing new because the idea of connecting computers
has been suggested before". That depends on what you call "new", but the devel is in the underpinning theory and the experimental validation. If the approach of Dr E. Garcia, Mi Islita.com is better in practice, so be it. But the statement I react to says nothing of the sort.