More on Statistical Language Translation 193
DrLudicrous writes "The NYTimes is running an article about how statistical language translation schemes have come of age. Rather than compile an extensive list of words and their literal translations via bilingual human programmers, statistical translation work by comparing texts in both English and another language and 'learning' the other language via statistical methods applied to units called 'N-grams'- e.g. if 'hombre alto' means tall man, and 'hombre grande' means big man, then hombre=man, alto=tall, and grande=big." See our previous story for more info.
Not just matching phrases (Score:5, Interesting)
however, this requires a stage where the sample texts are used to extract grammatical information on the second language. Of course, it helps alot if you are familiar with one of the two languages.
Same words, different meanings (Score:5, Interesting)
drunk?
angry?
urinated?
Re:Same words, different meanings (Score:1)
However, if this sentence appears in some context, and the sample texts are extensive enough to include the idiom "get pissed" in a similar context it may be enough to let the translator prefer one translation over the other.
If this project got this far I would be impressed.
Re:Same words, different meanings (Score:1)
Re:Same words, different meanings (Score:5, Funny)
The US Gov't was funding an early computer group to translate documents from Russian-to-English and back. The hope, obviously, was to eliminate the need for human translators. A particular sentence was fed to the computer, which translated it into Russian. The computer was then fed the Russian, and it translated it back to English.
The original sentence was "The spirit is strong, but the flesh is weak".
The resulting sentence? "The vodka is good, but the meat is rotten".
The computer didn't know which of the many possible words to use when translating spirit, so it used "vodka". Likewise, it tried to put the word "strong" into context, and since strong vodka is prized in Russia, it decided that the vodka was good. Likewise, flesh got translated to meat, and weak flesh became bad meat.
Re:Same words, different meanings (Score:2, Funny)
example:
Original English Text:
I am a lame anonymous coward
Translated to French:
Je suis un lache anonyme boiteux
Translated back to English:
I am a lame anonymous coward
Translated to German:
Ich bin ein lahmer anonymer Feigling
Translated back to English:
I am a lame anonymous coward
Translated to Italian:
Sono un vigliacco anonimo zoppo
Translated back to English:
They are vigliacco an a
Re:Same words, different meanings (Score:2)
Re:Same words, different meanings (Score:2, Insightful)
For example one country's "Weapons of Mass Destruction" is another country's "Strategic Deterrent". Both phrases mean the same thing but the tone is very different. Same thing with "terrorists" and "freedom fighters". You can use either phrase to describe the same people and imply very different meanings.
It will be a long time before an automated system will be able to make an acceptable translation
Two more classic machine mistranslations (Score:4, Interesting)
An engineer was confused when a a translated spec included water goats. "Water goats"?! Hydraulic rams, actually.
And perhaps most famous of all, "out of sight, out of mind" supposedly came back as "blind idiot".
Language is a curious thing. I can't help thinking there's some deeper meaning to the fact that misapplication of it can so easily be funny to us.
Re:Same words, different meanings (Score:2, Interesting)
Does anyone know if for example babel is context/locale sensitive in this sense:
If I write "theatre" or some other word with british spelling, does it then understand that any other words with different meanings in en-US and en
Re:Same words, different meanings (Score:2, Funny)
Of course, in British English... (Score:4, Interesting)
I always said you Yanks couldn't even use your own language properly... [fx: ducks]
Re:Of course, in British English... (Score:2)
To Wit:
In the UK, "get pissed," means "become inebriated."
In the USA "get pissed," does not mean "become inebriated." In fact, only people familiar with UK culture and slang know that it does mean that on the other side of the Pond.
In the USA, "get pissed," is a commonly used shorthand for "get pissed off," as in, "I really got pissed when when they told me I had to work late."
So, yes, the original model sentence is ambiguou
Re:Of course, in British English... (Score:2, Insightful)
Not "I urinated", but "I got urinated" - how could it tell?
Also I sometimes say "I'm pissed" (no 'off') when I'm angry, and I'm british. Although as I just pointed out, that could mean "I'm urinated"
Re: Of course, in British English... (Score:2)
To be precise (goodness knows why...), just as you'd never say "I got urinated" without a qualifier, such as "I got urinated on", the same applies to 'pissed' too. You could get pissed on, which would refer unambiguously to urination (literally or metaphorically), but if you just "got pissed", with no qualifier, it would almost certainly refer to inebriation. (Unless you were resorting to US slang -- but IME that usage is still very rare here.)
Re: Of course, in British English... (Score:2)
1) get drunk (LITERALLY)
2) go through the digestive system
3) get pissed (LITERALLY)
Like I say - very few people would mean it that way, but seeing as the most common use of "drunk" is the past tense of drink (ie, to drink a liquid), the computer would learn that meaning and take it literally, even when applied to a person:
a) The lemonade got drunk.
b) My friend got drunk.
Gramatically speaking, what's the difference?
Re: Of course, in British English... (Score:2)
You're still being too sensible
Story of my life. :)
Oh, and ObQuote:
Re: Of course, in British English... (Score:3, Interesting)
Grammatically, there is none. However, a statistical translation system could cope with this. If it had two matched texts:
"The liquid was pissed some time later" translated into Language X as "The liquid was urinated some time later"
"John was pissed some time later" translated to Language X as "John was inebriated some time later"
It would assimilate this into it's linguistic map as something like:
pissed =
Damn, it ate my < and > (Score:2)
<person> pissed = <person> inebrated
liquid pissed = liquid urinated
Re: No, British English is whacked (Score:3)
Actually, we do sometimes use 'fries', to distinguish them from 'chips' which are usually more than three millimetres thick and have actually been near a potato! We also use both 'cookie' and 'biscuit'; the former for larger, thicker things, often with chocolate drops, nuts, or whatever. What do you mean by 'biscuit'?
And I've no idea what 'podger' is - I've never heard it, and neither dictionaries nor Google can come up with anything more relevant than its use
Re: No, British English is whacked (Score:2)
(Just don't ask how to pronounce them...)
Re:Same words, different meanings (Score:3, Interesting)
This was proved impossible about fifty years ago (Score:2, Interesting)
However, this method does not work, as the silly examples elsewhere in the discussion show. You can only understand or translate if you "know" what is meant.
There is no way of figuring it out. There isn't enough information supplied in the texts themselves. You have to be born with the inherent ability to understan
This approach is limited (Score:3, Insightful)
Re:This approach is limited (Score:2)
Translator (Score:3, Informative)
Can anyone try this on the new (or some other recent) algorithm?
BTW here's Doc Och's most recent website:
Franz Josef Och [isi.edu] [isi.edu]
--
Esteem isn't a zero sum game
Re:Translator (Score:3, Insightful)
For statistical translations to work, you would need a substantial set of data, already translated, from which you could do the comparisons and create your database of phrases and words.
In the example you've given you would need to have pre-populated this database in advance for the statistical engine to understand how to do the translation.
What you've got to do is stop thinking that this is actually performing a translation... i
Re:We used to do this for fun at my last job (Score:1, Funny)
Babelfish [ altavista.com ] with something type and translate of English > to German > French > English. If you are creative, you receive indeed some-of the merriest translations. If you can use words of jargon, all in general loses general context in the translation. Consequently, to pay free attention spoken foreign about films and hearing French and to read English subtitles so much outside. Simple to make against is not to You directly with English translateab
IBM research 10 years ago (Score:5, Interesting)
Speaking of which -- speech recognition, AI, translation learning algorithms -- sounds like we have the seeds for the Universal Translator.
Re:IBM research 10 years ago (Score:3, Interesting)
With exceptions in tons of languages, is this even feasible in the near future? Sure, we can understand a poorly translated sentence, but can it translate it so that we don't have to?
Re:IBM research 10 years ago (Score:2)
I completely disagree with this. People thought they would need artificial intelligence to beat a chess grandmaster.
No Universal Translator any time soon (Score:3, Insightful)
Re:No Universal Translator any time soon (Score:2)
Re:IBM research 10 years ago (Score:5, Informative)
http://www-2.cs.cmu.edu/~aberger/mt.html
Re: Good quote (Score:1)
Re:IBM research 10 years ago (Score:2)
Y'all.
but in many other languages there are such words, so when the source text says "You should vote for the Republicans."
as in "Y'all should vote for the Republicans." (Errmm...no, you actually shouldn't just randomly vote
So statiscally... (Score:5, Funny)
France = "Cheese Eating Surrender Monkey"
George Bush = "Neo-Imperialist Moron"
Tony Blair = "Lap Dog"
WMD = "No where to be found"
and of course
Dossier = Creative Story Telling
Re:So statiscally... (Score:5, Insightful)
If this happens, I suspect this technology will be illegal...
Re:So statiscally... (Score:2, Funny)
Not illegeal, just when you try to run it in windows it will mysteriously crash. Microsoft won't want there to be a program that will translate their EULAs into "w3 0wnz0r j00 50ul!!!!!111"
I'm still holding out for one that will translate CS-speak into english. God i'm sick of having to translate "3y3 g0t m4d d34gl3 l0lz!!!1"
Re:So statiscally... (Score:1)
I like the first two points you made; translating jargon would be extremely useful (though I'm more interested in the translation between different languages).
But how would it translate an article from one political bias to another? If you change the political bias, you change the underlying tone and meaning of the article.
Re:So statiscally... (Score:2)
If you have an article which contains actual information, this would, of course, be impossible. The tone, on the other hand, can be seen as a langage, a way of expressing things. Saying 'Coalition forces announced collateral losses' or 'The occupying army killed innocent people' contains the same semantic information. The language is sim
Re:So statiscally... (Score:3, Funny)
Re:So statiscally... (Score:1)
Machine Translation as a whole does theoretically allow what you suggest, but example-based technologies don't understand the tex
Re:So statiscally... (Score:2)
"All Your Base Are Belong to Us!"
Re:So statiscally... (Score:2)
Hi George.
Works it does! (Score:5, Funny)
Yoda? (Score:5, Funny)
Yoda, is that you?
Re: Yoda? (Score:1)
> Yoda, is that you?
Shouldn't you ask, "Yoda, that you is?"
Re:Yoda? (Score:2)
Re:Works it does! (Score:2)
Older languages not supported? (Score:5, Interesting)
malo: I had rather be
malo: in an apple tree
malo: than a naughty boy
malo: in adversity
based on four very distinct meanings of malo, in which the word endings put the stem of the word in context, but unfortunately the same word endings are used for different things.
Not that I'm trying to rubbish the work, because I actually think that statistical methods are close to the fuzzy way that we actually try and make out foreign languages. I just wonder what the limits are.
Missed the idea (Score:2, Interesting)
As for inflected (read most) languages, learning to separate a word into its stem and inflections is the first step, even if you have a number of such possible break-ups.
Re:Older languages not supported? (Score:2, Interesting)
Get this idea out of your head. There is no continuum of inflectedness upon which modern languages align to the uninflected.
Re:Older languages not supported? (Score:2, Informative)
Japanese doesn't use inflection for any meaning at all. You can speak Japanese without using any inflection, you would just sound like a robot.
Sometimes it's easier to understand two words that sound similar with inflection, but the way they are written or even spoken is different without any inflection.
Re:Older languages not supported? (Score:2)
Re:Older languages not supported? (Score:2)
Re:Older languages not supported? (Score:2)
Re:Older languages not supported? (Score:2, Informative)
Re:Older languages not supported? (Score:2, Interesting)
(Offtopic, but indulge me.)
For anyone who doesn't know Latin, or for anyone who isn't familiar with inflected languages in general, here's a detailed morphological breakdown of this poem.
First-person, present indicative active form of the irregular verb malle, "to prefer, wish". It takes an infinitive (most likely esse, "to be"), which is often, as here, dropped.
The locative form of malus, -i (feminine noun), "apple tree").
Re:Older languages not supported? (Score:2, Informative)
Why the change and Internationalization (Score:5, Interesting)
Spanish is easy and led me to believe that the article had relatively little weight (it is lightweight and a topical PHB read anyway). I do a lot of data mining in text streams and have found it to be fairly easy work. Getting cursors to play in ideograms/unicode and reversing the data is something I haven't tried yet and the article barely covers it. When I saw that they were covering language sets that were extremely dissimilar to english, my interest in multi-language applications piqued again. All of my databases are unicode and I want to learn more about having truly international systems that are automated and then hand tweaked to avoid the engrish.com [engrish.com] type mistakes. Any help here?
-B
Script vs. language (Score:1)
Engrams? (Score:2)
Wow, these guys are just begging for a lawsuit from you-know-who.
Why not machine language to compiler language (Score:1, Interesting)
who wants to help me build a tower to heaven? (Score:2, Funny)
Now, all we need is to pinpoint Kolob and we'll be set!
Re: who wants to help me build a tower to heaven? (Score:1)
> FINALLY! After all these years of scrambled languages, we can finally get together and plan that tower of Babel!
Vebwe? Kootchka qwim?
the real problems lie in understanding... (Score:5, Interesting)
But that's an old story. Even the translation of complete sentences is fairly feasible in terms of syntactic structure.
Harder to translate are things like discourse markers ("then", "because") because they are highly ambiguous and you would have to understand the text in a way. I have tried to guess these discourse markers with machine learning model in my thesis [reitter-it-media.de] about rhetorical analysis with support vector machines (shameful self-promotion), and I got around 62 percent accuracy. While that's probably better than or similar to competing approaches, it's still not good enough for a reliable translation.
And that's just one example for the hurdles in the field. The need for understanding of the text kept the field from succeeding commercially. Machine Translation in these days is a good tool for translators, for example in Localization [csis.ul.ie].
Re:the real problems lie in understanding... (Score:3, Interesting)
or bank = hardware bus, as in a bank of memory
or banking = betting, as in I'm banking on that...
These statistical language solutions are interesting, in that they can analyze sentence structures and deduce the grammar of a language; however, I would think that they fail on generating the actual definitions of words. You almost need to generate a list of "concepts", then link each concept to a word, by language. Not my field, thank goodness; I wouldn't have the pati
Re:the real problems lie in understanding... (Score:2)
re defining: sometimes it's not bad to define a term using several samples of its context. you can use google for that -- just enter a complicated term and you'll find out how it is used and who uses it.
i do that quite often when i am looking for the correct usage of a word or a phrase in a foreign language...
Re:the real problems lie in understanding... (Score:3, Informative)
You start with a few words that occur with each sense, you now can disambiguate a few example occurences in the
Re:the real problems lie in understanding... (Score:2, Insightful)
Extralinguistic knowledge (Score:2)
Furthermore, in the end language is only a carrier of meaning and meaning ultimately refers to non-linguistic objects. Therefore, you can't understand language (fully) without understanding reality (at least partially).
And while machine translation is a relatively hard job, there are examples that suggest that automated insertion of hyphens ultimately need extralinguistic knowledge!
Re:the real problems lie in understanding... (Score:3, Informative)
KBMT can be done. We demonstrated that pretty definitively. It's labor-intensive. Yes, we DID create concept maps (ontologies) for the domains of human endeavor relati
I'll believe it when I see it (Score:5, Interesting)
There are a number of problems with the model here that point very clearly to the fact that it has the same shortcomings as other machine translation models.
For example, so long as we're working with cognates or 1:1 equivalencies (tall, man, etc.) it's fine. If we go to words for which there is no 1:1 lexical item, what's it do then? Consider especially words that signify complex concepts that are culture-bound. There would be, by definition, no reason for language #2 to have such a concept, if the culture isn't similar. The other problem arises from statistical sampling. Lexical items that are used exceedingly rarely and have no 1:1 or cognate would be unlikely to make the reference database.
Another similar problem arises with novel coinages and idioms. The example of "The spirit is willing..." is rightly cited. Consider the Russian saying, "He nyxa, He nepa," which translates as "Neither down nor feathers" but doesn't mean anything of the sort.
Real machine translation has been the golden fleece of computational linguistics for a long time. I'll believe it when I see it.
Re:I'll believe it when I see it (Score:5, Interesting)
Lee's toy project, SPHINX, won the DARPA competition that year. The highest scoring rule-based system came in fifth. What the linguists "knew" was wrong.
The example you gave is another example of the linguists not know as much about statistics as they think. The corpora used for statistical translation include examples of idiomatic usages. Idiomatic usage is highly stereotypical, so the Viterbi path through an N-gram analysis captures such highly linked phrases with high accuracy.
Grammatical Differences (Score:5, Interesting)
Although these methods work better than literal word-for-word translation, they're still not going to be perfect without some sort of human intervention. Dutch, for instance, has a completely different sentence structure than does English. For instance, the sentence "The cow is going to jump over the moon." becomes "De koe gaat over de maan springen" or, literally, "The cow goes over the moon to jump".
Don't laugh at this structure or perhaps any unobvious usefulness. I've had discussions with people regarding the grammatical structure of a language and the society around it. Indeed, a specific example I have comes from a TV show "Kop Spijkers", which is a show focused mainly poking fun at political activity and news events. At times, they have people dressed as popular media and political figures and have comical debates.
In one show, a person acting as Peter R. de Vries (roughly the Dutch equivalent of William Shatner on America's Most Wanted) stated the following joke (JS stands for Jack Spijkerman, the host of the program):
PRdV:
Translated into English, we would not find the humor in this transaction:
PRdV:
Sure you can crack a smile about it, but it's much funnier when the punchline comes at a climax. And in English, it is not possible to state "Well, I smoke 2 packs per day... NOT" (without sounding like a retard who's watched too much Wayne's World).
Getting back on topic, I believe there will be major issues with any tranlsation algorithm to come. This is, of course, to be expected; I hope, however, that more advances will soon follow.
I'll be convinced... (Score:2, Insightful)
"Shaka, when the walls fell!"
Re:I'll be convinced... (Score:2)
Do put me out of work. Please! (Score:5, Interesting)
Would a program know how to break up a monster like that?
Or, seriously, I ended up rewriting most of the letter to convey its contents in a tone that hopefully won't insult the recipient because of differing cultural expectations.
Finns often consider politeness a waste of time. Now explain that to a statistical translator program: "Leave out/add in some polite blablablah"?
We won't. (Score:2, Insightful)
In other words, it works best for technical manuals ;).
Knowledge-based approaches have the same limit (Score:2)
Her last big project was automatic translation of (you guessed it) technical manuals.
godot42a is spot on. The English originals of the technical manuals had to be written in a subset of English which restricted the range of grammatical expressions. Tech writers had to run a program to check their work for compliance.
In summary, even if you build a trans
wow (Score:2, Informative)
Wow. You could not provide a more wrong description of what's going on here. I don't know where to start. The statistical methods are explicitly free of meaning. There's no symbol-grounding going on here. Thus the statistical method does not say that hombre = man and alto = tall. All it says is that often when "hombre" showed up in text A, "man" showed up in text B, regardl
I do this stuff for a living... (Score:2, Insightful)
WHoa ho ho.... (Score:2)
Limited value? (Score:3, Interesting)
Raw dictionary work is pretty much the least interesting, most mechanical part of an MT system.
Grammar (source parsing, transformation and target generation) takes a lot more work and careful thinking.
The more accurate you want your MT system to be, the more extra information you want to attach to your dictionary entries (the more the system knows about all the words, the more disambiguation using real-world knowledge it can do.) "I have a ball" vs "I have an idea" translate to some languages quite differently; you need to know that you don't (usually) physically hold "an idea" in your hand. The most common words ("is", "have") are often the worst in this respect.
(I have worked coding an MT system.)
Re:Limited value? (Score:5, Informative)
Strictly separating raw dictionary work and grammar seems rather old-fashioned to me. Of course, it can work to some degree, but there are so many different types of collocational preferences that just providing each lexeme with a 'grammatical category' from a relatively small list and basing the grammar on these grammatical categories is hardly enough.
It is true that automatic systems' lack of world knowledge is a big problem, but the examples you provide aren't really a good demonstration of this fact. As you write, 'have' is translated differently into some languages depending on whether the object is abstract. So, given a translation system that recognizes the verb and its object and a bilingual parallel corpus, a statistical system can find out about that.
I heard of people who write dictionaries that can be used for automatic processing, for every lexeme they need between half an hour or an hour (consulting dictionaries and corpora, checking whether the application of rules gives correct sentences). This can only work if the aim of the MT system is either only a very limited domain (e.g. weather forecasts, for which there are working rule-based translation systems) or very low quality. It could never be affordable to have trained people provide all relevant characteristics for the millions of words that would be needed for a good MT system with wide coverage.
Differentiating between concrete and abstract entities is something that seems quite natural to us, but there are many other relevant characteristics of lexical items that don't come to linguists' minds so easily, statistical analyses can be better at discovering them.
N-grams? (Score:2)
It's a CoS [demon.co.uk] trick to enslave us all!
unfortunately doomed (Score:5, Interesting)
'As a punishment, he was given a longer sentence'. Obviously, we're talking prison, right? Well, what if the preceding sentence was:
'The teacher had grown weary of his poor attempts at translation'?
A statistical system, even working with the entire phrase, won't be able to figure out which meaning of the word 'sentence' is intended there.
how about:
'The box was heavy. We had to put it down'
'The dog was ill. We had to put it down'
You need semantic understanding to be able to perform translation.
Re:unfortunately doomed (Score:4, Insightful)
Artificial neural nets are one way to do this, but statistical methods are more or less analogous and have the advantage of being highly optimizable. Personally I don't understand the details, but Very Smart Mathematicians have found ways to optimize models like Singular Value Decompositions (SVDs) [davidson.edu] so that they can be calculated orders of magnitude faster than models that cannot be represent as formally using mathematics.
The bottom line is that statistical methods are probably the way that we will end up producing brain-like behavior on computers, and the fact that there are promising results already is heartening. Yes, for truly intelligent behavior a lot of domain knowledge will also be needed, as you point out. But I don't see any reason why the extraction and mapping of this knowledge couldn't also be achieved with large training corpora and statistical methods, rather than hand-crafting.
Re:unfortunately doomed (Score:4, Insightful)
Clearly, it would need to learn from a tremendous amount of input data before it could begin to approach the experience of a human, and hence make guesses of similar quality to a human translator. However, the amount of available source material is increasing so rapidly that it may be possible for a translator to get pretty darn smart this way.
William Orbit (Score:2)
Arabic Grammar Nazi (Score:5, Informative)
Not to be overly anal (hopefully to raise an important point), "rajl kabir" actually means "old man" not "big man." The Arabs will definitely laugh at you if you mix these up. You'd use the word "tawil" for a tall or generally large man. The word "sameen" refers to a fat or husky guy. In a different context (referring to an inanimate object), "kabir" does in fact mean big.
I wonder how good these statistical systems really are at learning the various grammical nuances of a language like Arabic. For example, in Arabic, non-human plurals behave like feminine singulars, whereas human plurals behave like plurals.
It's really incredibly cool that these machines can learn language mechanics and definitions on their own. But as previous posters have already noted, the machine still has to know the meanings of words in order to do a good translation.
For example, to translate "big box" and "big man" into Arabic, you'd actually use different words for big, since the box is inanimate, but the man is animate.
Re:Arabic Grammar Nazi (Score:3, Informative)
For example, to translate "big box" and "big man" into Arabic, you'd actually use different words for big, since the box is inanimate, but the man is animate.
I think that one of the major points of the statistical technique is to deal with precisely this sort of thing.
It doesn't have to know the "meaning" of words like "box" or "man," it just has to have seen them in a part
A paper on this (Score:4, Informative)
You can find the paper here (PDF) [metlin.org] and the presentation here [metlin.org].
at least you got the n-gram definition right (Score:2, Interesting)
[for e][or ex][r exa][ exam][examp] and so on.
Using n-grams this way helps with things like mis-spellings. Mr. Metlin (parent of this) used the character definition is his paper. N-grams are widely used in Information Retrieval Researc [umbc.edu]
EGYPT translation toolkit is GPL'ed. (Score:3, Informative)
I can imagine some distributions of this translation system that take this code - with improvements - and precook large corpuses to create translators. Anyone want to write the Mozilla and OpenOffice plug-ins for the new menu item "Edit/Translate Language"?
Language Applicability (Score:3, Interesting)
That's not really the case. Klingon was created through conscious effort and hasn't evolved many (any?) warts over time. Its structure is akin to well-understood human languages.
Now take Turkish, which has concatenative grammar. Adjectives are applied by tacking suffixes on to the word, sometimes changing spelling of previous chunks. Thus, a 20-word English phrase may correspond to a single Turkish word and extremely long words may be reasonably assumed to be unique. Statistical techniques can work with Turkish, but it requires some work up front to extract tokens. \b\B+\b doesn't help much. German (and, I think, Greek) are like this to a lesser extent.
Statistical approaches are often quite effective in language processing, much to the surprise and disheartening of linguists. They're far from perfect, but often the best thing so far.
hard sentence (Score:2)
That will be a fun one to give a translation program. (Or a speech recognition program, for that matter).
Re:this doesn't work well (Score:4, Interesting)
Fascinating stuff for sure, but hardly new unless they have come up with some new development. I haven't read the article.
Re:this system is only as good as the.... (Score:2)
Makes me think the explosion of interest in this type of MT has less to do with its advantages over other systems than the limitations of current dictionaries. Who wants to spend years defining 300,000 words and phrases just on the off-chance it might be useful, right?