Comment No (Score 3, Insightful) 147
For computational linguistics (translation, analysis, etc), machine learning is not a net gain. What ML proponents forget to factor in is the vast time spent on gathering and hand-annotating large quantities of text (gold corpora).
Even worse, for many many languages, these gold corpora simply do not exist and there are no plans on making them, or they are too small to be used for ML.
And even when the gold corpora do exist, models trained on them become tightly coupled with the data. They become domain specific. In order to escape domains, you need an order of magnitude more data.
Instead, one can make a domain-independent rule-based system in a fraction of the total time spent on machine-learning models. But rule-based has become this weird anathema - people will even write papers that use rule-based methods, while hiding it behind machine-learning terms.
I'm sure this also holds for other fields.