Therefore the only task of those who write software to grade essays is that the variation of the machine is no worse that the variations of the humans. There is some success in this. Edx has a module that will grade essays. As far as I know the value in this is quicker and more uniform feedback for practice essays.
Well, I'm a humanities guy and I know enough about the scientific method to understand that you don't know whether you have "success" until you test your bright idea in the real world and find out whether it actually works. And that's what MIT professor Les Perelman said in the article you're citing:
“My first and greatest objection to the research is that they did not have any valid statistical test comparing the software directly to human graders,” said Perelman, a retired director of writing and a current researcher at MIT.
As Perelman said, some computer students wrote a program that can turn out gibberish that the main robo-grading program consistently scores above the 90th percentile.
The article you're citing was not written by a journalist, but by a retired MIT writing professor.
So you've gotten it wrong on both the science and the reading comprehension. No mod points for you.
This is not to say that computer graded essays are going to be as good of an assessment as human graded essays. However, it may be good enough, and better than other objective measures, such as fill in the bubble tests. In fact anything that minimizes the cost of open ended free response assessment is going to benefit anyone. Securing multiple guess test is very expensive, and the value of them are highly questionable. They tend to overestimate the value of student how have vague passive knowledge, and underestimate the value of those who have an ability to actively apply knowledge.
I am deducting another point for bad grammar.
Computer graded essays can check whether an essay complies with an algorithm, and they can take care of anything you can reduce to an algorithm. The great success of computer writing was the spell-checker. There is also a grammar-checker which I never use because it doesn't work well enough for me. There are also algorithms to check the format of literature citations, which are useful.
But (as somebody who writes for a living) the most important features of writing depend on an understanding of the content. Most important: Is it correct? As Perelman says, the robo-graders ignore whether what you say is true (or whether it even makes sense). The next thing I look at: If the author takes a controversial position, does he give both sides of the argument? This is what you may know as Neutral Point of View from Wikipedia (although writers have known about it since the ancient Greeks.) Wikipedia actually has a pretty good structure.
Let's remember the purpose of writing: A person communicating an idea to somebody else. When I read something, I'm looking for a good idea, clearly communicated. If the algorithm can't identify a good idea (and as Perelman showed, it can't), then it can't tell me whether the writing is any good. Algorithms have surprised me, but I can't imagine how an algorithm can tell me whether an idea is good.