Since we already train students according to teacher bias of what makes a 'good' human-graded paper, it seems only fair to publish the bias that will be used to define a 'good' electronically-graded paper.
I see two ways electronic grading can fail.
(1) Students who submit poor papers which still score highly. If the AI algorithm is complicated enough that real cleverness is required, perhaps that's not a bad thing... And if the AI algorithm is easy to game, everyone will score highly and it will be obvious that the technology wasn't ready and this was a bad idea.
(2) Students who submit good papers which score poorly. Resolving this probably requires a public appeal-to-a-human-teacher process. If a large number of papers are appealed and found to be of quality, it will be obvious that the technology wasn't ready and this was a bad idea.
If after the trial, the number of overturn-by-appeals is low and the distribution of scores looks good, then mankind will have found a way to automate another (I believe) tedious task and free up more human capital and resources for more challenging and valuable pursuits, which sounds like a big win. Seems like we ought to try it and learn something.