Would each closed-captioned syllable or word need to be manually synchronized with the video first? Or can the training be done without it?
Getting half the words correct, then feeding that into a grammar / context engine should yield very close to 100% accuracy.
But this AI is already using context to some degree. The article gives the example of "Prime Minister" for instance, where the AI knows that if the word "Prime" is read on their lips, that the word "Minister" will probably follow. Also, the AI has been trained in one context alone, which means that the context is already taken into account. For instance, if the same anchorman were to order his favorite frappuccino at Starbucks, I really don't think that the AI would do as good a job.
Also, they say they used thousands of hours of video, but it could be that they trained the AI on just three to four news anchors, which could make it easier on the AI. After all, I would expect a professional lip reader to do a lot better with reading the lips of his own family members, simply because he was so accustomed to their style of talking.
And last, I always doubt the self-reporting of scientific results to the mainstream press. An AI developer/researcher has every incentive to exaggerate the success rate of his own work. Also, any professional lip reader hired probably received compensation for their work and probably signed an NDA with the researcher. So it could be very easy for the researcher to claim whatever he wanted and no one would be there to contradict his story.
After all, we're talking about big money here for the right sleight of hands (whether it's exaggerating, lying, or doing something else completely unethical). For instance, the guy who sold his self-driving car company to Uber after having only started four months earlier sold it for 600 million dollars. Can you imagine 600 million dollars after only four months? Many people, including researchers, would be willing to lie, cheat, or even kill for a tiny fraction of that amount, and some others would even be willing to do it for free for the ego boost alone.