Comment Re:He's wrong (Score 3, Insightful) 105
At the scale they seem to need to do it, it is not clear that hardware/software gain efficiency are the whole solution. Don't get me wrong we should do it. But our current ML stacks are already using our hardware decently well. That's the point where I don't expect we can really gain more than a factor 100 anymore.
And based on the type of usage that they believe we'll get, we more likely need to gain factors in energy consumption in the order of a million.
And training is not that irrelevant. If you look at the cost of training the final model that gets deployed. Yes, that's pretty negligible. But to have an honest metric, to train one model, you typically tried thousands, maybe millions of variants on hyper parameters. And that definitely takes time and power.
Even though I agree that eventually, commercially, the cost of running inference will have to be much smaller than the (full) cost of training, simply to be commercially viable.