What large commodity clusters provide is a price per cycle low enough that the owner doesn't have to worry about efficiency. For example, Google's Dean and Ghemawat ("MapReduce: Simplified Data Processing on Large Clusters") managed to successfully sort 10^10 100-byte records over 891 seconds, or about 6MB sorted per processor per second. Very fast overall, but hardly efficient use of modern hardware. There's an important place for the new big dataset system, but the argument is cost, not efficiency.
Top Ten Things Overheard At The ANSI C Draft Committee Meetings: (5) All right, who's the wiseguy who stuck this trigraph stuff in here?