Comment Re:Big Data Need (Score 2, Informative) 78
Mainframes and large multiprocessor machines have been handling multi-billion row data sets on RDBMS systems for a very long time. Data warehouses are commonly into the billions of rows. What commodity clusters provide is not efficiency--they often make poorer use of available cycles and repeat work to achieve goals.
What large commodity clusters provide is a price per cycle low enough that the owner doesn't have to worry about efficiency. For example, Google's Dean and Ghemawat ("MapReduce: Simplified Data Processing on Large Clusters") managed to successfully sort 10^10 100-byte records over 891 seconds, or about 6MB sorted per processor per second. Very fast overall, but hardly efficient use of modern hardware. There's an important place for the new big dataset system, but the argument is cost, not efficiency.