rockmuelle writes: I work in a 'Big Data' space (genome sequencing) and routinely operate on tera-scale data sets in a high-performance computing environment (high-memory (64-200GB) nodes, 10 GigE/IB networks, peta-scale high-performance stroage systems). However, the more people I chat with professionaly on the topic, the more I realize everyone has a different definition of what consitutites big data and what the best solutions for working with large data are. If you term yourself a 'big data' user, what do you consider 'big data'? Do you measure data in mega, giga, tera, peta-bytes? What is a typical data set you work with? What are the main algorithms you use for analysis? What turn-around times are typical for analyses? What infrastructure software do you use? What system achitectures work best for your problem (and which have you tried that don't work well?)?
"Every morning, I get up and look through the 'Forbes' list of the
richest people in America. If I'm not there, I go to work"
-- Robert Orben