rockmuelle writes: I work in a 'Big Data' space (genome sequencing) and routinely operate on tera-scale data sets in a high-performance computing environment (high-memory (64-200GB) nodes, 10 GigE/IB networks, peta-scale high-performance stroage systems). However, the more people I chat with professionaly on the topic, the more I realize everyone has a different definition of what consitutites big data and what the best solutions for working with large data are. If you term yourself a 'big data' user, what do you consider 'big data'? Do you measure data in mega, giga, tera, peta-bytes? What is a typical data set you work with? What are the main algorithms you use for analysis? What turn-around times are typical for analyses? What infrastructure software do you use? What system achitectures work best for your problem (and which have you tried that don't work well?)?