It does work, but it requires judgement
. A lot of people seem to think that you just shove the data into a statistical test, out comes a p-value, and if it's small enough you win. Interpreting and validating the initial hit is where 90% of the real work is, and it requires the careful application of prior knowledge and subsequent experiments. I work with a guy who's probably one of the best statisticians in the world, and he often asks me, "well, does the result make sense
?" His judgement was developed over decades of looking at real data. If you just shove your data into an algorithm and take the top-scoring hits, you'll probably spend most of your time chasing bogus predictions. Algorithms are good for automating specific tasks that are essentially repeatable. Data mining requires an in-depth understanding of the specific problem you're trying to solve; you usually need to tailor your statistics so that they make sense
for the problem. That's why the idea of selling someone a suite of fancy data mining software is probably useless; you need to sell them the statistican too.