Submission + - Test shows big data text analysis inconsistent, inaccurate (computerworld.com)
DillyTonto writes: The 'state of the art' in big-data unstructured data (text) analysis turns out to use a method of categorizing words and documents that, when tested, offered different results for the same data in 20% of the time and was flat wrong another 10%, according to an analysis by researchers at Northwestern. Researchers offered a more accurate method, but only as an example of how to use community detection algorithms to improve on the leading method (LDA). Meanwhile, a certain percentage of answers from all those big data installations will continue to be flat wrong until they're re-run, which will make them wrong in a different way.