You might be surprised. Check out pseudosci.org. I still get threats about that web site from time to time.
You're right, I am surprised. That is somewhat hilarious.
But I do believe in the old adage of "When all is said and done on the Internet, far more is said than done."
I agree... but it is to be said that the same is true of academia. I was at a conference session just the other month on the subject of text analysis, in which most of the attendees were managers with no relevant background or experience. It is currently flavour of the month. In two, three years' time they will be after something else, without having solved this one - not that they will admit to this. Academic funding agencies have ADHD, and therefore so does academia.
It is possible that the public at large will not benefit directly from games played with JSTOR, as JSTOR itself is a somewhat specialist resource. Even if the result is just a few people learning a little about available tools, theory etc, that in itself beats a slap in the teeth with a wet kipper.
My own years of experience have taught me strong collaborative teams are far, far more likely to do great things than some brilliant lone wolf in seclusion. And if that lone wolf does do something great, he's far more likely to use it to become rich than donate it for the good of mankind.
My experience has been rather mixed. What works for software development is not always what works for innovative but relatively theoretically routine applications. There is a lot of money in biomedical text mining, so that area attracts big dev. teams. However, there's been something of a time lag between profitable specialised applications of text analysis, which have in some cases attracted a lot of funding, and the idea that text analysis is another tool in the cross-disciplinary toolkit. Text analysis in the humanities is great fun but you can't cure cancer with a well-aimed Socratic dialogue, so in most cases that level of cash just isn't there (a lot of text mining already occurs in the humanities, but there are many more subjects/applications waiting in the wings).
Thanks for the link to the OTMI, by the way. It looks like an interesting concept, but given that it seems to have been abandoned since 2009, I'm not persuaded that a huge demand exists to data-mine journal papers in this manner.
Certainly not with OTMI, which went down like a lead balloon. It effectively shreds the paper and hands you the remnants to play statistics with. Better (slightly) than nothing, but not by much - and with the paywall in the way and no guarantee of long-term interface availability, why waste resources on it when you could play with openly available free stuff instead?