See, what you're saying is both sensible and unsurprising, but here's what bothers me: TFA doesn't acknowledge any of what you are saying. Instead, it suggests this is a novel activity, which seems ridiculous but happens for political reasons.
Without wishing to offend it, the BL is a monolithic organisation that doesn't always play well with others. Part of that is because funding doesn't always work that way. You can get money for claiming that you are going to do the very first über-awesome UK archive, but your chances of receiving the funding becomes rather lower if in the very first breath you point out that somebody else has been doing pretty much this for a decade. Another part of it is: most politicians would likely want the national heritage, such as it is (jubilee celebration tweets - please...) to be held by that nation's own national library.
I would imagine the BL have referenced archive.org work extensively, but differentiate this project with what tits in suits like to call "a compelling USP." To put it in plain English, they'll have a neat explanation that suggests that they are totally aware of previous work in the domain whilst making sure that this project looks a) different, b) excitingly new and c) contextually, better.
There's some regional variation in PhDs, at least in CS/info science. In my experience, US PhD progs often (but not always) seem to involve a lot of structured learning, like compulsory classes, etc. In the UK and many European countries, PhDs seem to have a slightly higher tendency to appear a lot like a regular job, plus added dissertation. Newly qualified PhDs therefore vary a lot in workplace skills/experience...
Umm, the majority of the UK has not had free tuition in many years. It currently costs £9000 per year to go to university in England.
The record is apparently 216 hours for the Rutan Voyager, that is, nine days.
Okay, if survival times for cloned species scale up linearly with flight endurance records, it still isn't great news for the ibex...
It is often forgotten that this effect goes both ways. It is not restricted to sociology.
Example: The Bogdanov affair.
You need to read this:
I second this. Portal 2 is insanely attractive to non-gamers. That said, it's not that much of a gateway drug in my experience, leading if anything to an interest in puzzle games. It seems easier to go from Portal 2 to Osmos than from Portal 2 to Left 4 Dead... much to my disappointment.
You might be surprised. Check out pseudosci.org. I still get threats about that web site from time to time.
You're right, I am surprised. That is somewhat hilarious.
But I do believe in the old adage of "When all is said and done on the Internet, far more is said than done."
I agree... but it is to be said that the same is true of academia. I was at a conference session just the other month on the subject of text analysis, in which most of the attendees were managers with no relevant background or experience. It is currently flavour of the month. In two, three years' time they will be after something else, without having solved this one - not that they will admit to this. Academic funding agencies have ADHD, and therefore so does academia.
It is possible that the public at large will not benefit directly from games played with JSTOR, as JSTOR itself is a somewhat specialist resource. Even if the result is just a few people learning a little about available tools, theory etc, that in itself beats a slap in the teeth with a wet kipper.
My own years of experience have taught me strong collaborative teams are far, far more likely to do great things than some brilliant lone wolf in seclusion. And if that lone wolf does do something great, he's far more likely to use it to become rich than donate it for the good of mankind.
My experience has been rather mixed. What works for software development is not always what works for innovative but relatively theoretically routine applications. There is a lot of money in biomedical text mining, so that area attracts big dev. teams. However, there's been something of a time lag between profitable specialised applications of text analysis, which have in some cases attracted a lot of funding, and the idea that text analysis is another tool in the cross-disciplinary toolkit. Text analysis in the humanities is great fun but you can't cure cancer with a well-aimed Socratic dialogue, so in most cases that level of cash just isn't there (a lot of text mining already occurs in the humanities, but there are many more subjects/applications waiting in the wings).
Thanks for the link to the OTMI, by the way. It looks like an interesting concept, but given that it seems to have been abandoned since 2009, I'm not persuaded that a huge demand exists to data-mine journal papers in this manner.
Certainly not with OTMI, which went down like a lead balloon. It effectively shreds the paper and hands you the remnants to play statistics with. Better (slightly) than nothing, but not by much - and with the paywall in the way and no guarantee of long-term interface availability, why waste resources on it when you could play with openly available free stuff instead?
By the way: the interface I was thinking of is the open text mining initiative (OTMI), abandoned by Nature. Nice idea, kept the publishers relatively happy but it didn't catch on (see brief critique in comments section).
Pseudoscientific rant? Bless you. What a shame that the time cube guy isn't around to demonstrate to you what pseudoscience really looks like.
Seriously, you react as though text mining needs a supercomputer and years of effort, which simply isn't the case any more. Perhaps a kid with a desktop might need to do some thoughtful triage to reduce his history-of-dinosaur-research project to manageable levels, but there's no real reason why your basic hobbyist can't do interesting stuff with a few lines of code and a few bits of JSTOR. Perhaps none of those people would ever produce 'meaningful' work, perhaps they would - depends on your definition of 'meaningful' - but I've certainly met domain specialists who've done interesting if idiosyncratic stuff on a shoestring with freebie resources before now, so I am just not as ready to write off the hobbyist as it seems you are.
I know I'm not going to force JSTOR to open up its database. I wouldn't ever have gone within a mile of it myself; it's commercial, I don't need the hassle. Unless someone hired me with a JSTOR-related project in mind I wouldn't volunteer for it. Equally, 'creating interfaces that enable contextual data mining' has been tried before and was either excessively restrictive, too much hassle or plain expensive. That said, it is asinine to scoff at the idea of permitting the great unwashed to get their hands on old journal data, either on the basis that they haven't the resources to do anything interesting with it or under the assumption that nothing they will do will be 'meaningful'. Even if all they do with the stuff is making gigantic, useless word clouds, I can't see the harm in it. If they do better (and someone would), so much the better.
In the end I don't think Aaron Swartz would've been able to open up JSTOR; he didn't have the influence and neither did his mates. If he'd wanted to make a positive difference he probably shouldn't have messed with JSTOR at all. But that isn't because JSTOR is technically too tough for the 'non-legitimate' researcher to handle; it's because all commercially-sustainable-library crap is invariably a can of worms.
I love the way that you assume that JSTOR is run by librarians. None of these services are run by librarians. They may be staffed by librarians but they're almost inevitably run by a guy/gal in a sharp suit who's very much aware of the potential for profit... sorry. As for one researcher in a thousand, yeah, until the rest of them figure out what this sort of access can do for them, that might be true. But so what? Most researchers don't give a toss about most things - that's the nature of specialism - but it doesn't mean that we should fail to support the ones that do, eh?
I'm not sure what you consider a 'legitimate researcher'. Indeed, I find that a pretty disturbing construction. We live in a world in which any muppet with a copy of NLTK and a lot of time on their hands can do great things with data. Also dumbass shit that doesn't work, but so what? I wouldn't be particularly inclined to consider that muppet any more illegitimate, whatever that might mean, than any other researcher. If he or she has a lot of spare time on his/her hands and/or insatiable curiosity and/or an unusual approach, we shouldn't really be judging him/her on the basis of whether he/she has received sufficient grant funding to be blessed by JSTOR or some guy called timholman as Worthy.
Real research (how judgmental!) does not always take time and effort and manpower and money. Time and effort and manpower and money are usually the things that inadequate people use to compensate for having no bloody imagination and no real vision. Time and effort and manpower and money and, above and beyond all else, privileged access are the tools that the entrenched use to keep those naughty illegitimate researchers away from the blessed ivory tower.
Yeah, Aaron could've done all sorts of things, but he did what he did and I'm not going to judge him for it other than to say that he had far greater vision than I do.
I don't discount the possibility that you have a better understanding of this than you have exhibited in this post, but you come across as though you have no idea about text analysis.
JSTOR indexes these papers and provides a search engine, yes, but that's not all that much use for somebody looking to extract a large body of information very rapidly from a large corpus of data. JSTOR's search engine is fundamentally intended to facilitate a single task - finding papers of relevance to a keyword/keyword set and reading them manually, one at a time. There's nothing wrong with that use case, but you have to realise that sometimes people are looking to solve different problems using different methods, and for them, JSTOR's indexing efforts are practically worthless. For those people, unless someone goes to the effort of opening JSTOR so they can apply their own toolset, JSTOR is essentially useless.
Then post something online about your grandma. I'm serious: why not? People may appreciate a chance to contribute their memories.
As regards the Patrick Moore story, you assume people neither met nor knew him, but in this case you may be wrong. I've come across the guy now and then due to my own interest in astronomy, and I'm nothing more than a rank amateur. In his younger and sprightlier days the gentleman in question would've been moderately hard for a keen amateur astronomer not to encounter at some time or another. It is sometimes forgotten that people on TV also exist IRL.
I bought a DX from amazon.com (delivered to the UK). It has the same setup as the International version of the standard Kindle used to have before they decided they'd sell those on Amazon UK - Whispernet from anywhere, and I think once they started selling Kindles on Amazon UK they did give the opportunity to move to the UK service overall.
I found the DX to be great for reading A4 PDFs, even the ACM-style two-column layouts. For reading novels and so forth it is merely acceptable; comically oversized, really, like the iPad.
You're pretty much right about PDFs, pragmatically at least, although some reflow more easily than others. Authors do have the option to tag PDFs, indicating what can be reflowed and in what order - it's an accessibility feature. However, since very few people have any idea that this option even exists and most PDF creation workflows don't really provide the option, the feature isn't, practically, much of a game changer.