Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×

Comment Re:Information paradox? (Score 3, Interesting) 193

I think information is used in it's most abstract sense. Any particle or wave signals that that approach the black hole get consumed. I.e. when we look at it, we see nothing because light is absorbed. I'm probably wrong, though, and someone who studies the topic might be more apt at providing an explanation. Personally, I wonder what this means in terms of the second law of thermodynamics. When a black hole consumes energy and releases a Planck star, do either events reduce the entropy of the system?

Comment Re:There's no need for a new bill ... (Score 1) 535

True, but passing a token bill probably isn't the most appropriate solution. The current situation at least makes the inequity visible, i.e. that the FCC has abdicated their authority and that ISPs have unlimited freedom to shape their traffic. At least this opens the door for more activism.

Comment Re:It would be nicer if... (Score 1) 52

There are a few issues with the output of pdftoxml that make it difficult to parse (mostly adobe's fault). For 2-column articles, the columns are interleaved. That means you'll get a little bit of text from column A followed by a little bit of text from column B. The xml tags contain the x/y coordinates, so you can develop some heuristics to cleave out segments of text for one journal. This is not particularly suitable when you want to analyze text across different journal formats, as you'll have to develop a one-off solution for each journal.

It would also be useful to have clearly demarcated sections for the abstract, results, references, etc. Again, you could set BIO (Begin-In-Out) tags based on the section title and formatting style, but you may run into a few false positives if those words are used elsewhere in the text, and the two-column issue mentioned earlier may dump in text from other sections. Finally, there's little distinction between the body of the manuscript and the header/footer information.

Overall, the text is a bit messy. If you're just looking for keywords, then it's not a big deal. If you are trying to extract more complicated syntactic structures within the document, then it becomes a problem.

Comment It would be nicer if... (Score 3) 52

... publishers removed the paywall to publicly funded literature, or at least made the prices more sane.

Also, while we're on the topic of text mining, would it be possible to get text-only or xml-based articles, with figures attached and cross-references as needed? It's quite annoying to manually convert a pdf when trying to setup an automated analysis over several documents. I know one could setup a shell script to dump it out using the pdftoxml converter, but the output is a bit messy to parse.

Comment Re:I find it amusing. . . (Score 4, Insightful) 154

You're joking. The PC was an attempt to retain control, quickly churned out by IBM. It was just there to keep down the new micros that were starting to look popular, and the design was never intended to last.

It worked too - IBM retained control over the business market for quite a while, and didn't realise until OS/2 and microchannel that it had actually lost the control it thought it had kept.

Cheers,
Ian

Comment Re:MS Word (Score 1) 154

Word 4.2 (I think it was .2) combined with my Mac LC and a Stylewriter was and remains my favourite word processor setup of all time - it got me through the last two years of university (first year I started with an ST, using First Word Plus). Loved 4.2 - perfect mix of simple but powerful.

5.x brought in envelopes and a bunch of stuff I don't recall and didn't use, but started to get slow. 6.x is where the rot set in for me and I've never really liked any version since, whether PC or Mac.

Cheers,
Ian

Comment Re:Story time! Perspective: (Score 1) 154

I had the same experience the first time I saw a GUI machine - an Atari ST in a shop. Although I'd read magazines (anyone remember Input magazine in the UK?) about graphical interfaces, I hadn't ever actually used one or seen one for real.

My first thought on seeing it was "how do I get out of this and where's the computer?". I was essentially looking to type load "" somewhere and was baffled that I couldn't do it.

Cheers,
Ian

Comment Re:Yeah, but WE'll have the last laugh! (Score -1, Troll) 287

Evangelicals, man. Such hypocrites. I mean, if you're going to be a Christian, they could at least follow the New Testament which clearly states that the Pharisees (Orothodox Judaism) is not to be trusted.

What will happen is the opposite. Israel intends to fake the return of Jesus Christ when they rebuild the Third Temple. They'll parade out their "Messiah," the King of the Jews, and they will also claim it is incarnation of Jesus Christ. In reality, it'll just be a member of the Rothschild family. They will then have their Messiah demand that everyone discard the Christian Bible and adopt the Noahide laws from the Talmud, becoming servant slaves of the Jews for eternity. The sad part is that 95% of the Christians will believe it, and will drag the rest of us non-Jews with them.

Comment They send US citizen's text messages to Israel (Score 0, Flamebait) 287

NSA shares raw intelligence including Americans' data with Israel

http://www.theguardian.com/world/2013/sep/11/nsa-americans-personal-data-israel-documents

America is a vassal state of Israel. Israel gets to decide when and where America goes to war in the Nuclear Weapon Free Iran Act of 2013.

https://www.govtrack.us/congress/bills/113/s1881/text

Slashdot Top Deals

The opposite of a correct statement is a false statement. But the opposite of a profound truth may well be another profound truth. -- Niels Bohr

Working...