Fake Scientific Paper Detector - Slashdot

Please create an account to participate in the Slashdot moderation system

×

Fake Scientific Paper Detector 277

Posted by ScuttleMonkey on Tuesday April 25, 2006 @04:22PM from the paper-unnoticed-amidst-conference-white-noise dept.

moon_monkey writes "Ever wondered whether a scientific paper was actually written by a robot? A new program developed by researchers at Indiana University promises to tell you one way or the other. It was actually developed in response to a prank by MIT researchers who generated a paper from random bits of text and got it accepted for a conference."

This discussion has been archived. No new comments can be posted.

Fake Scientific Paper Detector

Search 277 Comments Log In/Create an Account

Comments Filter:

Testing... (Score:2, Interesting)

by OakDragon ( 885217 ) writes: on Tuesday April 25, 2006 @04:25PM (#15199834) Journal

"We believe that there are subtle, short- and long-range word or even word string repetitions that exist in human texts, but not in many classes of computer-generated texts that can be used to discriminate based on meaning."

RESULTS: FAKE

Yep, it works!

Share
twitter facebook
Sadly, It appears that I am a robot. (Score:4, Interesting)

by cbelt3 ( 741637 ) writes: <cbelt&yahoo,com> on Tuesday April 25, 2006 @04:29PM (#15199872) Journal

I've taken a long posting that I wrote on my blog and dropped it into the site. And I am Inauthentic. Now I understand the "Bladerunner Moment" comment in the article. I shall begin to surround myself with oddly colored polaroids and snapshots of theoretically implanted ancestors.

The nice thing is that we've finally settled the argument if machines can be made to drink beer and like it !

Share
twitter facebook
The program is a failure. (Score:3, Interesting)

by im_thatoneguy ( 819432 ) writes: on Tuesday April 25, 2006 @04:35PM (#15199933)

Apperantly I'm on average 49% artificial, based on school papers I wrote. I dub thee program: a failure.

Share
twitter facebook
Re:Self defeating? (Score:5, Interesting)

by cp.tar ( 871488 ) writes: <cp.tar.bz2@gmail.com> on Tuesday April 25, 2006 @04:39PM (#15199978) Journal

I recently had to check out an essay-grading robot for my Introduction to Natural Language Processing class.

I'd fed it the introduction of a randomly generated essay. It got a 4/5 on all counts.

I figure, if teachers are going to use robots to grade essays, we should use robots to create them in the first place.

Parent Share
twitter facebook
Re:Yes! (Score:1, Interesting)

by Anonymous Coward writes: on Tuesday April 25, 2006 @04:45PM (#15200044)

...or we could have a human just read the damn thing.

Novel idea.

Parent Share
twitter facebook
Can fool it by duplicating first page (Score:2, Interesting)

by currivan ( 654314 ) writes: on Tuesday April 25, 2006 @05:25PM (#15200390)

Duplicating the first half of the sample fake paper after the end of the footnotes makes it go from inauthentic (17%) all the way up to 91% authentic. It seems to be looking for long-range n-gram repetition, but it doesn't have a ceiling on frequency or length or the repeated text.

It shouldn't be hard to compare the distribution of n-gram recurrence rates (or distances between recurrences) to the observed distribution for actual papers. Something like a KL divergence would capture deviations in either direction.

Share
twitter facebook
Re:It Caught Mine (Score:3, Interesting)

by Em Adespoton ( 792954 ) writes: <slashdotonly.1.adespoton@spamgourmet.com> on Tuesday April 25, 2006 @05:25PM (#15200394) Homepage Journal

This raises a question... how do Wikipedia articles fare? --I'd guess that they should be at least *somewhat* scientific....

Parent Share
twitter facebook
Re:Self defeating? (Score:2, Interesting)

by Frumious Wombat ( 845680 ) writes: on Tuesday April 25, 2006 @05:33PM (#15200445)

Personally, I'd be more interested in modifying this for Fraud Detection. The robot looks over your data and text, and decides, "Sorry Dave, a leap of faith has occurred here." Presumably, at that point the robot locks you out of your lab.

This could lead to a whole series of literary robots: The Too Many Coincidences in Fiction Detector, The Humanities Thesis Verbiage Reducer, The This Movie Is Going to Suck No Matter Who Acts In/Directs It Detector, and so forth.

Parent Share
twitter facebook
False positives (Score:3, Interesting)

by macklin01 ( 760841 ) writes: on Tuesday April 25, 2006 @08:05PM (#15201439) Homepage

Hmmm, it's an interesting idea, but it seems to give a lot of false positives. (So naturally, it will detect fake papers, if it thinks every paper is fake.)

First thing I tried was some pages on computational oncology website [uci.edu], in particular, my cancer primer [uci.edu], which I wrote in not a short time. Everything I fed was determined to be inauthentic. Perhaps I just write like a robot. :-) I figured that perhaps the detector was more primed for real papers, so I figured it wasn't too big of a deal.

So, next I tried my most recent research paper [sciencedirect.com], and it, too, was determined to be inauthentic, and in fact with less authenticity than my website. So much for the theory of being primed for scientific papers only. This thing is starting to look pretty bogus to me ... but an interesting idea, nonetheless. -- Paul

Share
twitter facebook
Re:Yes! (Score:3, Interesting)

by Ruff_ilb ( 769396 ) writes: on Tuesday April 25, 2006 @08:50PM (#15201658) Homepage

They did; the board that accepted the MIT paper, not consisting of specialists in the field, was likely confused by the pseudo-scientific gibberish they encountered. By mastering the methodology for the typical unification of access points and redundancy, the MIT students were able to effectively enter the scientific conference.

Parent Share
twitter facebook
Trying Wikipedia articles (Score:5, Interesting)

by Animats ( 122034 ) writes: on Tuesday April 25, 2006 @10:56PM (#15202124) Homepage

I've been trying my own papers and articles from Wikipedia. My own papers all score around 90%. Wikipedia articles that I consider good ones seem to score in the 80% range. Badly written fancruft scores very low.
Some variant on this thing might be useful as a new article filter in Wikipedia. We need more automation over there to stem the flow of incoming dreck.

Share
twitter facebook

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Related Links Top of the: day, week, month.

413 commentsChatGPT Leans Liberal, Research Shows
347 commentsAmazon CEO Says 'It's Probably Not Going To Work Out' For Employees Who Defy Return-to-Office Policy
327 commentsHotel Owners Start To Write Off San Francisco as Business Nosedives
323 commentsChina is Building Nuclear Reactors Faster Than Any Other Country
315 commentsChina is Calling in Loans To Dozens of Countries

Scientists will study your brain to learn more about your distant cousin, Man.