Fake Scientific Paper Detector

Please create an account to participate in the Slashdot moderation system

Fake Scientific Paper Detector 277

Posted by ScuttleMonkey on Tuesday April 25, 2006 @04:22PM from the paper-unnoticed-amidst-conference-white-noise dept.

moon_monkey writes "Ever wondered whether a scientific paper was actually written by a robot? A new program developed by researchers at Indiana University promises to tell you one way or the other. It was actually developed in response to a prank by MIT researchers who generated a paper from random bits of text and got it accepted for a conference."

This discussion has been archived. No new comments can be posted.

Fake Scientific Paper Detector

Load All Comments

Search 277 Comments Log In/Create an Account

Comments Filter:

Yes! (Score:5, Funny)

by stupidfoo ( 836212 ) writes: on Tuesday April 25, 2006 @04:24PM (#15199809)

I am always wondering what those damn robots are up to!

Share
twitter facebook
- Re:Yes! (Score:2, Funny)
  
  by Krakhan ( 784021 ) writes:
  
  ROBOT HOUSE!!!
- Re:Yes! (Score:3, Funny)
  
  by Schemat1c ( 464768 ) writes:
  
  I am always wondering what those damn robots are up to!
  
  They use old people's medicine for fuel.
- - - - Re:Yes! (Score:2)
        
        by portforward ( 313061 ) writes:
        
        I turned in a rough draft of a final paper for a class and it came back with a 30% chance that the paper was authentic. I thought, "oh no, what if my professor uses this tool and thinks that I did something bad". Then I remembered I had the text from my great, great, great grandfather's obituary written in the late 1800's, and it came back with a score of 26%. I don't think that the tool is very acurate for non-technical papers.
        
        Re:Yes! (Score:2)
        
        by Ariane 6 ( 248505 ) writes:
        
        My LPSC abstract came back at 45.6%. It's quite technical. Weird.
- - Re:Yes! (Score:3, Interesting)
    
    by Ruff_ilb ( 769396 ) writes:
    
    They did; the board that accepted the MIT paper, not consisting of specialists in the field, was likely confused by the pseudo-scientific gibberish they encountered. By mastering the methodology for the typical unification of access points and redundancy, the MIT students were able to effectively enter the scientific conference.
- - Re:Yes! (Score:2, Funny)
    
    by MarkChovain ( 952233 ) writes:
    
    I for one, welcome our paper writing robot overlords.
    
    I for one am a paper writing robot overlord, you insensitive clod! I for one welcome our new video game consoles. They are called "hands". Shouldn't it be something like this will ever happen then you will see that they bring things out in managable increments. Sure it is a biggish program, but many lone hackers have written one in under one person/year.
That's good and all (Score:5, Funny)

by XxtraLarGe ( 551297 ) writes: on Tuesday April 25, 2006 @04:25PM (#15199830) Journal

but I wonder if it can tell if a paper was written by a million monkeys pounding on typewriters?

Share
twitter facebook
- Re:That's good and all (Score:4, Funny)
  
  by denverradiosucks ( 653647 ) writes: on Tuesday April 25, 2006 @04:31PM (#15199893) Homepage
  
  Obligatory Simpson's Quote
  
  Monkey's typing on a typewriter as Mr. Burn's is working on the next great american novel:
  
  Burns: This is a thousand monkeys working at a thousand typewriters. Soon they'll have written the greatest novel known to man.
  (monkey smoking cigar typing on a typewriter)
  Burns: Lets see. It was the best of times, it was the BLURST of times! You stupid monkey! (Smacks monkey upside his head)
  
  Parent Share
  twitter facebook
- Re:That's good and all (Score:5, Informative)
  
  by visgoth ( 613861 ) writes: on Tuesday April 25, 2006 @04:34PM (#15199931)
  
  Oh, I'm sure the work of monkeys is quite easily identifiable [vivaria.net].
  
  Parent Share
  twitter facebook
  - Re:That's good and all (Score:2)
    
    by demonbug ( 309515 ) writes:
    
    Yup, it's inauthentic [whitehouse.gov].
  - - Re:That's good and all (Score:2)
      
      by 1u3hr ( 530656 ) writes:
      
      The monkeys and Shakespeare meme has an obscure history. This page [angelfire.com] tracks it back to statistical mechanics, ca. 1913.
      [Babelfish translation]
      Let us conceive that one drew up a million monkeys randomly to be struck the keys of a typewriter and that, under the monitoring of illiterate foremen, these monkeys typists work with heat ten hours per day with a million typewriters varied types. The illiterate foremen would gather the blackened sheets and would connect them in volumes. And at the end of one year, th
- Re:That's good and all (Score:3, Funny)
  
  by Rakshasa Taisab ( 244699 ) writes:
  
  I kinda enjoy getting mod points, it would be sad if they replaced that feature.
- Re:That's good and all (Score:2)
  
  by drinkypoo ( 153816 ) writes:
  
  There's already a program that determines the likelihood that two articles are written by the same author. All that is needed is to combine it with a http query to slashdot...
- Re:That's good and all (Score:3, Funny)
  
  by iNetRunner ( 613289 ) writes:
  
  Seems like it would be easier to develop a program that automatically detects /. dupes.. but no.
  
  *At least the million /. pounding monkeys detect it..*
- - Re:That's EASY! (Score:5, Funny)
    
    by Hal_Porter ( 817932 ) writes: on Tuesday April 25, 2006 @07:55PM (#15201365)
    
    I, for one, peruse the blogosphere. On my Powerbook, wearing a black turtle neck and beret. Stroking my goatee thoughtfully. Sipping a latté in a café
    
    If I could just find a way to recharge my PowerBook from your hatred, I could stop carrying this ugly power adaptor.
    
    Parent Share
    twitter facebook
    - Re:That's EASY! (Score:3, Funny)
      
      by Unski ( 821437 ) writes:
      
      Sir I regret to inform you that you are a ruffian. I for one sit not in a place so vile and common as a 'café', examining the flawed writings of others, but in a temple constructed purely out of my supercilious transcendent superiority. I consume nothing so plebeian as 'The Internet' but rather a rasterized, marked-up and projected form of my own rigourous, peerless stream of consciousness (with blue aqueous scroll-bars). I have no need for facial hair or indeed any of your corporeal trappings and henc
Turing test? (Score:5, Insightful)

by Nesetril ( 969734 ) writes: on Tuesday April 25, 2006 @04:25PM (#15199833)

so can a robot write a paper and then decide whether the paper was written by a robot (itself)?

Share
twitter facebook
- Self defeating? (Score:5, Funny)
  
  by benhocking ( 724439 ) writes: <benjaminhocking@nOsPAm.yahoo.com> on Tuesday April 25, 2006 @04:32PM (#15199911) Homepage Journal
  
  It seems like it wouldn't be too difficult to modify the MIT program to use this new anti-robot robot to write papers that this anti-robot robot would not be able to detect. Ideally, this would be done with a learning algorithm (so that it could easily be extended to other anti-robot robot programs), but reverse-engineering the anti-robot robot (by humans) should also provide a solution.
  
  Now that Indiana U has thrown down the gauntlet, I wouldn't be surprised if MIT responds. Hopefully it will result in an even better paper-writing robot. Ideally, it will lead to dissertation-writing robots. :)
  
  Parent Share
  twitter facebook
  - Re:Self defeating? (Score:5, Interesting)
    
    by cp.tar ( 871488 ) writes: <cp.tar.bz2@gmail.com> on Tuesday April 25, 2006 @04:39PM (#15199978) Journal
    
    I recently had to check out an essay-grading robot for my Introduction to Natural Language Processing class.
    
    I'd fed it the introduction of a randomly generated essay. It got a 4/5 on all counts.
    
    I figure, if teachers are going to use robots to grade essays, we should use robots to create them in the first place.
    
    Parent Share
    twitter facebook
    - Re:Self defeating? (Score:2)
      
      by kabz ( 770151 ) writes:
      
      That is pure fantasy. Everyone knows that the true *academic* way to grade papers is to toss them down a nearby stairwell...
      
      Hence my *lead-weighted* document folders. Bwahahahah.
  - Re:Self defeating? (Score:2)
    
    by bokmann ( 323771 ) writes:
    
    Douglas Hofstadter would be proud of you.
  - Re:Self defeating? (Score:5, Funny)
    
    by mctk ( 840035 ) writes: on Tuesday April 25, 2006 @04:53PM (#15200134) Homepage
    
    Eventually my students won't have to write papers and I won't have to grade them! Think of the potential application of this technology towards education!
    
    Parent Share
    twitter facebook
    - Re:Self defeating? (Score:2)
      
      by frdmfghtr ( 603968 ) writes:
      
      Eventually my students won't have to write papers and I won't have to grade them! Think of the potential application of this technology towards education!
      This reminds me of a movie where a few students started sending tape-recorders to class instead of themselves. Gradually the scene had the professor lecturing to a room full of tape recorders. The last step in this scenario was a tape of the lecture being played to a room full of machines taping it.
      
      (Dammit if I can't recall which movie that is though.)
    - - Re:Self defeating? (Score:2, Funny)
        
        by mctk ( 840035 ) writes:
        
        Only if their re-writing robots are designed intelligently...
        Okay, actually I just wanted to comment that I love the sig.
  - Re:Self defeating? (Score:2)
    
    by quokkapox ( 847798 ) writes:
    
    The day is coming when you'll have to submit an authentic scientific paper in order to comment on a slashdot story.
    On that day, I'll be long dead and so will my Moravec-inspired uploaded mind-children.
  - Re:Self defeating? (Score:3, Insightful)
    
    by BraksDad ( 963908 ) writes:
    
    Maybe after a string of anti robot robots, MIT would come up with a robot that would generate a real scientific paper!
    
    next comes your anti robot robot
    then the anti anti robot robot robot
    and of course the anti anti anti robot robot robot robot
    and the anti anti anti anti robot robot robot robot robot
    ...
    I could go on since cut and paste is so easy ;-)
    
    Perhaps it would be a million anti's followed by a million and one robots before something useful came out of such an exercise, but wouldn't it be cool t
  - Re:Self defeating? (Score:2, Interesting)
    
    by Frumious Wombat ( 845680 ) writes:
    
    Personally, I'd be more interested in modifying this for Fraud Detection. The robot looks over your data and text, and decides, "Sorry Dave, a leap of faith has occurred here." Presumably, at that point the robot locks you out of your lab.
    
    This could lead to a whole series of literary robots: The Too Many Coincidences in Fiction Detector, The Humanities Thesis Verbiage Reducer, The This Movie Is Going to Suck No Matter Who Acts In/Directs It Detector, and so forth.
  - Re:Self defeating? (Score:2)
    
    by pclminion ( 145572 ) writes:
    
    Clearly, what we need here is an anti- anti-robot robot robot.
  - Re:Self defeating? (Score:2)
    
    by jdgeorge ( 18767 ) writes:
    
    Now that Indiana U has thrown down the gauntlet, I wouldn't be surprised if MIT responds. Hopefully it will result in an even better paper-writing robot. Ideally, it will lead to dissertation-writing robots. :)
    
    Hmmm.... Have you ever read a dissertation? You'd have a hard time convincing me that such a robot hasn't been in common use for quite a while.
    - Reading dissertations (Score:2)
      
      by benhocking ( 724439 ) writes:
      
      I've read several. Hopefully, I'll have written one soon. Since my research is in neural networks, I figure if I can create a neural network that writes my dissertation for me, that's not really cheating. (That's not really what I'm trying to do - my actual research topic is on the cognitive effects of gamma and theta oscillations on a neural network model of the hippocampus.)
  - Re:Self defeating? (Score:2)
    
    by marcosdumay ( 620877 ) writes:
    
    You are assuming that P == NP here. Or that the bot that creates the paper has infinite time to run.
    
    On this specific situation, it may be usefull used with a learnning algorithm. But not on a general case.
- Re:Turing test? (Score:2, Informative)
  
  by ironring2006 ( 968941 ) writes:
  
  Speaking of Turing, this showed up in the references for the automatic paper that I generated:
  
  Turing, A., Wilkes, M. V., Nehru, B., Wang, F. Z., Subramanian, L., Zhao, W., Beaman, N. A., Turcotte, B. A., and Wu, V. Refining consistent hashing and 16 bit architectures with SandyEos. Journal of Efficient, Highly-Available Communication 1 (Apr. 2002), 50-62.
  Glad to see he's still contributing to the field from the grave!
Testing... (Score:2, Interesting)

by OakDragon ( 885217 ) writes:

"We believe that there are subtle, short- and long-range word or even word string repetitions that exist in human texts, but not in many classes of computer-generated texts that can be used to discriminate based on meaning."

RESULTS: FAKE

Yep, it works!
A USEFUL application... (Score:2, Funny)

by Flimzy ( 657419 ) writes:

When will MIT modify this technology to filter all the spam from my mailbox?
- Re:A USEFUL application... (Score:2)
  
  by Khashishi ( 775369 ) writes:
  
  the day when I start using robots to churn out spam. darn, I guess I'll have to come up with some other scheme.
Discrimination (Score:5, Funny)

by hsmith ( 818216 ) writes: on Tuesday April 25, 2006 @04:26PM (#15199843)

I hope the ACLU will ensure that discrimination against metal people will not be allowed to continue.

Share
twitter facebook
- - Re:Discrimination (Score:3, Funny)
    
    by Iron Condor ( 964856 ) writes:
    
    That is people of metal, you biologist
    I think the preferred term is "Ferro-Americans".
An interesting experiment (Score:5, Funny)

by fm6 ( 162816 ) writes: on Tuesday April 25, 2006 @04:27PM (#15199847) Homepage Journal

Has anybody fed Dvorak's latest column [slashdot.org] to this program? I've often wondered if he actually writes his columns, or just generate verbiage at random.

Share
twitter facebook
- Re:An interesting experiment (Score:5, Funny)
  
  by irregular_hero ( 444800 ) writes: on Tuesday April 25, 2006 @04:32PM (#15199898)
  
  "This text had been classified as
  INAUTHENTIC
  with a 24.9% chance of being authentic text"
  
  No kidding.
  
  Parent Share
  twitter facebook
  - Re:An interesting experiment (Score:3, Funny)
    
    by Anonymous Coward writes:
    
    Yep, I tried that too.
    I also tried another article from ABC News about meat eaters contributing to global warming (http://abcnews.go.com/Technology/story?id=1856817 &page=1 [go.com]). It was inauthentic/28.8%.
    
    Looks like they have a crafty team of robots there at abc :)
  - Re:An interesting experiment (Score:2)
    
    by Rimbo ( 139781 ) writes:
    
    Holy crap. I just verified that, except I got a 25.7% chance of being authentic.
    
    Wow!
  - Re:An interesting experiment (Score:2)
    
    by syousef ( 465911 ) writes:
    
    It appears to be very dependent on the length of the article as well with short ones being deemed inauthentic. I passed my astronomy masters papers through it, and it did well predicting they were genuine (scores of 80%-96%), except for a short 4 page proposal which it said was fake (scored around 46%). However for the papers that passed if I just put the abstract in that would fail.
- Re:An interesting experiment (Score:2, Informative)
  
  by Ontain ( 931201 ) writes:
  
  that's not surprising. i did a few articles and they come up in the 20ish percent range. this detector isn't very good.
  - Re:An interesting experiment (Score:2, Informative)
    
    by MindStalker ( 22827 ) writes:
    
    It was intended to classify scientific studies. Not articles.
    - Re:An interesting experiment (Score:3, Funny)
      
      by FhnuZoag ( 875558 ) writes:
      
      I liked the vast global robot conspiracy explanation better.
- Re:An interesting experiment (Score:2, Funny)
  
  by jacquems ( 610184 ) writes:
  
  I tested it on the text from the Time Cube index page, and it was rated as AUTHENTIC with a 95.3% chance of being an authentic paper.
Typos (Score:2)

by Nom du Keyboard ( 633989 ) writes:

"We believe that there are subtle, short- and long-range word or even word string repetitions that exist in human texts, but not in many classes of computer-generated texts that can be used to discriminate based on meaning."
Do robots make typos? Do they make the same typos each time, or different ones?
Therein lies the true heart of a proper detector.
- Re:Typos (Score:2)
  
  by DanTheLewis ( 742271 ) writes:
  
  Do robots make typos? Do they make the same typos each time, or different ones? Therein lies the true heart of a proper detector. I don't make typos, but that doesnt mean I'm not a robot.
- Re:Typos (Score:3, Informative)
  
  by dlakelan ( 43245 ) writes:
  
  Do robots make typos? Do they make the same typos each time, or different ones?
  
  Based on the slashdot articles that get posted. I would say YES.
  
  Actually it's pretty easy to add random convincing misspellings to text, you could use a database from something like usenet, and a spell checker to map misspelled words to their real counterparts, and then have a straightforward algorithm for replacing some set of words with misspellings, and you could tune that for consistency. It would be easier than many other as
- Re:Typos (Score:5, Funny)
  
  by brian0918 ( 638904 ) writes: <brian0918.gmail@com> on Tuesday April 25, 2006 @04:54PM (#15200135)
  
  E-mail spambots have been making typos for years.
  
  Parent Share
  twitter facebook
Sadly, It appears that I am a robot. (Score:4, Interesting)

by cbelt3 ( 741637 ) writes: <cbelt&yahoo,com> on Tuesday April 25, 2006 @04:29PM (#15199872) Journal

I've taken a long posting that I wrote on my blog and dropped it into the site. And I am Inauthentic. Now I understand the "Bladerunner Moment" comment in the article. I shall begin to surround myself with oddly colored polaroids and snapshots of theoretically implanted ancestors.

The nice thing is that we've finally settled the argument if machines can be made to drink beer and like it !

Share
twitter facebook
See what it says about slashdot (Score:3, Funny)

by Locke2005 ( 849178 ) writes: on Tuesday April 25, 2006 @04:30PM (#15199880)

According the the program, the comments to this article are rated as follows:
This text had been classified as INAUTHENTIC with a 32.2% chance of being authentic text
Bearing in mind that text over 50% chance will be classified as authentic, this add credence to the theory that slashdot comments are generated by monkeys randomly typing on keyboards.

Share
twitter facebook
- What it says about anything (Score:3, Informative)
  
  by Pi_0's don't shower ( 741216 ) writes:
  
  I just finished writing a scientific paper for publication. Apparently, this filter is very reliant on using long-term pattern recognition. When I fed this application my introduction only, it told me my work was INAUTHENTIC with a 35% chance of authenticity. When I fed it the first two sections, it said it was AUTHENTIC with a 66% chance of authenticity. And finally, when I fed it the entire paper, it said it was AUTHENTIC at the 87% level.
  
  So apparently, all you need to do to beat this filter is insert
Sounds like a major innovation in input screening (Score:2)

by mi ( 197448 ) writes:

From e-mail spam, to Slashdot submissions, to "letters to editor", to political petitions.
Or is this just another application of Bayesian filters again?
The program is a failure. (Score:3, Interesting)

by im_thatoneguy ( 819432 ) writes: on Tuesday April 25, 2006 @04:35PM (#15199933)

Apperantly I'm on average 49% artificial, based on school papers I wrote. I dub thee program: a failure.

Share
twitter facebook
- Re:The program is a failure. (Score:2)
  
  by tomstdenis ( 446163 ) writes:
  
  Not so. The Miller-Rabin algorithm will randomly allow 25% of all composites through.
  
  The trick to reading the results is when it says "definitely fake" it's fake. Otherwise you ignore the result as either "not-fake" or inconclusive.
  
  Tom
  - Re:The program is a failure. (Score:2)
    
    by im_thatoneguy ( 819432 ) writes:
    
    Well... actually the 49% was based on a very small sample, when I enlarged it to my entire class (just for fun) it moved down to about 20%... pretty conclusively innacurate.
    - Re:The program is a failure. (Score:2)
      
      by tomstdenis ( 446163 ) writes:
      
      Yeah, my point though is the result is not aggregate it's binary. So either you get "yes this is fake" [say above 80% probability] or it isn't [below 80%].
      
      I don't know what the threshold for this test is but it's likely not around 50%.
      
      Tom
- - Re:The program is a failure. (Score:2)
    
    by im_thatoneguy ( 819432 ) writes:
    
    Oh most definitely. But still real. :)
Only works for scientific papers (Score:5, Informative)

by gurps_npc ( 621217 ) writes: on Tuesday April 25, 2006 @04:35PM (#15199936) Homepage

If you try to use it on any human written NON scientific paper, such as Lincoln's gettyburg address, it almost always considers it false.
I suspect that it is looking for the conventional thinking with conventional word structure. As such, it is NOT a good idea i

Share
twitter facebook
- Re:Only works for scientific papers (Score:5, Informative)
  
  by nasor ( 690345 ) writes: on Tuesday April 25, 2006 @05:07PM (#15200255)
  
  No, it doesn't even seem to work on scientific papers. I submitted four papers from the latest issue of Inorganic Chemistry and it thought 2 out of 4 were false:
  
  Inauthentic: Assembly of a Heterobinuclear 2-D Network: A Rare Example of Endo- and Exocyclic Coordination of PdII/AgI in a Single Macrocycle.
  
  Inauthentic: Pyrazolate-Bridging Dinucleating Ligands Containing Hydrogen-Bond Donors: Synthesis and Structure of Their Cobalt Analogues
  
  Authentic: Manganese Complexes of 1,3,5-Triaza-7-phosphaadamantane (PTA): The First Nitrogen-Bound Transition-Metal Complex of PTA
  
  Authentic: Structure, Luminescence, and Adsorption Properties of Two Chiral Microporous Metal-Organic Frameworks
  
  Based on this (small) sampling, the program doesn't appear to do any better than if it were to guess randomly. I wonder if this thing is even supposed to work, or if it just returns a random result based on a hash of the paper or something?
  
  Parent Share
  twitter facebook
  - Read the Paper - Looks at Repetition (Score:3, Informative)
    
    by Constantine Evans ( 969815 ) writes:
    
    Read the paper listed in the menu of the website. The system essentially compresses the text with different window sizes, and then looks at the compression factors. In other words, it is only looking for repetition of strings. This is absurdly easy to fool, and the MIT generator could be easily fixed to pass this filter. For example, try entering a random text once (your post, for example). Note that it fails. Then append a few copies of the same text, and run that through. Your post, when run once, is too
- Re:Only works for scientific papers (Score:2)
  
  by zurmikopa ( 460568 ) writes:
  
  It seems to think that my blog has a 94% chance of being "a human-written authentic scientific document" ...
Incase anyone was wondering... (Score:2)

by GillBates0 ( 664202 ) writes:

...I just extracted the text from the PDF version of their paper [indiana.edu] on the subject (titled "Using Compression to Identify Classes of Inauthentic Texts") and ran it through the detector.
It passed with a "90.1% of being an authentic paper.
surely already done? (Score:2)

by user24 ( 854467 ) writes:

plenty of plagiarism detection software out there; if the prank was really just random bits of (I assume pre-existing and public) text, then all the program need do is search google for a few random snippets, no?
Ah.... (Score:2, Funny)

by BaronSprite ( 651436 ) writes:

Maybe slashdot can start running it on their links for "cold fusion in 1 year!".......
FYI: it wasn't really a conference (Score:2)

by jxyama ( 821091 ) writes:

FYI, the "conference" the prank paper was accepted for is arguably a real "conference," it's certainly not a reputable one. The "conference" ("World Multi-Conference on Systemics, Cybernetics and Informatics") is famous for spamming everyone in just about every semi-related subject to submit and has famousely low bar for acceptance. See http://en.wikipedia.org/wiki/WMSCI [wikipedia.org]
As I espected (Score:2)

by voice_of_all_reason ( 926702 ) writes:

JAR JAR Oyi, mooie-mooie! I luv yous! The frog-like creature kisses the JEDI.
QUI-GON Are you brainless? You almost got us killed!
JAR JAR I spake.
QUI-GON The ability to speak does not make you intelligent. Now get outta here!

This text had been classified as INAUTHENTIC with a 46.0% chance of being authentic text
What does it think of my paper? (Score:2)

by onco_p53 ( 231322 ) writes:

Results from one of my papers: http://aem.asm.org/cgi/content/full/70/10/5980 [asm.org]

This text had been classified as
AUTHENTIC
with a 95.2% chance of being an authentic paper

Whew!!, cool maybe I'll pass the turing test too.
There're a lot of "my stuff was inauthentic" posts (Score:2)

by The_REAL_DZA ( 731082 ) writes:

from people who have fed it (and no, I haven't R'd TFA -- this is still SlashDot, isn't it?!?!) their own (genuine) papers or something they feel is "authentic", and I wonder if the reason is less the fault of the software and more the fault of (genuine/human) authors writing (intentionally or unintentionally) in such a style because it's perceived to be the way they're "supposed" to write. Maybe software like this will cause authors to put a little more thought into their craft and not allow themselves to
Folks, I am a robot plagiarist (Score:2)

by Flying pig ( 925874 ) writes:

I took one of my own postings and got a score of 11%. And it was something I had actually written myself, a piece of reasonable length about a subject on which I have first hand experience.
I then tried an article from Scientific American and it scored 24% - sorry, guys, time for me to cancel the subscription, you are full of it. Alternatively, of course, it is the University of Indiana School of Informatics that's full of it and the air is thick with over-hype. It would be interesting for someone with the t
I am in awe (Score:5, Informative)

by DingerX ( 847589 ) writes: on Tuesday April 25, 2006 @05:02PM (#15200211) Journal

So I go there, and I start shoving it text from my hard drive. I try:

A) Text of an article (Philosophy) I (native English speaker) wrote in Italian: 98.5 Authentic.
B) Text of an article I wrote in English (History): 87.8
C) Text of an article (History) written in French by a native French speaker and translated into English: 93.2
D) Critical edition of a 14th-century Latin text (Theology): 97.7 Authentic.
E) Documentation to a Field Artillery Simulation: 95.3
F) A completely bogus narrative for a monastic order that doesn't exist, written in a style that mimics A)-C): 16.8% Inauthentic

So in this case, we have a human written document that has superficial meaning, but is written as a "fake scientific paper", and registering as such.

And yes, I did read the "purpose" of the page; I know it's not supposed to detect it.

And yet it does, decisively.

Share
twitter facebook
- Re:I am in awe (Score:2)
  
  by yali ( 209015 ) writes:
  
  Interesting... Just for the heck of it, I ran Alan Sokal's [nyu.edu] paper Transgressing the Boundaries [nyu.edu] through the detector. It came back with a 93.8% chance of being authentic.
  
  For those of you who don't remember the story, Sokal, a physicist, wrote a paper full of postmodern-sounding gobbledygook, asserting among other things that gravity is a social construction (the paper was subtitled, "Towards a Transformative Hermeneutics of Quantum Gravity"). The paper was accepted at a peer-reviewed humanities journal. Sokal
  - Re:I am in awe (Score:2)
    
    by yali ( 209015 ) writes:
    
    Correction: The humanities journal in question was not peer reviewed; see the editors' account [nyu.edu] of why they published the hoax paper.
- President Bush's Biography (Score:2, Funny)
  
  by b0wl0fud0n ( 887462 ) writes:
  
  I'm in awe too. I put in George Bush's biography from the whitehouse.gov website and got
  
  This text had been classified as
  INAUTHENTIC
  with a 27.3% chance of being authentic text
  I'm amazed too! It works!
  - Re:President Bush's Biography (Score:2)
    
    by Ariane 6 ( 248505 ) writes:
    
    OTOH, the State of the Union came back as Authentic. ...I'll have to find something the man wrote himself to prove that he really IS a robot...
Well, that's a relief (Score:2)

by Reality Master 101 ( 179095 ) writes:

The Special Theory of Relativity [gutenberg.org] got a 91.9% chance of being authentic. I'm sure if Einstein were alive, he'd be relieved.
The Sokal Affair (Score:2)

by Morganth ( 137341 ) writes:

All this talk without a single mention of the Sokal Affair [wikipedia.org]? It's pretty relevant. Also be sure to check out Paul Boghossian's article, "What the Sokal Hoax Ought to Teach Us." [nyu.edu] Great reading.
Can fool it by duplicating first page (Score:2, Interesting)

by currivan ( 654314 ) writes:

Duplicating the first half of the sample fake paper after the end of the footnotes makes it go from inauthentic (17%) all the way up to 91% authentic. It seems to be looking for long-range n-gram repetition, but it doesn't have a ceiling on frequency or length or the repeated text.

It shouldn't be hard to compare the distribution of n-gram recurrence rates (or distances between recurrences) to the observed distribution for actual papers. Something like a KL divergence would capture deviations in either dir
Heuristic Bayesian Filtering Success! (Score:2)

by smellsofbikes ( 890263 ) writes:

We applaud development of heuristic filter success. Many sophisticated algorithms go into recursive development of low-latency, high-bandwidth sieving systems. Ongoing procedural optimization with commensalism yields best signal/noise ratio. Additional funding needed!
Fake mission statement detector? (Score:2)

by geobeck ( 924637 ) writes:

I wonder if this program, with a different set of algorithms, would be able to detect whether a coporate mission statement was created using the Dilbert Mission Statement Generator [dilbert.com]. (Beware; Dilbert.com is pop-up hell.)
We need a Sarfatti detector (Score:2)

by azaris ( 699901 ) writes:

As a lithmus test, any such device should be fed the writings of Jack Sarfatti, PhD (http://en.wikipedia.org/wiki/Jack_Sarfatti [wikipedia.org]). It is perfectly possible that a paper produced 100% by a human still consists of random bullshit (See: "Waldyr A. Rodrigues Jr: A Comment on Emergent Gravity" at http://arxiv.org/abs/gr-qc/0602111 [arxiv.org]).
It's just another prank. (Score:2)

by kalirion ( 728907 ) writes:

The program only pretends to use computer algorithms. In reality, it emails the submitted document to the Indiana University speed-reader champion trained to recognize fake submissions. The prof skims it, and emails back the response.
But can it write a paper that will be rejected (Score:2)

by Tired and Emotional ( 750842 ) writes:

Looks like this might be much harder
Fake Scientific Paper Detector (Score:2)

by RNLockwood ( 224353 ) writes:

Oh, sorry, I thought that the Scientific Paper Detector was a fake.
Sheesh (Score:2)

by suv4x4 ( 956391 ) writes:

A new program developed by researchers at Indiana University promises to tell you one way or the other.

You would think that this embarassment will cause the paper reviewers to look closer to what the heck they are accepting, but instead we get a program that does that job better.

Just anything, ANYTHING to keep those reviewers from actually getting their work done is well accepted.
Apparently slashdot is also written by robots (Score:2)

by jbf ( 30261 ) writes:

The following text from the slashdot homepage classified as inauthentic:

Neopallium writes to tell us that in a recent announcement at the Desktop Linux Summit the Free Standards Group reports fourteen of the leading Linux vendors have pledged support for the newest release of the Linux Standards Base. From the article: "'The Release of LSB 3.1 is another milestone achieved by the industry and the Open Source Community that delivers ever increasing value to customers,' said Reza Rooholamini, director
False positives (Score:3, Interesting)

by macklin01 ( 760841 ) writes: on Tuesday April 25, 2006 @08:05PM (#15201439) Homepage

Hmmm, it's an interesting idea, but it seems to give a lot of false positives. (So naturally, it will detect fake papers, if it thinks every paper is fake.)

First thing I tried was some pages on computational oncology website [uci.edu], in particular, my cancer primer [uci.edu], which I wrote in not a short time. Everything I fed was determined to be inauthentic. Perhaps I just write like a robot. :-) I figured that perhaps the detector was more primed for real papers, so I figured it wasn't too big of a deal.

So, next I tried my most recent research paper [sciencedirect.com], and it, too, was determined to be inauthentic, and in fact with less authenticity than my website. So much for the theory of being primed for scientific papers only. This thing is starting to look pretty bogus to me ... but an interesting idea, nonetheless. -- Paul

Share
twitter facebook
Discarded Theories (Score:2)

by Sir Holo ( 531007 ) * writes:

I've always wanted to submit a paper to one of these vanity conference "peer reviewed journals" [cough cough], the ones where no paper is ever rejected, describing some work on long-discarded theories (>50 years). Just to be cheeky.

How does "N-ray studies of the Phlogiston Content of Polywater" sound?

Should probably wait until after tenure...
Trying Wikipedia articles (Score:5, Interesting)

by Animats ( 122034 ) writes: on Tuesday April 25, 2006 @10:56PM (#15202124) Homepage

I've been trying my own papers and articles from Wikipedia. My own papers all score around 90%. Wikipedia articles that I consider good ones seem to score in the 80% range. Badly written fancruft scores very low.
Some variant on this thing might be useful as a new article filter in Wikipedia. We need more automation over there to stem the flow of incoming dreck.

Share
twitter facebook
- Re:too bad this technology... (Score:2)
  
  by TheRealMindChild ( 743925 ) writes:
  
  Way to plug your political agenda.
  
  What we really need is a fake Extacy detector. The world would be a better place.
- Some fields don't have those (Score:2)
  
  by spun ( 1352 ) writes:
  
  Literary criticism, for instance. Lit. Crit. papers never make sense so only some form of advanced computer algorithm would be able to tell if a paper was written by a human.
- Re:Great... (Score:2)
  
  by apt142 ( 574425 ) writes:
  
  You mean these guys? [slashdot.org]
- Re:How about . . . (Score:2)
  
  by hal2814 ( 725639 ) writes:
  
  Hey, if you don't like 1-ply you can always fold it in half.
  - Re:How about . . . (Score:3, Funny)
    
    by mypalmike ( 454265 ) writes:
    
    Hey, if you don't like 1-ply you can always fold it in half.
    
    And if you don't like 2-ply, you can separate the sheets. Keep in mind that this works best before you wipe.
- Re:What a Downer (Score:2)
  
  by zippthorne ( 748122 ) writes:
  
  Don't feel too bad, at many conferences, the "Pros" make up the presentation on the spot while the preceeding presenter is presenting his research and insight..
- From the paper. (Score:2)
  
  by The Real Nem ( 793299 ) writes:
  
  One must understand our network configuration to grasp the genesis of our results. We ran a deployment on the NSA's planetary-scale overlay network to disprove the mutually largescale behavior of exhaustive archetypes. First, we halved the effective optical drive space of our mobile telephones to better understand the median latency of our desktop machines. This step flies in the face of conventional wisdom, but is instrumental to our results. We halved the signal-to-noise ratio of our mobile telephones. W
- Re:It Caught Mine (Score:3, Interesting)
  
  by Em Adespoton ( 792954 ) writes:
  
  This raises a question... how do Wikipedia articles fare? --I'd guess that they should be at least *somewhat* scientific....
- Re:Spam? (Score:2)
  
  by ClickOnThis ( 137803 ) writes:
  
  Could I run this spam email through the inauthentic paper detector and have it come out as authentic?
  
  This touches on the one issue that has yet to be discussed here: what, if anything, could this program do to help identify spam? My first thought is "probably not much", only because papers tend to be alot longer than e-mails, thus giving the program a chance to to generate better statistics for a bayesian filter to make a decision. Even so, when I look at some of the surreal gibberish embedded in spam tha

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Yes! (Score:5, Funny)

Re:Yes! (Score:2, Funny)

Re:Yes! (Score:3, Funny)

Re:Yes! (Score:2)

Re:Yes! (Score:2)

Re:Yes! (Score:3, Interesting)

Re:Yes! (Score:2, Funny)

That's good and all (Score:5, Funny)

Re:That's good and all (Score:4, Funny)

Re:That's good and all (Score:5, Informative)

Re:That's good and all (Score:2)

Re:That's good and all (Score:2)

Re:That's good and all (Score:3, Funny)

Re:That's good and all (Score:2)

Re:That's good and all (Score:3, Funny)

Re:That's EASY! (Score:5, Funny)

Re:That's EASY! (Score:3, Funny)

Turing test? (Score:5, Insightful)

Self defeating? (Score:5, Funny)

Re:Self defeating? (Score:5, Interesting)

Re:Self defeating? (Score:2)

Re:Self defeating? (Score:2)

Re:Self defeating? (Score:5, Funny)

Re:Self defeating? (Score:2)

Re:Self defeating? (Score:2, Funny)

Re:Self defeating? (Score:2)

Re:Self defeating? (Score:3, Insightful)

Re:Self defeating? (Score:2, Interesting)

Re:Self defeating? (Score:2)

Re:Self defeating? (Score:2)

Reading dissertations (Score:2)

Re:Self defeating? (Score:2)

Re:Turing test? (Score:2, Informative)

Testing... (Score:2, Interesting)

A USEFUL application... (Score:2, Funny)

Re:A USEFUL application... (Score:2)

Discrimination (Score:5, Funny)

Re:Discrimination (Score:3, Funny)

An interesting experiment (Score:5, Funny)

Re:An interesting experiment (Score:5, Funny)

Re:An interesting experiment (Score:3, Funny)

Re:An interesting experiment (Score:2)

Re:An interesting experiment (Score:2)

Re:An interesting experiment (Score:2, Informative)

Re:An interesting experiment (Score:2, Informative)

Re:An interesting experiment (Score:3, Funny)

Re:An interesting experiment (Score:2, Funny)

Typos (Score:2)

Re:Typos (Score:2)

Re:Typos (Score:3, Informative)

Re:Typos (Score:5, Funny)

Sadly, It appears that I am a robot. (Score:4, Interesting)

See what it says about slashdot (Score:3, Funny)

What it says about anything (Score:3, Informative)

Sounds like a major innovation in input screening (Score:2)

The program is a failure. (Score:3, Interesting)

Re:The program is a failure. (Score:2)

Re:The program is a failure. (Score:2)

Re:The program is a failure. (Score:2)

Re:The program is a failure. (Score:2)

Only works for scientific papers (Score:5, Informative)

Re:Only works for scientific papers (Score:5, Informative)

Read the Paper - Looks at Repetition (Score:3, Informative)

Re:Only works for scientific papers (Score:2)

Incase anyone was wondering... (Score:2)

surely already done? (Score:2)

Ah.... (Score:2, Funny)

FYI: it wasn't really a conference (Score:2)

As I espected (Score:2)

What does it think of my paper? (Score:2)

There're a lot of "my stuff was inauthentic" posts (Score:2)

Folks, I am a robot plagiarist (Score:2)

I am in awe (Score:5, Informative)

Re:I am in awe (Score:2)

Re:I am in awe (Score:2)

President Bush's Biography (Score:2, Funny)

Re:President Bush's Biography (Score:2)

Well, that's a relief (Score:2)

The Sokal Affair (Score:2)

Can fool it by duplicating first page (Score:2, Interesting)