AI Beats Human Sleuth at Finding Problematic Images in Research Papers (nature.com) 12
An algorithm that takes just seconds to scan a paper for duplicated images racks up more suspicious images than a person. Nature: Scientific-image sleuth Sholto David blogs about image manipulation in research papers, a pastime that has exposed him to many accounts of scientific fraud. But other scientists "are still a little bit in the dark about the extent of the problem," David says. He decided he needed some data. The independent biologist in Pontypridd, UK, spent the best part of several months poring over hundreds of papers in one journal, looking for any with duplicated images. Then he ran the same papers through an artificial-intelligence (AI) tool. Working at two to three times David's speed, the software found almost all of the 63 suspect papers that he had identified -- and 41 that he'd missed. David described the exercise last month in a preprint, one of the first published comparisons of human versus machine for finding doctored images.
The findings come as academic publishers reckon with the problem of image manipulation in scientific papers. In a 2016 study, renowned image-forensics specialist Elisabeth Bik, based in San Francisco, California, and her colleagues reported that almost 4% of papers she had visually scanned in 40 biomedical-science journals contained inappropriately duplicated images. Not all image manipulation is done with nefarious intent. Authors might tinker with images by accident, for aesthetic reasons or to make a figure more understandable. But journals and others would like to catch images with alterations that cross the line, whatever the authors' motivation. And now they are turning to AI for help.
Some 200 universities, publishers and scientific societies already rely on Imagetwin, the tool that David used for his study. The software compares images in a paper with more than 25 million images from other publications -- the largest such database in the image-integrity world, according to Imagetwin's developers. Bik has been using Imagetwin regularly to supplement her own skills and calls it her "standard tool," although she emphasizes that the AI has weaknesses as well as strengths -- for instance, it can miss duplications in images with low contrast.
The findings come as academic publishers reckon with the problem of image manipulation in scientific papers. In a 2016 study, renowned image-forensics specialist Elisabeth Bik, based in San Francisco, California, and her colleagues reported that almost 4% of papers she had visually scanned in 40 biomedical-science journals contained inappropriately duplicated images. Not all image manipulation is done with nefarious intent. Authors might tinker with images by accident, for aesthetic reasons or to make a figure more understandable. But journals and others would like to catch images with alterations that cross the line, whatever the authors' motivation. And now they are turning to AI for help.
Some 200 universities, publishers and scientific societies already rely on Imagetwin, the tool that David used for his study. The software compares images in a paper with more than 25 million images from other publications -- the largest such database in the image-integrity world, according to Imagetwin's developers. Bik has been using Imagetwin regularly to supplement her own skills and calls it her "standard tool," although she emphasizes that the AI has weaknesses as well as strengths -- for instance, it can miss duplications in images with low contrast.
Simple formula for AI beats human: (Score:2)
Increase the volume until the human has to reduce the quality of his work to keep up. Indeed, this is the basis of the expected economic imporance of AI: scalability. AI is a way of getting cheap mediocrity. Mediocrity is by definition "not bad" so that's an economic win.
But in this case, it's not an impressive "win" for AI. Does it really suprise anyone that software can scan a huge body of data and be better at picking up duplications than a human would be?
Re: (Score:2)
Re: (Score:2)
It can be both, but in this case the nature of the algorithm doesn't seem that important.
Re: (Score:2)
Does it really suprise anyone that software can scan a huge body of data and be better at picking up duplications than a human would be?
Well, image duplication is only one of many types of fraudulent image use in scientific papers; it would be nice if the AI could also find other types of manipulation. But, yes, I'd think that finding duplications is work that a computer would be particularly well suited for, and if a computer can do that work, well, it should.
And re-use of an image isn't in itself fraud. It's using the same image but labeling it as different things that's fraudulent. If it's "diagram of the laser probe station used for tr
Nothing Intelligent about it (Score:2)
This is not AI, not even close. This is what computers were designed to do. Wake me when it finds flaws in a research paper's conclusions and suggests a better one.
I wonder.... (Score:2)
What would happen if you told an AI to create photographs and images and then even an AI couldn't detect?
I suspect we'd really be doomed as we'd never know if it's truth or fiction.
Sorta like when they had the ship's computer create an opponent that could outsmart Data on ST:TNG.
Oh FFS this is not AI. (Score:2)
Re: (Score:1)
Agree. But with classical algorithms you would need to register the images, to account for magnification and cropping. Probably also other things, like color v grayscale.
Anyway, detecting academic fraud is really cool. Please see the work of researchers who took down Harvard Business Professor Francesca Gino. Read post 109. You'll never look at Excel the same way.
https://datacolada.org/ [datacolada.org]