Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Automatic Image Tagging

Posted by CowboyNeal on Thu Nov 02, 2006 08:07 PM
from the on-the-horizon dept.
bignickel writes "Researchers at Penn State have applied for a patent on software that automatically recognizes objects in photos and tags them accordingly. The 'Automatic Linguistic Indexing of Pictures Real-Time' software (catchy name) trained a database using tens of thousands of images, and new images have 15 tags suggested based on comparisons with objects or concepts in the database. Not sure how you identify a 'concept,' and they're only talking about having one correct tag in the top 15, but still cool."
This discussion has been archived. No new comments can be posted.
Display Options Threshold:
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • Not shockingly... (Score:5, Funny)

    by goldmeer (65554) on Thursday November 02 2006, @08:11PM (#16698175)
    The vast majority of the images on the internets including The Google include "Pornography" in it's top 15 tags suggested. The accuracy rate is surprisingly high.
  • That Sucks (Score:2, Funny)

    by JerkyBoy (455854) * on Thursday November 02 2006, @08:15PM (#16698221)
    (http://www.behti.com/ | Last Journal: Monday July 25 2005, @03:30AM)
    Researchers at a publicly funded institution are using their research results for personal (financial gain). Pennsylvania's tax dollars at work? How is this legal?
    • Re:That Sucks by Nybarius (Score:2) Thursday November 02 2006, @08:24PM
      • Re:That Sucks by KillerDeathRobot (Score:2) Friday November 03 2006, @02:31AM
      • 2 replies beneath your current threshold.
    • Re:That Sucks by Manchot (Score:2) Thursday November 02 2006, @09:00PM
    • Re:That Sucks by megaditto (Score:2) Thursday November 02 2006, @09:11PM
      • Re:That Sucks by sprocketbox (Score:1) Thursday November 02 2006, @09:42PM
        • Re:That Sucks by megaditto (Score:1) Thursday November 02 2006, @10:14PM
      • 1 reply beneath your current threshold.
    • Re:That Sucks by pantalanaga11 (Score:1) Friday November 03 2006, @09:13AM
    • Re:That Sucks by Gemini_25_RB (Score:1) Thursday November 02 2006, @08:58PM
    • Re:That Sucks by tomstdenis (Score:2) Thursday November 02 2006, @10:04PM
      • Re:That Sucks by grimsweep (Score:2) Friday November 03 2006, @01:38AM
        • Re:That Sucks by tomstdenis (Score:2) Friday November 03 2006, @07:03AM
    • 3 replies beneath your current threshold.
  • The other 50% is the problem (Score:3, Informative)

    by Heir Of The Mess (939658) on Thursday November 02 2006, @08:21PM (#16698273)
    (http://johnstewien.spaces.live.com/)

    I've seen lots of systems like this. The problem is in the 50% of the images that don't work, so basically you have to manually tag 50% of your images.

    I saw an interesting one about 10 years ago. It took an X-Ray image, did an edge detection, converted all the edges to a slope vs distance 2D plot, and conerted edge curves to a radius and distance plot, then used a kind of statistical correlation algorithm to pick which part of the body the image was from. I could imagine that you could apply something similar to the luminance of an image to pick out objects, and then maybe do some color transforms and stuff to improve results. The article says they do it in 1.4 seconds per image though, which is impressive.

  • Prior art? (Score:2)

    by Sirch (82595) on Thursday November 02 2006, @08:23PM (#16698285)
    (http://www.pureinnovation.com/ | Last Journal: Friday September 19 2003, @04:30PM)
    Not RTFA to be honest, but can I claim prior art?

    http://www.relle.co.uk/papers/2003-Content_Based_I mage_Retrieval__A_Words_and_Pictures_Approach.pdf [relle.co.uk]

    We didn't have enough time to train the system properly, but itstarted off well...

    • Re:Prior art? by nietpiet (Score:1) Friday November 03 2006, @07:14AM
    • Re:Prior art? by SausageOfDoom (Score:1) Friday November 03 2006, @03:18AM
    • 2 replies beneath your current threshold.
  • LIPS (Score:1)

    by ch-chuck (9622) on Thursday November 02 2006, @08:26PM (#16698323)
    (http://slashdot.org/)
    You've got to hand it to those cunning linguists at Penn State.

    • Re:LIPS by mattwarden (Score:2) Thursday November 02 2006, @09:10PM
      • Re:LIPS by Original Replica (Score:3) Thursday November 02 2006, @09:31PM
    • Re:LIPS by shashark (Score:3) Friday November 03 2006, @02:08AM
  • by EMIce (30092) on Thursday November 02 2006, @08:28PM (#16698339)
    (http://www.golden-dumpling.org/)
    I'm sure a lot of research is being done in this area, in fact there is lots of interest implement this sort of thing in DSP for robot vision. How much of what this patent covers overlaps with what the others are working on? Is this something completely out from left field or does it fit the trend of where this area research was headed anyway?
  • A video an the subject (Score:2, Informative)

    by damgx (132688) on Thursday November 02 2006, @08:29PM (#16698347)
    Luis Van Ahn did something almost the same, his idea though is to use humans aswell.

    View the video on Human Computation [google.com]
  • retrievr (Score:1)

    by iceph03nix (1005545) on Thursday November 02 2006, @08:30PM (#16698353)
    For those who missed retrievr...http://labs.systemone.at/retrievr/ [systemone.at] While this is good for hours of entertainment, i hope what theyre promoting is better.
    • Re:retrievr by ExFCER (Score:1) Thursday November 02 2006, @09:17PM
  • by indigest (974861) on Thursday November 02 2006, @08:31PM (#16698357)
    FTFA:
    The analysis takes about 1.4 seconds per image and in 98 per cent of tests suggests at least one correct tag in the top 15.
    I suspect you could generate a list of 15 sufficiently vague words that would cover 98% of all images. Here's a start: people, sport, animal, trees...
  • by packetmon (977047) on Thursday November 02 2006, @08:32PM (#16698365)
    (http://www.infiltrated.net/)
    at least that's what I told the psychologist. Then on the second look, it looked like splotched ink on a paper that was then folded in half... I hope this software doesn't think like me cause at the end of it all, I saw a segfaulted X server on fvwm
  • API (Score:1)

    by dampjam (779525) on Thursday November 02 2006, @08:37PM (#16698423)
    I currently work for the group doing this - a very cool new feature will be launched in the next week that I am writing (stay tuned). Yes - this project has been done many times before by many people (to lesser degrees of success than this), but the thing to keep in mind is that this is realtime. It takes less than a second for the tags to be generated. All previous systems required a much larger amount of processing time. Check out www.alipr.com to try it yourself!
    • Re:API by OzPhIsH (Score:2) Friday November 03 2006, @12:42AM
      • Re:API by dampjam (Score:1) Friday November 03 2006, @03:38AM
  • by Gothmolly (148874) on Thursday November 02 2006, @08:37PM (#16698429)
    How do they get less than a 50% average that you'd get by just guessing?

    (yes, assuming a normal distribution of 'concepts' in the pictures, etc)
  • w00t!!! (Score:3, Funny)

    by rts008 (812749) <rts008@ h o t mail.com> on Thursday November 02 2006, @08:38PM (#16698433)
    (http://www.redorbit.com/ | Last Journal: Sunday October 07, @03:44AM)
    Now almost 7% of my pr0n will get tagged correctly!
    That's cool, the rest of it will be like opening xmas presents!

    *file: 123456.jpeg>open>Aghh! Goatse!*

    Hmmm...This may be neat when it gets a LITTLE more accurate, but a cool start none the less.
    Kudus to the gang for getting a grip on a hard problem...erm..nevermind.
    • Re:w00t!!! by CCFreak2K (Score:2) Thursday November 02 2006, @08:56PM
      • Re:w00t!!! by rts008 (Score:2) Thursday November 02 2006, @09:29PM
        • Re:w00t!!! by nsillik (Score:1) Thursday November 02 2006, @10:57PM
          • Re:w00t!!! by rts008 (Score:2) Thursday November 02 2006, @11:19PM
            • Re:w00t!!! by nsillik (Score:1) Friday November 03 2006, @12:18AM
    • 1 reply beneath your current threshold.
  • In future news... (Score:1)

    by Parallax Blue (836836) on Thursday November 02 2006, @08:41PM (#16698455)
    Image recognition software is making it even easier for your kids to find porn! More at 6...

  • Not sure how far they got, but remember reading that IBM was working on this and had some reasonable success at object recognition in images. I'd love to be able to classify the 10k digital images I've got around. Especially if it can recognize individuals (not that it would know their names initially, but would be trainable).
  • Reportedly (Score:3, Funny)

    by stunt_penguin (906223) on Thursday November 02 2006, @08:59PM (#16698621)
    Reportedly the researchers showed the system a picture of a Death Star, and it correctly tagged the image with 'thatsnomoon'.

    The system has clearly been let crawl the web for far too long.
  • by Duggeek (1015705) on Thursday November 02 2006, @09:25PM (#16698789)
    (http://dehweb.home.comcast.net/ | Last Journal: Wednesday December 06 2006, @12:37PM)

    Unless Jupiter Media [jupitermedia.com] gets to it first.

    Someone like myself would understand the hours of data-entry and database development that goes into indexing imagery. I research photo copyrights for a living.

    The fact that there is a feasible, automated system that can do the work will certainly cut down the man-hours for that sort of work; at least by half.

    Pity, though. I heard that Google and others had a telecommuting thing that paid people to recognize what's in a photo. Sorry to hear they'll be out of a job soon.

  • Workarounds....... (Score:1)

    by Anachragnome (1008495) on Thursday November 02 2006, @09:28PM (#16698813)
    .........SexSurfer logs in to begin his daily search of the web to find more images to rip in an effort to increase his database of porn images, utilizing this technology, only to find that most of the images consist of naked women with political statements printed on their asses......

    Seriously now, I am sure their are people out there that have already got ideas rolling around in their heads about how they can use this technology to hijack images to their advantage. Once somebody understands how the technology works it is only a matter of time before it is used for nefarious purposes, by means of "tricking" the technology. And in the process, invalidating any possible means by which the developers can realize a return on their investments.

    Personally, I'd love to use such a technology(if it actually works) to sift through the plethora of "crap" images I have to search through on the web. It can be really frustrating to do a search only to find that a vast amount of the results are TOTALLY out of context simply because of the title tag attached.
  • by theeddie55 (982783) on Thursday November 02 2006, @09:39PM (#16698891)
    well that really had to happen, i've just tried it and you can't really go wrong if one of the top 15 tags is 'photo' and another of the tags is 'thing'.
  • Now all they need to do is come up with a way to recognize spam words in image text without the overhead of OCR and they can make a fortune on that alone.
    • 1 reply beneath your current threshold.
  • Uni assignment (Score:1)

    by wayneo13 (950853) on Thursday November 02 2006, @09:53PM (#16698969)
    This reminds me of a uni assignment that i did where we matched images based on colour.
  • I'm not... (Score:2)

    by Morphine007 (207082) on Thursday November 02 2006, @10:03PM (#16699045)

    ... usually a pedant... but you don't train a database. It was likely a neural net, but TFA is rather thin on details. Anyone got a link to their paper?

  • by Anonymous Coward on Thursday November 02 2006, @11:22PM (#16699497)
    Makes it easier to process all that data generated by all those security cams.

    Is there a "Big Brother" category on Slashdot, yet?
  • Wrong approach? (Score:2)

    by havardi (122062) on Friday November 03 2006, @12:08AM (#16699749)
    This would be cool and all, but why not focus more on letting humans do the hard work-- like if I could take a picture of a tree and then press a button and say aloud; "redwood tree", and have that tag the file.
  • by presidenteloco (659168) on Friday November 03 2006, @12:14AM (#16699789)
    When I was studying textbooks on how to do this in undergrad comp sci.
  • So they say... (Score:2)

    by Ninwa (583633) * <jbleau@gmail.com> on Friday November 03 2006, @12:18AM (#16699803)
    (http://www.ninwa.net/ | Last Journal: Thursday July 27 2006, @06:55PM)
    That's all fine and great that they can tell us, but why the heck couldn't they make a web-interface for it so I could try it out?
  • Bullshit Patents (Score:2)

    by OzPhIsH (560038) on Friday November 03 2006, @12:19AM (#16699811)
    (Last Journal: Saturday November 30 2002, @01:53AM)
    This just your standard data mining classification system, simply applied to image data as input, with the tags being possible classifications. This is an obvious application to ANYONE in the field. Software patents suck.
  • confucius says... (Score:1)

    by recharged95 (782975) on Friday November 03 2006, @01:13AM (#16700093)
    (Last Journal: Friday September 17 2004, @04:10PM)
    Considering the adage that "a picture is worth a thousand words", they're going to have a lot more words to index--where the words may not follow a specific taxonomy.

    And that's one of the problems: does an image define the taxonomy or taxonomy defines the image [type]?

  • I quickly run out of fantasy when it comes to assign tags to my pictures: an automated mass tag finder will save hours of my precious time while uploading photos to Flickr.
  • Neural Nets (Score:2, Insightful)

    by gekoscan (1001678) on Friday November 03 2006, @02:25AM (#16700351)
    How can you take a neural network and train it, then patent that?
    That's like patenting training a dog to fetch a stick, it's completely rediculous.

    You take software capable of generalizing a neural network algorithm by feeding it pictures and associating each picture with certain tags. It then creates a generalized algorithm model based on what you fed it initially. So that when you give new input it is capable of outputting tags most similar to what you initially trained it.

    So yes this software can recognize boxes, shapes, other objects, maybe scenes etc and associate them with tags... but ask them how the algorithm works under the hood =) They have no idea... a neural network is like a black box after it has been trained. You feed it input and it gives you output based on it's initial training. The inner workings are chaotic spaghetti values set on each neuron weighting and can't be deciphered.

    How can you patent software that is a black box inside?

    "Yes hello patent office? I have this box that manufactures microprocessors. I feed it all the materials and it outputs a shiny new processor. I am not sure of the manufacturing process internally but the output works great. I would like to patent this manufacturing process.

    "Okay your patent number is 247286-"BLACK BOX"-9

    The whole point of a neural network is it generalizes what you train it and can future predict any input based on that.

    It's like having the invention of the first mirror and everytime someone put something different infront of it, that person called up the art gallery because they had a new painting that they wanted in their name (because depending what was in front of it you get a different reflection).
  • Publications (Score:1)

    by $pearhead (1021201) on Friday November 03 2006, @03:00AM (#16700473)
    Instead of speculating, why not just read all about the algorithms?

    Main publications:
    http://infolab.stanford.edu/~wangz/project/imsearc h/ALIP/ACMMM06/ [stanford.edu]
    http://www-db.stanford.edu/~wangz/project/imsearch /ALIP/PAMI03/ [stanford.edu]
    http://www-db.stanford.edu/~wangz/project/imsearch /SIMPLIcity/TPAMI/ [stanford.edu]
  • unimpressing (Score:2)

    by Eivind (15695) <eivindorama@gmail.com> on Friday November 03 2006, @03:46AM (#16700637)
    (http://ekj.vestdata.no/)
    If it worked, it'd be very useful. However, getting the top 1 tag correct 50% of the time (which is the only quantifiable claim in the article) is pretty straigthforward. For most peoples photo-albums that can be done by the following AI-program: "print 'people'"

    There's a few subjects that are so common that it's more or less a given they'll be in a large fraction of the photos. Outputting "people, buildings, nature, animals, plants, city" would probably give atleast 1-2 "correct" tags for 90% of whats in peoples photoalbums.

    I had a class on neural networks and their (weak) sort of "ai", one task was to build a program to separate male from female names. The best programs could manage 80% or so, which is sorta decent. Until you realize that checking against static lists of the top 100 male/female names, if it's not in the list guess female if it ends in 'a', otherwise guess randomly will get you aproximately 95%. Furthermore, the latter program runs an order of magnitude faster, is more easily debuggable, can be understood by anyone, and can trivially be "extended" to reach 99% or more, simply by extending the lists of known male/female names.

  • link? (Score:2)

    by Tom (822) on Friday November 03 2006, @04:16AM (#16700765)
    (http://web.lemuria.org/)
    So where's the download link? How can software matter if I can't get it? ;-)
  • Pictionary! (Score:1)

    by f0rtytw0 (446153) on Friday November 03 2006, @08:01AM (#16701613)
    (Last Journal: Monday December 12 2005, @01:08PM)
    I now have someone to play pictionary with.
  • Big deal (Score:2)

    by Oligonicella (659917) on Friday November 03 2006, @08:22AM (#16701741)
    So what were they doing, throwing a dart at a damn board? That success rate is no better then randomly applying vague words.
     
    Move along to real research.
  • by Corporate Gadfly (227676) on Friday November 03 2006, @09:11AM (#16702165)
    I wonder how this compares with Riya [riya.com]. At some point, there were plenty of rumours [google.com] of a possible Google purchase of Riya. Then again they were rumours.

    I haven't RTFA and I don't have any experience with Riya either, so consider the above posting a waste of time (if you must).
  • Very important... (Score:1)

    by Not-a-Neg (743469) on Friday November 03 2006, @01:42PM (#16706509)
    Does it detect breast size?
  • Wouldn't it be easier to pay starving children in [???] $0.01 / hour to tag images?
  • Re:Tag for /. (Score:2)

    by Amouth (879122) on Thursday November 02 2006, @08:59PM (#16698625)
    "itsatrap" is the best... but use sparingly
    [ Parent ]
    • 1 reply beneath your current threshold.
  • Re:balls (Score:1)

    by ExFCER (1001188) on Thursday November 02 2006, @09:22PM (#16698781)
    Please see my other post in this thread. THX.
    [ Parent ]
  • by James McGuigan (852772) on Friday November 03 2006, @05:20AM (#16700977)
    I think I could categorise most things using less than 15 (admittedly very broad) tags. Animal, person, plant, machine, sports, vehicle, furniture, book, etc.
    So which of these categories does a mushroom fit into?
    [ Parent ]
  • 9 replies beneath your current threshold.