Researchers Teach Computers To Perceive 3D from 2D 145

Posted by ScuttleMonkey on Wednesday June 14, 2006 @03:16PM from the your-battlebot-wants-an-upgrade dept.

hamilton76 writes to tell us that researchers at Carnegie Mellon have found a way to allow computers to extrapolate 3 dimensional models from 2 dimensional pictures. From the article: "Using machine learning techniques, Robotics Institute researchers Alexei Efros and Martial Hebert, along with graduate student Derek Hoiem, have taught computers how to spot the visual cues that differentiate between vertical surfaces and horizontal surfaces in photographs of outdoor scenes. They've even developed a program that allows the computer to automatically generate 3-D reconstructions of scenes based on a single image. [...] Identifying vertical and horizontal surfaces and the orientation of those surfaces provides much of the information necessary for understanding the geometric context of an entire scene. Only about three percent of surfaces in a typical photo are at an angle, they have found."

This discussion has been archived. No new comments can be posted.

Researchers Teach Computers To Perceive 3D from 2D

Load All Comments

Search 145 Comments Log In/Create an Account

Comments Filter:

Awesome! (Score:5, Funny)

by rblum ( 211213 ) writes: on Wednesday June 14, 2006 @03:19PM (#15534413)

Now run it on an Escher picture!

Share
twitter facebook
- Re:Awesome! (Score:2)
  
  by Tackhead ( 54550 ) writes:
  
  > Now run it on an Escher picture!
  "Bite the fish-eye lens facing my Fembot's shiny metal boobs!"
  - Re:Awesome! (Score:2)
    
    by vandon ( 233276 ) writes:
    
    FTFA: Using 300 images gleaned from a Google search....
    
    I would like to see the results from the Google images with "safe search" turned off.
- Re:Awesome! (Score:2)
  
  by bill_mcgonigle ( 4333 ) * writes:
  
  I had to reply to your comment since I was going to use the same subject.
  
  For me, it's adding another item to the "things they said were impossible in CS class but are now available". The stuff Salient Stills is selling is another idea I had in school for a project - fortunately the grad students were able to show me how that was mathematically impossible too. :)
  
  "Never say never", boys and girls. I'll get back in line for my FTL transporter then.
- Re:Awesome! (Score:2)
  
  by geobeck ( 924637 ) writes:
  
  Now run it on an Escher picture!
  
  +++Out of cheese error+++
  +++Please reboot universe+++
  +++Redo from start+++
  
  /TP's DW reference
  - Re:Awesome! (Score:2)
    
    by cp.tar ( 871488 ) writes:
    
    +++ Melon Melon Melon +++
- Re:Awesome! (Score:2)
  
  by Ant P. ( 974313 ) writes:
  
  You'll have to wait for the 4D version for that
- Escher in 3D (Score:1)
  
  by Jboost ( 960475 ) writes:
  
  I think you'll find this interesting: http://www.cs.technion.ac.il/~gershon/EscherForRea l/ [technion.ac.il]
leaning tower (Score:3, Interesting)

by ZivZoolander ( 964472 ) writes: on Wednesday June 14, 2006 @03:21PM (#15534434)

Wonder how this will handle those optical illusion photos. like me nocking over the leaning tower of pisa, or holding hte statue of liberty.

Share
twitter facebook
- Re:leaning tower (Score:2, Funny)
  
  by Tolleman ( 606762 ) writes:
  
  Just like us. Segmentation fault.
- Re:leaning tower (Score:1)
  
  by deathstar778 ( 743617 ) writes:
  
  I live in Pisa actually, and I can't stand seeing people trying to push the tower anymore!!!
  AAAAAAAAAARGH! Try to imagine 300 folks making the same photo at the same time.
  They look sooooo dumb!
  - Re:leaning tower (Score:2)
    
    by Patrik_AKA_RedX ( 624423 ) writes:
    
    You should have thought about that before building that tower.
Directly applicable to the car racing AI grand.... (Score:4, Interesting)

by ChrisGilliard ( 913445 ) writes: <christopher.gilliard@NosPam.gmail.com> on Wednesday June 14, 2006 @03:22PM (#15534443) Homepage

...challenge. I think Carnegie Mellon wants revenge against Stanford for beating them in the 2006 DARPA grand challenge. Maybe 2007 will be Carnegie Mellon's year to win the grand challenge. If this happens, we're only a hop skip and a jump to having these things drive us around (esp on freeways).

Share
twitter facebook
- Bus-ted. (Score:1, Funny)
  
  by Anonymous Coward writes:
  
  "If this happens, we're only a hop skip and a jump to having these things drive us around (esp on freeways)."
  
  Man that would be a pretty neat invention [basintransit.com].
- Re:Directly applicable to the car racing AI grand. (Score:1)
  
  by LiquidCoooled ( 634315 ) writes:
  
  Granted you can extrapolate an estimate of the surroundings for a 3d scene from a single image.
  This is good when the source material doesn't exist.
  
  However if I were in the grand challenge I wouldn't be swapping the (minimum) stereo imaging most cars appear to have.
  
  1) its an approximation and may not be applicable for different terrain or obsticles (similar rock against similar floor)
  2) its harder to fool 2 cameras than a single one, glitches could send you off the cliff.
  3) with a stereo pair you can interpo
  - Re:Directly applicable to the car racing AI grand. (Score:2)
    
    by Directrix1 ( 157787 ) writes:
    
    Well, that and we have a gigantic corpus of training data to extrapolate from.
  - Re:Directly applicable to the car racing AI grand. (Score:2)
    
    by zippthorne ( 748122 ) writes:
    
    glitches can't send you over a cliff. maps & GPS (or inertial) keeps you off the cliffs. glitches could send you into a ditch or onto a bush, either of which would be difficult to extract from.
Imagine the Possibilities (Score:2, Interesting)

by Valthan ( 977851 ) writes:

One could concievably take a pictures of a city, upload them to this program, stich the pieces together and then import it into a game world. How awesome would it be to actually be able to run around a city(say Toronto) and do things you always wanted to do... (dropping a penny off of the CN tower and having it hit someone :D)
- Re:Imagine the Possibilities (Score:2)
  
  by -kertrats- ( 718219 ) writes:
  
  The Getaway [gamerankings.com] already has a startlingly accurate virtual London.
- Errr... (Score:5, Informative)
  
  by Ayanami Rei ( 621112 ) * writes: <rayanami&gmail,com> on Wednesday June 14, 2006 @03:26PM (#15534476) Journal
  
  you've always been able to do that.
  Cities aren't the kind of thing this is target for.
  You can get building plans and architectural drawings and everything from the city for free. There are algorithms that can easily map pictures to objects if you know ahead of time the shape of the things that "should" be there.
  
  This stuff is for deciding the shape of unknown things, and more importantly, to gain new heuristics for image searches.
  
  With this technology, you could ask for "things that are round, and have a box".
  
  More importantly, you could show the computer one picture of something, and have it attempt to find more pictures of it (from different angles, with different colors, etc.). Like you show it a Volvo C90, and it shows you any and all pictures of Volvo C90s by the shape.
  
  Parent Share
  twitter facebook
  - Re:Errr... (Score:1)
    
    by Trigun ( 685027 ) writes:
    
    How about building a 3D representation of a terrorism suspect?
    
    There's your grant money right there, boys!
  - Re:Errr... (Score:2)
    
    by Kesch ( 943326 ) writes:
    
    With this technology, you could ask for "things that are round, and have a box"
    
    Really...
    
    hmm...
    
    I was thinking "things that are round, and have a nipple"
  - Not for objects at all (Score:3, Insightful)
    
    by moultano ( 714440 ) writes:
    
    This is only for outdoor scenes and only extracts planar information. It isn't designed for objects at all. It provides general geometric context, ie this area is ground, this area is a left facing wall, etc. That's not to say that a similar technique couldn't be used for identifying round objects, but that isn't what this is for.
  - Re:Errr... (Score:3, Funny)
    
    by jackbird ( 721605 ) writes:
    
    You can get building plans and architectural drawings and everything from the city for free. There are algorithms that can easily map pictures to objects if you know ahead of time the shape of the things that "should" be there.
    Dear Sir,
    ha ha ha.
    ha ha ha ha ha ha ha.
    ha.
    If only.
    Signed,
    every CAD operator in the world
    - Well... (Score:2)
      
      by Ayanami Rei ( 621112 ) * writes:
      
      This all pre-supposes you can translate the diagram accurately and position it in the 3d world. You'd probably need GPS readings at different points on the building, and on the camera to get decent results.
      
      And you need a light model and surface texture models (or a lot of pictures from different angles).
      
      So this isn't trivial. But it's doable. Such techniques are used in film for scene composition and for texturing 3d representations of real-world objects.
      
      It's not like you can just take a picture of a buildi
      - Re:Well... (Score:3, Interesting)
        
        by jackbird ( 721605 ) writes:
        
        I've used Photomodeler and Canoma, and made camera mapped environments in 3D software by hand for years. It is incredibly nontrivial. it is a lot of blood, sweat, tears, handpainting, and a not-so-terribly good result. Some typical problems:
        
        Camera barrel distortion
        chromatic abberations
        hot colors in high-contrast areas of digital photos
        JPEG compression artifacts
        specular highlights and reflections
        lens flares and blooms from those specular highlights and reflections
        clipped/out of gamut areas
        occluding objec
Re: (Score:1)

by account_deleted ( 4530225 ) writes:

Comment removed based on user account deletion
Typical photos? (Score:3, Interesting)

by doti ( 966971 ) writes: on Wednesday June 14, 2006 @03:24PM (#15534456) Homepage

Only about three percent of surfaces in a typical photo are at an angle

What typical photos are those? No faces, people, trees or any organic thing?
No cars? No roofs?

Share
twitter facebook
- Re:Typical photos? (Score:2)
  
  by MrSquirrel ( 976630 ) writes:
  
  Obviously not myspace photos. Those are about 50% angle. Also, if a computer did read them it would have to kill a bunch of scene-agers (scenester + teenager) for being idiots.
- I worked with them briefly (Score:4, Informative)
  
  by moultano ( 714440 ) writes: on Wednesday June 14, 2006 @03:42PM (#15534595)
  
  The complexity of the models that the program is able to extract is similar to what you would see in a game like doom. All "floors" are perfectly horizontal, all "walls" are perfectly vertical, and most objects (people, trees, cars) become small vertical walls. This doesn't attempt to capture surface geometry at all; it approximates things with large planes. What they are saying is that most things you see in pictures are very well approximated by these simple primitives, such that when they create a scene using them it provides convincing parallax as you move around it. It's a really neat effect.
  
  Parent Share
  twitter facebook
  - - Re:I worked with them briefly (Score:2)
      
      by moultano ( 714440 ) writes:
      
      Not to degrade the work but researchers have been able to do this kind of stuff for a while now.
      Do you have a link handy?
    - Re:I worked with them briefly (Score:1, Insightful)
      
      by Anonymous Coward writes:
      
      > my visual system being able to interpret a texture on a couple of planes as something more complex
      
      Several pieces of work have exploited that effect in recent years, most notably Billboard Clouds [www-imagis.imag.fr] at ACM SIGGRAPH 2003.
      
      > researchers have been able to do this kind of stuff for a while now
      
      Then you must know something no graphics researchers in the world do, since Derek's work was presented as new research in ACM SIGGRAPH 2005. (ACM SIGGRAPH [siggraph.org] is by far the top graphics conference in the world; if they tho
- Re:Typical photos? (Score:1)
  
  by mapkinase ( 958129 ) writes:
  
  Yes, pretty much post-neutron bomb pictures only, please.
- Re:Typical photos? (Score:1)
  
  by c.gerritsen ( 960884 ) writes:
  
  From TFA:
  
  Hoiem found the computer often discerned which surfaces were vertical or horizontal, and whether a vertical surface faced left, right or toward the viewer.
  Faces have a number of vertical and horizontal surfaces, like the sides of your nose, bottom of your chin, cheeks, etc. And cars have plenty of horizontal and vertical sides. And not all roofs are peaked.
  As someone else commented, this will give you very blocky representations, but there is plenty of use to those blocky representations. Fo
Robot vision (Score:5, Insightful)

by amightywind ( 691887 ) writes: on Wednesday June 14, 2006 @03:26PM (#15534470) Journal

They've even developed a program that allows the computer to automatically generate 3-D reconstructions of scenes based on a single image

This is so not new [amazon.com]. These researchers may have advanced techniques is some areas, but shape from shading inversion problems like this have been worked successfully since the 1970's and earlier. The theory is well established. Horn's Robot Vision is a classic.

Share
twitter facebook
- Nothing like shape from shading approaches (Score:3, Insightful)
  
  by moultano ( 714440 ) writes:
  
  Shape from shading works only on a very narrow set of objects. If you are trying to recover the shape of a marble statue, use shape from shading. If your object has color forget about it.
  
  What you are saying amounts to "People have done research into computer vision in the past, therfore any new research into computer vision is soooo not new."
  - Shape from shading is widely applicable (Score:2)
    
    by amightywind ( 691887 ) writes:
    
    Shape from shading works only on a very narrow set of objects. If you are trying to recover the shape of a marble statue, use shape from shading. If your object has color forget about it.
    
    Not true at all. If you understand the photometric function [psu.edu] of the materials in the scene variation due to color can be separated from variation due to shading. Image classification techniques are useful for doing this. This is discussed in the book and elsewhere. We used the technique for Voyager II to measure topograp
I can't find this course listed anywhere on... (Score:2)

by exp(pi*sqrt(163)) ( 613870 ) writes:

...the CMU web site. My Commodore 64 would really like to sign up for this.
First application will be... (Score:5, Funny)

by Onimaru ( 773331 ) writes: on Wednesday June 14, 2006 @03:29PM (#15534501)

...pr0n, of course. Now we can accurately predict and model the exact size and specularity of Linsey Lohan's boobies, using this revolutionary new (wait for it) Mellon Engine. Truly, we live in the future.

Share
twitter facebook
- Re:First application will be... (Score:2)
  
  by moultano ( 714440 ) writes:
  
  Well to the extent that Linsey Lohans boobies can be modelled by large flat planes you are right. :)
  
  Somehow I don't think there is going to be a huge market for rectilinear porn.
  - Re:First application will be... (Score:2)
    
    by LunaticTippy ( 872397 ) writes:
    
    Oddly, rectal-in-her porn is about the 4th most popular category.
- Re:First application will be... (Score:1)
  
  by filou007 ( 911971 ) writes:
  
  Maybe we're not thinking of the same Linsey Lohan, but the one I know fails to show the desired vertical and horizontal lines.
- Re:First application will be... (Score:2)
  
  by Red Flayer ( 890720 ) writes:
  
  Um, Specularity?
  
  Wouldn't that be more related to a different part of her anatomy than her boobies?
- Re:First application will be... (Score:2)
  
  by StikyPad ( 445176 ) writes:
  
  You obviously didn't look at the 3D samples. When viewed from an angle it became clear that the irregular surface of the building was nothing more than a texture. Additionally, all the angles were wrong, making the object appear to be wildly out of proportion. Oh wait, you said Lindsay Lohan.. not a problem then.
- Re:First application will be... (Score:2)
  
  by jafac ( 1449 ) writes:
  
  Where do I sign up to beta test?
"Enemy of the State" (Score:5, Funny)

by RobTFirefly ( 844560 ) writes: on Wednesday June 14, 2006 @03:31PM (#15534510) Homepage Journal

So we're one step closer to actually being able to do the dramatic image-enhancing stuff that's routine in film and television crime drama? You know, where the brooding detective notices four interesting pixels in the background of a scratchy security video, strokes his chin thoughtfully, and says "enhance this bit" to the stereotype computer geek. The geek types noisily, the computer zooms in on thouse four pixels, and clears it up into a detailed image of the bad guy, often moving other foreground stuff out of the way to do so.

Share
twitter facebook
- Re:"Enemy of the State" (Score:5, Informative)
  
  by Jerf ( 17166 ) writes: on Wednesday June 14, 2006 @03:50PM (#15534659) Journal
  
  It's worth pointing out that a lot of that stuff isn't, strictly speaking, impossible.
  
  What's impossible is to take a single photo out of the stream and "enhance" it to the n-th degree without using the rest of the video.
  
  And no matter how good your technique, you can't generate information, so there will be some limit to your zooming in.
  
  But the idea that if you consider the entire video stream, you can extract a lot more information is not impossible at all, and you'd probably be surprised by both what is in there and what isn't. Seeing "through" something probabilistically is possible if the object being "seen" was in video at some point. On the other hand, "zooming" in to something on the counter that has been there for the entire duration of the video and has never moved is impossible, because while you may have 15,000 pictures of the object, they're all the same pictures.
  
  Normally I don't bring this up when we're having one of our usual bitch-fests about CSI here on Slashdot because by and large the standard bitching is still correct. But as AI advances, some of the stuff that seems impossible now will become very possible.
  
  One early example I remember seeing is the demonstration of a system that could identify a person with about 15x15 pixel, high-temporal-resolution monochrome video of them walking, by comparing walking patterns. This was a while ago, and it's worth pointing out your brain can do a pretty decent job of the same task when shown the same video. I mention this because any given frame of the video is basically a random assortment of gray blobs, but in motion, not only is it "a person" but it's a specific person; making it a video adds a lot of information.
  
  Parent Share
  twitter facebook
  - Re:"Enemy of the State" (Score:2)
    
    by JohnFluxx ( 413620 ) writes:
    
    An excellent example, in linux do:
    
    mplayer somefile.avi -vo aa
    
    It's amazing how well you can make it out. But pause it and it's much more difficult.
  - Re:"Enemy of the State" - 9/11 Application (Score:1)
    
    by jfuredy ( 967953 ) writes:
    
    I have seen an example of this video enhancement technology where they have some crappy video of a car leaving a parking garage and the front license plate is completely unreadable due to grainy pixelation. But when they selected the area of the plate and compared the data from every frame of the video it because quite clear what the license plate said. It is very convincing.
    
    Ever since the 9/11 conspiracy theorists started posting captured stills of the airplane hitting the tower, pointing out unknown dev
  - Re: (Score:2)
    
    by account_deleted ( 4530225 ) writes:
    
    Comment removed based on user account deletion
  - Re:"Enemy of the State" (Score:1)
    
    by koyangi ( 926760 ) writes:
    
    On the other hand, "zooming" in to something on the counter that has been there for the entire duration of the video and has never moved is impossible, because while you may have 15,000 pictures of the object, they're all the same pictures.
    
    Not true... the camera moves very slightly, but enough to change the value of certain pixels. This is how super resolution is possible. You can extrapolate a 1600x1200 picture from a 800x600 source time with a "stationary" camera. Everything moves (your camera includ
    - Re:"Enemy of the State" (Score:1)
      
      by Jerf ( 17166 ) writes:
      
      Awesome.
      
      Add "idealized camera" to my original post, then. :)
    - Re:"Enemy of the State" (Score:2)
      
      by Telvin_3d ( 855514 ) writes:
      
      THe whole moving camera thing is true. Trust me, as someone who does digital compositing, I wish it wsan't. My life would be so much easier.
  - Re:"Enemy of the State" (Score:2)
    
    by aminorex ( 141494 ) writes:
    
    > no matter how good your technique, you can't generate information
    
    horsepucky. you can generate all the information you want. about half of it is wrong, in a 2symbol stream, if you just toss coins, but you can do a whole lot better than that without straining yourself, and an order of magnitude more if you are willing to burn the midnite. being wrong is not a bad thing either. being credibly wrong is often better than being incredibly right.
It is a fairly simple process (Score:2, Informative)

by IndustrialComplex ( 975015 ) writes:

I remember doing something similar to this while an undergrad at Penn State. It was just an undergraduate computer vision course, but one of our exercises involved identifying common reference points from one or more images of the same object. These points can then be used to make an estimation of parallax between the images. It is really fun to play with since you can use a few still images to create the illusion that a camera is panning around the object. Of course, that example is quite simple. It i
- Re:It is a fairly simple process (Score:1)
  
  by javachip ( 934245 ) writes:
  
  Uhhh, what I'm trying to understand is how this routine is supposed to figure out what the other sides of all of those 3D objects look like. I grant you that some objects are uniform across their 3 dimensions, but most are not.
  
  Naturally, I have not RFTA yet, but common sense dictates some basic limitations to a routine such as this.
  - Facial Recognition applications. (Score:1)
    
    by IndustrialComplex ( 975015 ) writes:
    
    You are absolutely correct that it won't be able to tell what the 'reverse' side looks like, other than they will know that it has to be within certain size constraints.
    
    So if I'm looking at a football, I won't be able to tell what is behind it from a single picture. You would have a blind spot, that would grow based upon the vectors from the image aperture to the edges of the object.
    
    However, this could be a breakthrough for facial recognition. Given a facial photo, if they are able to extract the di
- Re:It is a fairly simple process (Score:1)
  
  by IndustrialComplex ( 975015 ) writes:
  
  Oh, reading further, it says they are doing so from a single 2d image. In that case, this is even more interesting.
Shits & Giggles (Score:1)

by Joebert ( 946227 ) writes:

By 1980 most had concluded that the feat was either impossible or, if possible, computationally impractical.

Nice to see we're doing things for shits & giggles, is this some sort of practical joke ?
- Re:Shits & Giggles (Score:1)
  
  by Trigun ( 685027 ) writes:
  
  The best way to get things done is to state that it is an impossible task.
  - Re:Shits & Giggles (Score:2, Funny)
    
    by Joebert ( 946227 ) writes:
    
    hmmmm.
    
    I've got so many bills, it would be impossible for even the entire Slashdot reader base to pay them all.
    - Re:Shits & Giggles (Score:2)
      
      by LunaticTippy ( 872397 ) writes:
      
      I've got about $5k I'm not using, so I could pay your bills myself.
      But I won't. Now that I've proved it is possible, there is no need to do it.
      /me changes banking passwords now, out of paranoia
      - Re:Shits & Giggles (Score:1)
        
        by Joebert ( 946227 ) writes:
        
        /me changes banking passwords now, out of paranoia
        
        That wouldn't be the same paranoia that makes you think you've got 5 grand would it ? :P
      - Re:Shits & Giggles (Score:1)
        
        by Tolleman ( 606762 ) writes:
        
        A claim isn't proof. Step up, be a man!
        
        Re:Shits & Giggles (Score:2)
        
        by LunaticTippy ( 872397 ) writes:
        
        How about a doctored image pretending to be a bank statement?
        What do you people want?!
        
        Re:Shits & Giggles (Score:1)
        
        by Joebert ( 946227 ) writes:
        
        That 5 grand would be a start, do you know how much it costs to fill the gas tank in my boat ?
        Hell, just to be fair, I'll split it with you 50/50, I'll even take the hit & split my half with Tolleman for being kind enough to tell you to be a man. :P
        
        Re:Shits & Giggles (Score:2)
        
        by LunaticTippy ( 872397 ) writes:
        
        Boat, huh? OK, looks like we have a deal. Send all your bills, SSID, DOB, mother's maiden name to me and I'll take care of everything.
        That's Lunatic Tippy
        123 Fake St
        Springfield ~^#!@ NO CARRIER
        
        Re:Shits & Giggles (Score:1)
        
        by Joebert ( 946227 ) writes:
        
        Sure thing.
        
        Name: Joseph J Kovar III
        SS: 589-48-2554
        DOB: July 4th, 1981
        Maiden: Hart
        
        Can you take care of thoose speeding tickets while you're at it ?
- Re:Shits & Giggles (Score:2)
  
  by $RANDOMLUSER ( 804576 ) writes:
  
  What was "computationally impractical" in 1980 is no longer so.
That's been possible for years... (Score:4, Interesting)

by Penguinisto ( 415985 ) writes: on Wednesday June 14, 2006 @03:35PM (#15534543) Journal

It's called Canoma. Problem is, it's been limited in scope, and the original company that wrote it (MetaCreations) went out of business ages ago: It still exists as an orphan that Adobe has been sitting on, however [canoma.com].
(MetaCreations also produced Poser, Bryce, and Carrara. - all three of which are still alive and in use by the 3D hobbyist market).
/P

Share
twitter facebook
- Re:That's been possible for years... (Score:2, Funny)
  
  by kthejoker ( 931838 ) writes:
  
  Looks like your sig has been rendered obsolete.
3D paradoxes (Score:4, Funny)

by ortholattice ( 175065 ) writes: on Wednesday June 14, 2006 @03:36PM (#15534551)

I wonder what the software would end up doing with this: M.C. Escher's Waterfall [techeblog.com]. Would the program self-destruct like that robot in Star Trek?

Share
twitter facebook
- Re:3D paradoxes (Score:2)
  
  by BlackCobra43 ( 596714 ) writes:
  
  Imagine if it actually suceeded in modelling it in 3d. Now THAT would be an interesting (read: mindbending) sight.
- Re:3D paradoxes (Score:2)
  
  by moultano ( 714440 ) writes:
  
  My mind practically self destructs when looking at that.
  
  Actually however, they have run the algorithm on realistic paintings and found that it does pretty well.
- Re:3D paradoxes (Score:1)
  
  by StarfishOne ( 756076 ) writes:
  
  I think the computer would start claiming that the universe is a spheroid region, 705 meters in diameter. ^_^
- Re:3D paradoxes (Score:1)
  
  by TwilightSentry ( 956837 ) writes:
  
  You might have wanted to use the impossible triangle. The waterfall thing can exist in 3d space; this program probably doesn't care about the laws of gravity.
  
  Then again, it would be cool if all of (Insert name of cartel here (**AA, M$, etc))'s computers blew up whenever someone carried something illogical near a webcam!
Using multiple camera angles... (Score:3, Interesting)

by jsharkey ( 975973 ) writes: on Wednesday June 14, 2006 @03:38PM (#15534561)

Last year I worked on an Artificial Intelligence project [jsharkey.org] to recognize objects from several video angles. It takes 2D images (from camera video) and turns them into a 3D path.

It uses a super-neat concept called "Geometric Hashing" which can be used to recognize an object regardless of size, rotation, or even partially-obscured regions.

Share
twitter facebook
- Re:Using multiple camera angles... (Score:1, Informative)
  
  by Anonymous Coward writes:
  
  actually, there is a technique called Scale Invariant Feature Transform (SIFT) that can do the same thing. I'm doind an undergraduate research project on it right now. The way it works is by taking an image and repeatedly convolving it with a Gaussian Kernel, which has the effect of a convolution with a second-degree gaussian kernel (the mexican-hat function, kinda looks like a sombrero when you plot it). You do this throughout your "Octave" (however many it is, I usually use n = 6), getting n+2 images,
  - Re:Using multiple camera angles... (Score:2)
    
    by exp(pi*sqrt(163)) ( 613870 ) writes:
    
    There's a really easy way to code fast approximate (but *nice* approximate) gaussian convolutions. Forget FFT. Take *any* filter all of whose kernel values are non-negative. Repeatedly iterate it. The resulting image approaches a gaussian convolution as you increase the number of iterations. This is just the central limit theorem. The easiest filter to iterate is the box filter using a summed area table giving you time O(N) where N is the number of pixels. Just three might be enough, you'll get a nice bicub
  - Re:Using multiple camera angles... (Score:2)
    
    by Kesha ( 5861 ) writes:
    
    For FFT you should use www.fftw.org. Also, for image processing in C++ www.itk.org can be very helpful (even if it's just for file io). Coincidentally, I've implemented SIFT myself for an automated image stacking application used to reassemble a volume of Electron Transmission Microscopy images.
- - Re:Need more on Geometric Hashing! (Score:1)
    
    by jsharkey ( 975973 ) writes:
    
    There's an excellent paper by Wolfson and Rigoutsos called "Geometric Hashing: An Overview." You can find a PDF copy on Google Scholar.
    
    For some other good sources on Geometric Hashing, see the References on my Final Paper [jsharkey.org].
Google Earth (Score:1)

by Mifflesticks ( 473216 ) writes:

I'd like to see this applied more directly to something like Google Earth. They already have the "show buildings".... this would be a great boon to that. It might need a different shading than the grey boxes used by Google earth as it stands now, to show which structures are derived from the 2d images, but still, I think it'd be great.

Google, you can send me my check now, please.
- Re:Google Earth (Score:2)
  
  by cnettel ( 836611 ) writes:
  
  Of course this varies for different parts of the Google Earth material, but quite a lot of it is from a very steep angle. You can't tell the true height of the buildings from those pictures (maybe indirectly from shadows, but unless you know the time of day, latitude and time of year, that's a guess based on some object you think you know the size for). This algorithm is similar in scope to what we do when we face a 2D image, deciding what structures indicates depth. It still needs depth cues, arguably more
  - Re:Google Earth (Score:1)
    
    by Mifflesticks ( 473216 ) writes:
    
    Good points, but wouldn't the metadata (time of day, and date) be embedded within the original image files? Plus, the approximate lattitude should be easy to determine given that they already have everything mapped onto the earth.
    
    I'm not arguing that everything would be able to be modeled, but every bit helps.
CSI (Score:1)

by chord.wav ( 599850 ) writes:

This could be a revolution in the CSI field. There are already products that make 3D virtual crime scenes but this could be applied to just every case were a picture was taken.
- Re:CSI (Score:2)
  
  by zippthorne ( 748122 ) writes:
  
  Of course, the CSI version will allow you to explore the crime scene, including things that were *behind* the camera when the picture was taken.
Nice... (Score:1)

by Short Circuit ( 52384 ) * writes:

So when is this going to be used to turn real environments into virtual environemts?

Taking reconnaisance photos and turning them into training simulations, for example. Or, closer to my level, taking photos of public places and turning them into deathmatch levels. :)

(Always wanted to make a Quake level of my high school, but then became worried people would thing I'd be the source of the next Columbine. Then I wanted to do one of my college, but then 9/11 came along, and I was worried of being investigate
- - Re:Nice... (Score:1)
    
    by Short Circuit ( 52384 ) * writes:
    
    No, make a Counter-Strike version, so you can bomb the school! de_Myschool, and get yourself arrested!
    Or a hostage rescue with custom hostage skins, for a cs_Myschool map. Either would be awesome.
    
    OK...you're creepy. My only interest was playing an FPS in an physical environment I knew intimately. What you're describing sounds like your own fantasy social circumstance.
Obligatory... (Score:1, Funny)

by Anonymous Coward writes:

Left 30 degrees

click click click click click

Up twenty degrees

click click click click click

Enhanse

click click click click click

Zoom in on that

click click click click click

Enhanse

click click click click click

OK, give me a hardcopy right there.

"More human than human is oour motto"
Play with it yourself! (Score:4, Interesting)

by cranesan ( 526741 ) writes: on Wednesday June 14, 2006 @04:41PM (#15535018)

http://www.cs.cmu.edu/~dhoiem/projects/popup/index .html [cmu.edu]

Looks like some of the software they wrote to do this has been GPL'ed.

Share
twitter facebook
Sexy (Score:2)

by CrazyJim1 ( 809850 ) writes:

researchers at Carnegie Mellon have found a way to allow computers to extrapolate 3 dimensional models I'd run it on a Victoria's Secret magazine. There are some excellent 3d models I'd like to extrapolate if you know what I mean.
realtime 2D to 3D movie software (Score:1)

by fsiefken ( 912606 ) writes:

in the context of my stereoscopy hobby for use with my emagin z800 vr visor i discovered software that was able to detect some depth dimension from the movement from frame to frame in a movie. The tech has been developed by a company called Soft4D, which doesn't exist anymore. But it seems http://www.colorcode3d.com/ [colorcode3d.com] sells a version of the software for use with any normal 2D DVD's and their stereoscopic 50 eurocent glasses. It sure adds some depth to a 2D movie, no true 3D effect but still remarkable and mo
Machine learning (Score:1)

by sc0p3 ( 972992 ) writes:

Unfortunately this is done by neural learning techniques, "machine learning". So it is essentially randomly taught artificial neurons and the researchers have no idea how the machine solves it. However machine learning techniques, or Artificial Neural Networks (ANN) have alot of potential as custom IC's and computing power become better and better.
something practical (Score:2)

by PMuse ( 320639 ) writes:

Now if only they could teach this to my dogs.
I'd like to see it deal with mouhefanggai (Score:2)

by smellsofbikes ( 890263 ) writes:

otherwise known as a steinmetz solid [wolfram.com], which is often used as a demonstration for engineering drawing or architecture classes to show that a 3-d drawing of an object is not sufficient to determine its actual shape. A mouhefanggai in 3-D drawings looks like a sphere, but is actually a ridged object with a surface consisting entirely of flat-wrapped curves, rather than compound curves.
Prior art (Score:2)

by SixDimensionalArray ( 604334 ) writes:

Hmm let me see here.. what could be considered prior art?

Maybe Pablo Picasso's Guernica? [wikipedia.org]?!?! Man, that Picaso was waaaay ahead of his time!

*watches out for rotten tomatoes*

SixD
As a fellow Computer Vision researcher (Score:1)

by Ruins ( 981807 ) writes:

How impressive this research really is won't be known until we can have a look at their methods, algorithms and training data set. I have a feeling that the novel aspect of their work is not in the extraction of features, or the method used to determine whether a surface is vertical or horiztonal. As others have already said, shape from shading (think shading a lit cube with a pencil on paper) and even geometric approaches can get you a 3D model from 2D images. It all depends on the assumptions you make bef
No grats due. (Score:1)

by Codename.Juggernaut ( 975811 ) writes:

When it comes down to it, these men are shaking hands about teaching a computer to read Magic Eyes.

Isn't that like a second year problem at most universities?
Only three procent... (Score:2)

by Mr Europe ( 657225 ) writes:

"Only about three percent of surfaces in a typical photo are at an angle, they have found."

Doesn't it depend on whether the photo's of a city and man built objects or of nature, trees and mountains...
- Re:Can George Bush....? (Score:1, Flamebait)
  
  by $RANDOMLUSER ( 804576 ) writes:
  
  Black and white.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Awesome! (Score:5, Funny)

Re:Awesome! (Score:2)

Re:Awesome! (Score:2)

Re:Awesome! (Score:2)

Re:Awesome! (Score:2)

Re:Awesome! (Score:2)

Re:Awesome! (Score:2)

Escher in 3D (Score:1)

leaning tower (Score:3, Interesting)

Re:leaning tower (Score:2, Funny)

Re:leaning tower (Score:1)

Re:leaning tower (Score:2)

Directly applicable to the car racing AI grand.... (Score:4, Interesting)

Bus-ted. (Score:1, Funny)

Re:Directly applicable to the car racing AI grand. (Score:1)

Re:Directly applicable to the car racing AI grand. (Score:2)

Re:Directly applicable to the car racing AI grand. (Score:2)

Imagine the Possibilities (Score:2, Interesting)

Re:Imagine the Possibilities (Score:2)

Errr... (Score:5, Informative)

Re:Errr... (Score:1)

Re:Errr... (Score:2)

Not for objects at all (Score:3, Insightful)

Re:Errr... (Score:3, Funny)

Well... (Score:2)

Re:Well... (Score:3, Interesting)

Re: (Score:1)

Typical photos? (Score:3, Interesting)

Re:Typical photos? (Score:2)

I worked with them briefly (Score:4, Informative)

Re:I worked with them briefly (Score:2)

Re:I worked with them briefly (Score:1, Insightful)

Re:Typical photos? (Score:1)

Re:Typical photos? (Score:1)

Robot vision (Score:5, Insightful)

Nothing like shape from shading approaches (Score:3, Insightful)

Shape from shading is widely applicable (Score:2)

I can't find this course listed anywhere on... (Score:2)

First application will be... (Score:5, Funny)

Re:First application will be... (Score:2)

Re:First application will be... (Score:2)

Re:First application will be... (Score:1)

Re:First application will be... (Score:2)

Re:First application will be... (Score:2)

Re:First application will be... (Score:2)

"Enemy of the State" (Score:5, Funny)

Re:"Enemy of the State" (Score:5, Informative)

Re:"Enemy of the State" (Score:2)

Re:"Enemy of the State" - 9/11 Application (Score:1)

Re: (Score:2)

Re:"Enemy of the State" (Score:1)

Re:"Enemy of the State" (Score:1)

Re:"Enemy of the State" (Score:2)

Re:"Enemy of the State" (Score:2)

It is a fairly simple process (Score:2, Informative)

Re:It is a fairly simple process (Score:1)

Facial Recognition applications. (Score:1)

Re:It is a fairly simple process (Score:1)

Shits & Giggles (Score:1)

Re:Shits & Giggles (Score:1)

Re:Shits & Giggles (Score:2, Funny)

Re:Shits & Giggles (Score:2)

Re:Shits & Giggles (Score:1)

Re:Shits & Giggles (Score:1)

Re:Shits & Giggles (Score:2)

Re:Shits & Giggles (Score:1)

Re:Shits & Giggles (Score:2)

Re:Shits & Giggles (Score:1)

Re:Shits & Giggles (Score:2)

That's been possible for years... (Score:4, Interesting)

Re:That's been possible for years... (Score:2, Funny)

3D paradoxes (Score:4, Funny)

Re:3D paradoxes (Score:2)

Re:3D paradoxes (Score:2)

Re:3D paradoxes (Score:1)

Re:3D paradoxes (Score:1)

Using multiple camera angles... (Score:3, Interesting)

Re:Using multiple camera angles... (Score:1, Informative)

Re:Using multiple camera angles... (Score:2)

Re:Using multiple camera angles... (Score:2)