Researchers Teach Computers To Perceive 3D from 2D 145
hamilton76 writes to tell us that researchers at Carnegie Mellon have found a way to allow computers to extrapolate 3 dimensional models from 2 dimensional pictures. From the article: "Using machine learning techniques, Robotics Institute researchers Alexei Efros and Martial Hebert, along with graduate student Derek Hoiem, have taught computers how to spot the visual cues that differentiate between vertical surfaces and horizontal surfaces in photographs of outdoor scenes. They've even developed a program that allows the computer to automatically generate 3-D reconstructions of scenes based on a single image. [...] Identifying vertical and horizontal surfaces and the orientation of those surfaces provides much of the information necessary for understanding the geometric context of an entire scene. Only about three percent of surfaces in a typical photo are at an angle, they have found."
Awesome! (Score:5, Funny)
Re:Awesome! (Score:2)
"Bite the fish-eye lens facing my Fembot's shiny metal boobs!"
Re:Awesome! (Score:2)
I would like to see the results from the Google images with "safe search" turned off.
Re:Awesome! (Score:2)
For me, it's adding another item to the "things they said were impossible in CS class but are now available". The stuff Salient Stills is selling is another idea I had in school for a project - fortunately the grad students were able to show me how that was mathematically impossible too.
"Never say never", boys and girls. I'll get back in line for my FTL transporter then.
Re:Awesome! (Score:2)
+++Out of cheese error+++
+++Please reboot universe+++
+++Redo from start+++
/TP's DW reference
Re:Awesome! (Score:2)
+++ Melon Melon Melon +++
Re:Awesome! (Score:2)
Escher in 3D (Score:1)
leaning tower (Score:3, Interesting)
Re:leaning tower (Score:2, Funny)
Re:leaning tower (Score:1)
AAAAAAAAAARGH! Try to imagine 300 folks making the same photo at the same time.
They look sooooo dumb!
Re:leaning tower (Score:2)
Directly applicable to the car racing AI grand.... (Score:4, Interesting)
Bus-ted. (Score:1, Funny)
Man that would be a pretty neat invention [basintransit.com].
Re:Directly applicable to the car racing AI grand. (Score:1)
This is good when the source material doesn't exist.
However if I were in the grand challenge I wouldn't be swapping the (minimum) stereo imaging most cars appear to have.
1) its an approximation and may not be applicable for different terrain or obsticles (similar rock against similar floor)
2) its harder to fool 2 cameras than a single one, glitches could send you off the cliff.
3) with a stereo pair you can interpo
Re:Directly applicable to the car racing AI grand. (Score:2)
Re:Directly applicable to the car racing AI grand. (Score:2)
Imagine the Possibilities (Score:2, Interesting)
Re:Imagine the Possibilities (Score:2)
Errr... (Score:5, Informative)
Cities aren't the kind of thing this is target for.
You can get building plans and architectural drawings and everything from the city for free. There are algorithms that can easily map pictures to objects if you know ahead of time the shape of the things that "should" be there.
This stuff is for deciding the shape of unknown things, and more importantly, to gain new heuristics for image searches.
With this technology, you could ask for "things that are round, and have a box".
More importantly, you could show the computer one picture of something, and have it attempt to find more pictures of it (from different angles, with different colors, etc.). Like you show it a Volvo C90, and it shows you any and all pictures of Volvo C90s by the shape.
Re:Errr... (Score:1)
There's your grant money right there, boys!
Re:Errr... (Score:2)
Really...
hmm...
I was thinking "things that are round, and have a nipple"
Not for objects at all (Score:3, Insightful)
Re:Errr... (Score:3, Funny)
Dear Sir,
ha ha ha.
ha ha ha ha ha ha ha.
ha.
If only.
Signed,
every CAD operator in the world
Well... (Score:2)
And you need a light model and surface texture models (or a lot of pictures from different angles).
So this isn't trivial. But it's doable. Such techniques are used in film for scene composition and for texturing 3d representations of real-world objects.
It's not like you can just take a picture of a buildi
Re:Well... (Score:3, Interesting)
X-Files (Score:1)
"Your scientists have yet to discover how neural networks create self-consciousness, let alone how the human brain processes two-dimensional retinal images into the three-dimensional phenomenon known as perception. Yet you somehow brazenly declare seeing is believing?"
-- Jesse "The Body" Ventura as a Man In Black
Typical photos? (Score:3, Interesting)
What typical photos are those? No faces, people, trees or any organic thing?
No cars? No roofs?
Re:Typical photos? (Score:2)
I worked with them briefly (Score:4, Informative)
Re:I worked with them briefly (Score:2)
Re:I worked with them briefly (Score:1, Insightful)
Several pieces of work have exploited that effect in recent years, most notably Billboard Clouds [www-imagis.imag.fr] at ACM SIGGRAPH 2003.
> researchers have been able to do this kind of stuff for a while now
Then you must know something no graphics researchers in the world do, since Derek's work was presented as new research in ACM SIGGRAPH 2005. (ACM SIGGRAPH [siggraph.org] is by far the top graphics conference in the world; if they tho
Re:Typical photos? (Score:1)
Re:Typical photos? (Score:1)
From TFA:
Faces have a number of vertical and horizontal surfaces, like the sides of your nose, bottom of your chin, cheeks, etc. And cars have plenty of horizontal and vertical sides. And not all roofs are peaked.
As someone else commented, this will give you very blocky representations, but there is plenty of use to those blocky representations. Fo
Robot vision (Score:5, Insightful)
They've even developed a program that allows the computer to automatically generate 3-D reconstructions of scenes based on a single image
This is so not new [amazon.com]. These researchers may have advanced techniques is some areas, but shape from shading inversion problems like this have been worked successfully since the 1970's and earlier. The theory is well established. Horn's Robot Vision is a classic.
Nothing like shape from shading approaches (Score:3, Insightful)
What you are saying amounts to "People have done research into computer vision in the past, therfore any new research into computer vision is soooo not new."
Shape from shading is widely applicable (Score:2)
Shape from shading works only on a very narrow set of objects. If you are trying to recover the shape of a marble statue, use shape from shading. If your object has color forget about it.
Not true at all. If you understand the photometric function [psu.edu] of the materials in the scene variation due to color can be separated from variation due to shading. Image classification techniques are useful for doing this. This is discussed in the book and elsewhere. We used the technique for Voyager II to measure topograp
I can't find this course listed anywhere on... (Score:2)
First application will be... (Score:5, Funny)
Re:First application will be... (Score:2)
Somehow I don't think there is going to be a huge market for rectilinear porn.
Re:First application will be... (Score:2)
Re:First application will be... (Score:1)
Re:First application will be... (Score:2)
Wouldn't that be more related to a different part of her anatomy than her boobies?
Re:First application will be... (Score:2)
Re:First application will be... (Score:2)
"Enemy of the State" (Score:5, Funny)
Re:"Enemy of the State" (Score:5, Informative)
What's impossible is to take a single photo out of the stream and "enhance" it to the n-th degree without using the rest of the video.
And no matter how good your technique, you can't generate information, so there will be some limit to your zooming in.
But the idea that if you consider the entire video stream, you can extract a lot more information is not impossible at all, and you'd probably be surprised by both what is in there and what isn't. Seeing "through" something probabilistically is possible if the object being "seen" was in video at some point. On the other hand, "zooming" in to something on the counter that has been there for the entire duration of the video and has never moved is impossible, because while you may have 15,000 pictures of the object, they're all the same pictures.
Normally I don't bring this up when we're having one of our usual bitch-fests about CSI here on Slashdot because by and large the standard bitching is still correct. But as AI advances, some of the stuff that seems impossible now will become very possible.
One early example I remember seeing is the demonstration of a system that could identify a person with about 15x15 pixel, high-temporal-resolution monochrome video of them walking, by comparing walking patterns. This was a while ago, and it's worth pointing out your brain can do a pretty decent job of the same task when shown the same video. I mention this because any given frame of the video is basically a random assortment of gray blobs, but in motion, not only is it "a person" but it's a specific person; making it a video adds a lot of information.
Re:"Enemy of the State" (Score:2)
mplayer somefile.avi -vo aa
It's amazing how well you can make it out. But pause it and it's much more difficult.
Re:"Enemy of the State" - 9/11 Application (Score:1)
Ever since the 9/11 conspiracy theorists started posting captured stills of the airplane hitting the tower, pointing out unknown dev
Re: (Score:2)
Re:"Enemy of the State" (Score:1)
Not true... the camera moves very slightly, but enough to change the value of certain pixels. This is how super resolution is possible. You can extrapolate a 1600x1200 picture from a 800x600 source time with a "stationary" camera. Everything moves (your camera includ
Re:"Enemy of the State" (Score:1)
Add "idealized camera" to my original post, then.
Re:"Enemy of the State" (Score:2)
Re:"Enemy of the State" (Score:2)
horsepucky. you can generate all the information you want. about half of it is wrong, in a 2symbol stream, if you just toss coins, but you can do a whole lot better than that without straining yourself, and an order of magnitude more if you are willing to burn the midnite. being wrong is not a bad thing either. being credibly wrong is often better than being incredibly right.
It is a fairly simple process (Score:2, Informative)
Re:It is a fairly simple process (Score:1)
Naturally, I have not RFTA yet, but common sense dictates some basic limitations to a routine such as this.
Facial Recognition applications. (Score:1)
So if I'm looking at a football, I won't be able to tell what is behind it from a single picture. You would have a blind spot, that would grow based upon the vectors from the image aperture to the edges of the object.
However, this could be a breakthrough for facial recognition. Given a facial photo, if they are able to extract the di
Re:It is a fairly simple process (Score:1)
Shits & Giggles (Score:1)
Nice to see we're doing things for shits & giggles, is this some sort of practical joke ?
Re:Shits & Giggles (Score:1)
Re:Shits & Giggles (Score:2, Funny)
I've got so many bills, it would be impossible for even the entire Slashdot reader base to pay them all.
Re:Shits & Giggles (Score:2)
But I won't. Now that I've proved it is possible, there is no need to do it.
/me changes banking passwords now, out of paranoia
Re:Shits & Giggles (Score:1)
That wouldn't be the same paranoia that makes you think you've got 5 grand would it ?
Re:Shits & Giggles (Score:1)
Re:Shits & Giggles (Score:2)
What do you people want?!
Re:Shits & Giggles (Score:1)
Hell, just to be fair, I'll split it with you 50/50, I'll even take the hit & split my half with Tolleman for being kind enough to tell you to be a man.
Re:Shits & Giggles (Score:2)
That's Lunatic Tippy
123 Fake St
Springfield ~^#!@ NO CARRIER
Re:Shits & Giggles (Score:1)
Name: Joseph J Kovar III
SS: 589-48-2554
DOB: July 4th, 1981
Maiden: Hart
Can you take care of thoose speeding tickets while you're at it ?
Re:Shits & Giggles (Score:2)
That's been possible for years... (Score:4, Interesting)
(MetaCreations also produced Poser, Bryce, and Carrara. - all three of which are still alive and in use by the 3D hobbyist market).
Re:That's been possible for years... (Score:2, Funny)
3D paradoxes (Score:4, Funny)
Re:3D paradoxes (Score:2)
Re:3D paradoxes (Score:2)
Actually however, they have run the algorithm on realistic paintings and found that it does pretty well.
Re:3D paradoxes (Score:1)
Re:3D paradoxes (Score:1)
Then again, it would be cool if all of (Insert name of cartel here (**AA, M$, etc))'s computers blew up whenever someone carried something illogical near a webcam!
Using multiple camera angles... (Score:3, Interesting)
It uses a super-neat concept called "Geometric Hashing" which can be used to recognize an object regardless of size, rotation, or even partially-obscured regions.
Re:Using multiple camera angles... (Score:1, Informative)
Re:Using multiple camera angles... (Score:2)
Re:Using multiple camera angles... (Score:2)
Re:Need more on Geometric Hashing! (Score:1)
For some other good sources on Geometric Hashing, see the References on my Final Paper [jsharkey.org].
Google Earth (Score:1)
Google, you can send me my check now, please.
Re:Google Earth (Score:2)
Re:Google Earth (Score:1)
I'm not arguing that everything would be able to be modeled, but every bit helps.
CSI (Score:1)
Re:CSI (Score:2)
Nice... (Score:1)
Taking reconnaisance photos and turning them into training simulations, for example. Or, closer to my level, taking photos of public places and turning them into deathmatch levels.
(Always wanted to make a Quake level of my high school, but then became worried people would thing I'd be the source of the next Columbine. Then I wanted to do one of my college, but then 9/11 came along, and I was worried of being investigate
Re:Nice... (Score:1)
Or a hostage rescue with custom hostage skins, for a cs_Myschool map. Either would be awesome.
OK...you're creepy. My only interest was playing an FPS in an physical environment I knew intimately. What you're describing sounds like your own fantasy social circumstance.
Obligatory... (Score:1, Funny)
click click click click click
Up twenty degrees
click click click click click
Enhanse
click click click click click
Zoom in on that
click click click click click
Enhanse
click click click click click
OK, give me a hardcopy right there.
"More human than human is oour motto"
Play with it yourself! (Score:4, Interesting)
Looks like some of the software they wrote to do this has been GPL'ed.
Sexy (Score:2)
realtime 2D to 3D movie software (Score:1)
Machine learning (Score:1)
something practical (Score:2)
I'd like to see it deal with mouhefanggai (Score:2)
Prior art (Score:2)
Hmm let me see here.. what could be considered prior art?
Maybe Pablo Picasso's Guernica? [wikipedia.org]?!?! Man, that Picaso was waaaay ahead of his time!
*watches out for rotten tomatoes*
SixD
As a fellow Computer Vision researcher (Score:1)
No grats due. (Score:1)
Isn't that like a second year problem at most universities?
Only three procent... (Score:2)
Doesn't it depend on whether the photo's of a city and man built objects or of nature, trees and mountains...
Re:Can George Bush....? (Score:1, Flamebait)