Follow Slashdot stories on Twitter


Forgot your password?

Intel Releases Open-Source Stereoscopic Software 129

Eslyjah writes "Intel has released a software library that allows computers to "see" in 3D. The library is available for Windows and Linux, under a BSDish license. Possible early applications include lipreading input. Check out the CNN Story, Intel Press Release, and project home page."
This discussion has been archived. No new comments can be posted.

Intel Releases Open-Source Stereoscopic Software

Comments Filter:
  • by Axe ( 11122 )
    Another feeble Intel's attempt to drive up the demand for CPU power..Remember speach recognition?

    Nobody needs it. Nobody. Those who do - will write their own stuff..

    • Ahh... but this is a FAR more useful attempt than simply encouraging website operators to include cpu intensive java applets on their webpages to create the illusion that older computers are too slow to handle "today's tough demanding applications" like web browsing.

    • (GASP!) Good GOD, man! You mean to tell me that they want computers to be able to do NEW things!? Things which might require more powerful processors!? New things so useful that people might be willing to buy these more powerful processors so that they might have computers that could do them!? OH! The shame!!
    • Umm, no.

      I've been using this library the last couple years, and it reduces CPU demand. I was doing pupil tracking using it only for image capture, and I needed two PIII-860Mhz with 400Mhz RAMBUS memory. Using it for some of the basic image processing reduced demand so it almost ran on a single Athelon, I think the memory was too slow on those boxen, but I left the project at that point. Fast enough was 30 fps, and it was difficult even with the library. (We needed the 3D position of the eyes reliably at video rates for a stereo display and didn't want the user to have to wear anything.)

      They have pretty decent MMX algorithms which work out of the box. I'd have to say those that do write in C unless they absolutely can't avoid it. I was happy to squeeze an extra two cycles out of the inner loop of my MMX code, but I'd really rather have Intel do it since it's not the core of my work.

      Computers are slow and always will be, I'm a graphics person so 6 hours to render a frame is really slow, to real vision people processing an image for depth overnight is fast. We invent ways to use the CPU much well ahead of Moore's law. In another 50 years we'll be able to emulate dog vision! maybe... just run it overnight ;)

      (I'm speaking of the general library, this press release is just another algorithm added. They basically add common algorithms to the library, slowly.)
      • Yeah, yeah, I hear you man.. I wrote lotsa Monte Carlo code myself..

        Its just the way they market it that ticks me off.. My point - these are not "consumer" solutions..

        On the unrelated note - I had one job writing astrnomical image processing code. First, I did it the usual way, using a library floating around in the community. Then I gave it a thought, and wrote my own code based on my old DWT code (that's for discrete wavelet transform it is).. New statistics turned out about 10 to 15 times faster to calculate, and more robust to boot. Yahoo. Use your brains - beat the Moore, man.. ;-))

    • I need it. Students build on it to code projects for our class on Digital Video Special Effects []. Without it, they'd spend the whole semester re-inventing old "stuff."

      There is actually a long history of trying to develop "the" computer-vision library, but this one actually has an impressive cross-platform user base []. To my knowledge, the 2nd most popular library is the Microsoft Vision SDK [].

  • an open source terrorist face recognition project.


    • an open source terrorist face recognition project.

      an open source smart bomb... we have the base requirements for a cruise missle. Or maybe an underwater enviro-terrorest robot that can locate and cripple the screws (props) on whale killing boats (or any other boat in the area just incase it might hurt the whales).

  • I'm sure I'll be the first of many to say please fix the HREF in the story. How did that get past the editors?
  • Looks like only two uses for this:
    1. 3D Research
    2. Games
    I honestly cannot think of any other logical uses for this library. And the places that do either category probably already have this, but it's nice to see some free stuff though. Now if only they'd GPL it...
    • oops, it is GPLed, or something pretty close. ignore line in previous comment. New thought: Hey! free game companies can use it too. Maybe a Bot-Building game...? Are you listening, SourceForge community?
    • Re:Usefulness? (Score:4, Interesting)

      by mindstrm ( 20013 ) on Tuesday December 18, 2001 @07:45PM (#2723589)
      Very useful; especially being somewhat open.

      Sure, it might sound a bit far fetched.
      But.. a computer being able to actually construct something in 3 dimensions instead of simply a colorfield has *huge* implications with regards to image recognition.. it really does.

      Take a single image of, say, a user's hands, for gesture recognition. How do you recognize the hands from the background? Color.... heuristics.. and what not.

      But now.. it's simple. It's the stuff that's close to you! It's a completely different way to look at things.
      • So, basically !

        Middle finger on it's own is Ctrl-Alt-Del

        A fast pumping motion is WebPorn

        Small Finger and Index are for launching windows

        A close fist to Quake some arse...

        A shooting motion to "Kill" a process...

        Yeah, I can see how this will improve the pantonime in front of my computer... 8)
    • I haven't skimmed through the article yet, but this is probably why they put it out in the open, simply because they're allowing other people to find uses for it.
    • Re:Usefulness? (Score:2, Interesting)

      by texchanchan ( 471739 )
      How about better control for computerized surgery and other medical procedures?

      Mapping, and lots of different kinds of aerial/satellite photo analysis. You don't have to be looking directly at something to see a 3D view of it--you can look at stereo images--and so could a computer. Examples at Lunar and Planetary Institute: 3D Mars images [].

      Eventually, autopilots for cars.
    • Re:Usefulness? (Score:4, Interesting)

      by foobar104 ( 206452 ) on Tuesday December 18, 2001 @08:07PM (#2723707) Journal
      I honestly cannot think of any other logical uses for this library.

      One word: mensuration.

      Read that again. It's not a typo. Mensuration (which generally means the act or process of measuring) specifically means figuring out how tall buildings or features are from satellite photographs. Traditionally it involves calculating heights trigonometrically based on the time of day, latitude, and angle of lens inclination from which the photograph was taken, using shadows as points of reference.

      This is not a research project. It's a very practical process, with tons of applications in the commercial world.

      My point is that there are a lot of image analysis techniques that you've probably never heard of before. Don't mistake a lack of experience on your part for a lack of usefulness on theirs.
      • Re:Usefulness? (Score:2, Informative)

        by t ( 8386 )
        I've heard this called Photogrammetry [].
        ... from the Third Edition of the Manual of Photogrammetry, is somewhat simpler in statement and that definition of photogrammetry is, the science or art of obtaining reliable measurements by means of photographs.
      • Mensuration (which generally means the act or process of measuring) specifically means figuring out how tall buildings or features are from satellite photographs.

        I'm being a pedantic bastard, but according to webster [] it doesn't specifically mean that. Where are you getting your definition from? Maybe I'll learn something..

        • which generally means the act or process of measuring)

          As the complete asshole foobar104 noted, it generally means the "act or process of measuring". You're looking at the online version of Merriam-Webster, which is VERY general.

          Go check the Oxford Dictionary of the English Language.
        • I'm being a pedantic bastard, but according to webster it doesn't specifically mean that. Where are you getting your definition from?

          From some remote sensing guys that I work with. I don't pretend to have all the jargon right, but I'm pretty sure about this one.
      • Good thing you asked me to read that again...

        First time around, it looked like something a lady would be doing at 'this time of the month.' -_-;
    • Actually, it sounds like the perfect targeting system for my robotic paintball sentry gun...
  • by javaaddikt ( 385701 ) on Tuesday December 18, 2001 @07:37PM (#2723539)
    Being able to see all those lusty ladies in 3d, it might use too much bandwidth downloading pr0n.
  • by chinton ( 151403 ) <{moc.liamg} {ta} {todhsals-100notnihc}> on Tuesday December 18, 2001 @07:37PM (#2723540) Journal
    When it is missing the leading "<"! Try this []
  • Bad link (Score:2, Informative)

    by Squigley ( 213068 )
    Try the CNN story here [].
  • by wickidpisa ( 41827 ) on Tuesday December 18, 2001 @07:40PM (#2723566) Homepage
    Possible early applications include lipreading input.

    Didn't we learn anything from 2001? You would think that people wouldn't be so eager to teach computers to read lips.
    • And vise versa. Have you ever turned off the sound to The Matrix and tried to read the agent's lips when he's questioning Morpheus?

      To mimic him, just let your lips set together and barely move them when you talk. Don't worry about being intelligible. (I'm pretty sure they voiced over in that scene.)
  • "Check out the A HREF='">CNN Story, Intel Press Release, and project home page.'"

    Dear editor, please check this [] out first...

    I'm sure chrisd wanted to have the l33t first (story)post..why can't we mod down /. staff for posting such things? ;)
  • I can never see the pictures, except sometimes when there's glass in front.

  • by adamy ( 78406 ) on Tuesday December 18, 2001 @07:53PM (#2723634) Homepage Journal
    This is not for rendering in 3d, but for allowing a machine to build a 3d model (internally) of the environment it uses. I assume it is based on the same sort of binocular mechanism as animal eyes, but the algorithms to build the internal structure are probably pretty advanced.

    A cool application (I haven't seen if they've done this yet) is rendering in Open GL the internal view of what the robot eyes see. It would allow you to walk through a building, and then have a 3D model for various other uses. Reverse engineering blueprints.

    THis would be great technology to have on any mars lander, or even just to analyze the data sent back.
    • A cool application (I haven't seen if they've done this yet) is rendering in Open GL the internal view of what the robot eyes see. It would allow you to walk through a building, and then have a 3D model for various other uses.

      There are some commercial apps (I'm thinking of something from RealViz, but I can't remember the name) that do something like this-- generating 3D models out of a number of photos of an object-- but they require a user to set a number of control points before processing. The number of control points required ranges from the tiresome through the annoying all the way up to the silly.

      Doing it automatically (or at least semi-automatically) would be a pretty serious upgrade to that software.
      • On a related note, see this: [], a set of gpl'd tools for making, among other things, panoramas. The part that relates to this story is PTStereo & PTInterpolate. I got the programs and some example images (pictures of a bmw from different angles), set some control points, and PTInterpolate extracted a 3d model from it, applied the textures from the picture, and rotated the car from one angle to another. It was quite astounding once you see it. I can't really do it justice...if that kind of thing interests you then check it out.
    • What I would like to know is how they plan to deal with specular reflection. Even the human eye has to deal with it.
  • by shymog ( 531012 )
    Why would we want or need computers to 'see' in 3d?
    Sure, if might be of an advantage to people who need their computer to read their lips. All in all, though, it is of very little use.

    I can see it now, webcams that have stereoscopic input. A whole new breed of porn/camwhoring'll rear it's useless, ugly head.

    Our computers don't need to see us. We don't need THEM telling us we're all ugly.
    • Why would we want or need computers to 'see' in 3d? Sure, if might be of an advantage to people who need their computer to read their lips. All in all, though, it is of very little use.

      Oh, you small-minded moron. See my post elsewhere in the thread about mensuration. There is more image processing going on in the scientific, commercial, and government worlds than you realize.
    • For instance, this could be used in a car and help the computer make the difference between the SUV in front of me that is breaking and the SUV on the AD in front of me...

      Could also help the car find it's place on the road, caus 3D would allow positionning ...

      Could have REAL Biometrics

      Could have Real 3D Pron...

      Uh, I think my brain just took the wrong turn here 8)
  • ...will be a terminator device. Hook this library up to stepper controlled aiming device and your burglar alarm can now be lethal. Imagine the possibilities -- your nosey relatives walk into your workshop and four .30 cal machine gun replicas suddenly swing into position, targeting said relative... As they walk, they're tracked. Perhaps mount a camera and superimpose terminator-like vision crosshairs on them...
  • Application Idea (Score:3, Interesting)

    by Rick the Red ( 307103 ) <> on Tuesday December 18, 2001 @07:56PM (#2723654) Journal
    If anyone has the time to develop this, I'd love a program that takes the inputs from two cheap (affordable) webcams mounted on a board (say, 6" or so apart) and digitizes what it sees as 3D files.

    What's the use? Well, besides the obvious uses in architecture, etc., how about being able to play (insert favorite 1st person shooter game here) in your house?

    Geeze, now that I think about it, maybe this isn't such a good idea []. What would JonKatz [] think?

    • JonKatz wouldn't be interested - he's seen it all before.

      His friend Junis in Afghanistan has had this system running for years on his C64.
  • by eyefish ( 324893 ) on Tuesday December 18, 2001 @07:57PM (#2723660)
    Too bad /.'ers here (at least the first 18 posts) don't see the benefits of this.

    How about a way to have a PC recognize the position of your fingers and hands. You could use this to manipulate shapes in 3D in a 3D rendering and animation program WITHOUT SPECIAL GLOVES. You'd simply gesture into something like 3DStudioMax, Lightwave, or Caligari TrueSpace and create shapes by molding them with your fingers.

    Or wouldn't it be cool to develop a "hand gesture API" which you could use to say play a karate game??? Think about a 3D Bruce Lee in front of you kicking, and you moving your OWN hands in front of the monitor to block it (and if you wear those cool 3D shutter glasses now common on graphics cards you will essentially have a low-budget VR system).

    Or how about a driving game where you use no driving wheel but rather simply move your hands IN THE AIR. The game could be smart enough to recognize when you shift your hands away from an invissible steering wheel to grab an invissible gear stick on your side.

    Or how about a tool to allow people like Stephen Hawkins gesture expressions and small movements in the air and have the computer react to these actions (like moving a wheelchair around, turning lights on and off, calling on the phone, changing TV channels, etc).

    Think about the possibilities!!!

    • I know... how about a mouse... with out the mouse? Just use a web cam (or two) mounted to look down at a 'mouse pad' (some color that contrasts with the color of you hands), and then have it track the motion of your hands.

      What about mouse clicks???

      Tap of the finger?

      Hmm, interesting concept anyway.

      I'd like to use this to analyze pictures taken from a model airplane to create 3d plots of the ground contour!
    • Given that Intel wants this to be availible at a consumer level in the near future, why couldn't people create robots with the ability to map terrain in front of them.. hmm...
    • > You could use this to manipulate shapes in > 3D in a 3D rendering and animation program > WITHOUT SPECIAL GLOVES. You'd simply gesture > into something like 3DStudioMax, Lightwave, > or Caligari TrueSpace and create shapes > by molding them with your fingers. That's already been invented. You don't need 3D imaging, you don't need 3DS MAX and you don't even need a PC. You just need a piece of clay.
    • (sorry for the formatting, here it goes again)

      > You could use this to manipulate shapes in
      > 3D in a 3D rendering and animation program
      > WITHOUT SPECIAL GLOVES. You'd simply gesture
      > into something like 3DStudioMax, Lightwave,
      > or Caligari TrueSpace and create shapes
      > by molding them with your fingers.

      That's already been invented. You don't need 3D imaging, you don't need 3DS MAX and you don't even need a PC. You just need a piece of clay.
  • Sounds like CMU (Score:2, Insightful)

    by quinto2000 ( 211211 )
    Sounds like a project being worked on by Carnegie Mellon University researchers. I know CMU has a close relationship with Intel. Anyone know any more about the connection to this research?
  • Uh... what happened to the slashdot policy of "check them dar.. links" before posting, and then what happen to the editorial checking, and reading, links before passing a post to the mass?
  • So now when I flip my computer the bird, it can delete my porn, right?

  • This is pretty neat, but reminds me of something from a few years ago.

    I interviewed with Intel and during the interview they said in no uncertain terms that they were actively trying to keep people upgrading their systems, and hence keep the dollars rolling in. At the time, the interviewer said that the technique was largely by helping Micro$oft to keep new OSs coming that required more and more horsepower to run properly.

    This is very cool in its own right (or could be, I haven't looked at it completely), but strikes me as another way they can push that curve...
  • This is great!

    Now the field of Stereoscopic software (I know, I never even thought it existed) could open up thanks to the pushing of a sorta-powerful company.

    Hey, at least we'll see an open standard file format.
  • whoever thinks this is useless is an absolute idiot. you could improve security by having computers recognize the users' face, or how about making a model airplane that can fly itself (or at least be smart enough to know when it's about to hit the ground), or perhaps combine this with a servo-controlled gun? or perhaps create instant 3D models of whatever you want (substitute lara croft with your girlfriend? have her running around your apartment complex?) or combine this with some robotic arms and have it make sculptures out of something you modeled on the computer? the possibilities are endless, and this kind of technology -interfacing computers with the real world- will be the driving force of our next big economic boom. -william
  • In my windowmanager I want to switch from click to focus to 'focus follows eyeballs'. I don't even think you need stereoscopic code to handle it. It seems possible, I just don't know how to do it.
  • WARNING: When using this software, do NOT talk about disconnecting your computer in front of it, if your plan to operate heavy machinery in outer space afterwards.
  • on intel's part. There aren't many applications these days that require a lot of cpu power, but I bet this library does. So intel just gives app writers a little push so that the demand for faster intel processors does not slow down. Heck, it's a win-win for the consumer (rare these days).
  • how about using this to improve video compression quality? make a codec that correctly identifies the more important objects in a scene (ones closer to the camera, ones that are moving) and use more of the bandwidth for those objects, less on things like the background, or plants, or whatever. or create 3D animations from reality that rival the quality (but not the size) of raster-based video.
  • by Mike_L ( 4266 )
    Voice recognition is the next major advancement in computer user interfaces. Lip reading will increase the accuracy of voice recognition software. It is exciting that Intel is furthering the field of cybernetics.

    I look forward to the day when I can dictate to my PC by just mouthing the words. Voice recognition and touchscreens will save the office worker from Repetetive Stress Injury and Carpal Tunnel Syndrome. Lipreading will make voice recognition practical for large offices and many other areas.

  • You see, I'm a mumbler. That's right, I talk to my machines all day long...coaxing, cajoling, cursing, crowing...I do it all.

    I'd hate for my corporate desktop to be recording everything I'd say for the posterity of HR - like my emails are today.

    Even worse, imagine if my machine decided to take all of that personally...*shudder*
    • I'd hate for my corporate desktop to be recording everything I'd say for the posterity of HR

      Chances are pretty good that when you need to swear profusely, your computer would be rebooting after a crash (and therefore would not be running lipreading software).

      On the other hand, recording for posterity everything HR says can be very entertaining.

  • Combine this with Magic Lantern and the FBI can "read" your conversations too. :)
  • Good, but not new. (Score:4, Insightful)

    by cosyne ( 324176 ) on Tuesday December 18, 2001 @10:38PM (#2724326) Homepage
    While i have to say that Intel's OpenCV library rocks (for a number of reasons), stereoscopic vision is nothing new. The cnn article is more or less crap ("Until today, computer vision applications has been restricted to two dimensions
    "? nice try...) It's mishmash of reporter hype and stock text which describes computer vision in general ("Over the next 5 to 10 years, Intel Corp. expects computer vision to play a significant role in simplifying the interaction between users and computers"). The Sussex Computer Vision Teach Files [] page has a reasonable description of stereoscopic vision [] from 1994. Lip reading is not really a 3D problem, so stereoscopic capabilites aren't going to help much. Many of the other uses- 3D environment modeling, object modeling and recognition, etc, are being worked on (again, the algorithms aren't new, this is just a new open source implentation) but they're not easy.

    I don't mean to sound pessimistic, though. OpenCV is really cool, both as a corporate contribution to open source, and as a programming library even if you never look at the code. And the Matlab interface means fewer MSVC++ sessions which end with me feeling homicidal ;-) The inclusion of stereo vision will be cool for people trying to write vision applications, but it's not advancing the state of the art.
  • by Bowie J. Poag ( 16898 ) on Tuesday December 18, 2001 @10:38PM (#2724330) Homepage

    Here's how you do make stereoscopic images with a digital camera:

    Take a picture like you normally would, but be mindful of the position and angle of your camera.

    2) Snap a picture.

    3) If the subject you're photographing is close to you, take a small step to the right. If the subject is far away, take a large step to the right.

    4) Aim your camera at the subject and photograph it again.

    5) Pull up both images in the photo editor of your choice.

    6) Arrange the photos side by side. The first image you took should be on the left, the second image you took should be on the right.

    7) Sit directly infront of your monitor, and blur your eyes. If you cant blur them, try crossing them slightly. Try to focus on "the picture in the middle". If you still cant do it, hold up a pencil (eraser-side up) exactly halfway between your eyeball and the screen. Focus on the eraser. The image on the screen should pop out at you in stereoscopic 3D.

    For some good examples of a stereoscopic images I took, go here. [] Try the picture of the steering wheel first...Its really easy. You'll also see a number of stereo photos of Tumacoccori, an 18th century Spanish mission that got the shit beat out of it by native americans. You'll also find another picture thats rather interesting---It's a downward view of a deactivated nuclear missile still in the silo at the Titan Missile Museum outside of Tucson. The view extends about 20 floors below ground. If I were to have taken this photo in 1981 versus 2001, I would have been shot on sight. :)

    • I have been to San Xavier outside of Tuscon, and your pics brought back memories - they are excellent. Now I need to try some myself (not that I didn't know about the technique, but I have never used it before)...
    • It's a downward view of a deactivated nuclear missile still in the silo at the Titan Missile Museum outside of Tucson. The view extends about 20 floors below ground. If I were to have taken this photo in 1981 versus 2001, I would have been shot on sight. :)

      "Most things in here don't react too well to bullets."
    • It reminds me of something I tried many years ago, before digital photography was feasible:

      While raytracing on my 386 25MHz(!) using POV and its (then) text interface, I tried doing exactly the same thing, and the results were fantastic - take your favorite scene, and re-render it with the camera moved to the right. I got some spectacular results - I don't have them still, but I somehow doubt they'd look so spectacular anymore.

      On a side note, your pictures are great, just a shame my monitor is large enough to make viewing them properly virtually impossible (going cross-eyed is a poor substitute) Guess I'll have to hike up the res...
    • A better way, and the way I learned how to look at stereo images (divergence/parallel not convergence/crossed), is to place your index finger tips together in front of your face about 2 feet away, then look far away ignoring your fingers, quickly switch your eyes back to your fingers with out changing the position of your eyes, you will see a weiner (hot dog looking) thing floating between your fingers as an illusion, try to keep it that way as long as you can, and even try changing the size of it by moving your eye balls, eventually you will be able to do it by thinking about it (its like looking far away with out focusing far away).
  • Point Grey Research [] has been offering something similar for a few years now. Even runs on Linux. It's not free, but it has a track record. There's a downloadable demo.

    Point Grey likes to use three-camera systems, with the cameras arranged in a triangle. This eliminates most ambiguities found with two-camera systems.

    Algorithms to do this have been around for years, but only in the last few years has it become possible to do it in real time on commodity processors. Hans Moravec was the first, almost 30 years ago, back when it took him 20 minutes of mainframe time to process a stereo image. Point Grey was selling a DSP-based solution a few years ago. Now you can do it on consumer hardware.

    Mobile robots should be getting much better shortly. Systems based on Polaroid sonars have the resolution of probing the world with the big end of a broom. Laser rangefinders cost way too much and have moving parts. Millimeter wave radar is complicated to use as an imager (although it opens most supermarket doors in the developed world.) Affordable, fast vision is finally here.

  • It sounds like it could have quite a few different uses. Can you imagine using this for a security system, perhaps when combined with face-recognition software? They would make a good combo.
  • Yeah, a couple of web cams for target acquisition and tracking, two torquey servoes to aim the Nerf blaster, one more to pull the trigger, this software to track the perps, and whammo! I could even calculate trajectories for the nerf darts (within their uncertanty, which is large), and give the target a few seconds to yell out the password in case they don't want to get shot. Very cool. Automatic cubicle defense systems - I could make a fortune here!
  • by robbo ( 4388 )
    Stereo vision algorithms have been around for years, and I suspect that OpenCV implements some of the more common published methods. We understand the image formation process pretty well now and working with a calibrated stereo head is easy. Taken one step further, improvements in automatic camera calibration and cpu speed have led to nearly real-time 3d reconstruction from a monocular video stream (where the camera is moving through the scene).

    Actually, IMHO, pure monocular vision is a more interesting (and challenging) problem-- it's pretty clear that human stereo vision is an exercise in redundancy, since we can do pretty well with one eye closed, not to mention the fact that we perceive all kinds of 3d structure in 2d contexts (like your favourite pr0n- umm, quake screenshot ;-). The fundamental question is how do we interpret 2d images into 3d models (or whatever representation we use in our heads)? This a distinctly different (and more difficult) problem from building a 3d model from a motion sequence or stereo pair.

  • ...if only my Intel webcam was supported. Intel isn't supporting it. *!#$@!!
  • by grmoc ( 57943 )
    Stereoscopic vision is a very VERY useful thing for all things which percieve their environment through visual apparatus.

    Huh? What do you mean? Well, close one eye (or put an eyepatch on) and look at your flat world. How far away is that streetlight? Hmm. How tall is that man? Hmm..

    Of course, we as people are much MUCH MUCH better at percieving (interpreting) our visual environment than computers are. Humans generally have little trouble correctively percieving things such even through partial occlusions, changes in scale, orientation, distortion (glasses might make you able to see, but straight lines become anything but..) and changes in intensity and color.

    Being able to get 3d information about objects aids greatly in interpreting what they are.
    An image (2d) of a hand is (almost always) full of occlusions. These occlusions are diffucult to interpret in 2d because the edges in 2d have less differentiation than a depth-map would. (The distance metric is less ambigious!)

    Wouldn't you like your computer to interact with you as if it was (a very obedient) human? This helps.
  • I remember watching an article on Television about 12 years ago about the amount of effort it took to program/teach robots in factories how to operate within their envionment. Back then imaging for computers was very primative, and has probably evolved a lot since then.

    As the presenter put it "take a pair of sunglasses, smear them with light machine oil, put on the sunglasses, put on thick gloves, and then use a pair of chopsticks (alone) to try and pick something up". It was not clear at the time if they were using anything steroscopic or not. However it was shown that video inputs for the computers were from multiple angles.

    One of the robots (an arm) that was under-development was made for use in a car factory to perform point spot welding. One of these things could easily punch a hole through the side of a car, so distance measurement was important at that time.

    How much this has progressed since is unknown (to me at least). However the technology featured was specific and not necessarily within the price range that someone can just grab off the shelf.

    Hopefully this imaging technology will give someone who wants to make their own robot a head start (somewhere). However I would rather stay away from it if it has any of the "capacity" of the robot in the Television article that I saw.

  • I've been working with this library now for about 6 months. I'm not even sure why Intel chose to make a press release about it now - they didn't even upload any new code! (Maybe they will soon). Hmm. Maybe there is a computer vision conference coming up?

    This is a super library for computer vision. The best I've seen. Highly optimized. Goes way beyond anything out there in the Open source domain.

    Sadly, it seems to me that many of the /.ers who read the report didn't go and look at the site or see what the library can do.
  • Intel did this in response to my post here [][1] (on April 26th of last year!), in which I outlined just this kind of program, but mentioned "All this is very processor-intensive, but so far it's very straight-forward." So, of course, Intel releases it openly. :)))

    [1] 32 31
  • One of the big headaches in visual effects is integrating CGI objects into a plate with live actors. The only two ways really used are blue/green screening to get a matte, or if a blue/green screen wasn't available or practical (or if the onset crew just *fucked up* - they either light it wrong or they put it in the wrong place and the onset VFX supervisor is at the catering table eating doughnuts when they shoot that scene or any number of other blunders) ... and rotoscoping which is a fancy word for tracing around an actor or part of set - frame by damn frame. Extra fun when you have to roto around an actor with wispy hair...

    The other big headache is tracking a camera move. You basically feed the footage into a camera tracking program and define tracking points in the image; features in the frame which the computer follows as they move - the software uses a lot of maths to work out where in 3D space these 2D points are and creates a CG camera in your 3D app to match the move, you usually then have to build a rough proxy version of the set in 3D to go with it (unless the production has the bucks to spend and they get a LIDAR scan of the set). THEN you get to finally start putting in your 3D elements, that is if you haven't shat yourself and run out of the building screaming after a week or so of staring at the same bloody footage 10-12 hours a day...

    Ahem... anyway - where this new system could come in useful is using depth perception to generate a z-buffer, which would allow the computer to isolate foreground and background objects - no need to blue screen you can just point and click an actor to get a matte. Tracking would be made easy(er, anyway) as you have an actual 3D plate to work with, feed it to one of those programs that can auto-model 3D geometry from photos and you get your proxy set for free too...

    Big blue screen shoots are tough on actors, just ask anyone that worked on one of the new star wars movies. They have to spend hours waiting for the screens to be set-up and lit, and the choreography of shooting a scene with a digital character is painful to learn, not to mention the hassles and expense of shooting with a motion control camera. Not only would a system like this speed up production but presumably with a real-time z-buffer being generated the cast and crew could interact with lo-res versions of the CG characters in real-time on monitors to get a better feel of what they are doing.

    In fact as a wider application, once we all have depth-percieving videophones you could matte in any image behind yourself you want - great for phoning in sick from the beach :)

  • The Intel computer vision library is not the only such resource available. The TINA [] machine vision system has been developed since 1986 and provides functionality for the machine vision researcher at both the infrastructure level (datastructures and functions for an enormous range of mathematical, statistical and image processing tasks) as well as state-of-the-art solutions to many machine vision problems. These include low-level feature extraction, robust primitive fitting, object tracking, 2D object recognition and 3D object location. Indeed the stereoscopic subsystems in TINA [] (PMF, Stretch Correlation) have been viewed for many years as the standard for edge based stereo. TINA [] is almost unique as a resource and living archive of over 70 man years of research and over 200 peer reviewed publications in machine vision and medical image analysis. Functionality in TINA [] has practical utility in several industrial contexts.
    For the past 5 years TINA [] has been provided as open source under an LGPL license and development is now based at the University of Manchester, UK. []

    Whilst I am very pleased that Intel recognise the importance of machine vision research and can only commend them on their open source approach I have some reservations regarding the use of OpenCV by the research community at large. Certainly their motives are business orientated (and one cannot argue with this). Therefore, however, the contents of their library are ultimately dictated by what Intel want not necessarily what the research community might need or indeed what is even possible (such as dense estimates of stereo).

    Open Source software is vital in research disciplines where there is a significant software component. What better way to disseminate your results than to encapsulate your entire experimental apparatus in a tar file! Why should others in the field waste time reimplementing your algorithms (probably incorrectly) in order to duplicate your results. A process which sits at the very heart of any scientific endeavour.

    TINA [] has recently received direct funding from the European Union for developed as the open source environment for machine vision and medical image analysis research. For more details of TINA visit the website at []

    Sorry to rant a bit but it is not often I read something on here that I know so much about!
  • I work in vision robotics as a daily task. Using cameras looking down to determine a x-y plane for positioning parts to a tolerance of 0.0005 inch. Then placing parts on that targeted subject. One of the only problems is being able to determine the height of said subject. Currently we use mechanical means of taking in variances of two thousands of a inch. This make me want to get my calcuator out and determine the angle between the cameras that would be required to make the z height determination that would enable me to know the height to 0.0005 inch. Yes, there are applications for this!

1 1 was a race-horse, 2 2 was 1 2. When 1 1 1 1 race, 2 2 1 1 2.