But it works by having the phone do speech recognition while being held at arms length. That way you can have multi-modal communication and it not simply speech replacing pointing, but having them work together, using each modality for what it's good for.
Here's a link to an article:
http://blog.wired.com/gadgets/2008/07/att-developing.html
The idea of using the phones accelerometer is a great idea. In AT&T's demo you need to "click to talk", which makes sense for their design, but the accelerometer idea is pretty nifty if you just have speech responses. Using the display is good for many things though, e.g. for maps, long lists. I'm thinking it could be a pain to have to hold the phone up to your ear over and over: hold it up to your ear, speak, look at the display, (speak again if something was misrecognized), (possibly click something), hold it up to hor ear, speak, look at the display, etc.
--
Why procrastinate now when you could procrastinate tomorrow.