Vista Speech Recognition Goes Awry 418
An anonymous reader writes "It seems even MSNBC is willing to take a jab on those rare occasions when Microsoft products don't work. During a demo of Vista's speech recognition technology, Vista couldn't differentiate between mom and aunt, and all attempts to rectify the problem just made it worse. Wait until you see what it spat out, I think we have a new 'All your base.' Don't you just love Microsoft's live demonstrations?"
The Voice of Experience (Score:5, Insightful)
Experience is the human quality that enables you to recognize a mistake immediately when you make it again.
Dacap
Re:Awww...c'mon guys.... (Score:5, Insightful)
This was really a dreadful presentation. There was no ambient noise (as the commentators say later, and despite what Microsoft says), and there was no echo as the demonstrator claims during the actual test. It seems to have been done under really good test conditions, but still it failed miserably.
Re:Awww...c'mon guys.... (Score:5, Insightful)
Voice recognition requires some training regardless of who provides it. We're not Star Trek here....Prep work and rehearsal people. If mr. sales guy had tried the demo before the presentation he would have noticed it wasn't working and avoided the embarassment.
This is why sales people are asshats. They're unprofessional non-technical people who sap back the high life while the rest of us have to put up with the mess they create through their daily barrage of verbal diarhea.
Tom
are u serious? (Score:1, Insightful)
As if MS is the only one who has problems with demonstrations. This is the problem with anti-ms guys... its not that MS is perfect, its just that the zelots are blinded by hate.
Re:are u serious? (Score:4, Insightful)
It's called modesty. If MSFT had any [and some humility] they wouldn't get laughed at so hard for this. I mean look at Linux. Find a bug in the Kernel, fix it, post notices that its. You don't see anyone saying "Oh hahaha, Linus is at it again!" That's because you also don't see Linus on CNN mocking the rest of the world.
Microsoft deserves all the negative press and humilitation they get because they are shameless, deceitful, greedy monopolistic bastards.
Tom
Re:are u serious? (Score:2, Insightful)
The reason I find this eminently amusing is that Microsoft is a company built on marketing. At no particular point has Microsoft had "The Superior Technical Solution"; they have always had luck and better marketing. Since DOS 3.3 there have frequently been products that were more stable, faster, easier to use - you name it. And Microsoft's captains have beaten them in two ways: Marketing and Money.
So of course when a company who has built their foundation on marketing flubs it, it's more amusing than when a company who has built their foundation on performance of one kind or another flubs it. It's inescapable that the Bg Dog gets more scrutiny than the Contender, anyway. And Microsoft apologists should understand that.
Re:Is SR ever going to be good enough? (Score:4, Insightful)
Re:Is SR ever going to be good enough? (Score:5, Insightful)
For example, how does the computer know that Picard wants to call Riker and isn't just talking about him? Oh and keep in mind the computer never misinterpreted something. In other examples, people would carry on intelligent conversations with the computer - all those holodeck scenes, Troi ordering chocolate, etc.
Star Trek-style of SR I think would be the holy grail and is probably always going to be out of reach. Barring some amazing breakthrough in AI algorithms, the computer power required just for the situations above would be incredible - and that's computer time that probably could be put to better use elsewhere, even if it was found to be possible.
I think the computer in the original Star Trek was more realistic - but even there the voice-recognition was far beyond what we're capable of today, as Microsoft has demonstrated so well. Plus all the blinkenlights that seemed to have no useful purpose were cool.
Re:Awww...c'mon guys.... (Score:5, Insightful)
For instance, the word "patent" is pronounced differently in the UK from North America. In the UK it is "pay-tent" and over here it's "pah-tent". That's just one example.
Point is [to paraphrase ballmer]:
Preperation (clap), preperation (clap), preperation (clap), preperation (clap), preperation (clap), [pitch of voice higher], preperation (clap), preperation (clap), [wheeze out of breath, pitch even higher], preperation (clap), preperation (clap), yeah!!!
Something tells me this sales guy will get neither punished nor lose their x-mas bonus. Some poor schmuck in engineering will take the fall for not making the demo "people ready".
Tom
removing ambient noise (Score:5, Insightful)
why not just use two mics, one to record the ambient noise (positioned away from the voice mic) the other to record the voice (headset) then as you have two signals just subtract the ambient noise signal from the heaset signal , voila clean headset mic audio
works for music too, you could control your music player by voice even when its playing loud (at a party) by removing the music signal from the mic signal
-AJS
Re:are u serious? (Score:5, Insightful)
Hmmm, no. Maybe it's the way they deal with failures. Remember Bill gates trying hard to demonstrate the Media Center [google.com]? Some time after that Steve Jobs gave his regular Macworld keynote when his Mac didn't respond anymore. He moved a monitor switch to continue the presentation on another Mac and said: "Well, that's why we have backup systems here."
Re:Is SR ever going to be good enough? (Score:3, Insightful)
As for ambient noise, there's often useful contextual information there too. Ambient noise can give information about where the speech is occurring and about what is happening at that location. In some rare cases the ambient noise might even be responsive to the speech itself. The audience laughing in the example was a clue that a) an audience is watching, and b) the system made a mistake. A human would have recognized that and used it to advantage. For a speech recognition system to work as well as a human, it will not only have to get better at separating speech from ambient noise, but it will need to be able to recognize the ambient noise.
Microsoft Innovation (Score:3, Insightful)
Re:Awww...c'mon guys.... (Score:5, Insightful)
Aw, c'mon; how many English dialects pronounce "mom" and "aunt" similarly?
Even to someone who's worked with voice recognition, that mistake simply isn't credible. If the software were anywhere near usable, it wouldn't confuse those words from anyone, especially not in a low-noise, no-echo demo.
This is a "No excuses" situation. That demo was simply a dismal failure due to some major bug(s).
Of course, the speech recognition field has a long history of staying in such a state forever. It's hard to find a product that, even with extensive training, doesn't produce howlers like this.
I did like the "killer" part
Re:Awww...c'mon guys.... (Score:5, Insightful)
Chances are he never even did a walk through of the presentation before the press was there.
Tom
This sounds so much like Microsoft (Score:4, Insightful)
Yes bugs happen, yes vista is still in beta but rather then just admit "vista is still a buggy piece of crap software that can't even be used properly by its own engineers" they tell us to sit and wait because we can trust them to fix it.
To MS credit, it is a strategy that works.
Re:Awww...c'mon guys.... (Score:4, Insightful)
I seriously doubt this presentation was rehearsed. At the very least, they should have tested it in that room with that mic, etc. But in all honesty, this is going to be used by millions of people in all sorts of rooms with all sorts of mics. That shouldn't matter anyways.
Anyways, I doubt he prepared at all, that is, other than snorting cocaine off a mirror in the back room before the show.
Tom
Re:Hee hee (Score:3, Insightful)
Re:Oh Please (Score:5, Insightful)
Point is, if the sales guy had tried the system out beforehand he would have noticed it not working.
That is, suppose the code is total shit [I know, big stretch for MSFT]. Then isn't it likely it would have failed during the preparation stage? If you are saying "mom" and it always comes back "aunt" you may want to cancel the presentation.
That's why I think he didn't do any prep work for the presentation.
Tom
Re:Is SR ever going to be good enough? (Score:3, Insightful)
The fleet's computers have "known" Picard since he entered the service. They should be pretty well trained .
The communicator badges in TNG could be transmitting supplementary biometric data and non-verbal commands, which by now have become almost automatic: "Watson. I need you!"
Re:Is SR ever going to be good enough? (Score:3, Insightful)
The badge also indicates the location of the person. So if Picard says "Will" (or "number one", which is simply an alias that Picard made for "Riker, William T.") and the computer sees that Will isn't in the same room as Picard (or isn't within normal hearing distance), it simply connects the two via a communication channel.
Re:Is SR ever going to be good enough? (Score:3, Insightful)
Depends on your own context...I deployed (admittedly, an older version) of Dragon NaturallySpeaking in an office full of mobility-impaired employees. They found it much easier to spend 10% of their writing time fixing errors than 100% of it trying to, for example, type with the onscreen keyboard. If you can't use a keyboard, even crappy voice recognition is a godsend.
Re:Oh Please (Score:3, Insightful)
Or, in the very least, don't say 'mom'! I've had plenty of times where a salesperson tests something the day before a demo (usually after a week of knowing he had a demo, but that's a different rant) and finding something. Our usual response with that short of notice is 'well, don't show that' since we didn't have enough time to fix it. At best, we could send a version that had the error suppressed but not truly fixed.