Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×

Vista Speech Recognition Goes Awry 418

An anonymous reader writes "It seems even MSNBC is willing to take a jab on those rare occasions when Microsoft products don't work. During a demo of Vista's speech recognition technology, Vista couldn't differentiate between mom and aunt, and all attempts to rectify the problem just made it worse. Wait until you see what it spat out, I think we have a new 'All your base.' Don't you just love Microsoft's live demonstrations?"
This discussion has been archived. No new comments can be posted.

Vista Speech Recognition Goes Awry

Comments Filter:
  • by dacap ( 177314 ) on Saturday July 29, 2006 @09:36AM (#15805434) Homepage
    Yes, once again Microsoft S/W Engineers learn that the more public the demo or the more important the audience, the more likely some will go wrong. It's one of Murphy's laws. Been there. Did that. Barely survived.

    Experience is the human quality that enables you to recognize a mistake immediately when you make it again.

    Dacap
  • by kripkenstein ( 913150 ) on Saturday July 29, 2006 @09:37AM (#15805439) Homepage
    Nothing to worry about, I'm sure they'll get all the kinks out by the time Vista is released - sometime in 2008 or so, it seems, based on this video.

    This was really a dreadful presentation. There was no ambient noise (as the commentators say later, and despite what Microsoft says), and there was no echo as the demonstrator claims during the actual test. It seems to have been done under really good test conditions, but still it failed miserably.
  • by tomstdenis ( 446163 ) <tomstdenis@gma[ ]com ['il.' in gap]> on Saturday July 29, 2006 @09:41AM (#15805458) Homepage
    Most likely the system was trained by an engineer and handed off to the ass in marketting. He was probably supposed to train it to his voice too but decided to hit the bar instead.

    Voice recognition requires some training regardless of who provides it. We're not Star Trek here....Prep work and rehearsal people. If mr. sales guy had tried the demo before the presentation he would have noticed it wasn't working and avoided the embarassment.

    This is why sales people are asshats. They're unprofessional non-technical people who sap back the high life while the rest of us have to put up with the mess they create through their daily barrage of verbal diarhea.

    Tom
  • are u serious? (Score:1, Insightful)

    by CDPatten ( 907182 ) on Saturday July 29, 2006 @09:46AM (#15805475) Homepage
    "Don't you just love Microsoft's live demonstrations?"

    As if MS is the only one who has problems with demonstrations. This is the problem with anti-ms guys... its not that MS is perfect, its just that the zelots are blinded by hate.
  • Re:are u serious? (Score:4, Insightful)

    by tomstdenis ( 446163 ) <tomstdenis@gma[ ]com ['il.' in gap]> on Saturday July 29, 2006 @10:00AM (#15805526) Homepage
    Microsoft routinely puts out their excellence over everyone else including OSS. Hear them talk about Office w.r.t. OpenOffice. They talk down about it, mock it, dismiss it, etc...

    It's called modesty. If MSFT had any [and some humility] they wouldn't get laughed at so hard for this. I mean look at Linux. Find a bug in the Kernel, fix it, post notices that its. You don't see anyone saying "Oh hahaha, Linus is at it again!" That's because you also don't see Linus on CNN mocking the rest of the world.

    Microsoft deserves all the negative press and humilitation they get because they are shameless, deceitful, greedy monopolistic bastards.

    Tom
  • Re:are u serious? (Score:2, Insightful)

    by NixLuver ( 693391 ) <stwhite&kcheretic,com> on Saturday July 29, 2006 @10:04AM (#15805538) Homepage Journal
    Nah, that's not it. I don't hate "Microsoft"; that's just a name on a door somewhere. I don't hate 'the corporation'; corporations are not individuals, no matter what the law would have us believe.

    The reason I find this eminently amusing is that Microsoft is a company built on marketing. At no particular point has Microsoft had "The Superior Technical Solution"; they have always had luck and better marketing. Since DOS 3.3 there have frequently been products that were more stable, faster, easier to use - you name it. And Microsoft's captains have beaten them in two ways: Marketing and Money.

    So of course when a company who has built their foundation on marketing flubs it, it's more amusing than when a company who has built their foundation on performance of one kind or another flubs it. It's inescapable that the Bg Dog gets more scrutiny than the Contender, anyway. And Microsoft apologists should understand that.
  • by CastrTroy ( 595695 ) on Saturday July 29, 2006 @10:08AM (#15805553)
    Who cares if they ever get up to star trek level. The technology still sucks. It's much quicker and less annoying to the people around you to just type on your keyboard. Sure it has some uses such as those who don't have full use of their hands. We shouldn't abandon all research on the subject, becuase it does have it's uses, but I don't think it's something worth pushing on the general population, especially before the technology is actually ready. People already don't like their computers, pushing buggy technology like this out will just increase the problem.
  • by Skater ( 41976 ) on Saturday July 29, 2006 @10:08AM (#15805555) Homepage Journal
    The computer in Star Trek (at least in the Next Generation) was WAY too smart. For it to do what it supposedly did in the show, it would have to be sitting there, monitoring the conversation all the time, and be totally able to understand the context of what was being said to know what to do. Not only when people directly asked the computer a question, but also when people wanted to converse with someone.

    For example, how does the computer know that Picard wants to call Riker and isn't just talking about him? Oh and keep in mind the computer never misinterpreted something. In other examples, people would carry on intelligent conversations with the computer - all those holodeck scenes, Troi ordering chocolate, etc.

    Star Trek-style of SR I think would be the holy grail and is probably always going to be out of reach. Barring some amazing breakthrough in AI algorithms, the computer power required just for the situations above would be incredible - and that's computer time that probably could be put to better use elsewhere, even if it was found to be possible.

    I think the computer in the original Star Trek was more realistic - but even there the voice-recognition was far beyond what we're capable of today, as Microsoft has demonstrated so well. Plus all the blinkenlights that seemed to have no useful purpose were cool. ;)
  • by tomstdenis ( 446163 ) <tomstdenis@gma[ ]com ['il.' in gap]> on Saturday July 29, 2006 @10:14AM (#15805577) Homepage
    Generally, from what I've seen you need to train it a bit on the way you speak. There are thousands of distinct English accents and pronounciation variations.

    For instance, the word "patent" is pronounced differently in the UK from North America. In the UK it is "pay-tent" and over here it's "pah-tent". That's just one example.

    Point is [to paraphrase ballmer]:

    Preperation (clap), preperation (clap), preperation (clap), preperation (clap), preperation (clap), [pitch of voice higher], preperation (clap), preperation (clap), [wheeze out of breath, pitch even higher], preperation (clap), preperation (clap), yeah!!!

    Something tells me this sales guy will get neither punished nor lose their x-mas bonus. Some poor schmuck in engineering will take the fall for not making the demo "people ready".

    Tom
  • by sh0rtie ( 455432 ) on Saturday July 29, 2006 @10:23AM (#15805611)


    why not just use two mics, one to record the ambient noise (positioned away from the voice mic) the other to record the voice (headset) then as you have two signals just subtract the ambient noise signal from the heaset signal , voila clean headset mic audio

    works for music too, you could control your music player by voice even when its playing loud (at a party) by removing the music signal from the mic signal

    -AJS

  • Re:are u serious? (Score:5, Insightful)

    by Udo Schmitz ( 738216 ) on Saturday July 29, 2006 @10:25AM (#15805616) Journal
    "As if MS is the only one who has problems with demonstrations."

    Hmmm, no. Maybe it's the way they deal with failures. Remember Bill gates trying hard to demonstrate the Media Center [google.com]? Some time after that Steve Jobs gave his regular Macworld keynote when his Mac didn't respond anymore. He moved a monitor switch to continue the presentation on another Mac and said: "Well, that's why we have backup systems here."

  • by wkitchen ( 581276 ) on Saturday July 29, 2006 @10:36AM (#15805668)
    There is Dragon Naturally speaking 9, which apparently is pretty good, but will SR ever really be the Star Trek kind?
    Probably. But it will have to get much better at using context. They're already using grammar as a cue, but it's going to take much more than that. Humans draw on memories of previous conversations, knowlege about the interests and mannerisms of the person speaking, and knowlege of the situation at hand. Even just knowing what's big in the news can help.

    As for ambient noise, there's often useful contextual information there too. Ambient noise can give information about where the speech is occurring and about what is happening at that location. In some rare cases the ambient noise might even be responsive to the speech itself. The audience laughing in the example was a clue that a) an audience is watching, and b) the system made a mistake. A human would have recognized that and used it to advantage. For a speech recognition system to work as well as a human, it will not only have to get better at separating speech from ambient noise, but it will need to be able to recognize the ambient noise.
  • by SCHecklerX ( 229973 ) <greg@gksnetworks.com> on Saturday July 29, 2006 @10:40AM (#15805687) Homepage
    OS/2 Warp had speech recognition in 1994 with OS/2 Warp. Better yet, the OS/2 version of netscape at the time was speech enabled (browse simply by speaking the link). Even cooler was that the netscape developers actually listened to the OS/2 community with that version (I remember them implementing something that I had asked for...very cool). Keep in mind that the average system of that time was a pentium 133 with 100MB of ram. And here we are at 2006, With GHz processors and GBytes of RAM dirt cheap, and M$ is just now starting to experiment with this? By now this technology should be damned near perfectly integrated across the board! Thanks for abusing your monopoly power to destroy all of the competition and REAL innovation, Microsoft!
  • by jc42 ( 318812 ) on Saturday July 29, 2006 @10:45AM (#15805711) Homepage Journal
    There are thousands of distinct English accents and pronounciation variations.

    Aw, c'mon; how many English dialects pronounce "mom" and "aunt" similarly?

    Even to someone who's worked with voice recognition, that mistake simply isn't credible. If the software were anywhere near usable, it wouldn't confuse those words from anyone, especially not in a low-noise, no-echo demo.

    This is a "No excuses" situation. That demo was simply a dismal failure due to some major bug(s).

    Of course, the speech recognition field has a long history of staying in such a state forever. It's hard to find a product that, even with extensive training, doesn't produce howlers like this.

    I did like the "killer" part ...

  • by tomstdenis ( 446163 ) <tomstdenis@gma[ ]com ['il.' in gap]> on Saturday July 29, 2006 @10:50AM (#15805739) Homepage
    I never said training was the only cause of the failure. I said it's likely that he didn't train it. Because most high powered sales people are just cocaine snorting asshats that make peoples lives miserable.

    Chances are he never even did a walk through of the presentation before the press was there.

    Tom
  • by SmallFurryCreature ( 593017 ) on Saturday July 29, 2006 @10:53AM (#15805763) Journal
    How the fuck did this bug go unfixed for so long? It to me sounds to much like the old MS sale strategy of saying "Just wait please, do not buy X now, our product will be much better in the future."

    Yes bugs happen, yes vista is still in beta but rather then just admit "vista is still a buggy piece of crap software that can't even be used properly by its own engineers" they tell us to sit and wait because we can trust them to fix it.

    To MS credit, it is a strategy that works.

  • by tomstdenis ( 446163 ) <tomstdenis@gma[ ]com ['il.' in gap]> on Saturday July 29, 2006 @10:55AM (#15805770) Homepage
    You clearly don't work for a large corporation. Sales people (who are not all bad) are typically the sort that don't really understand technology and are all juiced up to make sales. Around technology they often leap before actually finding out the facts which nets them in a world of trouble.

    I seriously doubt this presentation was rehearsed. At the very least, they should have tested it in that room with that mic, etc. But in all honesty, this is going to be used by millions of people in all sorts of rooms with all sorts of mics. That shouldn't matter anyways.

    Anyways, I doubt he prepared at all, that is, other than snorting cocaine off a mirror in the back room before the show.

    Tom
  • Re:Hee hee (Score:3, Insightful)

    by marcello_dl ( 667940 ) on Saturday July 29, 2006 @11:13AM (#15805870) Homepage Journal
    Yes, we'll get good voice recognition one day. It'll be right after 99% of the world population have mastered mouse and keyboard interfaces.
  • Re:Oh Please (Score:5, Insightful)

    by tomstdenis ( 446163 ) <tomstdenis@gma[ ]com ['il.' in gap]> on Saturday July 29, 2006 @11:15AM (#15805886) Homepage
    Who knows how the algorithm they implemented works. Chances are the computer scientists behind it are not total asshats and they assumed the sales guy would follow the same procedure they did [e.g. to train it].

    Point is, if the sales guy had tried the system out beforehand he would have noticed it not working.

    That is, suppose the code is total shit [I know, big stretch for MSFT]. Then isn't it likely it would have failed during the preparation stage? If you are saying "mom" and it always comes back "aunt" you may want to cancel the presentation.

    That's why I think he didn't do any prep work for the presentation.

    Tom
  • by westlake ( 615356 ) on Saturday July 29, 2006 @11:39AM (#15806010)
    For example, how does the computer know that Picard wants to call Riker and isn't just talking about him? Oh and keep in mind the computer never misinterpreted something. In other examples, people would carry on intelligent conversations with the computer - all those holodeck scenes, Troi ordering chocolate, etc.

    The fleet's computers have "known" Picard since he entered the service. They should be pretty well trained .

    The communicator badges in TNG could be transmitting supplementary biometric data and non-verbal commands, which by now have become almost automatic: "Watson. I need you!"

  • by Yvan256 ( 722131 ) on Saturday July 29, 2006 @11:51AM (#15806059) Homepage Journal
    The communicator badges in TNG could be transmitting supplementary biometric data and non-verbal commands, which by now have become almost automatic: "Watson. I need you!"


    The badge also indicates the location of the person. So if Picard says "Will" (or "number one", which is simply an alias that Picard made for "Riker, William T.") and the computer sees that Will isn't in the same room as Picard (or isn't within normal hearing distance), it simply connects the two via a communication channel.
  • by SleepyHappyDoc ( 813919 ) on Saturday July 29, 2006 @12:17PM (#15806154)
    90% accuracy is nowhere near enough for voice recognition in a dictation context

    Depends on your own context...I deployed (admittedly, an older version) of Dragon NaturallySpeaking in an office full of mobility-impaired employees. They found it much easier to spend 10% of their writing time fixing errors than 100% of it trying to, for example, type with the onscreen keyboard. If you can't use a keyboard, even crappy voice recognition is a godsend.
  • Re:Oh Please (Score:3, Insightful)

    by NaugaHunter ( 639364 ) on Saturday July 29, 2006 @01:41PM (#15806536)
    If you are saying "mom" and it always comes back "aunt" you may want to cancel the presentation.

    Or, in the very least, don't say 'mom'! I've had plenty of times where a salesperson tests something the day before a demo (usually after a week of knowing he had a demo, but that's a different rant) and finding something. Our usual response with that short of notice is 'well, don't show that' since we didn't have enough time to fix it. At best, we could send a version that had the error suppressed but not truly fixed.

One way to make your old car run better is to look up the price of a new model.

Working...