IBM Develops Technology To Talk To Web 83
ProgramErgoSum writes to tell us that IBM's Indian-based research arm is trying to bring a new dimension to web interaction through voice interaction on your mobile phone. Developing a new protocol, Hyperspeech Transfer Protocol (HSTP), the hope is to allow users to talk to the web and get a response. Without more explanation I'm hoping this goes about as far as the gopher web. "The spoken web is a network of voice sites or interconnected voice and the response the company got in some pilot projects in Andhra Pradesh and Gujarat and the kind of innovations that people came up with were just mind-boggling, Gupta said. "
Interesting... (Score:2, Informative)
I just can't imagine an entirely new protocol being adopted when it is already very possible using existing technologies...
Re: (Score:3, Interesting)
The text to speech bit could do with some sort of markup though. Despite the author's guild claim to the contrary, text to speech is very machine like and monotonous, it could do with some tags like <scared> or <angry> to get some emotion going.
Re: (Score:2)
So what would happen id it hit a blink or marquee tag?
Re: (Score:3, Funny)
ah easy, blink would be said extremely quickly, whole sentence in one second. Marquee would be a loud street salesman sort of tone.
Re: (Score:3, Funny)
I believe the tag names, respectively, are going to be <enron> and <balmer>
Re:Interesting... (Score:4, Informative)
Agreed. Especially since CSS has supported aural media (including multiple voices or generic speaker categories like "child", "male", "female" for different speakers in a story, for instance) for quite a while now.
Re: (Score:2)
Ah so it does exist.
Is this widely used? I will confess to not knowing about it.
Any example out there?
Re:Interesting... (Score:4, Informative)
There's a good (and recent) summary of the situation here:
http://lab.dotjay.co.uk/notes/css/aural-speech/ [dotjay.co.uk]
If you want an open source solution, you should probably look to the firevox (as opposed to firefox etc.) community. Otherwise, Opera is probably your best bet. As far as usage goes: I think it's still pretty limited, but definitely worth considering for future projects that need (or can benefit from) such features, rather than some proprietary solution. Especially since it's a relatively small amount of extra work that can be overlaid onto existing web pages.
Re: (Score:2)
Granted, "as different from" might be technically better. Personally I quite like the visual image that "as opposed to" creates, and I don't see why you can't set up one browser in opposition to another for comparison. Are you SURE your version is the only correct one of the two? If so, why?
Re: (Score:2)
Ah, I think I see what you're getting at now. If I was trying to say that firevox is not firefox, then yes, "as different from" makes sense. And that WAS part of my reasoning for mentioning firefox. However, I also find it strange that firefox doesn't support these standards, since it's pretty much famed as the more standard browser choice (over IE at least), so another thing I was trying to get across is that firefox is not an option.
Re: (Score:3, Informative)
Which is a rule, but which is very very stupid, and just looks wrong to every human I have ever asked anyway. So fix your rules already. I have.
Re: (Score:2)
The format you're using is now recognised, and considered correct when punctuation matters (such as in technical docs). It's called "logical quoting", I believe.
We have this now, from Microsoft (Score:2, Interesting)
Microsoft bought TellMe (1-800-555-TELL), which does some of that. (Call it from a cell phone; the behavior on land lines is entirely different. From a cell phone, you can get movie listings, driving directions, etc.; on a land line, all it does is phone directories.)
Re: (Score:2)
Ha-hah-hah! ohh...
http://www.tellme.com/you/faqs [tellme.com]
"Yes! 1-800-555-TELL is a toll-free number you can call from any phone."
Re: (Score:2)
Dr. Doolittle claims prior art . . . (Score:3, Funny)
Talk to the animals? Talk to the Web? Same difference.
OS/2 voice recognition (Score:2)
Re: (Score:3, Funny)
Achilles says "No." (Score:4, Interesting)
Voice tech has an achilles heel: It's called accents. Most voice software works great for english-speaking people in the midwestern United States. But if you have an accent and have ever tried to "interact" with one of those voice mail systems that are speech-activated rather than touch-tone, the words unholy rage doesn't begin to describe the frustration of listening to a soothing voice repeatedly saying "I'm sorry, I do not understand your request" and then endlessly repeats the menus. Pressing '0', if you're wondering, will only make the system remind you that it (a) only speaks english and (b) while it can process touch tones, it won't -- because it hates you.
And IBM wants to bring this unique hell to the web? What kind of sadists are these people? As if websites that require Flash and the horrors that server-side Java unleashed wasn't enough...
Re:Achilles says "No." (Score:5, Insightful)
If that's true of this software developed by IBM's Indian research arm and pilot tested in Andhra Pradesh [wikipedia.org] and Gujarat [wikipedia.org], then I suspect it will also handle a lot of other English-speaking people.
As if English-speaking people from the midwestern United States don't.
Re: (Score:2)
If that's true of this software developed by IBM's Indian research arm and pilot tested in Andhra Pradesh and Gujarat, then I suspect it will also handle a lot of other English-speaking people.
It can handle accents but it must be programmed in; Voice recognition software is significantly about heuristic algorithms -- guessing what accent, doing differential analysis, etc. But it also succeeds because it often limits itself to yes/no or multiple choice answers -- that is, the answer must be one of those presented. Voice recognition that tries to do free-form recognition has an unacceptably high error rate. Therefore, it doesn't matter where it's tested, or what language. It only reaches a passable
Re: (Score:3, Insightful)
"English-speaking with a midwestern accent is generally viewed [BY AMERICANS] as the most easily understood amongst all english accents; And this accent is the one used for many (if not most) [AMERICAN] television reporters, voice recordings intended for mass [AMERICAN] audience, etc. Most other accents are defined [BY AMERICANS] by how they mangle certain syllables."
Fexed thaht fah yah.
Re: (Score:2)
But if you have an accent
As if English-speaking people from the midwestern United States don't.
As if English-speaking people from England don't.
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
"J2EE isn't for everything, but sometimes it is the only tool for the job."
Sort of like thermonuclear weaponry, right? Massive, utterly impractical to ever deploy, makes no discrimination between friend and foe, but for 'strategic applications'...
Re: (Score:2)
not just accents (Score:2)
Most voice software works great for english-speaking people in the midwestern United States. But if you have an accent ...
I have a southeastern Michigan accent - essentially the same as the "standard radio/TV accent" (Cincinnati OH). It was chosen for that service because it makes ALL the American English phonetic distinctions (vs. for example an east-coast accent which merges "l" and "r" making Kennedys sound like they're saying Fidel heads "Cuber") and because it's intelligible to speakers of ALL the Ame
Re: (Score:2)
Merging "l" and "r" is kind of tangential to that, since the Carribean nation at issue isn't called "Cubal", either.
Re: (Score:2)
Oops. Meant "ah" and "arr".
(Thanks for the catch. Jumped tracks onto Engrish momentarily.)
Re: (Score:1)
That's annoying, but try working your way through one of those systems with kids running through the room halfway through the call (or, having your parents yell down to the basement, depending on your situation). Accent doesn't matter, the system will think you just agreed to buy something.
I wonder (Score:5, Funny)
Web: Oh Yea baby!
User: fap fap fap fap fap
Web: Wow that's it yea!
Re: (Score:1)
Re:Waste of Bandwidth (Score:4, Insightful)
When you're talking about millions of terminals vs. relatively few servers, the "dumb" terminals are cheap. Also, doing good voice recognition requires beefy hardware -- probably, ideally, DSP/GPU accelerator boards or a google-style huge cluster of commodity PCs. Finally, for blind users, but also for others, listening to even the best synthesized voice gets tiring/grating after a while. It's much nicer to listen to good speech from a professional narrator, over even a normal human speaker, much less a "good" voice synth.
I still think it'd be better for everyone if they worked on supporting a globally usable standard that could be applied on any machine, like CSS aural media, though. TTS and voice recog is probably the future anyway, might as well start taking it seriously now.
Re: (Score:2)
That's true, but recognising individual words against a table of individual words is a much less complex task than groking a sentence for grammar and essential concepts, and checking those against a huge dictionary or concept map for the difference between 'hey man' and 'hay, man', in real-time, with a long, on-going speech, given some background noise from passers-by. That said, just about any voice recog would probably come up with something better than txtspk :)
Is this gonna be like CB radio? (Score:4, Funny)
Breaker breaker, good buddy! Thanks for visiting my online speakin' site! My handle is: The Delta Lady! If ya'll wanna visit my cousin Watts' site, just say "bacon." If'n'ya wanna hear a special Christmas story about varmints pullin' Santa's sleigh, say "Merry Chris'mas, ya'll!"
Re: (Score:2)
Re: (Score:1)
Fuck ya'll, I'm from Texas and I guarantee you my grasp of "langauge" is farly superious to yours!
Re: (Score:1)
April 1st (Score:1)
Is Patronising India Really Good for Business? (Score:2)
From the RTFA,
Andhra Pradesh and Gujarat and the kind of innovations that people came up with were just mind-boggling, Gupta said.
Is IBM saying that these people from Andhra Pradesh and Gujarat are mind boggled when they are introduced to "Phone Mazes From Hell?" That the rest of us have had to endure from the Faceless Ones [wiktionary.org] for years? Or is Gupta saying that these noble folk were mind boggled when they hear voices respond back on a cell phone?
Defeats the purpose of the internet (Score:2)
Isn't the purpose of the internet to AVOID having to talk to people?
sounds like their Opera plugin on Zaurus 5600 (Score:2)
IBM had an addon or something for the Opera browser which was shipped with the Sharp Zaurus 5600 which took in speech and did recognition against web page stuff. I remember their demo having the ability to take in spoken orders for Pizza and flight reservations right into the browser. It worked pretty good but background noise was an issue from my experience.
It never went anywhere on the Zaurus mostly because the Zaurus didn't take off. Sharp attempted to build an open source software platform but didn't th
Can You Hear Me Now? (Score:2)
What good is any of this HTSP tech if the computer still can't parse speech into text or symbols? Speech recognition doesn't really work, not accurately enough for mass use on Web PCs or mobile phones. Even speech synthesis, a much easier problem, isn't really that great.
I smell another IBM submarine patent farm, not an actual "innovation factory".
After a slashdotting... (Score:2)
We are experiencing unusually high call volume. Your estimated hold time is 345987 minutes.
Hey doofus... (Score:1)
...wait until you get older and have to try and see that crap on those tiny screens with your old quadfocal eyes and try to type on them teeny designed for Japanese kids keyboards with your stiff fingers, then you *might* get a clue why a spoken way to interact with the web on those devices might be useful. I'd like that on my desktop, let alone some Lilliputian cellphone.
Now, don't get off my lawn, see that mower? Yank that cord and start pushing it and work off some of those cheetos!
It's not Dr. Spaetzo (Score:2)
Inevitable Response..... (Score:2)
Headline:
"IBM Develops Technology To Talk To Web "
Following-Up Story Headline:
"Web Talks Dirty To IBM"
Story is a Dup? (Score:1)
VoiceXML is old news.
http://slashdot.org/articles/01/03/14/1622217.shtml [slashdot.org]
HTTP works great with them so why do we need a new protocal anyways?
I saw those guys at a conference (Score:1)