Voice Recognition for a Techie? 102
kaybee asks: "I am a long-time developer, sysadmin, and general computer junkie (for fun and for work) who needs to seriously curb the usage of his hands. I'm curious as to the current voice recognition options, preferably usable on Linux and Windows. I prefer the command-line to a GUI, I prefer Vim to anything else, and I still read my email with Pine. I'd like to hear options for sending email via voice, which I hope is easy, and I'd love to hear of any solutions that allow effective coding via voice, which seems much more difficult."
Computer.... Computer... Hello computer... (Score:5, Funny)
Re:Computer.... Computer... Hello computer... (Score:3, Funny)
Write it yourself (Score:3, Informative)
'Course you could go the other way with some Open Source speech recognition and cygwin or similiar.
Circles within circles (Score:2)
Re:Circles within circles (Score:2)
Re:Circles within circles (Score:2)
Re:Circles within circles (Score:2)
Re:Circles within circles (Score:1)
Re:Write it yourself (Score:3, Informative)
Re:Write it yourself (Score:2, Funny)
Re:Write it yourself (Score:1, Informative)
Why is this modded down? (Score:1)
Had he said "with no need for wine" would you have modded him down?
Even if he said "wine is shit", just because you don't agree with an ON TOPIC, informative post is no reason to mod it down. Read the moderator guidelines.
You can mod this down too if you like.
Re:Why is this modded down? (Score:2)
Re:Why is this modded down? (Score:2)
Sera
Re:Why is this modded down? (Score:1)
User page:
http://slashdot.org/comments.pl?sid=164643&cid=137 45635 [slashdot.org]:
I doubt that's supported (Score:1)
Sysadmining by voice (Score:5, Funny)
shudder
Re:Sysadmining by voice (Score:5, Funny)
Oh yeah?
{ } . ! /
& ; ^ # -
< > @ \
{ } _ SYSTEM HALTED
Left titty, right titty, dot bang slash.
Ampersand semicolon, caret pound dash.
Less than greater than, at back slash,
left titty, right titty, under score crash.
* # ! ! (
~ & | )
' " . . DEL
# ^G ! ! working... done.
Star pound bang bang, open-paren.
Tilde and pipe, close-paren.
One quote, two quote, dot dot delete,
pound bell, bang bang, process complete.
- Doktor Dynasoar posting some ASCII poetry [google.com], and the thread also includes the immortal Hatless Atlas [wards.net], which I'm not even going to fantasize about getting past the filters.
Re:Sysadmining by voice (Score:5, Funny)
$ $ $|-|1+
# 3 11 H E LL
A general protection fault has occurred. A general protection fault has occurred. This application will be terminated.
slash bang open bracket dot star plus,
dollar sign dollar sign code for cuss,
pound three eleven, H-E double-hocky-pucks
BSOD. BSOD. Windows really sucks.
Re:Sysadmining by voice (Score:3, Funny)
Re:Sysadmining by voice (Score:2)
Re:Sysadmining by voice (Score:2)
Re:Sysadmining by voice (Score:2)
Hand use (Score:5, Funny)
Lest they... *ahem* wander.
Re:Hand use (Score:1)
Re:Hand use (Score:2)
Re:Hand use (Score:2)
Re:Hand use (Score:2)
Re:Hand use (Score:2)
wouldn't bother yet (Score:3, Informative)
Speaking WPM != Chars Per Minute (Score:5, Insightful)
Lets say you type at about 40wpm, or about 160characters per minute (this is a low estimate of 4 chars per word), or about 2.5 characters per second.
To be as productive speaking, you'd probabily have to speak about the same number of words per second as you type characters, or 2.5 words. That's really fast.
Sorry bub, doesn't look speech is a very good alternative. Hell, Brain Implants on the other hand...
Re:Speaking WPM != Chars Per Minute (Score:3, Insightful)
The best thing to do is take a rest on your hands, and get professional help. Voice recognition for coding sucks.. believe me. You're better off doing something else altogether if it comes to that.
Coding is very precise work, and voice recognition just isn't good at that. If you try coding with your voice, you'll soon find that your voice hurts, and you've been immensely frustrated at the whole experience.
Have you had medical attention to your hands?
Re:Speaking WPM != Chars Per Minute (Score:3, Informative)
No. Not true, even in English. For example, "c" does not make a sound distinctly different from all other characters. Some letters, such as "x" make sounds that can easily be made from a combination of other letters. Including pairings and such, linguists say that the English language includes something closer to 45 single sounds.
I used to teach Special Ed and saw software that could recognize entire words and use them in writing in a word processor. I have not
Re:Speaking WPM != Chars Per Minute (Score:2)
Re:Speaking WPM != Chars Per Minute (Score:1)
Re:Speaking WPM != Chars Per Minute (Score:3, Informative)
Low Back Shibboleths (Score:2)
With, of course, the classic case of cot, caught, and bother, which are defined with three different phonemes, but where the average person in the use uses only two of them based upon region.
Re:Speaking WPM != Chars Per Minute (Score:1)
I think what GP means is that you can speak whole words into a word processor and it will match them against words in its dictionary. Any it doesn't recognize would need to be spelled out.
When coding, We use a lot of words that aren't in the dictionary. if, else and switch would be ok but Degrees2Radians isn't going to be in any dictionary so you're going to end
Re:Speaking WPM != Chars Per Minute (Score:2)
Re:Speaking WPM != Chars Per Minute (Score:2)
Yes it does, but only when followed by an h. The "ch" sound is distinctly different from sounds produced by any other letters. If it weren't for "ch", yes, 'c' would be a rudendent letter.
Re:Speaking WPM != Chars Per Minute (Score:3, Informative)
As far as I can tell, you're saying that words would need to be spelled out character by character so you'd have to talk really fast to be productive. Custom dictionaries would go a long way towards fixing that. The main issue would be whether a particular speech recognition solution int
An itch to scratch (Score:2)
Re:An itch to scratch (Score:2)
Re:Speaking WPM != Chars Per Minute (Score:1)
Hmm, as for coding it does make one think a bit. I think you might see this sort of thing eventually for coding, but you'd need a special compiler (and perhaps language) that had a bit of AI in it to avoid silly mistakes with commenting, commands, variables, that sort of thing. It could work potentially thoug
Phonetic Punctuation to the Rescue! (Score:2)
Of course, we will have to extend it - Victor Borge didn't have sounds for #, < or > - but I'm sure we can come up with something.
Of course, some programming languages will be better than others - Ada will sound almost normal (other than having to bark out all the words in your best Drill Instructor parade voice), while Perl.... you'll need a good sock on the mike to keep the spit out, and people will t
Find ways to save typing effort (Score:3, Interesting)
Voice Wreck Ignition (Score:2, Funny)
Save your hands -- while you can (Score:5, Informative)
Seriously, if you're suffering hand or arm pain, you should think about the way you're doing things now. Speech recognition is unlikely to replace your current coding practices, although it might help with writing reports.
Instead, try using the keyboard break feature in gnome. To start with, have it kick you off your computer every 30 mins for a 3 min break, and don't allow yourself to postpone breaks. Get some equivalent software for windows too. Use your 3 min breaks to walk around and stretch. Within a week, you won't be a lot less productive, but your arms will feel a lot better. Then you can maybe up it to 40 mins. In the short term, a course of anti-inflams might help (ask your doctor).
Also, don't come home in the evening and play games on your computer, or do more work. Your arms probably can't take it. Equivalently, inform your employer of your condition and subsequent inability to work reckless overtime hours.
These two things should get you started for long-term sustainable maintenance of your arms.
the bonfire analogy (Score:2)
Re:the bonfire analogy (Score:1)
Re:the bonfire analogy (Score:2)
But I'm getting better now.
I had the same problem -- though mild (Score:2)
Mouse as little as possible.
Anti-inflamatories are your friend. Much of what's happening to your hands is your own body's doing.
You need to be able to heal over night as much as you do damage during the day. Do
Linux Adaptability (Score:4, Informative)
Read down to the section about speech recognition. I hope that helps.
Try using a GUI for email, etc... (Score:2)
Voice recognition is still hit-or-miss.
Re:Try using a GUI for email, etc... (Score:3, Informative)
Re:Try using a GUI for email, etc... (Score:1)
But in Nintendogs, you're talking to a dog. It's okay if it doesn't quite understand the first time. In fact, that's expected and part of the "charm" of the dog. I don't mind telling my dog to "sit" a couple times, but if I had to tell my computer to "save" three
Shoot! (Score:5, Informative)
You can, with some tweaking, even get it to understand complicated stuff. If I say "manual g r u b", I can get "man grub". "Vi save quit" could be mapped to ":wq" without too much trouble.
Anything you can type, it can do.
I don't think it works under Linux. I don't know of anything like it under linux. It does, however, work quite well inside PuTTY.
Re:Shoot! (Score:2)
In terms of Windows, the best that I know of is still Dragon Naturally Speaking [nuance.com], though I strongly recommend pirating it first to decide if it serves your needs. Unfortunately, even with regular training it still gets things wrong with alarming frequency. You have to retouch e
mmmmmmmaudio (Score:4, Interesting)
I've wondered about this myself. I tend to use my computer with the headphones on. Often, I'm listening to music or.. well just plain silence, just the standard dings of Windows. I do pay attention, though, to the sounds coming from the computer. (i.e. the traditional hoo-hoo of recieving an email.) I've always wondered about what more could be done with sound to make the user more aware of the goings on with their computer, especially when a number of apps are actively working. I think I was inspired by an episode of Futurama I caught. One of the character's personalities was in the Pilot's body. The Pilot, whose personality was in yet another body was trying to describe how to interact with the ship. I remember him saying "Can you hear that faint little tone? That's the status of..".. or something or other.
In any event, it's fun to imagine. I wouldn't mind if a soft low-volume voice were to say "You have recieved an email from: John Smith." I had a job a few years ago where that would have been a nice little feature since messages would come in that required urgent attention. My solution to the problem at the time was to use a custom filter that would specficially notify me of important messages by bringing a little window up to the surface. That was fairly annoying, though, when the computer was busy and it was slow as molasses to get the window to go away.
Re:mmmmmmmaudio (Score:2)
Re:mmmmmmmaudio (Score:2)
Re:mmmmmmmaudio (Score:1)
OH man. I was thinking Farscape, and I typed Futurama. Geez. Yes, you're right. Man, double embarrasment.
Hehe. Thanks, man.
Techie (Score:2)
I hope better voice recognition and TTS will resolve that.
OSSRI, VoiceCoder (Score:3, Interesting)
perlbox using sphinx (Score:5, Informative)
http://perlbox.sourceforge.net/ [sourceforge.net]
http://cmusphinx.sourceforge.net/ [sourceforge.net]
Command and control is a lot easier to do with voice recognition since the dictionary the engine has to choose from is so much smaller. Having voice recognition engines understand arbitrary words well is still a bit difficult.
IBM ViaVoice (Score:2, Funny)
Google it.
Not Voice Recognition but still helps RSI (Score:1)
Re:Not Voice Recognition but still helps RSI (Score:1)
A thought for coding via voice: (Score:4, Interesting)
Find a subset of words that are short, easy to remember, easy to say, and above all -- accurately translated by the chosen voice recognition software.
Then create a small perl script that can take this coded input and convert it into a nicely formatted chunk of code.
You can have different translators for different target languages... for example
In shell programming, you might have the following:
hash -> #
bang -> !
pipe -> |
test -> [
end test -> ]
mark -> '
quote -> "
end mark/quote (keeps them balanced for shell scripts)
for identifiers... don't name them. For example, lets' say you wanted to do this:
#!/bin/bash
function hello_lcase {
HELLO = $1
if [ -z $HELLO ] ; then
echo "Hello world"
else
echo -n "Hello from "
echo $HELLO | sed -e 's/.*/\L\0/'
fi
}
you would say:
hash bang slash bin slash bash
new function 1
set local 1 ref in 1
if test empty ref local 1 end test
then
echo string 1
else
echo option n string 2
echo ref local 1 pipe program s e d option e space
mark s slash dot star slash back upper l back 0 slash end mark
end if
end function 1
you'd run the perl script and it'd ask you:
what do you want to call function 1: foo
what do you want to call local variable 1 in function 1: HELLO
what do you want to use for string resource 1: Hello World
what do you want to use for string resource 2: Hello from
and it'd output the script (maybe after running through indent)
You could substitute "1" for any easily recalled mnemonic or symbol the text->speech translator is unlikely to mistranslate (in this case "foo" and "hello" would probably be fine as is)
Then you'd get a chance to globally "refactor" your symbols and give them nice-looking names, only having to type them once.
Re:A thought for coding via voice: (Score:2)
So instead you might have:
new function shrink
set local hello ref in 1
if test empty ref local hello end test
do you want to rename function "shrink": hello_lower
do you want
More than just the technical aspect... (Score:2, Interesting)
VoiceCode (Score:2, Informative)
Re:VoiceCode (Score:4, Informative)
Unfortunately, IBM released it in 2001-2002, then forgot about it. They've since gone onto their non-training voice recognition solutions for sale to businesses. They seem to have advanced, but not in any retail product.
Dragon has come out with updates, but from people who have used and trained on *both*, ViaVoice has higher accuracy (~1% difference). The ViaVoice product price has fallen, and Dragon has, of course, gone up....
Whatever product you get, get a fast 2+CPU machine with lots of RAM - 2GB or more. The ViaVoice algorithm adapts to your talking speed -- it will perform more looks and comparisons and have greater accuracy as the processor speed goes up. ViaVoice stops comparing when it runs out of time (your speaking has gotten too far ahead). But it listens to the words, in context, to determine spelling. The more memory it has, the more vocabulary it can pull into memory. Note -- I am saying get a dual-cpu (or dual core) machine, the faster the better.
Viavoice was also released on Linux, but without as much application support.
For coding support in voice products -- there just hasn't been enough demand.
But for "wrist support" -- try a multi-faceted approach. Maybe voice recognition, maybe a tablet for input? Ergo keyboards, trackballs? It's not a comfy field. There isn't a great financial incentive to develop voice input for coding when you can hire foreigners for peanuts, and keep having eager generations of new hackers to come and be sacrificial lambs on the keyboards of progress...;-)
speech recognition for Linux (Score:3, Informative)
For something that runs on Linux directly, you might have a look at the Accessible Speech Recognition Technology software [slashdot.org]. It's a research project, not a polished system, but you might be able to hack it to do what you need.
Re:speech recognition for Linux (Score:2)
Sorry, I screwed up the URL above. Here it is: Accessible Speech Recognition Technology software [msstate.edu].
Terminology (Score:1)
OTOH, "Voice Recognition" is used to describe the process of taking a voice sample and c
Re:Terminology (Score:2)
Re:Terminology (Score:2)
xvoice (Score:3, Interesting)
However once you get past all of these issues (actually even running the old gtk1 xvoice becomes hard on modern dists), it works a charm. As it's X clean, you can X to any X server, be it one run under OSX or Windows, or a Sun SPARC box. You just need the mic connected to the x86 Linux box the client runs on.
This meets your requirement for editing in vim etc. The accuracy, I found was fantastic.
Using speech recognition for anything... (Score:1)
Coming from speech recognition (Score:2, Informative)
The main problem is that no one actually speaks or writes as eloquently as people present speech recognition.
Try this experiment: map backspace, delete and arrow keys to @ and try to write a letter or some code. You'll quickly give up. When you see demos of speech recognition, you neve
Voice Recognition may lead to RSI in Vocal Cords (Score:1)
Almost
It's sad that voice control has been deemphasized. (Score:2)
The initial product package even included a headset microphone in the box.
Not many people used it, and at that point in time it required some initial training to use in an effective manner (it had to learn each person's pronunciation habits), but there were still a few folks I knew that got a lot of mileage out of the technology at the tim
SpeechLion (Score:2, Interesting)
http://freshmeat.net/projects/speechlion [freshmeat.net]
Plug for Dragon (Score:1)