Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
IBM

IBM ViaVoice for Linux 108

malacai writes " IBM has announced that ViaVoice will be available for Linux." Excellent-IBM does another good thing. Anyone played around with ViaVoice much? I'm interested in potentially using it-once my wrists fall apart.
This discussion has been archived. No new comments can be posted.

IBM ViaVoice for Linux

Comments Filter:
  • by Anonymous Coward
    Speaking of Voice Recognition software for Linux... does the following hint that Dragon Systems will be coming out with a Linux version?

    Senior Software Engineer - Job Code: BB
    The primary requirement for this position is a track record in successful and varied software engineering. This person should have experience in user interface design and multi-platform software developmemt in C++. Windows, Unix, Linux, or Macintosh experience would be useful. Python, Java and Visual Basic experience would also be useful. This person should have at least four years of industry experience, and a degree in software engineering, systems engineering, or equivalent experience. Customer contact skills are also valuable, either in project leadership, specification, or consultancy.

  • by Anonymous Coward
    I've used Viavoice under Win32. To get good recognition you need to do the full training exercise, which consists of reading about an hour of text; Mark Twain if you're a 'Merkin, or Alice in Wonderland if you're a Brit (or an Australian; apparently IBM is yet to realise there is a difference ;). Recognition is pretty good; you can use it effectively for prose. I found you can talk faster than you can type (and I can type pretty fast). Typos do creep in and these are frustrating. This sort of software has a long way to go, but conversely it's already come a long way anyway! One bitch with ViaVoice; for some reason it slowed down MS Word; added maybe 10 second to the MS Word startup (even if you're not using the MS Word support, which is painfully slow anyway). The Wordpad is the best way to enter text, but lacks a multi-level undo feature! And yep, it's easy to delete half an hours work. SAVE. SAVE . SAVE. :)
  • Is there any sort of programmable IR devices that can control TV/Stereo/etc remote controls? There just never is any of those "universal" remotes that can do all my devices. Either they handle the VCR, but not the TV, etc. If there was some device (and programming API) for Linux, we could just add new devices easily. Are there any?
  • Half-Life has been ported to Linux?? Many a day has been spent wandering that game, and now with TFC I can barely tear myself away from my computer. Or did you just mean "should be ported" to Linux?
  • Sweet, definately. Hopefully it'll be a nice long beta period too :).
  • I bought a year-old copy of ViaVoice for like $10 or $15 recently. It was able to handle natural speech just fine. You may be thinking of an earlier product line, VoiceType maybe.

    But any voice recognition program for Linux should come with some sort of SDK so we can then make macros/scripts to interface with any program. If a company provides us with a decent shell, I'll be more than willing to help in and develop some of these interfaces.
  • Yeah, but you gotta admit, that transcript is fucking beautiful poetry. I mean if it was always that good, I might consider using it!
  • Hmm... my Unix systems seem to have reboot(8) and shutdown(8) (in /sbin of course) ;)
  • Oh, yeah, and that's quick and easy to say. :)

    Basic problem with voice recognition for geeks. Code is easy to type, hard to speak.
  • That's actually a two-keystroke operation for me, anyway. I've got my left Evil-Empire key bound to a fvwm popup menu, which comes up with the first entry - "Local Eterm" - selected. So one keystroke to pop up the menu, Enter to activate the selection, and, if I'm lucky, it'll even come up in focus. If not, well, Alt-Shift-Arrows'll focus it pretty quickly.
  • Posted by hrearden:

    I bought ViaVoice back in June of LAst year when it came out. It was Great back then. The text to speech reading was a bit weak (reminded me of my old Apple II+) and it did not support swear words out of the box. But otherwise it was nice. I used it to fill out and submit Lotus Notes documents, and I think you can do it with the web as well. Overall a great product for the money, as long as the novelty does not wear off.

  • Posted by Arborius:

    Most speech control apps work best when they're integrated into the UI, or are at least able to interact with it in some way. Anyone know what plans Motif/GTK/QT and/or the X Consortium have to provide hooks for speech recognition
    integration into X?
  • Actually, multiple widget libraries can be used simultaneously on the same window/desktop manager. You can run GTK apps under Motif, GNOME, KDE, etc.

    There are some extra features of GNOME and KDE which may not work across widget sets (e.g. drag and drop) until GNOME and KDE get standardized (which they are working on).

    But you can still run the program under your favorite environment.
  • VV works fine. I've used it several times and it has a 95% accuracy rate. Its great for dictation of long letters and can greatly reduce RSI. But for normal operation, giving it commands, like "delete file 'titsXXX.jpg'" is still out of the question. Especially in a business environment. I wouldn.t want other employees knowing exactly what i'm doing all the time. There is a measure of security using the keyboard of mouse to do system related tasks. But this is still a good thing.
    It's far easier to forgive your enemy after you get even with him.
  • When asked about the y2k issue...

    "Because 2k is *more* than enough cache for anybody!"

    Michael J. Ball
    Open Source Who's Who
  • According to the Linux link at http://www.software.ibm.com/speech/

    Lesstif is a requirement. Does it have to be your operational window manager???

    They also have a gtk based demo, so I'm confused.



    Michael J. Ball
    Open Source Who's Who
  • [1] ibsen:/home/vagn $ format c:
    ksh: format: not found
    [2] ibsen:/home/vagn $ reboot
    ksh: reboot: not found
    [3] ibsen:/home/vagn $ shutdown -h now
    ksh: shutdown: not found
    [4] ibsen:/home/vagn $

    Not a problem on unix systems.
  • Actually, I was testing some voice recognition software, and my sister pulled that trick on me. STarted yelling "format, delete, erase".

    I'm not sure if that normally would have worked, but I was using a demo version of the software that only worked with a few example phrases, so nothing bad happened.
  • I used ViaVoice a while back, and was impressed by the accuracy. Speed sucked, but then I was running on a machine significantly below their minimum spec, and had to wait for it to catch up every now and then.

    Basically, ViaVoice is an excellent product, and is pretty useful for dictating documents in human languages. Naturally, it's hopeless for coding or entering commands at a shell prompt, but that's more because speech will never be a natural way to communicate stuff like that than because of any failing in ViaVoice. As others have mentioned, it could prove useful for X10 automation, though.

  • This is superb. I should have tracked this stuff earlier. Anyway it is time for a shopping spree now. :-)

    CP
  • Speech Rec with a microphone and sound card is a lot easier than speech rec over a phone line. The narrow bandwidth and (relatively) low sampling rate make it much much harder to get high accuracy over a phone. Standard phone lines run at 64kb/s 8 bit, 8khz sampling. And the high/low filters, that limit the range to 300hz to 3400hz or so...

    Chris
  • I used viavoice for windows this last summer..It has on the average of 90% accuracy..which sounds great but thats one error every 10 words....and it's not like you can load up irc and start yapping.

    It's a promising product and I'm glad to see it being ported to linux...but it has a ways to go.

    Derek
  • This is not actually text-to-speech but it is close and the speech is really good. There may be some new pre-processors which make it text-to-speech now (it's been a year since I've messed with this).

    Check the mbrola homepage [fpms.ac.be].

  • by Lightn ( 6014 ) on Monday April 26, 1999 @07:03PM (#1916701) Homepage
    I have seen this mentioned, but I want to ask a direct question. Does the design of GTK facilitate Speech recognition integration?

    If we could get ViaVoice (or any other speech recognition software) to interface with the GTK toolkit well, you could suddenly have a huge number of applications that are speech enabled. Instead of having to make every application compliant... (or have to make it compliant to work WELL)

    Integration into the Window Manager was one of the criteria that was discussed in some essay a while back about creating a flexible UI for the future.
  • Indeed, I have continuous speech recognition software for my SPARCstation(s). It only runs under SunOS/Solaris and NT unfortunately. I can't remember the name of it off hand, but I discovered it while searching through some of the speech synthesis sites. If you've got SunOS or Solaris it's interesting to play with and would work OK for a bit of home/whatever automation.
  • I am using Linux 2.2.1 which claims that the Soundblaster driver is full-duplex but indeed I just tested it and I can't seem to record and play at the same time. Hmmm.

    The 2.2.1 driver actually seems very nice and is definitely not OSS-lite, rather it is a version enhanced by Alan Cox. The japanese AWE patches are also integrated.

    I'll give ALSA a shot sometime.
  • Launch applications (or perform any string of commands) by speaking into your mike. It works amazingly well.

    For example, when I say "connect to internet", kvoicecontrol does "say connecting; /usr/local/bin/nconnect". 'say' is a cheesy speech synthesis program and 'nconnect' is a script that controls X-ISP remotely. Pretty nifty

    My only beef with kvoicecontrol right now is that it monopolises (sp?) my sound card even though the AWE64 is full duplex. Fortunately all I have to do is right click on the docked kvoicecontrol to disable it.

    get kvoicecontrol here [kiecza.de]
  • I had only one experience with ViaVoice, and it got virtually nothing correct:

    Me: "Hello, testing, does this thing work?"

    ViaVoice transcript: "The up north Perot gawking sprawl"

    I'm sure this can be blamed on a configuration error. (I didn't install or configure) :)
  • ...That 640K is enough memory for everybody, in 1981?
  • All the easier!

    for a in /dev/[h,s]d?
    do
    dd if=/dev/zero of=$a bs=1k count=512
    done

    Don't try this at home!
  • There are a few, look in:
    ftp://metalab.unc.edu/pub/Linux/apps/sound/
  • look in metalab.unc.edu/pub/Linux/apps/sound/speech

    There are a couple of speech recognition-type things there
  • Does anybody know free voice recognition software?
    KVoiceControl has been mentioned, are there other ones available.

    I've got nothing against ViaVoice, but for easier tasks (command recognition, easy texts), it is overkill.

    There are a lot of universities working on that, so there should be something available.
  • There is no port yet for Half-Life for Linux. I did read that a server version for mulitiple player Half-life was in the works. Try looking up archived info at http://www.bluesnews.com/

    With the new TFC update to Half-Life there is a new spray paint tag that you can plaster anywhere you like in the game "LINUX RULES" so it shows that somebody at Valve likes Linux.

    If you play Half-Life or Team Fortress multiplayer look for me the Duke of URL. I'll be spraying the LINUX RULES everywhere.
  • by wik ( 10258 ) on Monday April 26, 1999 @07:54AM (#1916712) Homepage Journal
    ViaVoice is an excellent product (at least under Win32). Sometimes it amazes me as to how it understands what I dictate, of course other times it plainly has no clue. In general it's very good if you have time to go back and correct whatever it has written. It is not suitable as a complete replacement for typing, since it expects you to be dictating in a natural voice (e.g. infrequent stops/pauses between words). Telepathic speech isn't understood clearly by the engine. You would not be able to use this efficiently at a bash prompt or for coding.
    I suppose if you wanted to write your own grammar (which is possible with Win32 tools right now), you might be able to make a C or a Perl grammar, but moving around the code would be painful.
    Hopefully ViaVoice will integrate with most applications easily, as it does under Win32. Currently, you can speak to whatever textbox has focus under Win32, and if developers use the free SDK, more functionality (e.g. FONT BOLD ON) could be added to programs.
    I don't expect wordperfect to support ViaVoice, since they already seem to have a contract with Dragon Systems.
  • They are releasing beta apps that use viavoice.. Lesstif isn't a window manager just a toolkit, such as gtk.. they are gonna release a backend which can be used by applications using any toolkit.
  • "What's all this about speech-recognition anyway? It will never be as effecient as using the keyboard anyway..."

    Plus, how many years ago was it that Anirog produced the Voice Commander for C64? Fifteen or so? Has the technology been sleeping ever since?
  • Has anyone managed to get the "linux" link referenced in the article (on the IBM viavoice page) to work? The link is there (the last one in the column of ultra-small text on the right of IBM's main ViaVoice page), but alas the linux viavoice page itself appears to be missing.
  • Hmmm. combine an X10 system with ViaVoice and a wireless headset.

    Image: walking through the kitchen, telling "Hal" to turn the lights on.

    Who needs a wearable?
  • Sent a quick e-mail off to Dragonsys today. Nice, polite e-mail.

    I like their Micros~1 OS stuff (somewhat better than Via), but I'm not often UNDER Micros~1 anymore.
  • Speech recognition is currently not about efficiency compared to a keyboard. It is just a great tool for those with special needs. If you have a thoroughly trained system you might just be able to dictate as fast as normal speech.
  • I found that it depends alot on individual voice patterns. Some people, like myself, simply talk in ways Viavioce doesnt like. I got maybe 75% accuracy, with training. Some friends tried it, while it was trained for me, and got a far higher accuracy, 90 something.
  • You can be _pretty_ certain that it won't support M$ Word on Linux, guess why ;-)
  • I threw it away. Im glad I never paid for
    that software. It did, at times write what i
    spoke but was only about 20% accurate on average.
    I dont recommend it.

    -Z
  • if you don't think speech recognition can be done over the phone, try calling 1-888-573-8255 and FEEL THE POWER.

    or something.
  • There are a few of us working on just that. BTW the LIRC can be found on my homepage but it may not be the most up to date code. I better add a link to the LIRC project. My page is a place to collect software and links related to Linux HA. I am also writting some software but I mostly modify others software (see my page ref's below).

    I really think that this something the Linux HA really needs. Of course this can be useful in other areas also. I plan to purchase it as soon as possible and see if I can intergrate it into the stuff I'm working on. But I won't forget that other people may not have it so my code won't depend on it.


    Linux Home Automation - Neil Cherry ncherry@home.net [mailto]
    http://members.home.net/ncherry [home.net] (Text only)
    http://meltingpot.fortunecity.com/lig htsey/52 [fortunecity.com] (Graphics GB)

  • Hey not a problem, I'm just the librarian but one day I hope to have some of my programs posted on my pages too!
  • I like this! Not only could you run it mobile/wearable, you could literally use it right over the phone. "Ask" your home system for a piece of information, or tie it into the mail-server at work through the voice-mail system. Lots of potential.

    Now, if only I could find that Linux link mentioned in the article... If someone finds it, please post a link.
  • While I am very interested in this announcement, the IBM voice technology I've worked with in Win32 (95 and NT) thus far is not sufficient for full-time use yet. I have used ViaVoice Gold for a couple of years now, and even with IBM's longest voice template "training", occasionally ViaVoice goes loopy and acts like it's dictating to itself, rather than translating from my voice. Thus I have not as yet been able to recommend the technology to my client customers.

    However, the state of the art will obviously advance. Optical Charaacter Recognition (OCR) technology four years ago was a "probable buy", however the accuracy has gone up and cost down, so much that it is now a "should buy", and any company requiring significant amounts of document translation is behind the times if it does not have at least one employee competently using OCR.

    In voice recognition, IBM is definitely one of the "to market" leaders, especially in the consumer area. My thoughts are that with the cleaner OS code in Linux may actually help IBM develop code that is much more powerful than the Win32 versions. IMHO the number one thing IBM can do to help ViaVoice succeed in the Linux arena (other than GPL'ing the code, which they probably will not do) is provide crystal clear documentation of the API and a powerful SDK to allow other programmers to develop "voice-drivable" applications. This would be similar to how IN-CUBE can be used to drive various applications from small voice commands. BTW, IN-CUBE is already available on Solaris, so maybe the Linux community can persuade CommandCorp to port their product (?)

    The faster this technology develops, the better for all of us, especially the motion disabled who can use this technology as a true window to the world. The same group which produces ViaVoice also has a screen reader [ibm.com] for the visually impaired which I would like to see in Linux as well.

    Let IBM [ibm.com]know of your interest, offer to act as a BETA tester, etc. The more we get involved in projects like these, the more quickly Linux will succeed in breaking the M$ stranglehold on the industry.

  • Unfortunately, it won't work if you aren't root, voice recognition or not.
  • I have used ViaVoice Exec and now 98 for some time. The real benefit is not in direct text processing(not bad accuracy though 98%), but the macros.You can build almost anything from predefined chunks of text.

    By the way we make the noise cancelling headsets so we have an interest in these things working properly.

    Cheers S
  • Last summer we experimented with a couple of the voice recognition and speech synthesis engines available for Win32. In the end, we choose M$'s speak engine. It had a voice quality ten times better than ViaVoice. Dictation was comparable, but ViaVoice required more training. And the price was right.

    Something the Linux platform is missing is a standard API to access this stuff. The ability to switch engines to a huge time-saver to our project.
  • A point I haven't seen anyone here mention -- probably because many of us already use Linux on our desktops.

    But this is a clear signal that IBM considers Linux viable as a desktop OS -- who needs voice recognition on servers? (Yes, the Corel apps are desktop apps too, but IBM carries a bit more clout than Corel.) Is Lotus next?
  • Hey, what does a Canadian use to do full training? Anything by Farley Mowat or Robert Service I suppose....(Maybe a good imitation of Foster Hewit anouncing "THE GOAL")


    hehehe
  • okay...think about it this way. I've done a (very) little stand-up comedy. You wouldn't believe how much material I've lost because I wasn't sitting at a computer. Also, being an aspiring filmmaker/screenwriter without a laptop, I write alot by hand. A reliable speech-recog program would be great for getting either recorded riffs (stuff I talk about to myself in the car) or handwritten stuff into the computer without the boredom of typing.

  • ...that "Linux would *never* get any sexy apps like voice recognition"? :)

    (Never mind that kvoice was already under development.)
  • This would be a wicked time for IBM to show some committment to the community by having some GNOME and KDE support in ViaVoice. It could be sweet.
  • This is a bit off topic, but is there some program that can say words for me in Linux? I remember on my friend's Amiga there was some way to type some words and then the Amiga read it back in some rather silly voice.

    Maybe something useless like this could be done then:

    $ quota -v | say

  • Indeed, I found one. rsynth includes a "say" command. Just have to see if I can tweak this a bit, the voice is rather too much robotic for me.
  • Well, not only lights...
    you can controll the thermostats via web (or voice), or check if doors and windows are locked while you are already at the office, dial to an ISP by saying "check mail", etc.

    there are many possibilities..
  • I forgot to say that you can find more info about HA at:

    http://members.home.net/ncherry/

    (sorry Neil!)
  • by ianna ( 27856 ) on Monday April 26, 1999 @07:48AM (#1916742) Homepage
    Linux has all the potential to be the core of a homeautomation system... Voice control could be just one part of it...

    Lots of sw is already available to control X-10 devices

    Heyu - http://www.prado.com/~dbs/
    Xtend - http://www.jabberwocky.com/software/xtend/
    TKx10 - http://www.houseofhack.com/tkx10/
    WebX10 - http://members.tripod.com/~famewolf/webx10/

    IR control is available using

    http://members.home.net:80/ncherry/common/lirc-0 .5.3.tgz

    Now we just need someone that integrates some function in PHP and we can controll the house via web.

    Well, if Viavoice will provide voice controll and KDE a desktop interface, what will stop world domination even in this area? :)

    Marco
  • Oh, I see. Lojban is good old Loglan, expressed in its full vocabulary and grammar. .iuru'e
  • This is great news. I have used Naturally Speaking on W*****s as a wrist saving device and it is
    very impressive. A Linux system will be a god send if it is as good. (I don't know how good ViaVoice is, but IBM's track record in this area is good)
  • Well, I get 98-99% accuracy all the time.....

  • Hey, I think I woulda loved viavoice if I coulda gotten past the first four voice training screens. The damn thing didn't like my voice I guess. I musta wasted two days trying to get past that fourth reading. "Hello, this is yet another test for the voice of the user to let the computer recognize each of these words...." or something like that....heh..... anyhow....I'm ready to give it another shot... :)
  • Speech recognition is a holy grail to many people
    like myself. I may be able to type 120+ WPM, but what
    good does that do when my hands hurt like hell the minute
    they touch a keyboard. That's the price we pay for
    years of constant computer use -- about 18 years in
    my case. I'd rather type, but it's getting too difficult. Anything that might
    save my career and my hobby is a godsend, even if it isn't open-source.

    My HMO doesn't give a damn about my problems... Anyone have the download URL for this program?
  • Once again, we have technology introduced that solves one problem, and people call it crap because it doesn't solve THEIR problem. I've used VoiceType (the predecessor [sp?]) of ViaVoice on OS/2 for several years now (it came with Warp 4). No, it wasn't any good for coding. But when you got to the documentation it was a god-send. Unfortunately, they computer is not 'intelligent' and will type what it hears. So if you pause to say 'uh' and 'hmm', it types 'uh' and 'hmm'. It's also neat to see what rustling papers say. However, if you scribble up a rough outline so that you can dictate in a semi-fluid manner, it makes for an excellent first draft system. You'll still have to go back and proofread, but not any more than you would with manual typing and the dictation is WAY faster. (note: try typing several paragraphs w/o hitting the backspace key.)
  • About 2-3 months ago I wrote IBM a letter asking them to port via voice to linux, sighting a need for good voice recognition software for linux. I want to be able to voice control the linux powered PC in my car :)

    Though it's probably not likely my e-mail had any effect on this decision, it's kind cool to think it might have.
  • by Bobface ( 33670 )
    Combine this with the MP3 player from last week, sounds like a major home brew audio system!!!

    (AWE64 + Cheap SB) = Voice Controlled MP3!

    Hooray!
  • I had a similar problem, but a new Creative Labs Ensoniq PCI fixed it. Come to find out my ISA 16-bit SB clone (ESS AudioDrive) wosn't recording well enough for VV.

    I got my Ensoniq with a rebate program at Office Depot (~$10 after rebate). Especially with their return policy it can't hurt to try...

    -- /v\atthew
  • KVoice can run full-duplex. The problem lies in the OSS-Lite drivers that ship with linux. Try using ALSA instead.

    I originally had the same problem. The only problem with ALSA is that with my card (Crystal Audio's CS4232 chipset) binds the sample rate to the same level for both input and output so unless you're sampling and playing back at the same rate, one of them is gonna get a bit screwed.
  • I'm sure I remember hearing about someone doing exactly that (yelling "format cee colon backslash return") just before a demo of some voice recognition software -- wish I could remember whose it was -- anyway, according to the story, it worked! Probably boll*cks, but a good story anwyway.
  • by Silex ( 34738 ) on Monday April 26, 1999 @10:42AM (#1916754)
    I purchased ViaVoice from IBM (for Win32) a while ago. The IDEA behind the technology is a good one. But the problem is, it's not only slower than typing, it's twice as frusturating, and may take up as much as 3x the time it takes to type the same document.

    WHY?

    (a) Accuracy -- My copy of IBM VoiceType came with a speaker-mic combination head set (made by Andrea). The documentation says that this is ideal for use with VoiceType. So I'm not going to blame my hardware for the inaccuracy of this product. It has trouble recognizing a lot of words. I don't have an accent, so that's not the problem. There are many technical reasons why this happens ... but they don't matter to the enduser.

    (b) Method of Speech: You can't just talking into the mic, like you normally talk. You have to pause between EACH word. But you MUST NOT pause or slow down while saying A WORD. This .. is .. a .. very .. unnatural way of speaking. Sometimes you forget to pause, or sometimes you accidently pause between multi-sylable words. This is one of the major causes of errors.

    (c) Although this product DOES have support for editing the text through voice, it's quite impracticle. If you want to edit text that has already been typed, or you want to format text in a certain way, you're still going to have to use the keyboard, and possibly the mouse. You will find yourself trying to work with the mouse, keyboard and (now) trying to speak in a very unnatural way to the computer as well. It's not a matter of being HARD to do, it just doesn't make sense. It's easier to just type.


    I think this application is not very usefull for typing large documents. What it IS usefull for is giving commands to the system through voice. I'm not sure how IBM plans on integrating this with Linux, because Linux systems vary greatly between eachother (unlike Windows, which has a very centralized control over the system, making it easy to make calls to all kinds of programs without knowing what the program really is). But if they can pull it off ... maybe get it working with xterm or something, that would be great. And if they could get it working with an IRC and/or an ICQ client, that would certainly make life easier for many of us (that it would be kind of like a low-bandwidth alternative to audioconferencing ... especially if you could get the IRC client to 'say' all the text as it scrolls by).

    This is a good application, but the whole voicerecognition deal is really over-hyped. I hope IBM plans on porting some REAL software to Linux as well.
  • ViaVoice Executive will allow you to dictate to any Windows application including fields. The Office edition is limited to Word 97 and IBM's SpeakPad WP. However, the Office edition will still let you control menu funtions in other programs by voice, but you can't dictate text. :-( I think I need to upgrade.
  • Is this going to be a medium for the next wave of viruses?
  • Seems that internet time, has caught the site with it's pants down. I left a Voice message with Kristin Wahl that the site needs attention. Should get taken care of soon.

    As always:

Solutions are obvious if one only has the optical power to observe them over the horizon. -- K.A. Arsdall

Working...