Slashdot Log In
Open Source Speech Recognition
Posted by
CmdrTaco
on Saturday January 19, @11:14AM
from the hello-computer dept.
from the hello-computer dept.
bedahr writes "The first version of the open source speech recognition suite simon was released.
It uses the Julius large vocabulary continuous speech recognition to do the actual recognition and the HTK toolkit to maintain the language model.
These components are united under an easy-to-use graphical user interface.
Simon can import dictionaries directly from wiktionary (a subproject of wikipedia) or from files formated in the HADIFIX- or HTK format and grammar structures directly from personal texts.
It also provides means to train the language model with new samples and add new words."
Related Stories
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading ... Please wait.

been playing with it (Score:5, Interesting)
--
webmasters: personalized bookmarking [primadd.net] [primadd.net] scripts for your site
wp and phpbb plugin available
Re:been playing with it (Score:5, Insightful)
Re: (Score:3, Informative)
Re:been playing with it (Score:4, Funny)
Re: (Score:3, Funny)
Re: (Score:2)
Prikaz: Start Firefox.
Prikaz: Open new tab.
Zamyechaniye: It would be quicker to use a mouse
Re: (Score:2)
Re:been playing with it (Score:5, Funny)
I could see how it could be useful in some apps (Score:2)
Mask out known audio? (Score:2)
Re: (Score:2)
Re: (Score:2)
It's a speaker-independent, continuous speech recognizer that can be configured to do everything from simple commands to full-text dictation. It's not Dragon'
Re: (Score:2, Interesting)
Are they productive? (Score:4, Insightful)
Re: (Score:2, Insightful)
Re: (Score:2, Funny)
Re:Are they productive? (Score:4, Insightful)
Re:Are they productive? (Score:5, Funny)
For those not familiar with this meme (Score:4, Informative)
Re:Are they productive? (Score:5, Insightful)
So depends who you are on how much it improves you productivity.
Re: (Score:2)
Re: (Score:2)
The biggest problem with text to speech is simply having to train the engine, I found Dragon Naturally speaking 9 not too bad, it's training it to recognize your own unique vocalizations tha
Re: (Score:3, Informative)
You might want to have a look at the voxforge project [voxforge.org]
And this doesn't require changes in the algorithm - just in the model.
-- bedahr
I use only computer dictation for medical notes (Score:3, Informative)
This is not about dictation software (Score:5, Interesting)
Double the Killer (Score:2)
Open Source, or Microsoft-Owned? (Score:5, Interesting)
[...]
you are not allowed to redistribute (parts of) HTK3
Re:Open Source, or Microsoft-Owned? (Score:5, Informative)
Simon does NOT contain the HTK toolkit - it meerly executes commands.
HTK is free of charge and open source (in the strict sense of you-can-look-at-the-code). It is, however, not "free".
We are aware of that and have not packaged any parts of HTK for the release - you have to download it yourself if you want to modify the model from within simon.
It is not optimal, but we don't have the knowledge and / or manpower to code up something similar in a reasonable timeframe. And after all, it isn't that big of a deal, is it?
-- bedahr
Re: (Score:2)
Which languages are supported? (Score:4, Insightful)
Re: (Score:2)
Re:Which languages are supported? (Score:5, Informative)
If you follow the link to the Sourceforge project and look at any of the screenshots (including the one on the front page--at the time when I visited it, anyway), you'll see that they're actually training the software with German. So, it looks like the answer to your question is, yes, it supports more than English.
Open Source? (Score:2, Insightful)
Re: (Score:2)
Aisle of it (Score:5, Funny)
Re: (Score:3, Funny)
Wiktionary != Wikipedia (Score:4, Interesting)
I would've expected that kind of sloppiness on the Register, but not on Slashdot (yeah, I know, I must be new here...)
Re:Wiktionary != Wikipedia (Score:4, Funny)
Re: (Score:2)
Hey this is slashdot... pedantry is the base of most of the discussions here...
you must be new here uh?
Pedant's Revolt (Score:4, Informative)
No it's not - Wiktionary is a sister project of Wikipedia. Not a subproject.
However, I must concur that in my experience speech recognition has been extremely patchy. While using it to issue voice commands is OK (and can be a real time-saver as it avoids going into Start, /Applications, Programs menu etc), dictation tends to be pretty rubbish. Especially when you're demonstrating the new speech recognition abilities in Windows Vista and just happen to work for Microsoft. And be in a loud, echoey expo hall. And using a dodgy mike.
Uses in Telephony (Score:2, Interesting)
Project's webpage in English? (Score:2)
Re:Project's webpage in English? (Score:5, Informative)
We are sorry that there is no international homepage for this yet.
BUT: you are strongly encouraged to contact me with any questions: grasch < at > simon-listens.org
-- Peter
Whither Microsoft? (Score:3, Insightful)
Actually, the reason we're not there yet is because most people don't want it. Keyboards and mice are simply a better way to give instructions to your computer than speech recognition is. Could you imagine the clatter of a dozen or more people in close proximity chattering to their computers?
Re: (Score:2)
"HTK was originally developed at the Machine Intelligence Laboratory (formerly known as the Speech Vision and Robotics Group) of the Cambridge University Engineering Department (CUED) where it has been used to
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
filthy open-source (Score:4, Informative)
julius is open source.
htk is *NOT* open source.
The latter is a micro$oft by-product, as clearly shown by the license [cam.ac.uk] that you have to first agree with and then send your email to them in order to download the tarballs...
myself never done this since 1995.
CMU Sphinx, an other free speech recognizer (Score:3, Informative)
http://cmusphinx.sourceforge.net/ [sourceforge.net]
http://en.wikipedia.org/wiki/CMU_Sphinx [wikipedia.org]