Slashdot Log In
Open Source Speech Recognition
Posted by
CmdrTaco
on Sat Jan 19, 2008 11:14 AM
from the hello-computer dept.
from the hello-computer dept.
bedahr writes "The first version of the open source speech recognition suite simon was released.
It uses the Julius large vocabulary continuous speech recognition to do the actual recognition and the HTK toolkit to maintain the language model.
These components are united under an easy-to-use graphical user interface.
Simon can import dictionaries directly from wiktionary (a subproject of wikipedia) or from files formated in the HADIFIX- or HTK format and grammar structures directly from personal texts.
It also provides means to train the language model with new samples and add new words."
Related Stories
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
been playing with it (Score:5, Interesting)
--
webmasters: personalized bookmarking [primadd.net] [primadd.net] scripts for your site
wp and phpbb plugin available
Re:been playing with it (Score:5, Insightful)
Parent
Re: (Score:2, Funny)
Re:been playing with it (Score:5, Funny)
Parent
Re: (Score:3, Funny)
Re:been playing with it (Score:5, Informative)
Parent
Re: (Score:2)
Prikaz: Start Firefox.
Prikaz: Open new tab.
Zamyechaniye: It would be quicker to use a mouse.
Re: (Score:2)
Re:been playing with it (Score:5, Funny)
Parent
I could see how it could be useful in some apps (Score:2)
The software has to be intelligent to know what to do when you press a button and say "shopping list, plums" etc.
I dont think speech recognition is good
Mask out known audio? (Score:2)
Re: (Score:2)
Re: (Score:2)
It's a speaker-independent, continuous speech recognizer that can be configured to do everything from simple commands to full-text dictation. It's not Dragon's stuff, but it's pretty good.
They even have a pure Java version of it: http://cmusphinx.sourceforge.net/sphinx4/ [sourceforge.net]
Re: (Score:2, Interesting)
-- bedahr
Are they productive? (Score:4, Insightful)
Re: (Score:2, Insightful)
Re: (Score:2, Funny)
Re:Are they productive? (Score:4, Insightful)
Parent
Re:Are they productive? (Score:5, Funny)
Parent
For those not familiar with this meme (Score:4, Informative)
Parent
Re:Are they productive? (Score:5, Insightful)
So depends who you are on how much it improves you productivity.
Parent
Re: (Score:2)
Re: (Score:2)
The biggest problem with text to speech is simply having to train the engine, I found Dragon Naturally speaking 9 not too bad, it's training it to recognize your own unique vocalizations that is the problem. I think text-to-speech and voice recognition is a project that demands wiki-pedia like sourcing of voices in different noisy environments nad using millions of samples of peoples voices to improve the alogorithms, I'm surprised no one at
Re: (Score:3, Informative)
You might want to have a look at the voxforge project [voxforge.org]
And this doesn't require changes in the algorithm - just in the model.
-- bedahr
I use only computer dictation for medical notes (Score:3, Informative)
This is not about dictation software (Score:5, Interesting)
Parent
Double the Killer (Score:2)
Open Source, or Microsoft-Owned? (Score:5, Interesting)
[...]
you are not allowed to redistribute (parts of) HTK3
Parent
Re:Open Source, or Microsoft-Owned? (Score:5, Informative)
Simon does NOT contain the HTK toolkit - it meerly executes commands.
HTK is free of charge and open source (in the strict sense of you-can-look-at-the-code). It is, however, not "free".
We are aware of that and have not packaged any parts of HTK for the release - you have to download it yourself if you want to modify the model from within simon.
It is not optimal, but we don't have the knowledge and / or manpower to code up something similar in a reasonable timeframe. And after all, it isn't that big of a deal, is it?
-- bedahr
Parent
Re: (Score:2)
Which languages are supported? (Score:4, Insightful)
Re: (Score:2)
Re:Which languages are supported? (Score:5, Informative)
If you follow the link to the Sourceforge project and look at any of the screenshots (including the one on the front page--at the time when I visited it, anyway), you'll see that they're actually training the software with German. So, it looks like the answer to your question is, yes, it supports more than English.
Parent
Open Source? (Score:2, Insightful)
Re: (Score:2)
Aisle of it (Score:5, Funny)
Re: (Score:3, Funny)
Wiktionary != Wikipedia (Score:4, Interesting)
I would've expected that kind of sloppiness on the Register, but not on Slashdot (yeah, I know, I must be new here...)
Re:Wiktionary != Wikipedia (Score:4, Funny)
Parent
Re: (Score:2)
Hey this is slashdot... pedantry is the base of most of the discussions here...
you must be new here uh?
Pedant's Revolt (Score:4, Informative)
No it's not - Wiktionary is a sister project of Wikipedia. Not a subproject.
However, I must concur that in my experience speech recognition has been extremely patchy. While using it to issue voice commands is OK (and can be a real time-saver as it avoids going into Start, /Applications, Programs menu etc), dictation tends to be pretty rubbish. Especially when you're demonstrating the new speech recognition abilities in Windows Vista and just happen to work for Microsoft. And be in a loud, echoey expo hall. And using a dodgy mike.
Uses in Telephony (Score:2, Interesting)
Project's webpage in English? (Score:2)
Re:Project's webpage in English? (Score:5, Informative)
We are sorry that there is no international homepage for this yet.
BUT: you are strongly encouraged to contact me with any questions: grasch < at > simon-listens.org
-- Peter
Parent
Whither Microsoft? (Score:3, Insightful)
Actually, the reason we're not there yet is because most people don't want it. Keyboards and mice are simply a better way to give instructions to your computer than speech recognition is. Could you imagine the clatter of a dozen or more people in close proximity chattering to their computers?
Re: (Score:2)
"HTK was originally developed at the Machine Intelligence Laboratory (formerly known as the Speech Vision and Robotics Group) of the Cambridge University Engineering Department (CUED) where it has been used to build CUED's large vocabulary speech recognition systems (see CUED HTK LVR). In 1993 Entropic Research Laboratory Inc. acquired the rights to sell HTK and the development of HTK was fully transferred to Entropic in 1995 when the Entropic Cambridge Research
Re: (Score:2)
Re: (Score:2)
filthy open-source (Score:4, Informative)
julius is open source.
htk is *NOT* open source.
The latter is a micro$oft by-product, as clearly shown by the license [cam.ac.uk] that you have to first agree with and then send your email to them in order to download the tarballs...
myself never done this since 1995.
CMU Sphinx, an other free speech recognizer (Score:3, Informative)
http://cmusphinx.sourceforge.net/ [sourceforge.net]
http://en.wikipedia.org/wiki/CMU_Sphinx [wikipedia.org]