Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×

Comment Re:Yay for Google (Score 1) 385

It's amazing how you can take someone out of context even while providing the full quotation.

Is it so hard to comprehend that if you really need serious privacy, NO major search engine provider in the world will give you that?

And this doesn't refute anything. Just because they may be forced into handing over data doesn't mean they won't put up a fight. Which is exactly what Google does, and exactly the opposite of Yahoo and MS, who are very pliable in their dealings with governments.

Comment Re:I've been playing with Markov models lately (Score 1) 342

As for speech recognition, aren't there any libraries or code bases out there that convert sound to IPA? It seems the most obvious solution. Heck, you could probably get away with some on-body sensors for more accurate detection of particular IPA symbols.

The IPA is used to specifically describe sound, whereas a natural alphabet is used to provide a general representation of a multitude of sounds. What sounds, and the level of abstractions (phone/allphone/phoneme) vary from alphabet to alphabet.

Ultimately, you must transcribe from the your context-specific sample to the generalised representation in your target alphabet. Nothing is gained by using an IPA representation instead of hashed audio samples because they are both context specific.

Comment Why ignore linguistic models and information? (Score 1) 342

Speech recognition will continue to hit upon this wall because it disregards internal representation. What you hear may externally be a wave form that has some certain characteristics, but the phonetic structure is also dependent on the phonemic, morphophonemic, syntactic, and semantic interactions. To actually understand a word, your exposure to phonetic information must trigger the aforementioned interactions.

The best example of speech recognition learning comes from babies. Babies are born with the ability to distinguish a pretty much infinite number of phonemes. Continued exposure to their native language then narrows this to the phones that are applicable to their use, i.e. their language. In English, things like aspirated ps get ignored for the purposes of meaning, such that I can hear "stoph" and how it is not distinct from "stop". Built upon this we discover morphemes and morphophonemic rules, so that I can tell that "stop" becomes "stopt" in past tense. Similarly, we upon this we build syntactic and semantic relationships. This is context based understanding. I need a context of "past" to start applying morphemes for past tense, but I also need a the correct phonemic context to perform the correct allophonic substitutions. Similarly, if someone with a thick Scottish or Novacastrian accent comes up to me on the street, I need to combine my semantic context with my own abstract internal representations of my language to try and understand them.

This provides a form of natural error correction that allows me to understand something I have never heard before and that might contain deviations or ambiguity (either inherent in the language or introduced by the speaker). My internal representation of English should prevent me from ascribing wrong phone clusters or wrong morphemes (runn-ingk) to the processed sound.

It's stimulus plus rule matching plus context plus error correction that should ultimately help me decide if something can be understood.

All thing ignores the complexity of graphemic translation, which build another set of rules.

The article said (somewhat in jest) that throwing out linguists helped improve the accuracy of the system. Sure, methods not representative of human language capability might in the short term give greater results, and there is no definitive model of the how language is represented in the mind. You can probably provide a great system ignoring much linguistic information that functions in a limited context (i.e. one language, rigid contexts (yes/no, numbers etc.). However, ultimately if the goal is to produce a speech system that functions like a human -- that is, performs the error correction when appropriate, uses various types of linguistic information, and in certain circumstances requires clarification -- then linguistic models are important

Slashdot Top Deals

So you think that money is the root of all evil. Have you ever asked what is the root of money? -- Ayn Rand

Working...