Microsoft Shows Off Adaptive, Multilingual Text to Speech System 171
MrSeb writes about a really cool project from Microsoft's speech research group. From the article: "Microsoft Research has shown off software that translates your spoken words into another language while preserving the accent, timbre, and intonation of your actual voice. In a demo of the prototype software, Rick Rashid, Microsoft's chief research officer, said a long sentence in English, and then had it translated into Spanish, Italian, and Mandarin. You can definitely hear an edge of digitized 'Microsoft Sam,' but overall it's remarkable how the three translations still sound just like Rashid. The translation requires an hour of training, but after that there's no reason why it couldn't be run in real time on a smartphone, or near-real-time with a cloud backend. Imagine this tech in a two-way setup. You speak into your smartphone, and it comes out in their language. Then, the person you're talking to speaks into your smartphone and their voice comes out in your language."
The Techfest 2012 keynote has a demo of the technology around minute 13:00.
But I miss Microsoft Sam! (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
"Don't run, we are your friends"
Re: (Score:2)
Microsoft Research comes up with some pretty awesome concepts. Not all of them ever see the light of day, but they are one of the best R&D shops around in the tech world.
Re: (Score:2, Funny)
Dear aunt, let's set so double the killer delete select all.
AZN (Score:2, Insightful)
Japanese please!!!!
Re: (Score:3)
Sounds cool....but.. (Score:2)
Will they license this for PBX systems other than their own?
I would love a multilingual system like this. The audio is really good compared to the paid software that I have access to.
Re: (Score:2)
It's built using the MS Speech platform. Their may be a port for Mac ( Office for Mac have TTS? ), but in general for a PBX system to use this, this part of the system has to be running windows.
That said, the voices have been free. You can buy MS Speech voices from 3rd parties for lots of money if you want some more natural voices. This seems like a step towards the eventually downfall of highly trained specialized voices. The concept here is, hire a new voice actress, spend an hour, and translate her into
Re: (Score:3)
Why the hell not? It's a product like any other.
They sell Microsoft Office for operating systems other than Windows.
I just hope they do the same with this and not tie into their own PBX exclusively. If they do it will make it see a hell of lot less production, that is for sure.
Re: (Score:2)
presumably they'll so some rather simple math and see if it's worth it.
Re: (Score:2, Troll)
They sell Microsoft Office for operating systems other than Windows.
This concession to the antitrust authorities and Apple is something of an exception to the general rule and it was a brutal fight to make it come about. Your use of "operating systems" in the plural is interesting. Other than Mac OS X, which? Windows Phone shouldn't count in this context. Are there any others?
Re:Sounds cool....but.. (Score:5, Informative)
They sell Microsoft Office for operating systems other than Windows.
This concession to the antitrust authorities and Apple is something of an exception to the general rule and it was a brutal fight to make it come about.
What rubbish! The first version of Microsoft Office EVER was for the Mac in August 1989. The Windows release came out in November 1990. With whom did they have this "brutal fight" to get this released for the Mac?
Interestingly, according to Wikipedia [wikipedia.org], after the release of Word for the Mac in 1985 (2 years after Word for MS-DOS and Xenix), "Word for Mac's sales were higher than its MS-DOS counterpart for at least four years". It seems that Microsoft were rather pragmatic about selling software where it would make a buck!
Re:Sounds cool....but.. (Score:4, Informative)
The selective memory of you 'softie fans is amazing. There's a reason for these things. In 1986 Windows looked like this [wikipedia.org]. Sales of Mac Office kept Microsoft alive in this period. Microsoft Office was moved to reinforce Windows as soon as Windows was a credible environment. Windows wasn't even a credible platform until Windows for Workgroups (Windows 3.11) was released in November 1993, some 7 years later (or 1/3 of the time to present day). Mac Office was so lagging for a long while after WfW launch that it was effectively discontinued, and Office's superior support of the Windows platform was a huge part of Windows assuming dominance over the superior Mac OS which had come to rely on Office, which now offered degraded inferior performance and features on the Mac OS. There were some other shenanigans you can read about in the above links. It was a very successful strategy you can read more about here [groklaw.net] - enough horrifying content to keep you awake for years. But if that's not enough, you might try these [catb.org]. Microsoft through these lessons evolved a strategy where all their products have to reinforce each other, and that became their core strategy. And then...
Apple got some traction in their TrueType font rendering patent suit against Microsoft [slashdot.org] and the Justice department was closing in on an antitrust action [wikipedia.org] legendary in its scope and reach. Bill Gates blinked, and they settled, and now there's Mac Office, but you can't say that it's fully supported. The Mac versions lag the Windows versions by some years and are not fully compatible with each other in ways that can't be explained by OS platform differences. The Office platform supports Windows now, as you can see by all the sockpuppets who come out every time somebody mentions some non-Windows operating system to say "you can't get Microsoft Office for that and you never will." And then the rest of us chime in "Application vitualization solves that problem."
Eventually Microsoft discovered political advocacy and contributed in various ways to the installation of a government more supportive of their business activities. Then the enforcement of antitrust protections to limit them and protect us against their abuse of their monopoly became lax, the limits were quashed until those protections expired. But that's another long story for another day.
Re: (Score:2)
Re: (Score:2)
This is what comes of dealing with the the devil. You get what you asked for, not what you wanted - and it costs your soul.
At that time (the hiatus when Office for Mac was poorly or not supported) - you're righ! - I can't say that was illegal. That's not for me to decide. The courts have found so, but I don't own them. They got away with it, so they won that one for the nonce. To call it a succesful strategy is to stretch it to a general case, and I can't do that.
Your own comment about how foolish th
Re: (Score:2)
BTW, how do you know what I am able to imagine or not? You don't know me (clearly). And talk about morality generally makes me, well.. not want to talk anymore. So, good day to you sir
Re: (Score:2)
Nobody used Windows 1.0. People were using DOS. DOS shipped with every PC and it was funding Microsoft. MS Office only became relevant when WYSWYG became possible on hardware sold at the PC price point, and at that time, the Windows 3.0/3.1 days, MS didn't care about Apple, they cared about Lotus 1-2-3 and Wordperfect.
In 1988, the Mac was a pretty computer used by families who could afford $2k on a computer which wasn't compatible with anyone else's $1k clone.
Re: (Score:2)
The big boss was impressed by another demo (Score:5, Funny)
"Programmeurs, programmeurs, programmeurs, programmeurs, programmeurs!"
Re: (Score:3)
SAM: "Ich bin ein Developer! Developer! Developer! Developer! Developer! Developer! Developer! Developer!STOP 80000X21 OOM_MONKEYDANCE_INFINITE_LOOP"
Comment removed (Score:5, Funny)
Re:First translation fail (Score:4, Funny)
instead of bobcat, hovercraft contained eels. would not buy again.
Mod up! (Score:2)
Re: (Score:3)
That's what the low quality garbled voice sounded like. What the Microsoft system actually said was "Hey, google is full of evil".
Re: (Score:2)
Better still:
"We am thy freighter Ursva, six weeks out of Kronos. Over.
We is condemning food, things and... supplies."
I haven't thought about that in years...
Heh (Score:3)
Remember a couple of weeks ago when we had that story about scifi nitpicks and someone griped about aliens in Star Trek always speaking English?
Theatrical review, circa 1599 (Score:3)
Verily, theis latest so-called play of Mr Shakespeare sucketh most bigge. Knoweth he notte that ye Romans (and may I be flayed with my own fibbling-cloth if Julius Caesar weare notte such) spake ye Latin?
Given the torment that foreign language class (Score:2, Funny)
Re: (Score:2)
I took several different languages. I am admittedly biased in that I'm a dyed in the wool linguaphile, but maybe you just had a shitty professor. In a couple of my classes there were people who wanted nothing to do with learning a language, but a good professor is what made the experience (for them anyway) bearable or even at times enjoyable. Well, as enjoyable as a class can be anyway.
Re:Given the torment that foreign language class (Score:4, Informative)
Hehe....
I am bilingual in English and another language. When I go to that country, many of the tourist attractions have price lists in English, Spanish, Russian, Japanese, you name it. Then they have one in the local language. The prices on that one are half of what they are for the tourists. And they're written out in words, not numbers, so if you can't read them you're SOL.
So yup, you don't need to speak the other guy's language, if you're willing to play by his rules.
Re: (Score:3)
Then one day, I went to Spain, and it WAS really great. I could speak with anyone, and I was leading the group, translating for everyone. It lasted a week.
Then I got home, and asked myself, "was that worth the time took learning Spanish?" And the answer is no, no it wasn't, not at all. Even if I travel to a Spanish-speaking country for a week of e
Re:Given the torment that foreign language class (Score:5, Insightful)
That said, I don't regret learning Spanish, but learning it just so you can get a cheaper tourist trap is not worth it at all.
Of course it's not worth it, if all the benefit you find in knowing another language is saving a couple of bucks at some touristy place. But knowing a different language is much more than that. You have now access to new worlds of literature, movies, poetry and music first hand, without a translator to intermediate (because, as the Italians say, "traduttore, traditore"!). You can talk to more people directly, understand their culture, expand your mind. You can read a whole set of new web sites, see different perspectives, or read news that aren't easily available otherwise. It opens lots of new possibilities for you - for example if you want to work for a global company, or if you ever feel like work in a different country for a few years. And even without any of those, the very effort of learning a different language improves your brain and slows mental aging.
I'm relatively fluent in three languages now, and can more or less read another two. I read books in all of them, and I find it really enriches my mind. I just started learning a fourth (Japanese), and am really looking forward to reading Japanese books in their original form (even though learning enough of the kanji characters will be a pain).
Re:Given the torment that foreign language class (Score:4, Informative)
I just started learning a fourth (Japanese), and am really looking forward to reading Japanese books in their original form (even though learning enough of the kanji characters will be a pain).
Might want to check out this book [amazon.com], it is good. And since I'm giving completely unsolicited advice, the exposition of grammar in "Communicating with Japanese by the Total Method" is my favorite of all language textbooks I've seen.
Re: (Score:2)
Might want to check out this book, it is good.
Thank you! After reading the reviews, the book seems to match pretty well my initial goal (reading cursively). 99 bucks new is a bit steep, but I'll check my local library, or see if I can get it second hand.
Re: (Score:2)
Re: (Score:2)
Oh, I didn't realize that was an old edition. The new one is cheaper.
Heh, much better, thanks again!
Re: (Score:3)
Just a quick tip. Start on kanji as soon as possible. Knowing the kanji creates mnemonics for learning vocabulary. It also helps you decipher new vocabulary that you've never seen before. I wasted a lot of time before I realized that learning the kanji and and vocabulary at the same time is *faster* than learning the vocabulary alone.
One more quick tip while I'm here (somewhat controversial, probably). Completely ignore polite speech until you have a good grasp of the underlying plain form. This is op
Re: (Score:2)
Just a quick tip. Start on kanji as soon as possible. Knowing the kanji creates mnemonics for learning vocabulary. It also helps you decipher new vocabulary that you've never seen before. I wasted a lot of time before I realized that learning the kanji and and vocabulary at the same time is *faster* than learning the vocabulary alone.
Thank you, that makes a lot of sense. I just finished learning hiragana and katakana, and still practising reading/writing, but I'll try to start on the kanji as soon as possible - even though I expect it'll take me years to become anywhere near fluent :)
My teacher insists on the polite forms (the course is sponsored by the company, and, obviously, they're mainly interested in business interactions), but I try to go beyond that - I expect reading as much as I can will help there.
Re: (Score:2)
Thank you, that makes a lot of sense. I just finished learning hiragana and katakana, and still practising reading/writing, but I'll try to start on the kanji as soon as possible - even though I expect it'll take me years to become anywhere near fluent :)
"Remembering the Kanji" by James Heisig is a useful method for learning kanji. You can download the first third of the book from the publisher as a pdf here [nanzan-u.ac.jp]. I personally disagree with the order of learning, but the technique is sound. He suggests learning English keywords for all the common kanji before learning Japanese. I don't think that's necessary. Learning it in the same order that Japanese students learn will allow you to read as you learn, which I have found more effective. Also, some of his
Re: (Score:2)
How do you retain all of them conversationally? Did you grow up multilingual? My non-English skills are continually crumbling from disuse.
Re: (Score:2)
How do you retain all of them conversationally? Did you grow up multilingual? My non-English skills are continually crumbling from disuse.
I didn't grow up multilingual as such, but I did start pretty young - my parents got me a private foreign language teacher before I was even in school. She taught me the basics, and then I started reading a lot and that helped me build a vocabulary. Truth is, I wasn't reading in order to improve my language skills - I just liked the books, and wanted to understand what's happening.
I find it's important to keep regular contact with a language, even if the contact is fairly limited - I try to read bo
Re: (Score:2)
I know this won't change your mind, and maybe it only applies to Spanish South America, but still...
A German acquaintance said it find it difficult to speak and convey emotions so intensely as Spanish speakers. He said that, in Spanish, you can speak as if you were singing, with a smile on your face. I guess it has to do with the extra use of vowels...
You speak a language that can do this! Isn't that cool? :)
Unrelated: Maybe you should try Latin America? People there are warmer than Spaniards, and in some p
Re: (Score:2)
A German acquaintance said it find it difficult to speak and convey emotions so intensely as Spanish speakers. He said that, in Spanish, you can speak as if you were singing, with a smile on your face. I guess it has to do with the extra use of vowels...
lol I think this says more about German than it does about Spanish......
Unrelated: Maybe you should try Latin America? People there are warmer than Spaniards, and in some places it's hard to find someone who knows English, so your knowledge will come handy.
I love Spaniards! There are some really pretty women there too. I lived in El Salvador for a while, and it was fine, and there are many good reasons to learn a language, but learning just so you can save a few bucks in a tourist market is a horrible reason.
Re: (Score:3)
I teach English to Japanese high school students. The vast majority of them will never speak English ever again. Nor will they need to. Here's what I tell them.
Not everyone needs to speak English. If you plan to stay where you are, probably you can avoid having to speak English. This does not imply that learning English is not useful for some people. I live and work in Japan and can do so partly because I speak/read Japanese. Life in Japan is hard if you don't speak and read Japanese. This is true in
Learning a language is NOT easy (Score:3)
"One of the advantages of learning a language is that it is easy."
For you maybe, not for me. I spent 6 months trying to learn german 5 days a week because I was visiting there on holiday. Got nowhere. Some people have a talent for learning languages, others don't.
"All over the world there are amazingly stupid people who can speak their native language fluently"
Thats because children are coached in their own language 7 days a week 12 hours a day and yet it still takes 5 years until they can put together even
Re: (Score:2)
First, let me assure you that your feeling is normal. You are almost certainly *not* language-learning-challenged, as much as you may fear that you are. I can't write as much as I would like to in a Slashdot posting, and you likely wouldn't read it anyway, but I'll try to shine a light on where I think you're having difficulty.
I want to point out that 5 year-olds are extremely fluent in their native language. They have all the basic grammar and a vocabulary of 4-6 thousand words. This is not enough to h
Re: (Score:2)
Well fair enough, but I supopse there are different definitions of easy. It could be easy in the sense that its possible for almost anyone given enough time (like building your own house) , but not in the sense of you'll manage it in a week. Problem is I only had time when I was commuting to learn german from a book and some mp3's, its not like I had anyone to practice with 24/7 like you do when speaking japanese. There's only so far you can go with a language on your own. Beyond that , unless you live surr
Re: (Score:2)
I can't tell if that's an argument against learning a foreign language, or an argument for learning more than one.
That's fine, but... (Score:2)
Re: (Score:2)
It just means "do what needs to be done". There's no particular subtext to it, though I'm sure it's probably more common in some regional dialects than others.
Re: (Score:2)
It means prepare to revert the same.
How Does It Compare With Project Festival? (Score:2)
Isn't this the same thing that Project Festival has been doing since about 2004?
http://www.cstr.ed.ac.uk/projects/festival/ [ed.ac.uk] (try the demo)
Just FAIL (pipe dream?) (Score:4, Interesting)
1) The translations aren't semantically equivalent (as pointed out by commenters above above). I can already say "Ich bin ein dummer Amerikaner" in my own voice, without machine help. If the meaning isn't there, who cares?
2) The machine accent ain't that great, either.
All of this makes me think this is still somewhat of a pipe dream. The AI guys have been selling the idea of machine translation for years and years-- at least since the 50s, when it was promised to eliminate the need for trained State Department linguists. It's never emerged because it's still a hard problem. Even Google's translate, which beats the MS stuff by some yards, produces results which range from awkward phrasing to just plain inaccurate and misleading.
He's selling a great idea, but it's kind of like the Fountain of Youth. It ain't there, vaporware.
Re: (Score:2)
All of this makes me think this is still somewhat of a pipe dream. The AI guys have been selling the idea of machine translation for years and years-- at least since the 50s, when it was promised to eliminate the need for trained State Department linguists. It's never emerged because it's still a hard problem.
Yeap. If you can't solve the hard problem, solve an easier one that looks similar. That's what these guys have done.
Re: (Score:2)
Why so cynical? Why so "a problem is either solved or there can be no progress towards it"?
It's not that dichotomy. If you read, my meaning was, "these guys are not making progress towards it." That is quite different than saying there can be no progress towards it.
The biggest FAIL (Score:2)
3) You have to train it for an hour?
I was actually slightly interested until I got to this bit and realized, like any other Microsoft "innovation," it wasn't really at all. Anyone can make a custom voice sample in about an hour. Hooking up simple voice recognition and text-to-speech is incredibly dull.
Had they actually interpreted intonation for semantics, and simulated and learned your voice in real time, it would have been pretty neat.
Re: (Score:2)
Shut down innovation, folks, as nothing's perfect! Close it down, boys, and head back to the caves.
Seriously, this is a ridiculously-early look at the technology. Calling it a fail is incredibly premature. FUD's not cool when anyone spreads it, remember?
Re: (Score:2)
It isn't exactly innovative, though - there have been ways of adapting TTS voices to sound like a particular speaker [festvox.org] that require less training than this for years. There's even an open source implementation of it in Festvox.
Re:Just FAIL (pipe dream?) (Score:4, Insightful)
He's selling a great idea, but it's kind of like the Fountain of Youth. It ain't there, vaporware.
Is he actually trying to sell a mature product, or is he just showing something cool? I'm not sure where the innovation is, if it's in being able to train text-to-speech to sound like your voice, preserving intonations and such across the translation (even though it's obviously not great at it yet), or if it's just in putting a few existing technologies together, but you have speech recognition, and a translator, and text to speech that sounds like your voice, then this is what you can have. Include preserving the intonation and you have something cool. So what if it's just showing off a cool application of existing technologies?
Translators aren't great but are getting better...speech recognition isn't great but is getting better. Preserving intonation across the translation and including in text-to-speech in a voice that sounds kinda like your own can probably get better too. Put the 3 together and you get something useful. I think that's all it's trying to show, and I think as these technologies get better we could end up with something pretty cool.
If this was a something out of any other company, would the same people be criticizing it?
Re: (Score:2)
>If this was a something out of any other company, would the same people be criticizing i
Ehhn. I dunno. I'll say this. I'll give your answer 10 microLenats.
The Future of International Business (Score:3)
Chinese Businessman (via translated phone call): "An excellent idea! I suggest we sign the papers over dinner at Translate Server Error [boingboing.net]. They have the best HuMan chicken in town. And the owner prides himself on his bilingual staff."
So, two problems.
One, our text translation software isn't foolproof, but people expect it to be. What happens when the software confuses "galleta" (Spanish for "cookie") with "callate" (Spanish for "shut up"). They do sound similar if you say them out loud, but no one notices because you'd almost never use both in the same conversation. I foresee someone attempting a friendly gesture by offering to share her mother's recipe for "shut up."
Two, live conversations depend upon both parties building on a shared experience. If each one has a different account of the experience, conversations break down very quickly. Ever tried to carry on a conversation with a schizophrenic? And that's just assuming the errors are innocent. What happens when corporations start using this? Your bank requires you to call a number to activate your new card and during the call they have the software "translate" some required disclosure for you, only the translation doesn't really convey what they are supposed to be disclosing. Don't think it won't happen... whoever implements this first on purpose will be running the company one day.
Then again, this whole discussion is purely academic. Gene Roddenberry's estate will just claim prior art [memory-alpha.org] and prevent this from ever becoming a reality. Hopefully.
Re:The Future of International Business (Score:4, Informative)
Context is context. Obviously, an English speaker hearing a Spanish speaker offer to share a recipe for "shut up" on a (up until this point) benign and friendly conference call is going to assume translation error. Better than that, translation software knows about these little mix ups better than you do. On a Text To Speech, there's not much to do but suffer the mis-translation ( or maybe they play an audble 'ping' when they warn about a context or idiosyncrasy error), but in a system that displays you something on a device, these things tend to be shaded a different color, and offer options as to what other possible meaning they may have meant, based on context.
No, they don't. No one even expects paid human translators to be perfect.
Honestly, with a schizophrenic, chances are I have, at some point in my life, on IRC. But more to your point, i've played games where opposing sides are communicating from different languages via google translate. Think Russia vs US, and the only way to talk to them is via delayed google translate results. It's slow, it's tedious, and yet we somehow managed to have amazing rapport with people of like mind. The assholes were still assholes via google translate, and the people we wanted to work with we managed to communicate with. Again, you are ignoring the fact than incrementally better translation is still better than it's predecessor. For now. Sure, one day we'll identify some uncanny valley with voice translation, and we'll all spend lots of time plotting how bad the translation software has to be for us to feel it's robotic.... but for now, any small step forward is better than the previous one.
Yup, god forbid someone spends time and money on a problem that sci-fi writers got to magically make disappear in one sentence, and a prop. Maybe someday some brilliant young chap will figure out how to make warp drive not require 3x the mass of the universe for power, and Gene's children can make some more cash. Hopefully.
Partnership (Score:2)
Microsoft Research comes up with a prototype that barely works. Apple wraps it up and gives it a foreign name and sells it like crazy.
Fools (Score:3)
Translations We'd Like to See: (Score:2)
I will not buy this record; it is scratched.
I will not buy this TOBACCONIST, IT is scratched!
Would you laaahik... would you LIKE to come back to my place, bouncy bouncy?
My nipples explode with delight!
Aah just go watch it yourself! http://www.youtube.com/watch?v=G6D1YI-41ao [youtube.com]
Frank Zappa's entry:
This is my left hand.
This is my right hand.
I have a big bunch of dick.
Aah, just go watch it yourself! http://www.youtube.com/watch?v=CkCYJ6FK0T4 [youtube.com]
Isn't teh internets great?
Accent? (Score:2)
The summary says "preserving the accent, timbre, and intonation of your actual voice". Now i can get timbre and intonation but accent? It made me wonder what does Mandarin with a Scottish accent sound like, does it apply Scottish speech tones, which would make it unintelligibly, or is it clever enough to find a social equivalent, maybe an accent of a small semi-autonomous region of China?
Unfortunately checking TFA reveals this "accent" part to be the slashdot reporter's fantasy.
I don't see the need for all the 'training'... (Score:2)
...as there exists already an international phonetic alphabet [wikipedia.org], an alphabet that includes annotations for lilts, gutteral intonations and such. Why not just add the IPA pronunciation of each word to a given language dictionary, and have the computer read that? This would greatly reduce the 'training' work needed by the end user. It would also open new possibilities for text-to-speech translation, or even speech-to-speech translation.
To date I have found no text-to-speech reader on any platform that can under
Re: (Score:2)
The training has to handle the way real people speak as opposed to the idealized way the words are transcribed. The sounds of words change when they are pronounced in a single sentence as opposed to individually. A single word is often pronounced multiple ways in a single English sentence. The IPA dictionary is also unlikely to be able to handle accents. From that article on the IPA it mentions that not all tones are supported. Chances are there are various other phonemes that the IPA doesn't support.
Above
Re: (Score:2)
The closest text-to-speech program today is eSpeak which uses an ASCII variant of IPA phonemes. The problem with this is that it has a voice for each language (which is essentially the same voice) with a subset of the IPA phonemes available. I am intending to use IPA fully in my own text-to-speech program and associated voices (http://rhdunn.github.com/cainteoir/) but haven't gotten to implementing the text to phoneme and phoneme to audio parts yet, nor the associated tools for working with them and the dif
Re: (Score:2)
Microsoft voice recognition, because it just works (Score:2)
Dear aunt, let's set so double the killer delete select all
Would have watched the video... (Score:4, Insightful)
... if only my software could translate a bytestream of type video/x-ms-asf into a video.
In light of this experience, why should i believe that someone actually invented a unidirectional universal translator? Nice try.
Over the years (Score:2)
All fine and good, but... (Score:2)
Re: (Score:2)
Yeah, text translation is exactly the same thing as speech translation. It must have been really hard for Google to get the 'accent, timbe, and intonation' of all that text just right.
Re: (Score:2)
Re: (Score:2)
I know this is slightly off topic but why doesn't google translate have 4 boxes?
if you are translating language a to language b then any reply is likely to be in language b and need to be translated to language a.
A bit of history would also be useful so you could scroll back up the conversation too.
Still google translate is pretty good as long as you avoid ambiguous phrasing.
I'm not sure if microsoft is achieving anything of use here. The intonation and tone of a sentence of one language maybe completely
Re: (Score:2)
I know this is slightly off topic but why doesn't google translate have 4 boxes?
if you are translating language a to language b then any reply is likely to be in language b and need to be translated to language a.
You can switch the languages around with a single click on a button (I'd post the symbol if /. wasn't broken). Having four boxes would just make the layout confusing for new users, in my opinion.
Re: (Score:2)
2 instances works as well (2 tabs) but your always going to have to wait longer than its possible to achieve. There is probably an api for google translate which would give you the option to create a more advanced page.
Still there is no real substitute to learning the 2nd language if you need it regularly.
Cue brain on android is pretty good for vocabulary building, lots of flash cards and with the right language plugin spoken words too.
(asked one of the developers for the option to just speak the words, ra
Re:Do they sound alike? (Score:5, Funny)
Re: (Score:2)
I completely agree. It is total garbage and if it isn't absolutely flawless in every possible regard, then it should not even have been attempted.
It's not garbage, and if they had real innovations, it would be nice. Instead, they've taken a few characteristics of a speaker, like pitch, and used those to model the computer voice in another language. It's about as interesting as if someone said, "what would you look like if you were a boy?" (or girl, if you are male), and then sampled your eye color, hair length, nose shape, etc, and then morphed those into a stock photo of a boy. Yeah, it would have some characteristics of you, but it also wouldn't be
Re:Do they sound alike? (Score:5, Insightful)
It's not garbage, and if they had real innovations, it would be nice. Instead, they've taken a few characteristics of a speaker, like pitch, and used those to model the computer voice in another language.
No, if you listened to the keynote, they took speech characteristics, and then broke the target voice pattern up into 5ms pieces and reconstructed the voice to match a reference translation from a different language. What they are doing is not only very interesting, but clearly has space for improvement and a variety of applications.
It's about as interesting as if someone said, "what would you look like if you were a boy?" (or girl, if you are male), and then sampled your eye color, hair length, nose shape, etc, and then morphed those into a stock photo of a boy. Yeah, it would have some characteristics of you, but it also wouldn't be what you would look like if you were a boy.
That's sort of the point. The sampled voice may not speak fluent Mandarin, but if you'd like it to, this technology will allow it to. A better analogy would be along the lines of taking a computerized sample of your body shape and texture, (skin, hair, face, etc), and then using 3D animation to reconstruct a model of you doing karate, even if you didn't actually know karate.
Eventually, as the 'resolution' improves, the bits of this that you disapprove of, (the computerized feel you are getting from the voice), will most certainly improve as well. But it's the underlying ideas and tech which are interesting here.
Re: (Score:2)
No, if you listened to the keynote, they took speech characteristics, and then broke the target voice pattern up into 5ms pieces and reconstructed the voice to match a reference translation from a different language. What they are doing is not only very interesting, but clearly has space for improvement and a variety of applications.
Incidentally, if you want to do similar voice matching of text-to-speech yourself I believe that the open source Festvox project has supported doing this for a few years now, though it's not terribly well-documented. See festvox/src/vc/HOWTO. You'll need to record some sample phrases in your own voice for the voice transformation code to work from, but apparently the Microsoft demo requires that too.
Re: (Score:2)
Re: (Score:2)
...Patents being the way they are these days, I guess I kind of do care who does it.
lol than teach your kids to not forget the big stuff if they choose to focus on the small stuff.
Re: (Score:2)
Yeah, they're just trying to play catch-up with Intellivion's latest technology [youtube.com]
Re:I see where this is headed. (Score:4, Funny)
I want to hear a TTS that can turn Punjabi into Valley Girl.
Re: (Score:2)
if you read the output of any given chatbot in a valley-girl voice, it will pass the Turing test.
Re: (Score:2, Funny)
Like as if
Re:I see where this is headed. (Score:5, Informative)
Provided that the speech recognition engine is good enough, it can distinguish between the /Q/ and /A/ sounds in lot (British English: /lQt/, General American English: /lAt/), cot, hot, etc, with /A/ also appearing in father /fA:D@/. This will mean that the speech recognition engine will record the actual phonemes spoken, rather than the phonemes it thinks are being spoken. With this, it can then build up a database of phonemes to the recorded audio.
When a given language is selected (strictly speaking it is a language + accent, as Liverpudlian English sounds different to Australian English and Mexican Spanish sounds different to Argentinian Spanish) it will have a set of rules that describe how to convert the text into phonemes specific to that accent (for example, "ook" is usually pronounced /Vk/ in English, but in Scouse English it can be /Vx/). These rules provide a set of phonemes required by the language+accent to speak it properly.
The phonemes are transcriptions of IPA-based phonemes (http://en.wikipedia.org/wiki/International_Phonetic_Alphabet). If you plot the phonemes available by the voice on the phoneme charts, you can fill in more phonemes that are similar (e.g. using /A/ instead of /Q/ if the voice does not support /Q/, or an untrilled /r/ if the trilled version is not supported, where a trilled /r/ can be found in Spanish).
Then, provided that the voice can handle all the phonemes in a language+accent, you can then map between the two, allowing your English speaking voice to speak German, Chinese, Afrikaans or whatever language you have data for. The eSpeak text-to-speech program does a simple version of this to make the German, Polish, Swedish, Romanian, Dutch, Hungarian, French and Afrikaans MBROLA voices speak English.
You can also use it to have a voice support different accents, provided you have the rules for producing the correct phonemes.
Re: (Score:3)
microsoft is like nestle, never to be trusted again
That reminds me. I need to pick up some chocolate milk powder.
Re:microsoft and their credibility (Score:5, Funny)
It is no surprise that Excel is being used for engineering [google.com] given its power and flexibility. Hell, a shop I worked for used Excel as its database.
Now let's get down the the nitty-gritty - Visual Studio is one of the most powerful IDEs on the face of the planet. You want power? You got it. You want speed? You got it. You want both? It empowers you, the ninety-pound weakling, with both, with minimal effort. I got a raise because I used Visual Studio. I got my dick sucked by my boss' hottest secretary because I wrote an patch in C# that prevented our ERP system from total meltdown.
Why be some boring open-source ODBC slob when you can be fast. Quick. Nimble. Packing.
Be potent. Be Microsoft.
Re: (Score:3)
Stay thirsty, my friend.
Re: (Score:2)
I got my dick sucked by my boss' hottest secretary because I wrote an patch in C# that prevented our ERP system from total meltdown.
Let me guess, that was two weeks ago and the ERP system was also from Microsoft? ;)
Re: (Score:2)
The first paragraph sounded like it should be in the voice of those youtube videos like the one with the "webscale" bears.
Re: (Score:2)
Why be some boring open-source ODBC slob when you can be fast.
That should be "open-source X/Open SQL CLI slob", given that ODBC is a Microsoft term for (more or less) the same thing.
</pedant>
Otherwise epic.
Re: (Score:2)
And, if it does do Gorn - will it blink?
Re: (Score:2)
Speech in Language 1 by the user gets converted internally to text using speech recognition software. This then passes through a Language 1 to Language 2 translation program. The text in Language 2 then gets spoken using text-to-speech software using a voice created from the user's own voice. The bit in the middle is transparent to the user, which is why it looks like speech-to-speech.