Catch up on stories from the past week (and beyond) at the Slashdot story archive


Forgot your password?

Comment Re:I think you've already decided... (Score 1) 600

The fact remains that in most sensible implementations, the user is unable to run arbitrary code outside his own directory.

And it's a completely meaningless fact, since arbitrary code run from outside your own directory will have exactly the same privileges as arbitrary code run from inside your own directory.

I think you mean that the user is unable to run code under someone else's user ID (such as root).

Comment I totally called this. (Score 1) 226

I totally called this, back in 2007, when LiveJournal started to use SpinVox's services.

I was suspicious at the time, and started to look for information. What I found made me absolutely sure that at least part of it wasn't actually as automated as it was made out to be, and in fact, gave me the distinct feeling that it was mostly manually done by humans.

I started to write an article on the subject that I was going to publish in the LJ community "no_lj_ads". Being a Support volunteer, I had access to the feature before it was released for general use, and I was able to make some observations. However, although I made good progress on the article, it was never finished. There were lots of points to make, and it wasn't long after that that LiveJournal was the subject of a controversy known as "Strikethrough". The article got buried on my computer and forgotten about, half-finished.

In 2008, I dug up the article again, completed it using notes that I had left, and reposted it to my LiveJournal. I'll reproduce it here, too, because I think people will be interested.

Remember, this article was originally made in 2007. Because of that, some of the links are now defunct. The article has been slightly edited in places in order to note where this is the case; these edits will be noted [2008: Like this!] or [2009: Like this!], depending on whether I noticed it in my 2008 reposting, or in this 2009 reposting.

On to the article!

The Problem of Logistics
First, let me address the obvious problem of logistics. Yes, logistics are a big problem. LiveJournal has tons of users, to put it mildly, and SpinVox already has quite a lot of clients, I believe. If SpinVox weren't fully automated, how could they solve this problem? Is SpinVox some sort of sweatshop?

To tell the truth - I don't know how SpinVox solve that. It seems like for that reason alone SpinVox would be an automated system, and I'll be the first to admit that it's a good question that deserves an answer, and a good reason to believe it's automated. On the other hand, though, I believe I have evidence that shows pretty strongly that all is not automated. I'll be covering that evidence here.

The Evidence
Well, let's get started with some obvious points. The first thing to do is to look at some random people's journals and check out the quality of the transcription for yourself, so go check out the post in paidmembers and click to some random commenters' journals. Chances are, most of them will probably have made a voice post by now to test the system, and auto-transcription only occurs on public entries, so you have a good chance of finding some. Heck, some commenters link to their posts for you. Go check them out. I'll be here when you get back. (If you want, you can also try this Google Blog Search search for recent voicepoists too, but not all of them will be from paid members, and Google doesn't pick up all of them.)

Okay, you're back? Cool. You've probably noticed that the quality of the transcriptions is really pretty good, but obviously it still makes mistakes. That's okay - it's to be expected, from an automated system, right? And yes, it *is* to be expected. No automated system is perfect. Mistakes will always be made. I encourage you to bear this in mind and be skeptical about what I have to say. Analyse it for yourself; don't let me brainwash you. Be skeptical, it's healthy for you.

Having said that, however, SpinVox is still very awesome, if we consider it to be automated:

1. It understands a wide variety of accents.
2. It understands when you speak quickly.
3. It works over the phone.
4. It doesn't mind background noise, or quiet voices.
5. It knows when and how to filter words that aren't relevant, such as "um" and "er".
6. It can link Web addresses perfectly.
7. It can infer context and know what sort of words to use in a transcription.
8. It doesn't keep your mistakes - just start a sentence over and it'll use only the new one.

Hmmm... this is starting to sound a little less than fully automated, ne? While none of them are a smoking gun in themselves, taken together they do add up to quite a bit of suspicion. Let's examine them one by one.

1. It understands a wide variety of accents.
This one is fairly obvious. You'll have noticed during your paidmembers listening (you *did* do that, right?) that a wide variety of accents are represented. This is virtually assured by the LJ userbase - it encompasses people from everywhere. Even in the United States alone, there are tons of different accents to worry about.

The problem of accents may seem small (sometimes!) to a human, but they're big hurdles for a computer. One person's "eee" sound might be another's "ehh" sound, which could also sound like an "A". A computer would have to do some pretty extensive processing in order to work out the best way to interpret the accent - and even then everybody has their own voice *regardless* of their accent.

Accent recognition probably *can* be done, which is why it's not in itself a smoking gun. But generally most good voice recognition software packages have to be trained at least a little to your voice before it'll interpret free text reliably. The ones that don't are mostly limited in what they can do, generally being more suited for a limited number of choices.

2. It understands when you speak quickly.
Again, this is nothing new. It used to be that voice recognition systems required that you spoke each word separately and distinctly for it to recognise you. That's no longer the case, and nowadays voice recognition systems will accept a 'normal' way of speaking, for some definitions of 'normal'.

SpinVox, however, doesn't mind if you talk quickly. See this post from ridicully - I think it's fair to say that she talks pretty quickly in this case. Notice how a lot of words are picked up even in the fastest bits. Some are done wrongly ("conscription" instead of "transcription", for example), but for the most part it's really good. Give that same recording to a commercial voice recognition program, however, and I doubt it would be quite so good. (Disclaimer: I don't have a recent voice recognition program to test this with, and for the purposes of this article I didn't feel it would be fair to try anything with an old program, so I've done no testing on this part; take it with a pinch of salt.)

As another example, check out this post from lacey. While the post itself isn't said fast, the "I don't care" of "Apparently, oh hell, I don't care" bit definitely is, and I was surprised to find it in the transcription. (Also note an invocation of rule #5 - it got rid of the "daa de daa de daa" just before that bit.)

3. It works over the phone.
Yes, yes, I know this is an obvious thing to say, considering how SpinVox works. But it's still important, since the quality of a phone conversation differs markedly from a normal conversation. Admittedly, in some ways, this can actually make it easier for a voice recognition system, especially for people who happen to be rather bassy in the first place. I still feel it's worth noting, however.

Additionally, using a phone also brings with it its own small crackles and noises, which also leads me nicely onto...

4. It doesn't mind background noise, or quiet voices.

This is one of the more surprising things. With a system as good as this, you would expect it to require a clean environment - no background noise and a loud, clear voice being two of the requirements (one of the others being to have a voice with the same characteristics as the one that was learnt from - see #1 above). However, SpinVox doesn't mind if it has background noise, or a quiet voice.

Take a listen to this post from spicy_mustard ; the user here is speaking quietly, meaning it would be harder for an automatic transcription to pick up the voice; in a sense, this is akin to having more background noise, which is why I put the two in the same section.

For background noise, and random pops/crackles, check out this post by johno. Notice that despite the fact that there are all sorts of problems with the quality of this voice post (which, I should stress, is not the user's fault - it's more likely to be the phone they're using, or some weird problem with the local number they were calling), and the fact that the voice constantly clips (again, not the user's fault), SpinVox manages to get an awesome transcription. It seems to cut off in the middle though, for some reason; weird since the same thing doesn't happen to longer voiceposts.

5. It knows when and how to filter words that aren't relevant, such as "um" and "er".
I'm sure you've noticed in all the examples so far that the transcription doesn't include such filler words. This isn't such a feat in itself; assuming the rest of the transcription worked well, it's something that could probably be done fairly easily. But it's one more point to remember. ...and humming, laughing, blabbering, etc?
If that were the only thing that gets filtered, I'd be onto the next point by now. However, that isn't the only thing that gets filtered. It also manages to filter things that don't sound like filler words at all. You already saw/heard an example back in #2's writeup - lacey's voice post included a "daa de daa de daa" that got completely cut out. It didn't have a "___" or a word or two followed by "(?)", it just got completely cut out. Why? How would an automated system know that wasn't something that it should have least *attempted* to parse?

How about laughing? Take a look at these transcriptions of a post from azurelunatic. Note that since the post has multiple transcriptions, I'm linking to the transcriptions page so you can see SpinVox's version. She laughs a few times in the recording, the most obvious being after "It seems that everybody in the room knows Rent." But SpinVox somehow knows not to try to transcribe it. Granted, it might be a bit easier than the "daa de daa de daa" above, but still. I'm sure you can find more instances of this sort of thing.

And extraneous talking? See this post from thad_duir (warning: Swearing in recording!). The automatic transcription for this post somehow cut out the entirety of the beginning of the recording, including several bit which would have been perfectly transcribable automatically, and focused on the main, *actual* message. Hmm.

6. It can link Web addresses perfectly.
To demonstrate this, I need to talk about simoncrowfoot [2008: This journal existed back when I wrote the article, and it was a normal personal journal]. That journal was set up before LiveJournal had SpinVox integration, and it's a publicly available demonstration of SpinVox's services and how you can post to LiveJournal with them. It doesn't use voiceposts, so you can't get at the original recordings and people can't post alternative transcriptions (unless you use the comments). However, there's still some things you can glean from it.

One of these things is this post: [2008: This link is now defunct, as the journal was deleted. Basically, though, as part of the post there was a link to]. Notice how the Web address that was spoken in that post got converted into a link perfectly. Now, that may not seem like much to some of you but that would be really, really hard to do automatically, especially since the address is made up of compound words - techbrew and jetstream. "techbrew" isn't even a word, so it would have had to have special exceptions in place to take two separate words and splice them together when it believed it was in a Web address. Not easy.

7. It can infer context and know what sort of words to use in a transcription.
Now, this is where the fun really starts. English is one heck of a language and has all sorts of fun rules for what to do with words in certain contexts, etc. It's asking too much to expect me to believe that an automated system knows all those rules regarding context and to apply them in the correct manner, but that's exactly what SpinVox does.

Some examples? Okay. The first and most obvious one is the use of names - people and non-people. Some examples:

  • - it correctly recognised "Silent Hill" as a proper noun and capitalised the words, and in fact the poster was also suspicious about it. Even if an automated service were to know that "Silent Hill" could be a proper noun, it'd have to do some extensive analysis of the context and grammar to figure out that was what was expected.
  • - this post has both a mistake and some correct capitalisations. The "Kids In The Halls" is a mistake, I believe; it should have been "I have decided to read from the kids in the halls' Dr. Seuss' Bible." It's an understandable mistake - but only if you're using context to figure it out, since there's no way that it could have been capitalised otherwise [2008: By this, I meant that SpinVox was treating it as a book title based on context, not on whether it's an actual title or not]. The correct capitalisations are also only things you can work out by context - the capitalisation of "Son" and "Father". "King of the Jews" is also capitalised, but to be fair, that one occurs more often in contexts where the capitalisation of "King" would be expected, so you could theoretically just capitalise that phrase indiscriminately.
  • [2008: Now defunct.] - this post reads perfectly fine, doesn't it? But listen to the actual recording. The auto-transcription misses words; consider the sentence "if it's something I feel will work properly I think I may use it much more frequently than I have in the past". The actual recording says "If it's something that I feel will work properly...". Also, "Certainly I am grateful to some of the folks out there kind enough to transcribe what I post in the past..." becomes "Certainly I am grateful to some of the folks out there who have been kind enough to transcribe what I post in the past...". You be the judge on this one.


  • Full names are recognised - first names and surnames. To an extent, this can be explained by English rules; if you say something that's recognised as a first name, there's a good chance that the next word is going to be a surname. However, as I think you can tell, there are by now *lots* of different rules in the mix that SpinVox apparently knows about. The more rules you add to a system like this, the more likely that it a) becomes less flexible to variations on that rule, and b) comes up with false positives. SpinVox seems to do neither.

    There are also cases where context tells you virtually nothing about the status of a name, yet it still correctly capitalises. For example, the transcription from SpinVox on this post from intendent [2009: Now defunct.]. One of the sentences is: "Let's see if it can understand Brandy." (As it turns out, her name's Brandi, not Brandy, but that's nothing to do with this and should be discounted.) SpinVox somehow knew it was supposed to be a name, even though context tells you very little, if anything, and the word "brandy" is a legitimate noun. You'd basically have to know that most people testing a service like SpinVox will test it with their own name. The system would have to be incredibly complex in order to know that "Brandy" was likely to be a name in this case.

  • SpinVox uses abbreviations even when you say the full word. For example, 'Apt.' instead of 'apartment', 'hrs' instead of 'hours', 'Fri' instead of 'Friday', and of course, the ubiquitous and much-hated '&' instead of 'and' (link not given because, well, it's pretty obvious as it appears in just about every single voicepost). There are others too, of course.

    Why does it do this? What is there to be gained from using abbreviations in a supposedly automated service? Maybe I'm missing something here; if so, please let me know.

At this point, the original article was left unfinished. However, the following is what I continued with in 2008, with what I felt I would have intended to say from here on.

8. It doesn't keep your mistakes - just start a sentence over and it'll use only the new one.

At the time, I had come across at least one voicepost where the person started to speak a sentence, but then made a mistake and started over. The auto-transcription from SpinVox totally discarded the first attempt at the sentence and just kept the second.

Again, this is the sort of thing that you can only really tell is needed by context, and IMO requires a certain amount of understanding of what they're actually saying. While I fully believe it'll be possible to write an automated system to do this, it would be large and very complex.

Unfortunately, I have no link to the actual post as I didn't have it in the unfinished article.

I would have used this section to describe a number of problems with the supposedly 'automated' system. I have a few notes about what I would have put in here, but it mostly all comes under one heading, so I'll do it that way instead.

SpinVox makes spelling mistakes.

If the above points didn't convince you that SpinVox is powered at least mostly by humans, this will. On some posts, SpinVox actually made spelling mistakes. Below are some examples...

  1. - in this post, hammond talks about "freaking out" - but SpinVox transcribed it as "freeking out".
  2. - here, "absolutely" was misspelled as "absolutley". Also, notice that the transcription uses the non-word "ya".

This sort of mistake isn't something a computer program can do accidentally. It would be working to a dictionary. Okay, so maybe those bad words got into the dictionary. It's possible, I guess, but...

Still not convinced? There's a smoking gun to reveal. However, SpinVox's site has changed in the last year so I'll need to describe what was on it.

There used to be a Flash applet on the front page of SpinVox's site that would invite you to call a number, dictate a message, and have the applet display the text version of what you wrote. Your converted message would be given an ID, and you could use that later to come back and look at the message. These converted messages were stored, and it was possible to look through the history of what had been converted. So I did so.

I found a message with the ID of 15217 that contained a smoking gun - the word "SinVox", where clearly it was meant to say "SpinVox".

Let's examine this in more detail.

  1. The 'word' is compound - that is, it consists of two words which are otherwise perfectly fine - "sin" and "vox".
  2. There is no reason, however, for those words to be merged - there is no dictionary word "sinvox", and even if the user had actually said that (you didn't get to hear the original recordings, unfortunately), there's nothing to say that it shouldn't have been transcribed "sin vox" instead.
  3. The word is capitalised, as if somebody had tried to type "SpinVox" and missed the "p" - which is just weird. We've already established that there's no dictionary word like this. The word is capitalised, and as such, it could not have been accidentally added to the dictionary, if one is indeed used; even using AI techniques, it would have been transcribed as "sinvox" or "Sinvox" at the most - not "SinVox". This is, absolutely, 100%, human error. I see absolutely *no* way that this could be anything else.
  4. It's also the only time that I've seen this transcription - if it was indeed an automatic transcription, then I'd expect to see it a lot more frequently.

Unfortunately, although I *thought* I'd made a screenshot of this at the time, I can't find it right now. In addition, the site has changed so that there's no way to see that sort of thing now. I'll keep looking for that screenshot.

Comment Re:Boy oh boy! (Score 1) 414

It's more analogous to counting people as drivers of cars when in reality they're only ever passengers.

Nobody's saying that Linux servers aren't used. What the GP is saying is that you can't count *every single user* of some popular site as a user of the OS that site runs on.

Or to put it another way: Let's say 70% of the Web-browsing public uses GMail. (which, of course, is a number I pulled straight out of my ass.) Does that mean 70% of the Web-browsing public are Linux/GoogleOS/whatever-OS-GMail-runs-on users? No, and to try to say otherwise is just outright skewing the numbers. They're GMail users, and that's all you can say about them. It makes no sense in this case to say that Linux use is up from a user perspective.

Now, had you framed it in the context of the servers themselves - with more users of the service equating to needing more Linux servers to cope with the load - then you might have a point. (though even then, it's still only use by one company.)

Comment Re:Reason? (Score 1) 779

Being able to inspect the source isn't the be-all and end-all. In some cases there may be more than you bargained for.

It depends very much on specific circumstances, of course, and with the fast progress of software nowadays you'd really need to be in control of both the compiler source and the target's source to pull this off. But the possibility is there.

Slashdot Top Deals

A freelance is one who gets paid by the word -- per piece or perhaps. -- Robert Benchley