Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror

Comment: Re:The Real Question Is ... (Score 1) 91

by DavidShor (#37304466) Attached to: Facebook Testing Translate Feature For Comments?
"So in theory slang and abbreviations would be no more difficult to translate than dictionary words. "

Sure. The problem is that the slang and abbreviation need to show up in their corpus. From what I understand, Google mostly uses Canadian and European Parliment proceedings for their sample. "LOL' dosn't show up much there...

Comment: Re:The Real Question Is ... (Score 1) 91

by DavidShor (#37303068) Attached to: Facebook Testing Translate Feature For Comments?
Nowhere, though maybe they can do some statistical magic. I mentioned this down-thread. Corpus, as far as I know, applies to monolingual collections of text as well.

One thing that they can do, is to use statistical models of language to infer what unknown words "should" mean. They could even incorporate phonetic priors (IE, "Qui" sounds like "ki").

Comment: Re:The Real Question Is ... (Score 1) 91

by DavidShor (#37302624) Attached to: Facebook Testing Translate Feature For Comments?
This is actually a legitimate issue with translation. I have a lot of teenage cousins from Paris, and they butcher their language as much as our teens do (que turns into ku, qui to ki, non to nn, etc). I actually speak french, so I can sort of trudge my way through it. But my cousins from Israel do it too, and the slang and misspellings completely throw translation software off.

Facebook right now has an oddly rich corpus of multi-lingual slang, they'd be in a good competitive position vs google-translate if they went through the effort to incorporate it into it's translation.

Comment: Re:My prediction (Score 1) 228

by DavidShor (#37117906) Attached to: Santa Cruz Tests Predictive Policing Program
Actually, most statistical analysis of cop behavior show that cops still harass minorities more than you'd statistically expect from their greater chance to commit crime:

"In the period for which we have data, 1 in 7.9 whites stopped were arrested, compared with approximately 1 in 8.8 Hispanics and 1 in 9.5 blacks. These data are consistent with our general conclusion that the police are disproportionately stopping minorities; the stops of whites are more “efcient” and are more likely to lead to arrests"

http://www.stat.columbia.edu/~gelman/research/published/frisk9.pdf

Comment: Re:Graph theory (Score 2) 228

by DavidShor (#37107516) Attached to: Yahoo, Facebook Test "Six Degrees of Separation"
"I always thought that this was a result which was known through graph theory of what happens when you get a large number of nodes each with an arbitrary number of unique connections between them, that it would always tend towards the case that you got an average of no more than six degrees of separation for a sufficiently large network."

Not really, no. It's about scale-free networks (Networks that have preferential attachment, IE, people with tons of friends are more likely to get new friends than people with no friends. Their degree distribution, IE, the number of friends, is power-law distributed as opposed to exponential distributions, which come from friendship being totally random). You can model social networks fairly well as scale-free networks empirically. Roughly speaking, the average distance between two random notes is proportional to the log of the log of the number of nodes.

Real programmers don't bring brown-bag lunches. If the vending machine doesn't sell it, they don't eat it. Vending machines don't sell quiche.

Working...