skelterjohn - Slashdot User

Comment Re:The 1 in 200 bit is garbage (Score 1) 156

by skelterjohn on Sunday June 21, 2009 @11:34AM (#28411027) Attached to: Researchers Find Gaps In Iranian Filtering

I find that likely as well.

Comment Re:The 1 in 200 bit is garbage (Score 1) 156

by skelterjohn on Sunday June 21, 2009 @11:28AM (#28410993) Attached to: Researchers Find Gaps In Iranian Filtering

At the end of the article, they say

The probability that a fair election would produce both too few non-adjacent digits and the suspicious deviations in last-digit frequencies described earlier is less than .005. In other words, a bet that the numbers are clean is a one in two-hundred long shot.

The last sentence does not follow from the first. They are saying that P(these numbers | fraud) = P(fraud | these numbers). This is not the case! If they want to be correct, they need to take into account the prior, like I have said a few times.

Don't accuse me of bastardizing statistics when first: I am not and second: I am pointing out a bastardization of statistics. Try to actual know something about the subject matter, and combine that with some reading comprehension skills before you make these accusations against me.

Comment Re:The 1 in 200 bit is garbage (Score 1) 156

by skelterjohn on Sunday June 21, 2009 @11:23AM (#28410947) Attached to: Researchers Find Gaps In Iranian Filtering

The claim was "in which the authors say the election results have a one in two-hundred chance of being legitimate."

Your claim was "if this election were legitimate, there is a one in two-hundred chance of things turning out this way."

These are two completely different statements. The key is that one is conditioned on the fact that the election is legitimate. This is easy to find out, and is probably what was intended. The sentence provided, however, made an estimate on the odds of legitimacy, conditioning on what was observed.

If they had done this, it would be a straightforward application of Bayes rule, as someone was kind enough to point out, and requires knowledge of the prior on legitimacy.

Comment Re:The 1 in 200 bit is garbage (Score 2, Insightful) 156

by skelterjohn on Sunday June 21, 2009 @10:54AM (#28410727) Attached to: Researchers Find Gaps In Iranian Filtering

Responding to "The prior probability of mr.A cheating has no consequence - we're just looking at the distribution of the numbers."

The claim of the article was that the probability of Mr. A not cheating was 1 in 200. That was the claim I was disputing, not the fact that the ballot numbers were wonky. I thought my point was clear, given the subject I chose for my comment.

When claiming some quantifiable likelihood that there was fraud, the prior on fraud is most definitely relevant. At the same time, the prior is most definitely impossible to know. These two things together make any posterior estimate completely meaningless. *THAT* was my point.

Comment Re:The 1 in 200 bit is garbage (Score 0, Flamebait) 156

by skelterjohn on Sunday June 21, 2009 @10:50AM (#28410699) Attached to: Researchers Find Gaps In Iranian Filtering

I said exactly nothing false. Whether or not it is relevant to the topic on hand is left to the reader, but what I presented was just mathematic.

And, for what it's worth, this slashdotter is a PhD student in machine learning (responding to the GP's comment about 2 PhD students vs a slashdotter).

Comment Re:The 1 in 200 bit is garbage (Score 1) 156

by skelterjohn on Sunday June 21, 2009 @10:48AM (#28410683) Attached to: Researchers Find Gaps In Iranian Filtering

I am not claiming that it is likely that the election is fair. I am claiming that the "1 in 200" statistic is pulled out of a hat, much like the ballot numbers.

Comment The 1 in 200 bit is garbage (Score -1, Flamebait) 156

by skelterjohn on Sunday June 21, 2009 @09:54AM (#28410289) Attached to: Researchers Find Gaps In Iranian Filtering

To help illustrate, I am going to flip a fair coin 100 times. Actually i'll have a computer do it for me. I end up with...

*drumroll*

48 heads and 52 tails!

Seems pretty reasonable. The question is, now, how likely is it that I flipped exactly 48 heads and 52 tails?

If you know something about a binomial random variable (which is what we just sampled from), you know that this is (100 choose 48)*.5^(100) = .0735!

Wow...and that was with only 100 random coin flips. A 1 in 20 chance that, by their metrics, this was a fair set of coin flips (see where the logical incongruity happens?)

The bottom line is the probabilities we get out of this are not useful to think of as absolute...with so many possibilities the likelihood that any one of them in particular pops up is extremely small. However, we know that at least one of them *will* pop up. It is more useful to think of these likelihoods as relative probabilities...if you take the ratio of any two of them, that does tell you how many times more likely one is to happen than the other.

Maybe a useful test would have been to randomly generate some results and look at the likelihood ratio?

Beyond that, to truly say something like "and the probability that they cheated was X", you need to have prior distributions over cheating and not cheating.

A good example for why this is true is the following classic example: you take a test for a disease that has a 99% chance of correctly diagnosing you, and one out of every 10000000 people have this disease. It diagnosis you as positive. Should you be worried?

The answer is: given only that information above, no you should not be worried. Of 1000000000 people, there will be 10000000 false positives (multiply by 1%) and 99 true positives. The rest will be negative (including one false negative, and assuming I did the arithmetic right which is not a given). Given that you test positive, the likelihood that you are, in fact, sick, is 99/10000000. Not bad odds...

The information about how much of the population actually has the disease is what's called a prior. Without a prior on Ahmadi cheating, we cannot make a posterior (the odds after considering the test, or the election results) prediction.

There are lies, damn lies, and statistics... but actual statisticians are pretty good at this stuff. They don't often do political polling though.

Comment Re:!bug (Score 4, Insightful) 239

by skelterjohn on Monday June 08, 2009 @10:51AM (#28251051) Attached to: Software Bug Adds 5K Votes To Election

yeah, cause the difference in saying something like "x+y/2" or "(x+y)/2" is obvious fraud, as it is a bug that wouldn't crash the system.

Comment Re:Yeah... (Score 1) 348

by skelterjohn on Friday June 05, 2009 @03:59PM (#28226885) Attached to: String Theory Predicts Behavior of Superfluids

Or that the object is magnetic. We can make entire trains float.

Comment Re:CPU Usage... (Score 1) 251

by skelterjohn on Friday June 05, 2009 @07:21AM (#28220303) Attached to: Google Announces Chrome For Mac and Linux Dev Builds

If you're on os x, you should try out Stainless. It's quite good, and based on some of the same ideas as chrome, only more mature. I've been using it as my primary browser (above safari, firefox, or the new safari beta) for some time.

It's missing a few things, but honestly, I don't care. It's sleek, simple and multi-processed.

Actually one of my favorite features is the unified address/search bar. Only problem is I have to go to google first to search for queries that have periods and no white space.

Comment Re:Taking vs Excelling (Score 1) 588

by skelterjohn on Wednesday June 03, 2009 @10:19AM (#28195393) Attached to: The Myth of the Mathematics Gender Gap

Hah! Universities will hand out PhDs just to get rid of some people.

Comment Re:Well, Obama is nominating Sotomayor... (Score 5, Insightful) 456

by skelterjohn on Tuesday June 02, 2009 @04:56PM (#28187685) Attached to: Sotomayor's Position On Copyright Damages

It's not like Obama ran on a platform of copyright abolition.

There is no misrepresentation going on here, even if you had hoped that since you agreed with him on one thing that he would agree with you on another.

Comment Re:Dealing with Layered Problems (Score 2, Interesting) 154

by skelterjohn on Wednesday May 27, 2009 @01:00PM (#28111681) Attached to: How IBM Plans To Win Jeopardy!

Since you bring up crosswords as an example of this sort of issue, let me point you to http://www.oneacross.com/proverb/

Its an automated crossworld puzzle solver. How it works (and my advisor led the project, though I don't work on anything remotely similar) is that it has a large number of solver modules that are each good at a certain kind of clue. One might be really good at looking up famous people based on keywords. Another might be good at... I dunno some other type of crossword clue.

Then each of these modules made lists of possible answers for each clue (subject to length and letter constraints), complete with the confidence they had in various answers.

A central "merger" then collected the candidate answers for each clue from the different modules, and then did lots of tricky search-like algorithms to find a set of answers that seemed the most cohesive.

This system, PROVERB, was at least the best computer system for solving crosswords, at one time, and did fairly well in competitions in which other humans competed too.

With Jeopardy! something similar could approach this issue, as well, except without the added constraint that questions to different answers have to relate to each other on the level of spelling.

Comment Re:geocentrism (Score 1) 146

by skelterjohn on Wednesday May 27, 2009 @12:06PM (#28110815) Attached to: Pulsar Signals Could Provide Galactic GPS

It doesn't rotate around the sun. It is fixed in both time and space.

Comment Re:How accurate does it need to be? (Score 1) 146

by skelterjohn on Wednesday May 27, 2009 @12:05PM (#28110793) Attached to: Pulsar Signals Could Provide Galactic GPS

Hide something in interstellar space, note its current GalacticPS coordinates and velocity, and come back years later to find it. Probably needs to be fairly accurate.

Slashdot Top Deals