ma_luen - Slashdot User

Comment Re:Might be a mistake but not where Rob is pointin (Score 1) 436

by ma_luen on Monday March 01, 2010 @01:32PM (#31318662) Attached to: Schooling Microsoft On Random Browser Selection

If you define correctness as "producing the desired results on one particular version of one particular platform", then you have a recipe for non-robust code that won't survive in the real world.

I think defining correctness as "producing the desired results is pretty reasonable. Sure correct doesn't mean robust, but they are different things and this discussion has been about correctness.

Damn, I've been wasting my time. Now I know you didn't properly read the article. He includes the test programs he used. ...
Not worth responding to, just reinforces that you never looked at the source code

Yeah, I looked at the code. He replaces the sort says the bug is gone, must be a bug in how the sort is used. Sure seems likely but where is the actual bug. If we are going to spend the time giving a case study of a bug I think showing exactly there the bug is and why it is matters. But I suppose reasonable people can disagree.

The main thing this discussion has shown is the existence of different metrics for program quality, i.e. an error and a bad design, one is wrong the other is undesirable but correct (at least for the current program). Both are bad but in different ways and need to be handled using different techniques.

Conference proceedings don't count for much. I can't see any evidence that you know anything about devloping robust programs. Not that I'd expect you to, it is quite a different game to developing verification software anyway (although if the verification software itself needs tp be robust, then we have a problem).

If that is your opinion. But I should note that IBM, Microsoft, Mozilla, etc. spend a fair bit of money to sponsor and send people to these things. So someone apparently thinks these proceedings count for a fair bit (and companies that care a lot about building robust systems).

I'll leave my part at that.

Comment Re:Might be a mistake but not where Rob is pointin (Score 1) 436

by ma_luen on Monday March 01, 2010 @12:05PM (#31317244) Attached to: Schooling Microsoft On Random Browser Selection

There are lots of dangerous but technically correct programs. In fact almost every program written is ill specified and most likely has methods that are used in ways that they were not initially intended.

As for my example, I agree in a "good" program one would always use it according to the given specifications. Doing anything else is breaking modularity and invites errors when making later changes. If you fail to understand the difference between modularity/abstraction (using the semantics provided by the contracts) and correctness (producing desired results) then I cannot help you.

No what Rob demonstrated was that replacing the call to sort made the error go away. But several things were changed here as well (most notably the random number generation and initial seeding were probably altered substantially). So, while he is right that calling sort to get a purmutation is a bad idea and he is (probably) right that the sort was the problem, this is never actually shown in the article. He just assumes that the bug went away so it must mean that the sort was the problem (but as I showed above this need not be the case).

As for your concern for me, I doubt if my code has appeared on TheDailyWTF, but I do know that I recently had some of my work in the ACM proceedings of "Verification Model Checking and Abstact Interpetation" on modeling the semantics of programs for later verification applications. So, I do know a bit about program correctness and specification. How about you?

Comment Re:Might be a mistake but not where Rob is pointin (Score 1) 436

by ma_luen on Monday March 01, 2010 @11:05AM (#31316260) Attached to: Schooling Microsoft On Random Browser Selection

You do realize there is a difference between code that is dangerous (but technically correct) and code that is actually buggy. As per my quicksort example. Dangerous, but correct.

You are getting intent and results confused. Calling my quicksort routine with an invalid compare function clearly violates the intent of the contract abstractions but functionally produces the correct results. The same may happen with your sqrt(-1) example, it is a bad idea from a modularity standpoint but may be functionally correct.

If you want to talk about if that is the case for the browser selection, well, I am glad you agree that determining if the implementation of the JS quicksort routine causes the bug (or if it is something else) would be a good idea. Unfortunately the author of the article was happy to jump to premature (although not unreasonable) conclusions.

And don't worry about me being a programmer. My code is quite reliable and in fact I do a lot of work on software reliability and correctness.

Comment Re:Might be a mistake but not where Rob is pointin (Score 1) 436

by ma_luen on Monday March 01, 2010 @09:31AM (#31314896) Attached to: Schooling Microsoft On Random Browser Selection

Look, I agree with you on mis-using the sort routine. Using a method to do something unintended does not cause bugs, it makes them more likely and is bad form. But without further analysis it is not certain if and why the sort is the source of the bug.

As for your challenge see the first example of quicksort from wikipedia:

--------
function quicksort(array)
var list less, greater
if length(array) 1
return array
select and remove a pivot value pivot from array
for each x in array
if x pivot then append x to less
else append x to greater
return concatenate(quicksort(less), pivot, quicksort(greater))
------

Note that this correctly implements quicksort. By induction on length we show the uniform distribution:

length = 1: trival.
length = 2: a 50/50 comparison has a 50% chance of producing either order.
length = k: Each element has a 50% chance of being in less and a 50% chance of greater. By induction we will get permutations of each of these uniformly. Since we have a uniform chance of putting each element in each of these (and pivot is selected uniformly at random) the combination of the three will produce all purmutations with equal probability.

So if you want to argue the effects of possible implementation details then that is fine but that was my point. These implementation details are key to determining what actaully caused the problem (which as shown above may not be the sort at all). The article never addresses these only says the problem must be there, then after changing multiple things the problem isn't.

Comment Re:Might be a mistake but not where Rob is pointin (Score 1) 436

by ma_luen on Monday March 01, 2010 @08:10AM (#31314212) Attached to: Schooling Microsoft On Random Browser Selection

If you want to analyze the behavior of a specific implementation of an algorithm given input that violates the preconditions, then lookup the source code for your javascript interpreter. That may be a fun exercise to do, if you are bored, but it isn't very illuminating unless you want to understand why browser X produced that specific distribution.

Yeah that is exactly what I wanted because the article was about why a particular javascript implementation produced a poor distribution for the browser selection screen in the EU browser ballot. That is a specifc bug in a specific implementation.

I can tell you that failure to do array bounds checking before loads/stores will lead to bugs as well. But that does not explain why a specific bug exists. If you want to discuss poor programming practices that is fine but I did not find this a very enlightening way to do it.

And if you run bubble sort, as described in basic algorithms, it will always terminate (regardless of the comparator) since progress is always made via the loop indexing. In fact termination is often not a function of the loop body behavior. One only needs to show that there is an order (which is not infinitely decending/asscending) on some function of the program state in order to ensure termination. For bubble sort this is just the order on the loop index counter (i).

Comment Re:Might be a mistake but not where Rob is pointin (Score 1) 436

by ma_luen on Sunday February 28, 2010 @08:43PM (#31310546) Attached to: Schooling Microsoft On Random Browser Selection

No there is no guarantee and the article does not specify which sorting algorithm is actually used. But any reasonable sorting algorithm should terminate very quickly even with the incorrect comparison function. Which is one of the reasons I found the article annoying, lots of general claims but very little analysis in the end.

I usually don't post much but for some reason the article struck me as designed to have just enough technical content to seem authoritative but that the intent was simply a chance to get in some cheap shots at Microsoft.

Comment Re:Might be a mistake but not where Rob is pointin (Score 1) 436

by ma_luen on Sunday February 28, 2010 @08:34PM (#31310494) Attached to: Schooling Microsoft On Random Browser Selection

I agree with your general point. The particular implementation was not good. Violating method contracts leads to bugs. Yes, obviously. But issue was poor distribution in the random numbers. And the author never determined why this was the case. Was it the implementation of quicksort, an issue with seeding when run in a tight test harness, as you mention does it used different sorts for different input sizes, etc?

So yes, we have a program that is poorly written and produces the wrong answer. We also have an article that is poorly written and does not explain why the results are what they are (other than don't violate method contracts, which is good advice but not very enlightening).

Comment Re:Might be a mistake but not where Rob is pointin (Score 1) 436

by ma_luen on Sunday February 28, 2010 @06:49PM (#31309562) Attached to: Schooling Microsoft On Random Browser Selection

Most likely I graduated from a different school but I appreciate that you asked.

You are quite right the formal precondition of the sort method is violated and this implementation is quite sketchy in that respect. However, that does not tell us what the distribution errors are caused by. As Rob is making other strange statements like possible non-termination (which should be impossible in any reasonable quicksort), I am not particularly happy to just believe him without a more careful analysis (e.g. seeding, how the test harness was implemented, pivot selection, etc.).

Comment Re:Might be a mistake but not where Rob is pointin (Score 1) 436

by ma_luen on Sunday February 28, 2010 @06:24PM (#31309396) Attached to: Schooling Microsoft On Random Browser Selection

Yes I am well aware of this, in my post I state a "simple quicksort" algorithm which precludes several of the issues you mention and I discusses the possibility of problems in the pivot selection. But as you will note the article does not discuss any of these issues so we have no idea if it is one of these or not (it just claims it must be the sorting).

So we basically have an article that says I ran a benchmark and I can't explain the results so I'll just make up something that sounds good.

Comment Re:Might be a mistake but not where Rob is pointin (Score 1) 436

by ma_luen on Sunday February 28, 2010 @04:50PM (#31308684) Attached to: Schooling Microsoft On Random Browser Selection

And to follow up after re-reading the article he claims that non-termination is possible. But since in quick-sort each recursive call is to a smaller list (we removed the pivot element otherwise the algorithm has real problems) the sort must always terminate.

Now this might not be a very elegant solution and definitely violates the abstract specifications of a sort routine but the problems pointed out by Rob are not actually the real problems. Way to go slashdot, always making sure articles that slag on Microsoft are factually accurate .

Comment Might be a mistake but not where Rob is pointing (Score 1) 436

by ma_luen on Sunday February 28, 2010 @04:39PM (#31308604) Attached to: Schooling Microsoft On Random Browser Selection

Well unless there is something else going on Rob needs to go back to school too. A simple quicksort algorithm with the comparator replaced with a random choice becomes a recursive random partition of the array. A simple inductive argument shows that this will not produce the unbalanced results he found. So the problem is elsewhere (most likely in the RNG seeding or a bias in the pivot selection, if he is running it in a tight loop and the seed is something like, current second, then the same seed may appear several times).

Comment Re:Not fraud (Score 1) 406

by philipgar on Wednesday February 17, 2010 @07:17PM (#31177984) Attached to: Owners Smash iPhones To Get Upgrades, Says Insurance Company

Actually, this isn't always true. Sometimes, if the warranties are provided by the manufacturer for a fee, it can be cheaper to buy their extended warranty than to run the risk of your hardware failing. Although this doesn't make sense at first glance, it does for many consumer electronics where buying replacement parts to fix them is extremely expensive, as no market for them exists. The only choice is getting the original manufacturer to fix the product at a massively inflated price. In these instances, the manufacturers often have warranties available that could be less than the projected cost of fixing the failures yourself.

With something like a laptop, if it's 2 years old, out of warranty and the logic board dies, it's often about the same price to buy a new laptop as it is to replace your logic board. If this happens to even 1/10 laptops (not counting other problems that are likely to occur), a warranty costing 1/10th of the price of the machine is a worthwhile purchase. In reality, the numbers are likely even higher than this, and many of the manufacturer extended warranties are worth it simply because the cost of getting them to fix your product otherwise is completely unreasonable.

Of course, Best Buy and other stores warranties rarely operate under these principals, and I largely view them as a ripoff. However, I will say buying them isn't always a bad idea. Sure you'll end up behind in the long run, but it's the same with buying any insurance, even HEALTH insurance. You buy the insurance not because it's a good deal, but because the cost of fixing a potential problem is too great to take on on your own.

Phil

Comment Re:Seems reasonable (Score 1) 505

by philipgar on Tuesday February 09, 2010 @07:57PM (#31080166) Attached to: Call For Scientific Research Code To Be Released

The first part that you propose is COMPLETELY unreasonable There is no way conferences or journals would remotely be able to "peer review" source code as part of their review process. You must think that scientific peer review is that strong, it really isn't. They read over the papers, and much of their decision is based on their own biases, although your ability to sell your idea in a paper can help it get in. I know conferences, and likely many journals have papers with grammatical errors in them (have you ever read some of the foreign submissions to these things?), they get the most glaringly bad ones out most of the time, but they don't remove them all. The lack of citations is sometimes seen, but not all reviewers know of all the relevant work that is out there. Often times as a submitter to a conference or journal, you look at who will likely be on your review committee, and ensure that you are citing their work, even if the work is only tangentially relevant.

As far as "peer review" for correctness goes, that tends to be a joke depending on the field. Scientists aren't trying to reproduce your experiments before seeing if they should accept or reject a paper. They have to take it on face value that the work you do is real. Occasionally you'll hear stories about students who have been faking results for years, getting multiple publications etc. It's not until years later when this gets revealed, and the papers get invalidated. This can often ruin careers of relatively innocent bystanders, such as professors who trusted that their student was doing the work they said they were doing. There just isn't time to do all the fact checking necessary. Trying to add a code review to the process is akin to asking a scientist to stand behind another scientist whenever they're doing field or lab work to verify that their procedures are correct. Some things are taken at face value as being accurately done or not.

As for simulations, at least in my field (roughly computer architecture), the simulator you use or base your research on can greatly influence what people think of your results. Using older less detailed simulators will get your work knocked down pretty quick. However, I will say that credit is MOST DEFINITELY given to the authors of the simulators used. This is generally done through the citations when describing the simulation infrastructure, or methodology of the work. Most simulators have a single paper written describing how it works etc. This is normally written a few years after the simulator was developed, and the group that developed the simulator has already gotten most of the use they could get out of the base infrastructure.

Phil

Comment Re:Seems reasonable (Score 3, Insightful) 505

by philipgar on Tuesday February 09, 2010 @03:18PM (#31076300) Attached to: Call For Scientific Research Code To Be Released

Actually, I'm pretty sure everyone is fairly close with the current data they're generating to prevent other groups from beating you out the door with your idea. The exceptions to this rule are when professors trust one another, and know that the other wouldn't use the information you're supplying them with to do the same research you are already working on.

As a graduate student, you definitely don't want to share code you've developed immediately. You may spend 2 or 3 years of a PhD writing code, and get a couple papers out of it, but with the code base in place you plan on getting a handful more. More to the point, these papers become relatively easy to generate, because you spent those years developing the program that allows you to do it. Writing papers, and generating results, analyzing them etc takes time, so you can't do everything at once. Releasing your code too early means other groups can do these other experiments, and you, the grad student who spent so many years setting up the code or experiment for them, still wouldn't be able to graduate, because you have not produced enough original research, and instead only developed the tools others used to pump out results.

As a student nears graduation, they might be more willing to release their code, as then competition is less of a concern. Someone won't pick up your code and be releasing a paper based on it in 2 or 3 months, it just takes too long to get up to speed. However, the BIGGEST impediment to releasing software in academia is the support that you have to give to your software if anyone is going to use it. You first need to audit and clean up your code, a non-trivial task. You have to supply documentation on how to use the software, another non trivial task, and then provide documentation on the basics of how it works etc. All of this stuff takes a lot of time, and doesn't tend to help a student graduate. Also, once code is released, there's an expectation that you'll be providing some level of help with questions. Granted that normally rarely happens (as the author has gone on to do other things, and hasn't touched the code in years). It just becomes a difficult thing to do.

Phil

Comment Re:He got away with it. (Score 1) 402

by philipgar on Tuesday February 09, 2010 @02:55PM (#31075864) Attached to: Space Shuttle Spy Gets 15 Years

I'm pretty sure it has to meet the US definition of treason. For instance Robert Hanssen was sentenced to treason for spying for the Russian government, and betraying the US government. As far as I know we were never officially at war with the Soviets.

Looking at the UC constitution, treason is defined as follows:

"Treason against the United States, shall consist only in levying War against them, or in adhering to their Enemies, giving them Aid and Comfort. No Person shall be convicted of Treason unless on the Testimony of two Witnesses to the same overt Act, or on Confession in open Court.

The Congress shall have Power to declare the Punishment of Treason, but no Attainder of Treason shall work Corruption of Blood, or Forfeiture except during the Life of the Person attainted."

It only states adhering to the country's enemies, and NOT specifically countries we are at war with. Otherwise the definition of treason would end up being pretty ridiculous.

Phil

Slashdot Top Deals