Most static analysis tools look for bugs and potentially buggy behavior. They must rely on limited pattern matching and data flow analysis. They can't find all bugs. See: The Halting Problem.
Slashdot videos: Now with more Slashdot!
So... we should re-write OpenSSL in a higher level language, yes? About how long is that going to take? How many code bases that currently use OpenSSL will not be able to use this new library, due to portability reasons? Your solution to the problem is impractical.
Should higher level languages be used when possible? Absolutely. I'm a fan of high level languages. I prefer to write software in Haskell and Scala when possible. Is suggesting the use of a higher level language at all helpful here? No.
There is idealism, and then there is pragmatism. I choose to be pragmatic. The problems in the OpenSSL code base are not impossible to solve. OpenBSD is written primarily in C, and compared to OpenSSL, it's light years beyond this library in terms of good programming style and proper use of language constructs. It's not failed advice to argue that these issues can be avoided. It's pragmatic advice, and it's more useful than throwing out a mature library and re-writing it from scratch in a higher level language.
Could a higher level language help? Sure. Is it a realistic and practical solution to OpenSSL's issues? Not really.
I've heard this argument, and I've seen blunders of vulnerabilities in Java, C#, Ruby, Python, and other higher level languages. This is not a language or platform specific problem. It's an industry wide problem. Show me typical code in any language, and I will find bugs in that code. Some of those bugs can be exploited.
The biggest difficulty I have with advocates of higher level languages is that I have a harder time convincing them that bugs in their code can be exploited. "...but, this is Java. It's supposed to be secure." Better languages can help, but they are not a panacea. It takes dedication and hard work to write hardened code.
Whether the flaw was intentional or not is irrelevant. It should not have been possible to commit such a flaw into the codebase without someone noticing.
I can agree with that. Programming style often gets confused with window dressing, which goes against the original guidelines that Kernighan and Plauger laid down in The Elements of Programming Style, many of which are still quite valid today. I refer to this tendency as "cargo cult" programming style, where people attempt to poorly mimic the style guidelines that have been passed down for generations, getting all of the trappings of this original understanding, but none of the substance.
I'm trying to take back the phrase, "programming style", to mean something in the spirit of what Brian Kernighan meant. Sometimes it means that I get a little curmudgeonly about its mis-use.
Style, or the lack thereof, is absolutely related to this issue. It created the festering environment that this bug hid in for two years before it was discovered.
Style is about more than pretty print formatting. It's about avoiding the god-awful raw pointer math found in this function. It's about properly bounding values. It's about enforcing the sorts of checks that come naturally to programmers with more experience and less bravado. You may not appreciate the need for good style yet, but I bet you that the OpenSSL team is rethinking this now. To know that such a sophomoric mistake lingered for two years, even though hundreds of eyes passed over that code, is the epitome of why good programming style matters. The people who looked at this code are likely much smarter than you or I. They could not follow the logic of this code, because their eyes glossed right over this glaring bug. That's bad style. Everything else is window dressing.
I meant that the refactor would make the bug obvious. However, as is the case with any bit of refactoring, one often finds bugs, writes test cases to capture these bugs, and then comes back to eliminate them. While the pedantic would argue that refactoring keeps functionality the same, refactoring is just one step in a larger process of code stewardship that includes the isolation and elimination of bugs. When a refactor makes a bug obvious, I contend that the refactor helps to eliminate that bug.
Either way, you are correct: refactoring does not fix bugs. But, in the larger sense, it brings them to light.
Theo de Raadt is correct, if not a bit abrasive in his assessment of the situation.
Two years of dedicated work: writing proper unit tests, refactoring code, and refreshing this library would do wonders for this project.
I agree. Code review with an eye towards modern programming style would have brought this bug to light years ago.
There is well written C, and there is poorly written C. I've been through the bowels of OpenSSL, and there are parts of it that frighten me. Ninety percent of the issues in OpenSSL could be solved by adopting a modern coding style and using better static analysis. While static analysis tools can't find vulnerabilities, they can root out code smell that hides vulnerabilities. If, for instance, I followed the advice of two of the quality commercial static analyzers that I ran against the OpenSSL code base, I would have been forced to refactor the code in such a way that this bug would have either been obvious to anyone casually reviewing it, if the refactor did not eliminate the bug all together.
C and C++ are not necessarily the problem. It's true that higher level languages solve this particular kind of vulnerability, but they are not safe from other vulnerabilities. To solve problems like these, we need better coding style in critical open source projects.
I can explain every detail about how a bit of software I did not write works if I study it long enough. I can even do a pretty good job of explaining why I wrote the software a certain way, even if I did not write it. I can make said software look quite beautiful, and give a rather impressive presentation about how it works. I'm sorry, but there's probably no way that you can tell that I did not write this particular software, unless you happen to know the software. Just because I can describe, in detail, how a bit of software works doesn't mean that I'm capable of writing it or something like it myself.
I pass over hundreds of resumes a month, after they've been filtered by HR. I've interviewed countless people over the past fifteen years. I've seen it all. People often embellish or outright lie. Many are not nearly as good as they think they are. Board problems are objective. They can't be faked, and unless the problems are known ahead of time, they can't be rehearsed.
Actually, if the CS board problem is done correctly, then error handling, software design, and planning can be measured. There are plenty of corner cases in most of these problems. That's where the real fun begins. Even something as simple as adding and removing items in a linked list or a binary tree leads to corner cases.
Everything else comes down to Q&A. That's where we get into probing questions about design, documentation, etc.
As for code samples, I'm not a big fan of code samples. You won't believe how many times I've seen people plagiarize source code in interviews. The only exception here is the take-home test. The only problem is that take-home problems have to be rotated out pretty quickly. They hit the forums within a week.
As someone who gives CS interview problems, I have to disagree with your assessment. The problems aren't designed to prove that you can implement a bubble sort. It's meant to be representative of the sort of typical hard problem you'll be faced with writing software. The reason why we choose CS problems is because they are properly bounded, they have a finite number of correct answers, and if you get off course while working them out on the board, we can better help you to get back on course. Furthermore, there are decades of research that have gone into these problems, so a naive board implementation leads to all sorts of prompts for interesting questions.
Most of my evaluation has nothing to do with whether you get the right answer or the wrong answer. It has to do with how you arrive at the answer, and how you respond to constructive criticism, or in a pair programming environment. I couldn't care less if you can write a bubble sort coming in; if you solve the problem quickly, I'll just substitute a different one that you can't solve quickly. It's the process by which you arrive at an answer that interests me, and CS problems are, by far, the easiest way to uncover this.