Bug Hunting Open-Source vs. Proprietary Software - Slashdot

Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

×

Bug Hunting Open-Source vs. Proprietary Software 244

Posted by Zonk on Saturday October 07, 2006 @01:41PM from the i-am-a-big-fan-of-quality dept.

PreacherTom writes "An analysis comparing the top 50 open-source software projects to proprietary software from over 100 different companies was conducted by Coverity, working in conjunction with the Department of Homeland Security and Stanford University. The study found that no open source project had fewer software defects than proprietary code. In fact, the analysis demonstrated that proprietary code is, on average, more than five times less buggy. On the other hand, the open-source software was found to be of greater average overall quality. Not surprisingly, dissenting opinions already exist, claiming Coverity's scope was inappropriate to their conclusions."

This discussion has been archived. No new comments can be posted.

Bug Hunting Open-Source vs. Proprietary Software

Search 244 Comments Log In/Create an Account

Comments Filter:

Re:So how did they test the proprietary software? (Score:5, Informative)

by msh104 ( 620136 ) writes: on Saturday October 07, 2006 @01:55PM (#16349333)

they tested it by using a program that systemattically scans code for common errors.

I don't know if the closed source statistics are online somewhere, but these are the open source statistics.
http://scan.coverity.com/ [coverity.com]

and if you ask me the "Defect Reports / KLOC" is pretty low, and such software would normally be considered "good" software.

Parent Share
twitter facebook
Re:How much is it really true? (Score:3, Informative)

by teg ( 97890 ) writes: on Saturday October 07, 2006 @02:00PM (#16349367)

An open source software is tested by a whole lot of people over the world and everyone is free to take the code and test if. On the other hand in case of proprietary software this is not the case and is tested by far less number of individuals.

That sounds rather idealistic... The coverage on OSS varies a lot. Most is not tested much, and the testing is not systematic and analyzed, but ad hoc. And if a bug is found, many just shrug and think of it as buggy software, but don't do more about it. There is a world of difference between the high standards of large projects like the linux kernel, apache and eclipse vs. the thousands of small projects found on freshmeat or just googling for something to scratch your itch.

Parent Share
twitter facebook
Misquoting TFA (Score:5, Informative)

by Harmonious Botch ( 921977 ) writes: on Saturday October 07, 2006 @02:07PM (#16349425) Homepage Journal

While I appreciate that PreacherTom was good enogh to bring this to us, the sentence "...no open source project had fewer software defects than proprietary code." just does not match TFA.

TFA says that no open source project is as good as the BEST of proprietary, but it also says that the AVERAGE open source is better than the AVERAGE proprietary.

Share
twitter facebook
Actually (Score:2, Informative)

by YetAnotherBob ( 988800 ) writes: on Saturday October 07, 2006 @02:16PM (#16349497)

the report (carried by Business Week) said that the Porpriatary software that beat out the open source stuff was avionics software or controls for reactors or other heavy industrial software. That stuff is all small, done in assembly, and extensively tested.

It was not an apples to apples comparison, more like apples to diamonds. Dom't worry, just fix any real problems identified. Many of the bugs found are theoretical, not real. Many others are style questions. the experts will probably never quit arguing about what is 'good' and what is 'bad'.

There are also some very real race conditions, memory leaks, and things like that. A real list by line number would be nice.

That said, what would really matter is a comparison program by program. I don't think my quick and dirty one off is really in the same class as the Kernal, or Firefox,I havn't seen how this diferentiates.

Share
twitter facebook
Re:So how did they test? -- badly (Score:3, Informative)

by AJWM ( 19027 ) writes: on Saturday October 07, 2006 @02:20PM (#16349541) Homepage

they tested it by using a program that systemattically scans code for common errors.

A method known to have flaws. It raises a ton of false positives, things that might "look like" potential bugs but aren't because of the data flow. You have to do a data flow analysis to see if they really are bugs.

For example, not checking for buffer overflows when copying strings, etc, is usually considered a (potential) bug. Certainly it is when dealing with unknown input. However, in a function buried deep behind layers of other code that has already filtered out potentially overrunning strings, it is quite safe. (At least until said code is changed, so it is considered bad practise, but the fact remains that if there's no data path that can cause a problem, it's not a bug.)

Mind, there's bugs and there's bugs. The Space Shuttle software is generally considered to be about the most defect-free software around, but they'll launch with some bugs. Specifically, minor bugs in the display (ie a misspelled word, a column that doesn't quite align) are allowed because they're a much lower risk than going in and changing the software to fix them (and possibly introducing a different bug).

Parent Share
twitter facebook
Re:Exactly what constitutes a software bug? (Score:3, Informative)

by dgatwood ( 11270 ) writes: on Saturday October 07, 2006 @02:22PM (#16349547) Homepage Journal

Security holes. Coverity specializes in programmatic detection of buffer overflows.

Oh, and I forgot some of the other obvious things you can check for: unreachable code, comparisons that always evaluate to true or false, possible uninitialized use of variables, global and/or heap storage of pointers to variables on the stack.... There are a lot of things that are usually unsafe to do and are usually bugs. It is usually too slow to check for this stuff during compilation, as it requires at least some degree of static (and possibly dynamic) function call tree analysis.

Parent Share
twitter facebook
Re:just an example of how "buggy" OSS software. (Score:4, Informative)

by Reziac ( 43301 ) * writes: on Saturday October 07, 2006 @02:35PM (#16349651) Homepage Journal

Quoth the poster:

linux 2.6: 3,315,274 lines of code, 0.138 / 1000 lines of code.
kde: 4,518,450 lines of code, 0.012 bugs / 1000 lines of code.

So far so good! But for contrast, I'll add this stat from TFChart:

Gnome: 31,596 lines of code, 1.931 bugs / 1000 lines of code.

Eeeep!!

(No wonder I prefer KDE :)

Parent Share
twitter facebook
Re:What's a bug? (Score:4, Informative)

by AJWM ( 19027 ) writes: on Saturday October 07, 2006 @02:35PM (#16349655) Homepage

Knuth used to have this great offer where he'd send you a check for pi or e or something if you managed to find a bug in his code.

I think you're conflating two things. The check was (is?) for $50 or some such. The version number of the software is pi (or e) to whatever number of decimals, where each subsequent release adds a decimal place (becomes a closer approximation to the real thing.)

No, his concept of a bug is a deviation from the specified functionality.

That's the only reasonable definition of a bug in the software.

But what if that functionality is wrong or sucks?

Then that's a bug in the specification or in the requirements. I spent the better part of six months debugging the requirements on a major project once. Part of that was getting mutual agreement from three major customers, part of that was resolving internal inconsistencies in the requirements document, and part of that was a high level design process in parallel, to be sure we had a chance of actually satisfying the requirements.

Of course the end user (especially of off-the-shelf software) generally doesn't differentiate between a bug in the software vs a bug in the specification or requirements. The end user generally never sees the spec, and only has a vague idea of the requirements. (Sometimes worse than vague -- how many people do you know who use a spreadsheet for a database?)

(And to BadAnalogyGuy -- I'm not disagreeing, just amplifying.)

Parent Share
twitter facebook
Re:Why is this surprising? (Score:5, Informative)

by tb3 ( 313150 ) writes: on Saturday October 07, 2006 @02:38PM (#16349681) Homepage

Are you nuts? Or are you just trying to see how many vapid over-generalizations you can jam into a single comment?

Propriety software traditionally undergoes a formalized, designed testing process. It's not perfect, but it's an ordered approach to boundary testing, design level implementation of quality, and more.
Says who? QA and testing covers the entire gamut, from formalized unit-testing at every level, to 'throw it at the beta testers and hope nothing breaks'. it's got nothing to do with 'proprietary' (not 'propriety') vs open source.

Open source software must rely on after-the-fact testing in the form of "this broke when I tried to do this".
Where on Earth did you get that? Are you completely oblivious to all the testing methodologies and systems developed by the open source community? Here's a few for you to research: JUnit, Test::Unit, and Selenium.

Commercial software has a strong QA engineering component. Open Source software relies primarily on a black box testing approach.
Again with the generalizations! Commercial software development is, by definition, proprietary, so you don't know how they do it! They might tell you they have a 'strong QA engineering component' (whatever that means) but they could be full of shit!

Parent Share
twitter facebook
A bug can be many things (Score:4, Informative)

by jchenx ( 267053 ) writes: on Saturday October 07, 2006 @02:41PM (#16349703) Journal
I work at MS. In my group (and I imagine it's the same in others), a bug can be many things. Here's what they typically are though:
1. A product defect
  - This is the typical meaning behind the word "bug".
2. DCR (Design Change Request)
  - That's where your TeX complaint would fall under. It's "by design" that it doesn't have an iconic user interface, but that doesn't mean it's something that shouldn't be addressed ever
3. Work item
  - This is actually a result of the bug tracking system that we use. Rather than sending e-mail, which often gets lost, we often track work items as bugs. For example, "Need to turn off switch X on the test server when we get to milestone Y"
To further complicate things, there is a severity and priority attached to every bug. Severity is a measure of the impact the bug has on the customer/end-product. It can range from 1 (Bug crashes system) to 4 (Just a typo). Priority is a measure of the importance of the bug. It ranges from 0 (Bug blocks team from doing any further work, must fix now), to 3 (Trivial bug, fix if there is time). (I don't know why the ranges don't match, BTW, seems silly to me)

As anyone who works on large-scale project probably knows, there are always a wide range of bugs, across all the pri/sev levels. To me, a simple count of all the bugs isn't terribly useful. A project could have a ton of bugs, but most of them being DCRs (which are knowingly going to be postponed till the next release) and/or low pri/sev bugs. Or maybe it's the beginning of the project and they're all known work items. Or a project could have only a few bugs, but with all of them being critical pri/sev ones.

So, whenever I see a report that simply talks about bug count, I take it with a huge grain of salt. If I had to guess (I skimmed the article), it seems like OSS projects have far more bugs, but perhaps lower pri/sev since the product itself has been evaluated as being higher quality. In the end, it's the quality that the customer really cares about.
Parent Share
twitter facebook
Re:What's a bug? (Score:2, Informative)

by tawhaki ( 750181 ) writes: on Saturday October 07, 2006 @03:23PM (#16350007)

Knuth used to have this great offer where he'd send you a check for pi or e or something if you managed to find a bug in his code.
It is a $2.56 check. The reasoning for that was that it is "an hexadecimal dollar".

Parent Share
twitter facebook
Re:How much is it really true? (Score:4, Informative)

by angel'o'sphere ( 80593 ) writes: <angelo,schneider&oomentor,de> on Saturday October 07, 2006 @03:33PM (#16350059) Journal

And n00b developers are also capable of finding bugs. Aren't they?
No they are not to the extend of a experienced developt.
going through the code dow not find bugs. Either you do a formal correct approach, that is a walk through or a code inspection then you may find bugs, or you only have the chance to find occasional off by one errors in a loop or array index. Just by looking over code as you say in your n00b appoach you only find suspicious pieces of code.
What now? You change it to be less suspicious? And then? You commit it? So you don't know if somethign elsewhere is breaking now because of your change? Ah .... you have test cases for the software? So you run them after your refactoring? What now? All pass as before? Oops, if so: then you had no test case for that piece of suspicious code you just have fixed! So you still don't know if there was an error or not!

Testing means to DEFINE how individual pieces of code should behave and writing a test case exactly for that. Changing software and fixing bugs means to have tests, lots of tests, not eyeballs.

angel'o'sphere

P.S. that does not mean that formal walk throughs / inspections don't work, they do!! But informal ones are only for educational purpose intersting.

Parent Share
twitter facebook
Re:Exactly what constitutes a software bug? (Score:3, Informative)

by angel'o'sphere ( 80593 ) writes: <angelo,schneider&oomentor,de> on Saturday October 07, 2006 @03:46PM (#16350151) Journal

Somebody please explain to me exactly what kind of software bug can be found by automatic scanning that isn't found by standard debugging and compile-time checks. If a computer can ascertain exactly what the programmer intended to do, why do we need programmers?
Decimal one = 1; Decimal two = 2; one.add(two); System.out.printline(one);
Guess whats printed? Similar errors are made if you use methods on java.lang.String like replace(pattern, replacement, pos).

The simple answer to this is that they can't.
Thats a very uninformed oppion! Tools like http://pmd.sourceforge.net/ [sourceforge.net] have a data base of over 100 "bug patterns" to check your code againt. That does not mean all found points are truely positive, but thy definitely are bad coding practice and my end in a bug later if the code gets changed. There are lots of simialr tools, check IBMs alaphaworks and developerworks e.g.

angel'o'sphere

Parent Share
twitter facebook
Re:Misquoting TFA (Score:2, Informative)

by sensei moreh ( 868829 ) writes: on Saturday October 07, 2006 @04:19PM (#16350335)

"...no open source project had fewer software defects than proprietary code." just does not match TFA. AMANDA,emacs, ntp, OpenMotif, OpenPAM, Overdose, Postfix, ProFTPD, Samba, Subversion, tcl, Thunderbird, vim, XMMS all now with 0 defect reports/KLOC. That must match the best closed source software!

Parent Share
twitter facebook
Re:So how did they test? -- badly (Score:3, Informative)

by belmolis ( 702863 ) writes: <billposer.alum@mit@edu> on Saturday October 07, 2006 @04:49PM (#16350515) Homepage

A guess is that this is because much of emacs' functionality is implemented in elisp code, which is not part of the core program and so not included in the source line count, whereas most of vim is implemented directly.

Parent Share
twitter facebook
Re:Exactly what constitutes a software bug? (Score:2, Informative)

by EvanED ( 569694 ) writes: <{evaned} {at} {gmail.com}> on Saturday October 07, 2006 @05:20PM (#16350693)

BTW, in full disclosure, CCured isn't a static analysis tool per se, and may have required running the programs to find the above bugs. But without the instrumentation CCured they would have gone undetected.

Another paper from UC Santa Barbra and (I think) the Technical Institute of Vienna used static analysis of compiled code (not even source!) to try to determine if a kernel module was malicious. (In this context, "malicious" means that it acts like a rootkit; in other words, if it modifies internal kernel data structures that don't belong to the kernel module itself.) They tested 7 rootkits and the entire driver database that comes with Red Hat. Their tool god a perfect detection rate -- no false positives, no false negatives. (The paper does have some issues -- like how did they pick the rootkits they tested, did they analyze them and then build the tool or the other way around, are there other rootkits they didn't test, how easily would it be to make a kernel module that would pass their test, etc., but it does provide a demonstration of the sort of power that static analysis can give you.)

Parent Share
twitter facebook
Re:meaningless, no data, and probably biased (Score:3, Informative)

by dkf ( 304284 ) writes: <donal.k.fellows@manchester.ac.uk> on Saturday October 07, 2006 @05:43PM (#16350833) Homepage

As a member of one of the OSS teams contacted, most of what Coverity found in our project were not actual bugs but rather places where the software wasn't smart enough to guess the preconditions on a function right. So they were more places where ill-advised maintenance might well have introduced a bug in the future. (Maybe the other spots were also like this, but we decided to clarify the code in all places anyway so the coverity problems were all cleansed.)

It should, however, be remembered that coverity does not and cannot find all bugs. It is just a more advanced form of Lint that catches a class of errors that is otherwise annoying to track down; a nice extra tool, not a magic bullet.

Parent Share
twitter facebook
Automatic code testing (Score:2, Informative)

by joe_plastic ( 704135 ) * writes: <stephen DOT pollei AT gmail DOT com> on Saturday October 07, 2006 @08:24PM (#16351713) Homepage Journal

I just ran across an interesting blog entry by Federico Mena Quintero [ximian.com] that linked to two interesting papers : When should a test be automated? [testing.com] and Classic testing mistakes. [testing.com]
Automated tests are often more expensive to write than manual tests, so when should you do automated tests versus manual ones? The answer is not always automated tests. Also typical unit testing is only part of what kind of testing needs to happen.
Using Ring Buffer Logging to Help Find Bugs [visibleworkings.com] was also another good link that Federico had in another entry.

Parent Share
twitter facebook
openoffice (Score:3, Informative)

by higuita ( 129722 ) writes: on Saturday October 07, 2006 @08:50PM (#16351829) Homepage

openoffice code is a mess because i is very old... remember that there was a staroffice in DOS time, the same code was update over and over and over before release to the opensource community...
since then many people try to clean it, but its hard and risky to clean a such big app

most projects have a coding style that everyone should follow, and many force you to comply if they want their code to be accepted

Parent Share
twitter facebook
OpenOffice is a bad example. (Score:4, Informative)

by Ayanami Rei ( 621112 ) * writes: <rayanami&gmail,com> on Saturday October 07, 2006 @09:32PM (#16352013) Journal

The codebase is very old, contains a bunch of legacy stuff nobody really understands, as the codebase has passed hands from a German company to Sun to the OpenOffice.org foundation. It's also picked up a layer of java along the way (for whatever reason).

It's too bad because it actually works kinda okay, but it's a real effort to get your hands dirty with.
Blender is also like that... it seems when a codebase has 'gotten around' it tends to pick up the bad habits of all the hands its been through.

MySQL is a bad state because it's really only developed by MySQL AB -- no one else is contributing to it so they have no reason to make it any more maintainable than it is. PostgreSQL, on the other hand, had the luxury of being the fruit of some academic research projects and was rewritten once or twice, so it's a little more maintainable.

Parent Share
twitter facebook

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Related Links Top of the: day, week, month.

413 commentsChatGPT Leans Liberal, Research Shows
347 commentsAmazon CEO Says 'It's Probably Not Going To Work Out' For Employees Who Defy Return-to-Office Policy
327 commentsHotel Owners Start To Write Off San Francisco as Business Nosedives
323 commentsChina is Building Nuclear Reactors Faster Than Any Other Country
315 commentsChina is Calling in Loans To Dozens of Countries

Intel CPUs are not defective, they just act that way. -- Henry Spencer