Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Microsoft

Microsoft Code at Fault for Half of all Windows Crashes 819

Flamester writes "In a ZDNet Australia story, Microsoft is claiming that half of all MS Windows crashes are the fault of third party code, not their own. That is, according to Dr. Watson. The article also goes into the 'rigor in which MS tests their products before release'. "
This discussion has been archived. No new comments can be posted.

Microsoft Code at Fault for Half of all Windows Crashes

Comments Filter:
  • by rokzy ( 687636 ) on Wednesday August 13, 2003 @11:36AM (#6686222)
    Microsoft has laid the blame for half of all Windows crashes on third-party code.

    Scott Charney, chief security strategist at Microsoft, told developers at the TechEd 2003 conference in Brisbane, that information collected by Dr Watson, the company's reporting tool, revealed that "half of all crashes in Windows are caused not by Microsoft code, but third-party code".

    Charney's comments come as the company highlights the rigour with which it tests its own products before release. Microsoft emphasised that products such as Yukon and Exchange Server were undergoing thorough testing -- both internally and via independent third parties -- prior to their release to the market.

    The company is employing root cause analysis and event sequence analysis procedures to scrub out the creation of sloppy code. The result is that individual developers have a high degree of accountability for the code they produce, while the systems and processes associated with code development are rigorously monitored.

    Root cause analysis enables the company to check closely the work of individual developers. "If a developer has written vulnerable code, then we look at what else that developer has written and check it," Charney said

    Event sequence analysis takes this further, analysing the reasons why the vulnerable code was written. Charney said it was not necessarily so they can sack whoever is writing vulnerable code, but find out the reasons why and how Microsoft improve their staff with training or more efficient processes.

    As Charney made his remarks, Charles Sturt University announced they would be offering a Master of Information Systems Security degree including MCSE:Security industry certification.

    Charney's also reinforced Microsoft's message to developers and network administrators that they needed to build secure applications and networks "from the ground up".

    The chief security strategist's remarks have come at an unfortunate time, as mainstream and niche media outlets produce heavy coverage of the impact of the MSBlast worm, which has infiltrated corporate and enterprise networks worldwide.
  • by sphealey ( 2855 ) * on Wednesday August 13, 2003 @11:37AM (#6686232)
    John Dvorak developed some interesting stats on XP crashes [pcmag.com] based on information given in a speech by Bill Gates. He works out that there are 25 millions blue screen crashes of XP per day. Interesting read. Also raises the question of exactly what happens to all those "crash reports".

    sPh

  • Re:Uhm, right... (Score:5, Informative)

    by andrewl6097 ( 633663 ) * on Wednesday August 13, 2003 @11:40AM (#6686284)
    Actually OS crashes do get sent. When you boot up, xp will recognize that it had just crashed and will offer to send the info.
  • Re:Uhm, right... (Score:5, Informative)

    by JimDabell ( 42870 ) on Wednesday August 13, 2003 @11:46AM (#6686382) Homepage

    So they're saying that a poorly designed application can take down the entire operating system?

    I suspect that they are referring to drivers and other kernel-space code. The standard Microsoft weenie excuse for instability in the past has been "it's the drivers!", blaming the video drivers is a favourite.

    Remember that Microsoft don't write most Windows drivers, they don't have to because their market share is so great, any hardware manufacturer who doesn't supply Windows drivers is not competitive.

    I believe this is the reason why Microsoft introduced their "Microsoft signed drivers" that are supposed to guarantee Microsoft-level stability (!).

    However, I have to laugh at Microsoft when they claim 50% of crashes aren't their fault. It's like an advert for a diet pill saying "Doesn't cause death in over 90% of people!".

  • Re:Uhm, right... (Score:3, Informative)

    by Malc ( 1751 ) on Wednesday August 13, 2003 @11:48AM (#6686414)
    They didn't say applications, they said code. From my experience, it's drivers that brings down my Win2K system, not applications. Well, Mozilla has been known to do it, but that goes back to the graphics drivers (kernel space) and related resources that Mozilla miss-manages. On a single-user desktop machine, an app that brings down the graphical shell is no better than an app that brings down the whole system, IMHO - I've still lost all my graphical apps and any unsaved work in them.

    Let's be honest here, if it's bad drivers that are the main problem, they also affect Linux just as badly. I've seen sound drivers lock up my system many times under Linux. The difference between Linux and Windows is that more companies produce more drivers for Win32, and so the chances of a user encountering a problem are increased.
  • by Detritus ( 11846 ) on Wednesday August 13, 2003 @11:53AM (#6686485) Homepage
    It would probably help, but the fundamental problem is the design of the operating system. Running everything in kernel space, without memory protection, is begging for problems. This is aggravated by the complexity of many types of drivers.
  • by donutz ( 195717 ) on Wednesday August 13, 2003 @11:53AM (#6686486) Homepage Journal
    "Microsoft emphasised that products such as Yukon and Exchange Server were undergoing thorough testing -- both internally and via independent third parties -- prior to their release to the market."

    Hey, they're TESTING! Wow, they really are taking this trustworthy computing thing seriously.


    Probably just a flippant remark, but they actual do test all of their applications and OSes, and they have (you know, all those internal and public beta TESTS and such).

    But maybe this time they'll fix the bugs, instead
    of just making note of them. ;)
  • by welloy ( 603138 ) on Wednesday August 13, 2003 @11:58AM (#6686542)
    Gates said that 5 percent of Windows machines crash, on average, twice daily. Put another way, this means that 10 percent of Windows machines crash every day, or any given machine will crash about three times a month.
    Dvorak's quote makes no sense at all. And he is not talking about total number of reboots a day. He is talking about frequency of crashes.

    What percent of machines crash once a day? Gates did not say. It could be the case that 20% crash once a day and 5% crash twice a day. It could be the case that 90% crash once a day and 5% crash twice a day. The number of machines that crash twice a day gives no information on how many machines crash once a day.

  • by Anonymous Coward on Wednesday August 13, 2003 @11:59AM (#6686564)
    Dr. Watson catches OS crashes, not app crashes

    Say what? Dr. Watson most assuredly catches application crashes. Just because XP doesn't say "Dr. Watson Error" anymore it still is dr. watson that is logging your error.
  • Re:Ring 0, Ring 3? (Score:5, Informative)

    by jpmorgan ( 517966 ) on Wednesday August 13, 2003 @12:07PM (#6686667) Homepage
    Dude, do you know what you're talking about? First, graphics drivers run in Ring0, along with most of the graphics subsystem. They haven't run in Ring3 since NT 3.51 days.

    Regardless, if a driver is running in the same memory space as the subsystem, a driver crash is going to take it out. It doesn't matter what ring the code is in. Again, back in NT 3.51 days graphics drivers were kept in seperate memory spaces, in ring3, but that was dropped due to piss poor performance.

    The GDI subsystem (several layers away from any graphics drivers) currently sprawls Ring0 and Ring3.

  • Indeed BS (Score:4, Informative)

    by Otis_INF ( 130595 ) on Wednesday August 13, 2003 @12:11PM (#6686711) Homepage
    ... your posting. When a 3rd party driver crashes, it probably will take down the system as well, since it runs in ring 0, and can walk over kernel resources (and probably did).

    When Windows gets read-only mempages (IIRC win2k3 has them) for kernel processes, this will be ended, until then: the 3rd party drivers are mostly at fault.
  • Re:Uhm, right... (Score:5, Informative)

    by tomhudson ( 43916 ) <barbara.hudson@b ... minus physicist> on Wednesday August 13, 2003 @12:59PM (#6687300) Journal
    My original post stated North America because most jurisdictions have taken Microsoft to task over this, including, for example,
    1. all of Canada [www.ccpe.ca]
    2. the IEEE statement on the title of engineer [ieeeusa.org]
    Microsoft is not recognised by the (in the USA) Accreditation Board of Engineering and Technology (ABET), and don't have the ability to grant a BSc., which is a prerequisite for using the term or title Engineer in most states.

    Guess you've been caught talking out of your ass again (but that's what ACs do)

  • Graphics in Ring0 (Score:2, Informative)

    by bstadil ( 7110 ) on Wednesday August 13, 2003 @01:00PM (#6687311) Homepage
    What inherently flawed OS structure? Could you please elaborate?

    How about moving the GDI to ring 0 for performance reasons, allowing a printdriver to crash the OS.

  • Re:Uhm, right... (Score:2, Informative)

    by Lurker_2k ( 319949 ) on Wednesday August 13, 2003 @01:10PM (#6687441)
    Sorry if I misunderstand something here but doesn't that indicate a memory/resource leak somewhere?
    Its still a bug even if it doesn't bring the system to it's knees for days.


    Actually, it's a stress test. This is generally an automated tst where we would run scripts to open and close various applications and whatnot for days. One script I ran when I was contracting at MS was something that opened up every single image in a certain directory (100+ jpgs) and at the same time, the machine would be also opening up several dozen excel spreadsheets, doing calculations on them, and exporting them to word files.

    The system would be pegged at 100% CPU usage and the memory usage would max out as well, hence it was unusable from an ordinary standpoint. The scripts generally can be set to autoterminate after a certain amount of hours. Over the weekends I'd sed them to terminate after 72 hours and would arrive back on mondays to check out what ran and what didn't. For the systems that crashed, I'd have to send out reports to the various developers regarding how it crashed, what module actually crashed, and when it crashed.
  • Re:Uhm, right... (Score:5, Informative)

    by EnVisiCrypt ( 178985 ) <groovetheorist@h ... .com minus punct> on Wednesday August 13, 2003 @01:20PM (#6687559)
    Yup.

    I've done an embedded system with QNX, and it is quite the nice RTOS.

    Under QNX, the devices hang out in the device manager, which is not in the kernel space, and the drivers are handled by the process manager, also not in the kernel. Since the kernel exists just to pass messages, essentially, it is uncrashable.
  • Re:Uhm, right... (Score:5, Informative)

    by AstroDrabb ( 534369 ) on Wednesday August 13, 2003 @01:24PM (#6687592)
    Yup, I have to agree with that. It depends on what you are using it for. For the average desktop use, XP is a big improvement over win9x. However, I get a lot of crashes from XP especially with Outlook when I am doing some heavy compiling and do some heavy dev work. 6 months ago I switched to using Linux to develop with at work (without anyones knowledge) and things have been great. This is at a fortune 500 company. Some people caught wind of it and now a few other developers and most of the Oracle DBA's are asking and showing interest. I have been MS free on my home network for 3 years or so and it has been great. Being able to be almost MS free on my workstations at work has been icing on the cake. Oh, one other thing I don't think anyone has seemed to notice is that is doesn't matter whether those 50% of crashes are from drivers OR apps. The thing that sticks out to me is MS is admitting to 50% of all crashes is because of their product. They are just saying it in a marketing friendly way to try to push the blame to driver developers.
  • Re:Uhm, right... (Score:4, Informative)

    by Keeper ( 56691 ) on Wednesday August 13, 2003 @01:42PM (#6687821)
    I can't speak for Candada, but I believe Texas is the only place that "Engineer" can't be used, because people with certifications don't take/pass some sort of official engineering exam. Sturt Univeristy can get away with it because they're not in Texas, and can't feel the wrath of a Texas court -- wheras MS can.

    And Watson can and does report back to "the mothership" for driver crashes, when the user allows it.
  • Re:Uhm, right... (Score:3, Informative)

    by aoteoroa ( 596031 ) * on Wednesday August 13, 2003 @01:44PM (#6687842)

    re: "Consider this: Microsoft has been ordered not to use the term MSCE in both the United States and Canada because Microsoft does not have the legal right to "certify" people as engineers."

    cite?


    Canadian Council of Professional Engineers (CCPE) opposes the use of the word "Engineer" in the MSCE designation [peo.on.ca]


    Microsoft Debating World-Wide MCSE Name Change [certcities.com]

  • Re:Uhm, right... (Score:3, Informative)

    by harvardian ( 140312 ) on Wednesday August 13, 2003 @02:09PM (#6688082)
    This has been said already in this article's comments, but people who say it aren't getting modded up, so I'll try again:

    Honestly, I think they may be including more than just OS crashes in these statistics. I'd say that in the past month, my computer (running WinXP) has crashed a handful of times. Of those crashes, one was severe (I think explorer restarted and apps closed? whatever happened I didn't need to restart).

    The other 5 (estimated) or so "crashes" were IE going down. Of the 5 times IE went down, a couple were caused by espn.com and a couple were caused by a nasty ad on nytimes.com.

    But here's my point: when I had my "severe crash", I reported it via Watson, and it didn't know wtf went wrong. When espn.com crashed the first time, I reported it via Watson and it told me Flash died. For the other 4 times Flash killed IE, I force-killed the program and DIDN'T report the problem because I knew what it was.

    So my statistics for the month are: a handful of app crashes (1 reported) and 1 os crash (1 reported). So I'm right on par with their data, that 50% of my REPORTED crashes were OS crashes (Microsoft's fault) and the other crash was IE going down (not Microsoft's fault).

    In the end, based on my personal experience, I'm guessing that they include app crashes in their data, or at least IE crashes (since it's "tied" to the OS). It might not be a driver issue, and it might not be Microsoft's inherently flawed paradigm for writing code at all.
  • Re:Uhm, right... (Score:5, Informative)

    by monkeydo ( 173558 ) on Wednesday August 13, 2003 @02:22PM (#6688198) Homepage
    But then again you could be worng, and look at that, YOU ARE!

    The "Texas Engineering Practice Act" has a whole page of exceptions, but they call them "exemptions".

    Lets see if we can find the relevant parts:
    Section 20. EXEMPTIONS.


    (a) The following persons shall be exempt from the licensure provisions of this Act, provided that such persons are not directly or indirectly represented or held out to the public to be legally qualified to engage in the practice of engineering: ...SNIP...

    (3) a person doing the actual work of installing, operating, repairing, or servicing locomotive or stationary engines, steam boilers, Diesel engines, internal combustion engines, refrigeration compressors and systems, hoisting engines, electrical engines, air conditioning equipment and systems, or mechanical and electrical, electronic or communications equipment and apparatus; ....SNIP...


    Well, that would seem to apply quite nicely not only to train engineers, but also software and systems engineers.

  • by billstewart ( 78916 ) on Wednesday August 13, 2003 @02:25PM (#6688222) Journal
    There are different kinds of crashing:
    • Individual Apps crashing themselves - that can happen on any OS. It shouldn't happen in major commercial products, but that's reality, and at least most of them are better about saving their state so they fail as safely as possible. I would have said that MS Office is pretty stable about that, except my MS Office has been crashing a lot the last couple of days, and of course there's all the Word Virus and Outlook Virus crap, so maybe I won't say that.
    • Hardware crashes - Unavoidable.
    • Crashes related to third-party device drivers - yeah, fine, you can't escape that, but the OS should be designed to minimize the need for drivers and provide mechanisms for isolating them.
    • The whole box crashing from applications - There's simply no excuse for this. That's why operating systems have kernels, and hardware has memory protection. Unix could pretty much defend itself from this by what, 1979? It wasn't rocket science like Multics or something. The 8086 memory architecture was too baroque, but the real advantage of all the segmentation stuff was that you *could* use it for memory management. Linus delayed at least one kernel release because a root user who opened a disk drive and scribbled on it _could_ cause the OS to crash. NT 3.5x was pretty safe about this, since it still looked VMS-like inside, but in NT 4 they moved a lot of the graphics capability into the kernel for "speed", and opened up the possibility of crashes again.
    • Applications using up some critical resources like disk drive so the machine becomes unusable - yes, this is possible, but the resources that are that critical are very very limited, e.g. a virtual memory system lets you page out or swap out application processes to prevent it.
    • Applications crashing some major subsystem that doesn't take down the OS. Unix has this risk - if you hang X Windows or the graphics system, applications that don't use X can still run fine, but you may need to telnet in to restart the subsystem. But this should also be minimized - keeping separate file systems for the OS's use vs. users' applications helps a lot.
  • Re:Uhm, right... (Score:4, Informative)

    by Idarubicin ( 579475 ) on Wednesday August 13, 2003 @03:21PM (#6688647) Journal
    It seems like a big scam to support the PE Ponzi scheme.

    I've been reading the replies to this thread, and I'm a little bit confused. The licensing of engineers has been a hotly-debated practice for...well, for as long as engineers have been licensed.

    Whether in favour of or opposed to licensing, I don't see how it could qualify as a Ponzi scheme [rr.com]. It may or may not be a worthwhile practice, but it's quite a stretch to describe it as a pyramid scheme.

  • by loraksus ( 171574 ) on Wednesday August 13, 2003 @04:29PM (#6689212) Homepage
    ATI has gotten their act together - it seems the drivers for their "good" cards, i.e. 8500, 9700 actually work, however their support for their older cards is terrible and I don't see that changing in future. ATI just doesn't think supporting older cards is a priority, and it shows clearly. I think the same with their lower end consumer cards, the 7500, etc. The drivers aren't that bad, but from what I've heard the 9700, etc are solid.
    Of course, the all-in-wonder pro I have is old (1998?) - so I can see why they want to kill it off, but dragging your customers kicking and screaming to a new product isn't very good for customer relations - and ATI knows this now. Nvidia and other companies made them wake up.
    Unfortunately they do have only 1 real competitor for the retail box market, so they aren't that concerned, but competition does help. Not that they will ever fix the drivers for the AIW Pro and their older cards, the PR damage has already been done, and the cards replaced.

    Their support is, of course, useless, just because they have to deal with so many buggy - and often weekly - releases. There just isn't time for them to find the problems. No point to call / ask for support because it will not be helpful. Of course venting is fun, but hey. Besides, half the games out there don't work properly and cause issues by themselves.

    Every once in a while, they get it mostly right, but it is a crapshoot. I've had drivers for my 7500 that would refuse to let me log on to 2k, but also the current version which works in both xp and 2k3 without any problems - i.e. I've had 0 bsod under 2k3 w/ my box with the 7500 in it since rc2 came out. A couple with "recording", or trying to with the AIW Pro - although that was expected. (the release for the 7500 is 6.14.1.6307 2/28/2003 if it helps anybody).

    As far as I can tell, 2k3 IS stable. I've abused my system - knocking out ide cables while the system is running, "hot pulled" pci cards, etc. Basically anything that would not cause the computer to reboot due to a short would keep the system up. My processer fan came out for a couple minutes, I saw it running at 95C and dove for the power switch, but 2k3 stayed up. Granted, it isn't that hot, but still.

    If I still lived in Ontario, I'd probably drive by at 120kph and throw a used tire rotor at their front door, it might cure an ulcer or two ;)
  • by Dasein ( 6110 ) * <tedcNO@SPAMcodebig.com> on Wednesday August 13, 2003 @05:26PM (#6689698) Homepage Journal
    Try
    Windows Crash Vs. Linux Crash
  • by moncyb ( 456490 ) on Wednesday August 13, 2003 @05:29PM (#6689729) Journal

    Which one? I think there were several, though I don't recall any of them affecting me--they all seemed to be cause by obscure stuff or in experimental drivers. The one specific incident I remember was a problem with ext3 not writing all the data on umount. If you synced before unmounting, you didn't lose data. I know Slackware puts a sync in the shutdown script, so I bet most Slackware users running ext3 didn't see the problem except when manually umounting filesystems. Ext3 was rather new then (still is), and I elected not to use it. In fact, I still go with ext2.

    The only problem I've ever had with ext2 was when I pulled out a floppy while it was writing. Hosed the disk pretty bad. I used minixfs for floppies from then on. I suppose it happened because ext2 is optimized for speed, not data recovery. If you want that, then go with FreeBSD's soft update and disable disk write caching.

    Maybe I haven't experienced problems with Linux because I just haven't encountered the brunt of Linux bugs, or maybe it's because I stay away from most experimental code and new features. Though I don't think Linux has nearly as many problems as MS flunkies try to make it out. My primary reason for migrating from MS to Linux was all the stupid problems with MS software--especially their OS, and the fact Linux had almost no problems. No matter what I did, Windows would crash at least a couple times a day. Linux almost never crashes, and when it does, I have been able to trace it down to either a hardware problem or a massive misconfiguration on my part.

  • Re:Uhm, right... (Score:4, Informative)

    by Captain Nitpick ( 16515 ) on Wednesday August 13, 2003 @05:56PM (#6689953)

    But then again you could be worng, and look at that, YOU ARE!

    Why do you think I have that sig? It's because everybody screws up occasionally. But since you don't want to play nice...(and you misspelled "wrong")

    Lets see if we can find the relevant parts: Section 20. EXEMPTIONS.

    (a) The following persons shall be exempt from the licensure provisions of this Act, provided that such persons are not directly or indirectly represented or held out to the public to be legally qualified to engage in the practice of engineering: ...SNIP...

    (3) a person doing the actual work of installing, operating, repairing, or servicing locomotive or stationary engines, steam boilers, Diesel engines, internal combustion engines, refrigeration compressors and systems, hoisting engines, electrical engines, air conditioning equipment and systems, or mechanical and electrical, electronic or communications equipment and apparatus; ....SNIP...

    Well, that would seem to apply quite nicely not only to train engineers, but also software and systems engineers.

    Your indentation is extremely misleading. Subsubsection (3) only applies if the requirements of subsection (a) are met.

    Since the requirements of 20(a) must be met first, let's take a look at it by itself:

    (a) The following persons shall be exempt from the licensure provisions of this Act, provided that

    such persons are not directly or indirectly represented or held out to the public to be legally qualified to engage in the practice of engineering:

    Wow, your options are:

    1. Make sure nobody ever refers to you as an engineer outside the company ever. ("We have a software engineer on staff")
    2. Every time you are referred to as a "software engineer", immediately follow it with "but he isn't legally qualified to practice engineering in the state of Texas." This would apply not only to you, but to everyone else at the company, and probably to your friends and family as well ("indirectly represented").
    3. Call yourself an engineer, but don't do anything resembling "the practice of engineering".

    The only way to ensure option 1 is to make sure nobody in the company calls you an engineer, so they won't slip up when talking to people outside the company. This is no different than not calling yourself an engineer at all.

    Option 2 is worse than calling yourself something other than a software engineer, and a lot less reliable.

    Now, you might say that software engineering doesn't fall under the "practice of engineering" bit.

    *ahem*

    Section 2. DEFINITIONS. As used in this Act the term:

    (4) "Practice of engineering" or "practice of professional engineering" shall mean any service or creative work, either public or private, the adequate performance of which requires engineering education, training or experience in the application of special knowledge or judgment of the mathematical, physical, or engineering sciences to such services or creative work.

    To the extent the following services or types of creative work meet this definition, the term includes consultation, investigation, evaluation, analysis, planning, engineering for program management, providing an expert engineering opinion or testimony, engineering for testing or evaluating materials for construction and other engineering uses, and mapping; design, conceptual design, or conceptual design coordination of engineering works and systems; development or optimization of plans and specifications for engineering works and systems; planning the use or alteration of land and water or the design or analysis of works or systems for the use or alteration of land and water; performing engineering surveys and studies; engineering for construction,

  • by EddWo ( 180780 ) <eddwo AT hotpop DOT com> on Wednesday August 13, 2003 @07:26PM (#6690537)
    Sometimes they do pass on the error reports they collected to the third party driver and application developers.

    http://msdn.microsoft.com/chats/windows/windows_ 08 1502.asp
  • by Guppy06 ( 410832 ) on Wednesday August 13, 2003 @07:40PM (#6690623)
    I look at the story title:

    "Microsoft Code at Fault for Half of all Windows Crashes"

    I look at the paragraph under it:

    "Microsoft is claiming that half of all MS Windows crashes are the fault of third party code, not their own."

    Anybody older than the age of, say, 10 should see that these are two very different statements. To assume that Microsoft is automatically to blame for the other half of OS problems completely ignores what everybody here should know is the #1 source of computer problems: User error.

    If you want to lament the lack of quality conrols involved in Microsoft's "Made for Windows" branding, fine. If you want to conjecture just what that other half really is, also fine. But you can't print painfully obvious logical fallacies like this and hope to be taken seriously as a source of news.

UNIX is hot. It's more than hot. It's steaming. It's quicksilver lightning with a laserbeam kicker. -- Michael Jay Tucker

Working...