Before I answer the questions, allow me to correct some misperceptions that I have seen with respect to my former article in LinuxWorld on this topic.
The myth that is debunked by the Evans Data survey is that Linux is taking more market share away from Unix than from Windows. The data is conclusive on this matter with respect to Linux developers from a wide variety of software development environments. More developers who focus primarily on Linux used to focus primarily on Windows than Unix. So there is a much greater shift from Windows to Linux among developers than from Unix to Linux. The attempts to explain it away that I've seen are usually based on one or more errors:
1. Most people still use Windows as a client OS.
Of course this is true, but it has nothing to do with the myth. Unix has practically no market share at all on the client, so it is mystifying that anyone would think that the myth is based on client market share. Why would anyone try to perpetuate a myth that Linux is taking away more market share from Unix than Windows on the client if there's virtually no Unix client market share to divert to Linux? It should be self-evident that the myth relates primarily (if not exclusively) to servers. As Linux gains client market share, it will be from Windows, and there will be no myth to debunk.
2. Evans surveyed only those companies that are involved in Linux development.
First, the purpose of the Evans report was to uncover trends, needs, desires, opinions, and decisions among developers who use Linux at least part of the time. One does not ask developers who never use Linux for this information. The fact that the data debunked the popular myth that Linux takes market share primarily from Unix is a fascinating side-benefit, but it was not the purpose of the survey.
However, the same logic on who to ask still applies to this side-conclusion. Here is the question answered by the data: Among those developers who now focus primarily on Linux, which did more of them have as their primary focus beforehand -- Windows or Unix? The only way to get an answer to this question is:
1. You must ask the question of people who now have Linux as their primary development host or target.
2. You must ask the question of that subset of Linux developers who switched from some other platform as their primary platform to Linux as their primary host or target.
This is precisely what Evans did. Evans asked those who switched to Linux as their primary host or target what platform they switched from. Does anyone honestly believe it makes more sense to ask those who use Windows or Unix as their primary platform what they used to use before they switched to Linux as their primary platform?
The only question is, were these developers honest? The fact that 50% of these developers still focus primarily on Windows as of this year should tell you that we're not talking about a survey of Linux fanatics. That isn't the only thing that supports their candor, however. These developers were brutally honest in their answers regarding Linux tools, distributions, etc. All of the answers reflected reality, not zealous Linux advocacy. With the permission of Evans, I may address some of these other results in a future article.
Now, on to the questions:
1) So that 40% number...
by Anonymous Coward
...the one where 40% of developers are writing mainly to Linux. Where does that stat come from, and what does "developers" mean? It sounds really nice, but if it were true I as a Linux user would expect to see a lot more apps. Does it come from Sourceforge numbers? Does it come from a poll at a website; maybe a Slashdot, Kuro5hin or Newsforge poll? Is it of *all* developers, or of *paid* developers, or of developers of open-source developers or in-house developers or developers of commercial software? Does it include platform-agnostic developers (ie. Java/ perl/ ASP/ PHP/ .NET)? If so, which side does it put them on? Also, what is the error margin of the poll?
I know a bit about statistics, and more about Linux, and something smells fishy. Linux is good, so I figure the numbers are bad.
Evans Data sent out a survey to about 400 developers who are either known to have some involvement in Linux development, or work for companies that are involved in Linux development. The degree of involvement is not the critical issue, as is obvious from the results, since even now more of the developers they surveyed focus primarily on Windows than Linux. The developer responses indicate that ratio will reverse as of next year, but that was obviously unknown until they returned the survey, so it wasn't a qualifying factor. The survey included developers from all walks of life, but as far as I know, all of them are paid developers. There were many who work in very small companies (generally VARs and consultants), some who work at ISVs producing commercial software, some who work in IT departments and write custom applications for internal use, etc. The report is very detailed as to how this breaks down, and it even shows what kinds of decisions VARs tend to make as opposed to the kinds of decisions developers at ISVs tend to make.
Here's what you may have misunderstood. The 40% number does *not* mean that 40% of developers worldwide are focusing primarily on Linux, nor does Evans represent it that way. It means 40% of the developers Evans surveyed, and those developers were pre-selected by their use of Linux. This makes perfect sense, because the study was meant to discover information about the decisions, needs, and desires of developers who use Linux, whether they use it occasionally or all the time.
You may have been confused by some of my own pet theories as to why Linux market share growth is overlooked, but I offered that information as supplemental to the Evans data. One of my pet theories was confirmed by the data. But that was never intended as "proof" that Linux has 40% market share among *all* developers because that was not the conclusion of the survey. The 40% figure was part of several figures that refute the myth that Linux takes more share away from Unix than Windows.
The survey did not ask questions about languages such as PHP, Python and Perl, but the data suggests that a very large portion of developers use PHP, Perl or Python and other languages for web development since a very large number checked "other" in areas where those languages would apply. As far as I know, Evans plans to be specific about these languages and platforms in the next survey.
By the way, .Net is not platform-agnostic, but Mono and DotGNU promise to provide some of the .Net framework. About half the developers surveyed said they will adopt Mono or DotGNU if they are successful. Only a small portion of developers object to the idea of Mono and DotGNU enough to refuse to use it, so the vast majority do not have a strong philosophical objection to .Net (yet another confirmation that we're not talking about zealots). It is revealing, however, that only about 17% of the developers currently use .Net, and almost twice that amount use Sun ONE. This suggests they simply do not want to use .Net itself, or that it is not compelling enough to justify the price tag or to stick with Windows. There are other possible explanations, and perhaps Evans will uncover them in the next survey.
by The Bungi
What is the difference between you and the people who are demonized and flamed to no end because they quote seemingly unreliable and baseless statistics to support the idea that Windows is doing well in the market place? That Windows is better than Linux as a server OS?
It seems to me that for the past four or five years I've been seeing "statistics" and "studies" to the tune of "Linux is enterprise-ready" and "Linux will overtake the desktop" and "Linux rulez". What's different today?
If you're talking about being demonized, there is no difference. To one extent or another, I have been demonized for practically everything I write. I've learned to live with it, though, since it is an occupational hazard.
Statistics are reliable as long as you collect them properly. To the best of my knowledge, these were collected properly. What is generally unreliable - in those cases where statistics are rendered unreliable - is the analysis of those statistics. Whether or not the analysis is intentionally flawed or simply poorly done is debatable, depending on the circumstances. Sometimes it is flawed simply because the analyst speculated but failed to communicate that the conclusions were based on spculation.
As far as this particular report is concerned, I did my best to analyze the statistics based on what could be gleaned from the actual data, and whenever I applied speculation, I made every effort to communicate that it was speculation.
Survey results are often dubbed "unreliable" or "baseless" because the results are either designed to confirm the conclusions of the company that commissions the survey, or because the results are misrepresented, or both. I could easily misrepresent the Evans Data if it were my intent to deceive. For example, the survey showed that the respondents have experienced virtually no security breaches or viruses on Linux (the number of experiences is so small as to be statistically insignificant). But each year, fewer respondents say the open source model is inherently more secure, despite the fact that their own experience contradicts this perception. If I wanted to spread fear about open source, I could quote what the respondents "feel" without revealing the hard data regarding what they actually experience. I'm afraid that's what some analysts or research companies may do, which is why they get a deservedly bad reputation. This study does not deserve that kind of reputation. As for what "rulez", my statement in my recent column that Linux is a better server platform than Windows is my own, although it is confirmed by several case studies. I would be surprised if these case studies haven't already appeared on slashdot.
3) IDC credibility
IDC is always publishing those studies about future market share, but where are the studies comparing past IDC predictions with the actuals?
We can't even get solid Internet traffic statistics. Look at the mess Worldcom's inflated traffic numbers caused.
First, anyone who has read my articles for long would know that I am one of the world's most severe critics of research organizations and their analysts. I am still very suspicious of most research reports and the analysts who help produce them. Exceptions include Dan Kuznetsky (IDC), who is quite good, and Esther Schindler (Evans), who is also extremely good. There are others, of course, but these come immediately to mind.
I agree that someone should keep a record of research company predictions and hold them accountable for their errors. I maintain that this is a good idea, and I have suggested it before.
Having said that, allow me to correct your perceptions on a few issues. First, the report is from Evans Data, not IDC. Second, while the data does make predictions, the primary issue I addressed was not a prediction, it concerned data regarding existing Linux developers. Of those developers who currently focus primarily on Linux, more used to focus primarily on Windows than Unix. If you want to add the prediction to that, here it is: Of ALL these developers (including those we surveyed who still focus primarily on Windows), more plan to focus primarily on Linux next year than Windows. As it is, more focus primarily on Windows today.
Before I correct one last possible misperception, it may help you to understand how the process worked.
They gave me the survey to review. I commented on the questions as best I could, given that this was my first project of this type and I wasn't sure what they were most interested in discovering from the data. (Remember that these reports have a dual-purpose. They exist to serve clients of Evans Data, and they also provide interesting results for the general public.) They made almost all of my suggested changes (some would have made the survey too long, which is a perfectly reasonable concern, so we condensed some questions to compensate).
By the way, do not read too much into that part about serving clients of Evans Data. Yes, I suspect that in some cases companies commission reports in order to get the results they want. That is one reason why I am so critical of research groups and their reports. But neither the survey nor the way Evans handled the process ever hinted at this kind of manipulation. As far as I can tell, the commercial purpose of the survey had nothing to do with whether or not developers are moving from Windows to Linux. It had more to do with what existing Linux developers want and need.
Anyway, some time later, I received the results, along with many standard cross-tabulations of the data. I had just over a week to produce the report, which was extremely difficult, but Evans was very responsive and cooperative. Sometimes the data suggested a conclusion or trend but didn't confirm it. In some cases, I was able to confirm my suspicions by requesting a cross-tabulation of data to isolate who was saying what. In other cases, the best I could say was something like, "the data suggests X, but there could be other explanations." But Evans ALWAYS responded to my questions about possible errors, ALWAYS produced the cross-tabulations I requested without even asking why, and NEVER suggested that I change my conclusions or approach to analysis. Evans even responded to requests for cross-tabulations when the answer I was looking to understand had little or nothing to do with its target audience.
In only one case did Evans ever suggest a trend I didn't see for myself, and they were very careful to say that I could toss out that conclusion if I didn't agree that the data suggested this trend. They pointed out that despite the increase in developers who make Linux their primary focus, there was also a big increase in those who develop on multiple platforms. That change was obviously valid, because the hard data confirmed it. The problem is that I was unable to explain from the data why this apparent paradox existed. We would have had to ask more questions to qualify it. Personally, I think the answer is obvious, but because we asked no questions to prove my analysis, I simply suggested it as a possibility in the report (along with at least one other possibility). In a strong economy, companies dedicate developers to projects full-time, and that produces more people who spend all their time on a single platform. A down economy tends to force companies to reduce the number of programmers who are dedicated to a project full time. More people work on several projects at once, some of which include platforms other than their primary platform, whether that is Windows, Linux, or something else.
Now, since I haven't seen the final version of the report, it is entirely possible that Evans edited what I finally submitted into something abominable. But given the way they handled the entire process from start to finish, I can't even begin to imagine why this would be the case. As far as any of their dealings with me and the data were concerned, I never even perceived a hint of integrity problems.
4) Linux Usage Growth
Ok, this statement was thrown in my face a while back.
Its easy to go from 1 to 2 users or 2 to 4 and claim a fantastic growth rate, but what constitutes that magic number of users before its truly a desktop operating system being used daily by enough of a mass to catch the attention of large software development firms that will create/port applications to linux?
Is growth rate in terms of number of desktops conquered (eg growth rate of 1.5 million desktops a year) a better measuring stick than doubled/tripled/whatever the number of users in X years. What, in your opinion, is a good measuring tool in determining the growth rate/acceptance of linux in the market?
I don't know of any good measure of growth rates. Your example is perfectly valid. One of the most amusing examples of error in this regard has to do with spin. I know of a magazine that quoted endlessly (years ago) that OS/2 was a dead-end operating system because it only had 2,000 native applications. Later, the same magazine published a story about how Windows NT was gaining good momentum, as evidenced by its 1,200 native applications.
And, as I said in my article, even accurate numbers about existing market share (not growth) can be deceiving. If one company uses 50 Linux servers to do the same job as 100 Windows servers used by another company, Windows appears to be the more popular platform because it has a greater market share. Yet the only reason Windows has a greater market share is because it takes more Windows boxes to do the same amount of work. Whether or not you agree with the assumption that Linux outperforms Windows, hopefully you can see that market share figures do not reflect important information.
5) Dear Nicholas Petreley (Score:5, Funny)
You might be unaware of this fact, but the words Usage Statistics, IDC, study, etc trigger some deep emotions in the slashdot community.
So can you tell me, Is BSD dying?
I have no idea if BSD is dying. I personally believe BSD is an excellent operating system, so I hope it does not die. Again, it was Evans, not IDC. The Evans study didn't ask much about BSD, but what it did ask revealed that the respondents consider it to be one of the most secure operating systems available (particularly OpenBSD).
6) Distros and numbers
Part of the problem in counting the number of Linux desktops/servers/etc. is that anyone can get it from any of a million different places (friends, ftp, subscriptions, etc.), but the industry tends only to count sales. I know for a fact that every CD I have of Linux I have installed it on at least 10 other systems...some are upgrades, others are new users, and still others moving over from another distro.
And this leads to the other problem...what are the *real* usage stats on distros? It's hard to tell. From talking to people, a lot of people use Slackware and Debian for servers, Red Hat, Suse and Mandrake for desktops...but how can we really count who is using what?
This was a survey of developers who use Linux at least some of the time, so it had nothing to do with sales, friends, etc.
As I said in my LinuxWorld article, Red Hat was by far the most popular distribution. Debian was the most popular non-commercial distribution. You wouldn't find many surprises in the rest of the list. The survey didn't really identify what people use the most on servers vs. desktops. But they use Red Hat the most, period.
Interestingly, most respondents think the issue of commercial vs. non-commercial is irrelevant, although only by a slim margin over those who prefer a commercial version. Even more interesting was the dichotomy between those who prefer commercial distributions and those who prefer non-commercial. They seem to disagree the most about what constitutes the strength of a commercial or non-commercial distribution. One gets the impression they have not tried a distribution from the "other side".
7) Linux announcements from big companies...
Do you see announcements from heavy hitters (like Dell, IBM, etc) helping sway more 'desktop users' to switching to Linux?
This is only my opinion, not something from the report, but yes, I believe those announcements do help a lot. If anything from the survey supports this conclusion, it is that the respondents favor IBM by far over other hardware companies, and IBM has been the most vocal about Linux. On the other hand, Dell was highly rated, too, and there has been a lot of controversy over Dell support for Linux.
8) My question
Do you think statistics are nothing more of a marketing tool, and should the open source community use these numbers (usually squeued) to get some leverage when promoting open source alternatives to the higher ups?
Statistics are what you make them. After that, assuming you tried to get the most objective and realistic statistics, their true value vs. manipulative value still depends on how you use the results.
As for what you make them, you can ask the same question hundreds of ways. Some forms of a question will get you the answer you "want". Other forms of the same question are more likely to get you an honest, objective and informative answer. There are techniques such as "distractors", etc., that can help, if honesty is what you want.
The problem is that honesty is not always what one wants. And even when the survey turns up honest answers, it is easy to distort the results if that is what you want to accomplish.
What the open source community does with statistics is up to the open source community and the conscience of the individuals within it. Personally, my religious convictions are the motivating force behind everything I do, so I am committed to the truth. If I stray from the truth, it is because I'm far from perfect, whether that imperfection surfaces as an imperfect attitude or simply a careless mistake. But I can't tell the open source community what should drive their motives. Each one should probaly act according to his own conscience.
9) At what point will Linux reach critical mass?
At what level of penetration (% install base share) will Linux reach critical mass on the desktop? It's much less relevant from a server perspective since it appears that Linux already has reached critical mass on that front. Should we assume that when Linux supplants Apple as the number two platform (although this has already happened from what I have seen, nobody is stating it yet in the mass media), that we will see a proliferation of commercial Linux offerings and (more importantly) better OEM hardware support?
I have no idea what point would be considered critical mass. I'm not even sure it is a good idea to measure critical mass in terms of installed base. This was a good measurement for commercial products, but Linux has an appeal that transcends commercial software. It is open source and free, and those two elements make it difficult if not impossible for any commercial software company to compete.
The Evans Data survey showed that developers choose Linux first because it is stable, second because it is open source, and third because of the low cost. Commercial software can be made stable, but it is not likely that some companies will ever open their source code (make it truly open, that is, not just let people have a supervised peek at it now and then). And, as the folks at what used to be Netscape know only too well, it is very difficult to compete with "free as in beer", no matter how much propaganda you spout about total cost of ownership. (Of course, bundling was an issue there, too, but the point about free is still hard to deny.)
Speaking strictly for myself, I would say the desktop is a unique market that will transform radically over the next several years. Personally, I think digital rights management (DRM), Palladium (or whatever it's called this week), and the evolution of media centers and game consoles pose a much more serious threat to Linux on the desktop than market share or OEM bundling.
10) Gathering data.
How is the data gathered and is the same techniques used for other OSes in comparison? Also do you consider coporate desktops or personal desktops? I ask this because many employees would rather use Linux as their primary desktop, but management strong arms Windows.
See above for more details on how the data was gathered. The survey had little to do with desktop vs. server Linux use. It was focused on development and the needs of developers. Evans asked whether they were developing more for desktop or server applications. As it turns out, most of the developers work on server-side applications. Based on the data, a good deal of the development is dedicated to web-based applications. There is also significant and growing activity and interest in using Linux on 64-bit architectures and embedded systems.
I know of many people who are strong-armed into using Windows. But since these companies are making money by developing on Linux and for Linux, I would guess that it is doubtful they are being strong-armed to use Windows.
Bio: Nicholas Petreley is a consultant and freelance writer based in Asheville, NC. He was founding editor of LinuxWorld, and hosts the non-profit weblog VarLinux.org. He can be reached at email@example.com.