Slashdot Log In
What Actually Makes Up "Linux"?
Posted by
CmdrTaco
on Wed Jun 20, 2001 11:14 PM
from the somebody-did-actual-research-for-a-change dept.
from the somebody-did-actual-research-for-a-change dept.
David A. Wheeler sent in linkage to his extensive analysis of the true
size of Linux. There's an amazing amount of information in here, and although it focuses on Red Hat 7.1, it still has tons of interesting bits of information about the code that makes up the distribution. Break downs include languages, licenses, cost estimates, and stats that in no way clear up the legendary GNU/Linux debate that will undoubtably be engraved on tombstones somewhere.
This discussion has been archived.
No new comments can be posted.
What Actually Makes Up "Linux"?
|
Log In/Create an Account
| Top
| 283 comments
(Spill at 50!) | Index Only
| Search Discussion
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.

Linux is composed primarily of: (Score:4)
Several hundred utilities.
And three hundred and fifty thousand annoying slashdot trolls.
--
Windows is made up of the following (Score:5)
Windows is made up of the following:
You can moderate this down, but I challenge you to find proof that this situation is otherwise.
Re:GNU/Linux (Score:3)
A lot of the code they're listing as "Linux" code isn't GNU code at all. It's released under the BSD license (e.g. Apache). It's released under the Artistic license (e.g. Perl). Calling the system GNU/Linux simply because it has some GNU tools on it is like me calling my Windows box Netscape Windows because I have an old version of Navigator on it or GNU/Windows because I have GNU apps on it.
I think the reason people are more apt to further describe Linux as GNU/Linux is not because it uses GNU apps, but because it is released under the GNU Public License.
GNU/Linux (Score:3)
The difference is the GNU System and the utilities that were built up beside the linux kernel and supporting it. The difference between linux the kernel and linux the system that we all know and love is the GNU System.
And that's why the system is GNU/Linux, and not 'Linux', which merely refers to the kernel.
Good up until the point he mentions Bill Gates (Score:5)
In particular, the bit about documentation. The thing that Linux lacks these days is decent documentation in alot of areas, in particular things like devfs (which the author even admits is now poorly documents (the instructions that are available are now out of date)).
Coming from a BSD background (no, this isn't an excuse for a platform war - just hear me out), documentation is just as important as the code itself. This sometimes means that availability of certain features in BSD are a generation behind that of Linux, but when they arrive, the documentation is top notch, containing correct spelling and grammar, notes what bugs are present, provides examples of correct usage (this is especially relevent in documenting programming functions whose incorrect usage may have a security impact) and so on. Overall, it's an issue of documentation quality.
The author of the paper may scoff in the direction of Bill Gates, noting the ability of the Linux community to create and maintain an operating system, but what he's done in the process is brought the whole paper down by exposing the single thing that Linux as a "disparate sources, one distribution" model operating system can never have as what Microsoft products and, from my perspective, the BSD operating systems have - documentation that exists in a single form and written in a style that is consistent across the entire operating system. (This is not the case with Linux. Some things use manpages, others use "info", others use textfiles, others use html documentation. Heaven knows how a new user on Linux (advocacy is about attracting new users, right?) is supposed to navigate this mess without a considerable level of pain and/or persistence).
And before you let the flames begin, have a poke around on say, the NetBSD/OpenBSD/FreeBSD sites' manual page listings on their website and compare them to the ones you see on RedHat and so on.
Linux is... (Score:4)
...i know this, and still, I find myself compulsively rebuilding my kernel.
Linux - Microkernel (Score:3)
2437470 source lines of code for the Linux kernel. Doesn't that worry some people out there? We have a monolithic kernel almost two and a half million lines long. I think that by 2.6 the kernel is going to collapse under its own weight unless the designers decide to reorganize it in a fundamental way. Maybe it's time for a Linux-Hurd fusion project that will turn Linux into a true microkernel.
Re:What really makes up "Linux"... (Score:3)
That advance certainly didn't come in 1991, because the Amiga's clipboard.device already could do that earlier (at the latest, 1990 when 2.0 was released, and probably much earlier in the 1.x days of the 1980s but I'm not 100% sure). And this sort of thing wasn't really what the Amiga was famous for, so (I am speculating) that idea may have been stolen from the Mac.
Linux has faster filesystems. But Linux and NT both still suck at that, so I guess I should mention something more substantial:
An area where Linux is way ahead of Windows would be extensibility.
For example, Linux and Windows, when running on x86, both have a severe problem where code can be executed on the stack. If you run a network service and it has a buffer overflow bug, then bad people on the Internet can write their own code and execute it on your machine. So some guys [openwall.com] decided it wasn't such a good idea for that to be possible, and they released some kernel patches to make it so that this infiltration technique doesn't work.
This actually reveals two ways that Linux is further ahead than Windows.
That doesn't put Linux just a few years ahead of Windows. It puts Linux a whole generation ahead of Windows, and even my beloved (but no longer maintained) AmigaOS. Freeness itself is a huge feature. (Alas, it's about all that Linux really has. But it's a biggie!)
---
Re:What really makes up "Linux"... (Score:3)
How easy is this in Windows? I can do this with one command line in Linux (and any other *nix for that matter). Yes, I have to know REs to do it. Yes, it took me several hours to learn the RE expression syntax and several weeks of using them to make them second nature to me, but now I can do tasks like these in a matter of minutes. With ANY Windows system this would take several weeks.
"Useability" is a slippery term. Also, while Microsoft products do meet a certain level of minimum useability, there is a equal amount of crappy software from third-parties out there that are every bit as "unusable" as the hobbyist stuff for Linux.
And just how "useable" is, for example, MS Office? Sure enough, retarted monkeys can do the basics, but I would bet you 2:1 that 90% of Word users only acheieve 40% code coverage of Word -- in other words, if you start digging into everything Word is every bit as obtuse and difficult as state of the art 1970 glass teletype editors. More difficult, I would argue, because you could learn everything there was to know about those "unusable" editors in about two hours. Of course, you couldn't make a marketing brochure with those editors (unless you wanted to go out of business), but my point is that "useability" is pretty danged meaningless. "Suitability" is more to the point. Word is lousy if you want to do accounting.
Microsoft has actually substantially held back the increasing useability of systems by kepping the PC the dominant platform. Most people do not need general purpose computing devices. Home users need an "appliance" that does Web, e-mail, instant messaging, personal finance, word processing and maybe a spreadsheet. Business users need that plus presentation software, calendar/scheduling etc. These devices could be "embedded" type devices (think the Palm metaphor) that are much easier to use than PCs. Why should ANYONE but the very few who need more need to know about clock speeds, RAM size, ISA/EISA/PCI, irqs, USB, etc.?
The claim that Microsoft has advanced useability is absurd. They have been struggling against their own monopoly platform for over a decade, not because of their own failure, but because of the inappropriate design of the platform for its present use.
I will certainly grant that one must know a lot more to make good use of Linux on PC than to make good use of Windows on a PC. But which is easier to use, a Palm Pilot or a Windows PC? A TiVo or a windows PC? A Nintendo or a Windows PC?
Useability my rotund fundament!
Re:What really makes up "Linux"... (Score:3)
$ find
I used to do it with a find within a find and a sed command, but the perl trick is a very nice shortcut, esp. since it edits the files in place and leaves backups behind!
Re:Linux - Microkernel (Score:3)
--
SecretAsianMan (54.5% Slashdot pure)
Re:Linux is composed primarily of: (Score:3)
And most of which preach a mantra they don't really understand.
---
Re:What really makes up "Linux"... (Score:5)
I don't see how that's true at all. In both technology, and the bottom line, Microsoft is *years* ahead. Technology: let me offer one example: go to a web page (IE) with some kind of table with data in it. Copy the table. Paste it into Word. It actually becomes a Word table! Paste it into Excel. It actually places the data, and the formatting, into the cells! How far is linux from that level of ease of use, that level of "object linking and embedding" across apps? Do you think the multiple desktop standards helps or hinders this task?
And in terms of bottom line, linux companies are still trying to figure out how to make a buck. Redhat just now moved into the positive column, VA and others layoff people seemingly every week.
I'm a fan of Linux because I'm a hacker. I like the shell, I like the flexibility and customisability that come with having dozens of "glue" tools. But the fact is, hackers are the minority of computer users, and this is only going to be more and more true in the future. For the masses, ease of use is priority 1, and it seems, at least to me, that the "other" platform has a great lead in that arena.
---
What does a user actually need? (Score:3)
After reading the analysis, two things sprang out at me. The first is that a lot of the stuff on a Linux system is meant for development, rather than just using the system. The second is that lots of the stuff on the list clearly is "application" and not anyone's idea of an "operating system".
Specifically, in the top ten, we have:
Development Tools
Applications
(Also in the top 20 are libgcj, teTeX, postgresql, and xemacs. And we won't get into the issue of whether Mozilla (#2) should be considered part of the operating system.)
So my question is, what's the size of the non-development/non-application stuff? What's the size of the kernel plus the essential utilities (most of which are GNU, as RMS points out ad nauseum)?
Size of GPL disclaimers? (Score:5)
What really makes up "Linux"... (Score:5)
It's something I could go on and on forever about because it really is something special in a world dominated by the shadow of Gates and Jobs. "Those people" who work "over there" don't make this. We do! While all those numbers can start to quantify this, you can't really put a dollar value on it the same way you can't put a dollar value on freedom. Funny thing to be able to say that about a bunch of software...
"I may not have morals, but I have standards."
Re:What really makes up "Linux"... (Score:3)
I still think that the advancements mad by Linux in the time frame it has been available are more than what MS has done since its inception.
Let's see, since it's inception, M$ has developed several complete sets of delvelopment tools including the first high level language tool for any microcomputer. It has developed the World's three most popular desktop OS's (MS-DOS, Win9x, WinNT/2000) an architecture to make it easy to configure the later OSes with a remarkable variety of hardware + all the support tools that go with them. It has developed the World's most popular suite of office applications, the second most popular groupware system, and a framework that makes it relatively easy to for the average computer user to use these tools together.
The Linux community has developed in about half the time..... a kernel wow!
OK, so in some cases M$ started by buying the product (e.g. the first versions of MS-DOS and Excel), but then Linus didn't start from scratch either, but with Minix
As far as I can see (Score:3)
---
GNU vs. Linux (Score:4)
Linux is in its simplest form much like a Japanese car built with 87% United States parts.
On a personal note:
In the beginning there was Linus and the word was with Linus. Accept Linux into your hart and you shall have uptime eternal.
Kernal 3:16:
For Linus loved man so much that he gave his first begotten OS.
Re:Linux - Microkernel (Score:4)
--
BACKNEXTFINISHCANCEL
LOD: Lines of Documentation (Score:3)
Which might help explain another number that keeps cropping up: 5% of the OS market.
Clarifications about the paper (Score:5)
> Using RedHat as a distro for this project isn't that good of an idea.... it's just an unrepresentative mass of programs and code! I can safely say that most Redhat users will never use about one-quarter of the programs in their distribution...
That's true for any of today's operating systems. No user uses all the code in Windows, either. Even real-time OS's have more code developed for them than is used by any given user. As a measure of effort, though, examining all the code makes sense.
> Since when is the number of lines of code proportional to the quality of the software? If Red Hat 7.1 has 30 million lines of code over 6.2's 17 million, does that mean the product is 76% better? Is the code getting more sloppy as more programmers get involved? I feel like counsel is leading the witness for the author to say 7.1 has "60% more effort" under the COCOMO model."
I never said it was "better", I said it included "60% more effort." Better is a value judgement. Effort is measured in person-years.
> The kernel shouldn't be two million lines of code. How much of that is drivers? And how much of the drivers are duplicated from one driver to another?
Section 3.2 specifically discusses this; 57% of the lines of code are drivers. Duplicate files are only counted once, but "partly duplicated" files are much harder to detect (and to discount when they happen); they certainly happen in the Linux kernel. However, the COCOMO model is based on real project data, and many other projects include cut-and-pasted code (for good or ill).
> Ok, so this guy claims that Linux would cost a little over $1 billion (US) to develop. I wonder what the big deal is. I'm sure Microsoft has spent that much over the years on Office+Win9x+WinNT+Backoffice+etc ... The only thing incredible about this number is that most of that billion was completely unpaid, or at least underpayed.
But I believe that is a big deal. Gates' "Open Letter to Hobbyists" assumed that if people just shared code, no large project would be developed. GNU/Linux and other open source/free software systems show the assumption wrong, and this paper has the numbers to prove it. You can argue which is "better", of course, but the notion that it can't be done is no longer debatable.
> Are there estimate[s of] how much money in form of salaries were ever paid to programmers for the code and how much was in effect done not only voluntarily, but also completely on an unpaid basis?
Unfortunately not; it's not even clear how to find out. You would have to go back to individual patches submitted to every project, and few people identify in their patches "I was paid to do this."
> 2437470 source lines of code for the Linux kernel. Doesn't that worry some people out there? We have a monolithic kernel almost two and a half million lines long. I think that by 2.6 the kernel is going to collapse under its own weight unless the designers decide to reorganize it in a fundamental way.
It's the nature of a monolithic kernel, and in any case, most of that is in modules (which are individually much smaller and only loaded when needed). I see no evidence of a "collapse", though clearly there are competitors (like HURD) that might eventually replace it in the market.
> Quoting statistics/data going back to '95 is way out of date by todays standards, even '99 is now very old.
It may be old, but it helps give perspective. A simple SLOC number doesn't mean much to people, unless it's compared to something else.
> The cost formula includes a term (ksloc**1.05): i.e. thousands of source lines to the power of 1.05. This reflects the fact that the bigger a program becomes, the harder it is to add new lines, because the system you are adding too is more complex. He plugs the size of the entire code base of RH7.2 into this formula. This seems unreasonable to me - these are many almost independent packages.
No, I don't do that (for the reason you cite). Section 2.3 of the paper discusses this: "Each build directory had its effort estimation computed separately; the efforts of each were then totalled." Appendix A mentions that sloccount was given the "--multiproject" option, which implements this.
Anyway, I hope people found this study interesting. It sounds like several people did.
Re:Well...there are more than some GNU (Score:4)
How about solving this by creating a fanciful glyph (vaguely 'L' shaped) and allocating a point in the Unicode codespace to replace the name? There would no longer be a spoken name for /The Operating System Formerly Known as (GNU\/)?Linux/.
The Glyph could mean all things to all people. Everyone would be happy enough to resume productive activities.
Re:GNU/Linux (Score:3)
But hey, I'm entitled; all the times you lousy morons write 'looser' when it's goddamned LOSER - buy a friggin' dictionary, already! - and I've never said a word about your inability to spell such a simple word incorrectly, until now....
Make you a deal: I'll call it GNU/Linux, as stupid as that sounds, when you convince all the twits to write 'loser' correctly. Then we'll all be happy campers.
Until then I hold the GNU hostage. And I'm armed.
Max
Growth of Linux (Score:3)
-- Dr. Mike
Re:The 1,000,000,000 Dollor Linux Standard (Score:3)
Frankly, show me one usefull feature on RH distribution that hasn't been done before ?
And few realize that what really makes up Linux (Score:4)
x-windows???? (Score:4)
Am I the only person who cringes every time I read "x-windows?"
Or have they officially changed the name? (might as well...)
--