Forgot your password?
typodupeerror

Comment Experience at Excite@Home (Score 5, Interesting) 700

I was principal architect for Excite Clubs for 4 years. During a period of one year, we went from 100K page views to over 20M page views a day.

We had a rather unique situation. We started the project on Windows NT 4.0 and later migrated to Win2K. During that time, we were barely able to handle 1M page views per day on the windows boxes. In addition, the average page generation time was 2 seconds. The 20 windows boxes we had in production cost approximately 17K a piece (quad compaq proliant with 1 gig of ram) and were all experiencing 80% or more CPU usage.

The 20 boxes were managed by 1 sysadmin (6 years experience from MS consulting services) with a full time assistant. This does not count the high school students we had wandering the racks hard rebooting terminally ill boxes.

Most admin time was spent on upgrades, boxes that would just stop working (we called it spontaneous server rot) and trying to use a host of opaque, inadequate tools to detect and eliminate bottlenecks. Build, rollout and staging tools were also a big time synch. Finally, the installation of software onto a new machine in the right order with all configuration parameters took an extradinary amount of time.

In addition, I had one full time engineer writing noting but 'nanny' programs to monitor the program and restart the process when there were problems.

With all this work, the system still went down daily.

After much politicking we translated the program to JSP (straight page per page translation) and moved to solaris machines. The java middle tier ran as on solaris. The 20 compaq boxes were replaced with 16 solaris boxes. Oddly, we paid almost the same amount per box (20K versus 16K).

Immediately, we were able to more that 5M pageviews per day with no changes to the software. In addition ,the page generation times went down to .1 seconds and the highest observable CPU load was less than 10%.

Our sysadmins were replaced with a part time (less than 5 hours per week) solaris admin. The roll out scritps were trivial to write and maintain. We had very few upgrades/security patches.

Most important, the host of tools provided to monitor system performance and tell exactly where bottle necks were and the truly deep understanding of the system internals by the sys admins allowed us to eliminate the remaining problems and scale to 20M pageviews per day.

That is right. two orders of magnitude better performance for precisely the same code. And and order of magnitude less admin time.

Those were measurable results. Here is my 'opinion' of why the differnces were so dramatic.

I taught Win32 programming and system internals for four years. I was also chief scientist for Redmond Communications who publish a technical journal on Microsoft Software/strategies. So I am not a linux bigot.

My observation has been, that no one truly understands the internals of a windows system. Just as I start to get a handle on the latest caching, memory management, threading issues, there is an 'upgrade' via some patch that changes many of the internals. In addition, as shown by the above threads, most windows sys admins seem to have vastly difference experience and understanding of how to configure and maintain systems.

Unlike most nerds, I will not blame the admin, but blame the system. In the scientific community, windows, in practice, has proven to be somewhat opaque.

Unix, on the other hand, is incredibly well documented and all source is available. Uncertain how inodes are locked and released? No problem, there are many books and references to help you. If worst comes to worst, crack open the damn code.

This has nothing to do with open source, but more to do with the which communities evolved the techonlogy and the underlying motivations of companies hawking their wares.

Note, this is not a good thing, or a bad thing it is only a thing.

There were many people out there criticizing the studies accuracy. I must say I do not have a single colleague that I have spoken with that doubts its varisity from personal experience on BOTH sides of the isle. I just knew that I had to share my own experience with you. My only doubt about the story is that I would say 'order of magnitude' for production servers.

Thank you for your time,
Carmine Mangione

Slashdot Top Deals

Math is like love -- a simple idea but it can get complicated. -- R. Drabek

Working...