High-Performance Web Server How-To

Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

High-Performance Web Server How-To 281

Posted by Hemos on Saturday October 19, 2002 @07:03AM from the build-it-and-maybe-they-will-come dept.

ssassen writes "Aspiring to build a high-performance web server? Hardware Analysis has an article posted that details how to build a high-performance web server from the ground up. They tackle the tough design choices and what hardware to pick and end up with a web server designed to serve daily changing content with lots of images, movies, active forums and millions of page views every month."

This discussion has been archived. No new comments can be posted.

High-Performance Web Server How-To

Load All Comments

Search 281 Comments Log In/Create an Account

Comments Filter:

High-performance web server (Score:5, Informative)

by quigonn ( 80360 ) writes: on Saturday October 19, 2002 @07:08AM (#4484219) Homepage

I'd suggest everybody with the need of a high-performance web server to try out
fnord [www.fefe.de]. It's extremely small, and pretty fast (without any special performance hacks!), see here [www.fefe.de].

Share
twitter facebook
- Re:High-performance web server (Score:4, Informative)
  
  by Electrum ( 94638 ) writes: <david@acz.org> on Saturday October 19, 2002 @09:36AM (#4484533) Homepage
  
  Yep. fnord is probably the fastest small web server available. There are basically two ways to engineer a fast web server: make it as small as possible to incur the least overhead or make it complicated and use every possible trick to make it fast.
  
  If you need features that a small web server like fnord can't provide and speed is a must, then Zeus [zeus.com] is probably the best choice. Zeus beats the pants off every other UNIX web server. It's "tricks" include non blocking I/O, linear scalability with regard to number of CPU's, platform specific system calls and mechanisms (acceptx(), poll(), sendpath, /dev/poll, etc.), sendfile() and sendfile() cache, memory and mmap() file cache, DNS cache, stat() cache, multiple accept() per I/O event notification, tuning the socket buffers, disabling nagle, tuning the listen queue, SSL disk cache, log file cache, etc.
  
  Which design is better? Depends on your needs. It is quite interesting that the only way to beat a really small web server is to make one really big that includes everything but the kitchen sink.
  
  Parent Share
  twitter facebook
- - Re:High-performance web server (Score:2, Informative)
    
    by Fefe ( 6964 ) writes:
    
    fnord supports CGI and PHP can be run in CGI mode.
    Actually, at least two people are using fnord to host a PHP site.
    
    Don't expect stellar performance, though. PHP is by no means a small interpreter. I guess it would be possible to be fast and PHP compatible with some sort of byte code cache. If there is enough demand, someone will implement it.
10'000 RPM (Score:3, Insightful)

by Nicolas MONNET ( 4727 ) writes: <nicoaltiva@gmai l . c om> on Saturday October 19, 2002 @07:10AM (#4484221) Journal

The guys use 10'000 RPM drive for "reliabilit" and "performance" ... 10k drives are LESS reliable, since they move faster. Moreover, they're not even necessarily that faster.

Share
twitter facebook
- Re:10'000 RPM (Score:4, Funny)
  
  by autocracy ( 192714 ) writes: <(slashdot2007) (at) (storyinmemo.com)> on Saturday October 19, 2002 @07:15AM (#4484233) Homepage
  
  In comparison to what? Yes, they're faster than the 7,200 you probably have - but they only run at 2/3 the speed of most really high end drives (15,000 RPM). Really it's not too bad a trade-off.
  Also, please note that the laws of physics say that it can read more data if the head is able to keep up - and I'm sure it is.
  
  Parent Share
  twitter facebook
- Re:10'000 RPM (Score:3, Informative)
  
  by khuber ( 5664 ) writes:
  
  10k drives are LESS reliable, since they move faster.
  Okay, well ,you can use ancient MFM drives since they move much slower and would be more reliable by your logic.
  Personally, I'd take 10k SCSI drives over 7.2k IDE drives for a server, no question.
  -Kevin
- Re:10'000 RPM (Score:5, Funny)
  
  by Krapangor ( 533950 ) writes: on Saturday October 19, 2002 @07:34AM (#4484278) Homepage
  
  10k drives are LESS reliable, since they move faster
  This implies that you shouldn't store servers in high altitudes, because they move faster up there due to earth rotation.
  Hmmm, I think we know now why these Mars missions tend to fail so often.
  
  Parent Share
  twitter facebook
- Re:10'000 RPM (Score:5, Insightful)
  
  by Syre ( 234917 ) writes: on Saturday October 19, 2002 @03:48PM (#4485898)
  
  It's pretty clear that whomever wrote that article has never run a really high-volume web site.
  
  I've designed and implemented sites that actually handle millions of dynamic pageviews per day, and they look rather different from what these guys are proposing.
  
  A typical configuration includes some or all of:
  
  - Firewalls (at least two redundant)
  - Load balancers (again, at least two redundant)
  - Front-end caches (usually several) -- these cache entire pages or parts of pages (such as images) which are re-used within some period of time (the cache timeout period, which can vary by object)
  - Webservers (again, several) - these generate the dynamic pages using whatever page generation you're using -- JSP, PHP, etc.
  - Back-end caches (two or more)-- these are used to cache the results of database queries so you don't have to hit the database for every request.
  - Read-only database servers (two or more) -- this depends on the application, and would be used in lieu of the back end caches in certain applications. If you're serving lots of dynamic pages which mainly re-use the same content, having multiple, cheap read-only database servers which are updated periodically from a master can give much higher efficiency at lower cost.
  - One clustered back-end database server with RAID storage. Typically this would be a big Sun box running clustering/failover software -- all the database updates (as opposed to reads) go through this box.
  
  And then:
  
  - The entire setup duplicated in several geographic locations.
  
  If you build -one- server and expect it to do everything, it's not going to be high-performance.
  
  Parent Share
  twitter facebook
But any web server is high-performance (Score:5, Insightful)

by Ed Avis ( 5917 ) writes: <ed@membled.com> on Saturday October 19, 2002 @07:11AM (#4484226) Homepage

Computer hardware is so fast relative to the amount of traffic coming to almost any site that any web server is a high-performance web server, if you are just serving static pages. A website made of static pages would surely fit into a gigabyte or so of disk cache, so disk speed is largely irrelevant, and so is processor speed. All the machine needs to do is stuff data down the network pipe as fast as possible, and any box you buy can do that adequately. Maybe if you have really heavy traffic you'd need to use Tux or some other accelerated server optimized for static files.

With dynamically generated web content it's different of course. But there you will normally be fetching from a database to generate the web pages. In which case you should consult articles on speeding up database access.

In other words: an article on 'building a fast database server' or 'building a machine to run disk-intensive search scripts' I can understand. But there is really nothing special about web servers.

Share
twitter facebook
- Re:But any web server is high-performance (Score:5, Insightful)
  
  by khuber ( 5664 ) writes: on Saturday October 19, 2002 @07:37AM (#4484282)
  
  With dynamically generated web content it's different of course. But there you will normally be fetching from a database to generate the web pages. In which case you should consult articles on speeding up database access.
  I'm just a programmer, but don't big sites put caching in front of the database? I always try to cache database results if I can. Honestly, I think relational databases are overused, they become bottlenecks too often.
  -Kevin
  
  Parent Share
  twitter facebook
  - Re:But any web server is high-performance (Score:5, Insightful)
    
    by NineNine ( 235196 ) writes: on Saturday October 19, 2002 @08:38AM (#4484378)
    
    Good databases are designed for performance. If databases are your bottleneck, then you don't know what you're doign with the database. Too many people throw up a database, and use it like it's some kind of flat file. There's a lot that can be done with databases that the average hack has no idea about.
    
    Parent Share
    twitter facebook
    - Re:But any web server is high-performance (Score:2, Interesting)
      
      by khuber ( 5664 ) writes:
      
      Our databases are tuned. Some apps would just need to transfer too much data per request for a SQL call to be feasible.
      -Kevin
      - Re:But any web server is high-performance (Score:3, Informative)
        
        by NineNine ( 235196 ) writes:
        
        Our databases are tuned. Some apps would just need to transfer too much data per request for a SQL call to be feasible.
        
        I had this problem for a while... Sloppy coding on my part was querying 65K+ records per page. Server would start to crawl with a few hundred simultaneous users. Since I fixed it, 1000+ simultaneous users is no problem at all.
    - Re:But any web server is high-performance (Score:4, Interesting)
      
      by PhotoGuy ( 189467 ) writes: on Saturday October 19, 2002 @05:38PM (#4486352) Homepage
      
      A key question someone needs to ask themselves when storing data in a relational database, is "is this data really relational"?
      
      In a surprising amount of cases, it really isn't. For example, storing user preferences for visiting a given web page; there is never a case where you need to relate the different users to each other. The power aggregation abilities of relational databases are irrelevant, so why incur the overhead (performance-wise, cost-wise, etc.)
      
      Even when aggregating such information is useful, I've often found off-line duplication of the information to databases (which you can then query the hell out of, without affecting the production system) a better way to go.
      
      If a flat file will do the job, use that instead of a database.
      
      Parent Share
      twitter facebook
  - Re:But any web server is high-performance (Score:5, Interesting)
    
    by jimfrost ( 58153 ) writes: <jimf@frostbytes.com> on Saturday October 19, 2002 @08:52AM (#4484404) Homepage
    
    As you say, databases are usually the bottleneck in a high-volume site. Contrary to what Oracle et al want you to believe, they still don't scale and in many cases it's not feasible to use a database cluster.
    Big sites, really big sites, put caching in the application. The biggest thing to cache is session data, easy if you're running a single box but harder if you need to cluster (and you certainly do need to cluster if you're talking about a high-volume site; nobody makes single machines powerful enough for that). Clustering means session affinity and that means more complicated software. (Aside: Is there any open source software that manages session affinity yet? )
    Frankly speaking, Intel-based hardware would not be my first choice for building a high-volume site (although "millions of page views per month" is really only a moderate volume site; sites I have worked on do millions per /day/). It would probably be my third or fourth choice. The hardware reliability isn't really the problem, it can be good enough, the issue is single box scalability.
    To run a really large site you end up needing hundreds or even thousands of Intel boxes where a handful of midrange Suns would do the trick, or even just a couple of high-end Suns or IBM mainframes. Going the many-small-boxes route your largest cost ends up being maintenance. Your people spend all their time just fixing and upgrading boxes. Upgrading or patching in particular is a pain in the neck because you have to do it over such a broad base. It's what makes Windows very impractical as host for such a system; less so for something like Linux because of tools like rdist, but even so you have to do big, painful upgrades with some regularity.
    What you need to do is find a point where the box count is low enough that it can be managed by a few people and yet the individual boxes are cheap enough that you don't go broke.
    These days the best machines for that kind of application are midrange Suns. It will probably be a couple of years before Intel-based boxes are big and fast enough to realistically take that away ... not because there isn't the hardware to do it (though such hardware is, as yet, unusual) but because the available operating systems don't scale well enough yet.
    
    Parent Share
    twitter facebook
    - Re:But any web server is high-performance (Score:2, Informative)
      
      by khuber ( 5664 ) writes:
      
      Very good info Jim.
      Yeah, my experience is at a relatively large site. We use mostly large and midrange Suns, EMC arrays and so on. There's a lot of interest in the many small server architecture though that is still being investigated.
      -Kevin
      - Re:But any web server is high-performance (Score:5, Informative)
        
        by jimfrost ( 58153 ) writes: <jimf@frostbytes.com> on Saturday October 19, 2002 @09:16AM (#4484473) Homepage
        
        I've seen both kinds and take it from me, many small servers is more of a headache than the hardware cost savings is worth. Your network architecture gets complicated, you end up having to hire lots of people just to keep the machines running and with up-to-date software, and database connection pooling becomes a lot less efficient.
        You save money in the long run by buying fewer, more powerful machines.
        
        Parent Share
        twitter facebook
        
        Re:But any web server is high-performance (Score:2, Interesting)
        
        by khuber ( 5664 ) writes:
        
        The interest is primarily hardware cost (the big Suns cost over $1m, and EMC arrays are likewise). Another issue is that when you have a few big machines and you do a deployment or maintenance, it's a struggle for the other boxes to pick up the slack. If you had more small servers, you could upgrade one at a time without impacting capacity as much.
        What do you think about handling capacity? Do you see sites with a lot of spare capacity? We'd have trouble meeting demand if we lost a server during prime hours (and it happens).
        -Kevin
        
        Re:But any web server is high-performance (Score:5, Insightful)
        
        by jimfrost ( 58153 ) writes: <jimf@frostbytes.com> on Saturday October 19, 2002 @09:43AM (#4484546) Homepage
        
        Yea, big Suns are too expensive and you do need to keep the server count high enough that a failure or system taken down for maintenance isn't a really big impact on the site. I mentioned in a different posting that my cut on this is that the midrange Suns, 4xxx and 5xxx class, provide good bang-for-the-buck for high-volume sites.
        Beware of false economy when looking at hardware. While it's true that smaller boxes are cheaper, they still require about the same manpower per box to keep them running. You rapidly get to the point where manpower costs dwarf equipment cost. People are expensive!
        Capacity is an issue. We try to plan for enough excess at peak that the loss of a single server won't kill you, and hope you never suffer a multiple loss. Unfortunately most often customers underequip even for ordinary peak loads, to say nothing of what you see when your URL sees a real high load.[1] They just don't like to spend the money. I can see their point, the machines we're talking about are not cheap; it's a matter of deciding what's more important to you, uptime and performance or cost savings. Frankly most customers go with cost savings initially and over time (especially as they learn what their peak loads are and gain experience with the reliability characteristics of their servers) build up their clusters.
        [1] People here talk about the slashdot effect, but trust me when I tell you that that's nothing like the effect you get when your URL appears on TV during "Friends".
        
        Parent Share
        twitter facebook
        
        Re:But any web server is high-performance (Score:3, Interesting)
        
        by jimfrost ( 58153 ) writes:
        
        I have more than a few problems with that idea, but amongst them is:
        
        Diskless systems start to collapse the central servers even by forty or fifty clients. By the time you're talking the thousand or more Intel systems necessary for a big site you're looking at having to have a tiered system just to do software deployments, forget about data serving.
        
        Diskless systems don't work well if you have more data than you can realistically afford to store in memory. You start to see practical limits (like hardware limitations) in the low gigabyte range, when most larger websites have static content to deliver in the hundreds of gigabyte range.
        
        Applications are notoriously hungry because they have to do a lot of caching to offload the database since databases generally don't scale well. It's pretty common to see our application servers running with 2+ gig heaps, and we'll run one application server per CPU on a system, and you're probably running three or more 6 or 8 CPU systems just for the application server part. Try to make that diskless and you're now talking about machine configurations with something like 30G of RAM ... very expensive and impractical.
        
        We're talking about a totally different scale, really.
    - Re:But any web server is high-performance (Score:2)
      
      by Matey-O ( 518004 ) writes:
      
      "The hardware reliability isn't really the problem, it can be good enough, the issue is single box scalability."
      
      I dunno, our current major project is running on an ES7000 (8 processors, fully redundant, running Windows Datacenter) It seems pretty beastly to me.
      
      At the point here where X Unix implementation is x% faster than Y Microsoft implementation, the issue is decided by other factors. As long as either is fast enough to handle the load, n-th degree performance doesn't matter.
      
      In out case, the company that won the contract specified the hardware, it was part of a total cost contract (you get one amount of money to make this work, work within those boundaries.)
      
      _Presumably_ that company is happy enough with Windows performance on a 'big iron' box.
    - Re:But any web server is high-performance (Score:3, Informative)
      
      by Aldurn ( 187315 ) writes:
      
      Aside: Is there any open source software that manages session affinity yet?
      
      Yes. Linux Virtual Server [linuxvirtualserver.org] is an incredible project. You put your web servers behind it and (in the case of simple NAT balancing) you set the gateway of those computers to be the address of your LVS server. You then tell LVS to direct all IPs of a certain netmask to one server (i.e. if you set for 255.255.255.0, 192.168.1.5 and 192.168.1.133 will connect to the same server).
      
      The only problem I had with it was that it does not detect downtime. However, I wrote a quick script that used the checkhttp program from Nagios [nagios.org] to pull a site out of the loop when it went down (these were Windows 2000 servers: it happened quite frequently, and our MCSE didn't know why :)
      
      There are higher performance ways to set up clustering using LVS, but since I was lazy, that's what I did.
    - - Re:But any web server is high-performance (Score:5, Interesting)
        
        by jimfrost ( 58153 ) writes: <jimf@frostbytes.com> on Saturday October 19, 2002 @09:57AM (#4484575) Homepage
        
        If you're just serving static pages you're right. If you're doing dynamic content then you're wrong.
        But 2.5 million hits a day is still just a moderate volume site to me. One of the sites I worked on sees in excess of a hundred million hits per day these days; it was up over ten million hits per day back in 1998.
        I don't happen to know what Slashdot does for volume, but Slashdot is a very simplistic site when it comes to content production. Each page render doesn't take much horsepower and sheer replication can be used effectively. Things get more complicated when you're doing something like trying to figure out what stuff a user is likely to buy given their past buying history and/or what they're looking at right now.
        If you really think a 4-way Intel box is equivalent to a 12-way Sun, well, it's clear you don't know what you're talking about. You're wrong even if all you're talking about is CPU, and of course I/O bandwidth is what makes or breaks you -- and there's no comparison in that respect.
        
        Parent Share
        twitter facebook
      - Re:But any web server is high-performance (Score:3, Informative)
        
        by Hast ( 24833 ) writes:
        
        How about reading the FAQ before you start giving out "facts"? Slashdot is running on:
        * 5 load balanced Web servers dedicated to pages
        * 3 load balanced Web servers dedicated to images
        * 1 SQL server
        * 1 NFS Server
        Either the "little 4 way intel" you mention has a serious case of shizofrenia or your just full of it. (Guess which theory I'm going for.)
        
        Besides the poster mentioned that those sites /are/ bigger than Slashdot. E.g. the mention that "Getting your URL posted during Friends" is nothing like getting it posted on Slashdot.
        
        I know I shouldn't feed the trolls, but someone might actually belive this tripe.
    - - Re:But any web server is high-performance (Score:2)
        
        by jimfrost ( 58153 ) writes:
        
        I think we're going to see more and more of this kind of server. The Zseries mainframes running Linux are really interesting because you're not so dependent on scalable SMP capabilities and yet you get the same kind of manageability as if you were working with a big SMP box. Nice.
        I haven't personally done any deployments on such a system, but I like the idea.
  - Re:But any web server is high-performance (Score:5, Insightful)
    
    by Matey-O ( 518004 ) writes: <michaeljohnmiller@mSPAMsSPAMnSPAM.com> on Saturday October 19, 2002 @10:12AM (#4484604) Homepage Journal
    
    I think the big problem here is the tendency to DBify EVERYTHING POSSIBLE.
    
    Like the State field in an online form.
    
    Every single hit requires a tag to the databases. Why?
    
    Because, heck if we ever get another state, it'll be easy to update! Ummm, that's a LOT of cycles used for something that hasn't happened in, what, 50 years or so. (Hawaii, 1959)
    
    Parent Share
    twitter facebook
- Re:But any web server is high-performance (Score:4, Insightful)
  
  by NineNine ( 235196 ) writes: on Saturday October 19, 2002 @08:41AM (#4484383)
  
  You're absolutely right. Wish I had some mod points left...
  
  Hardware only comes into play in a web app when you're doing very heavy database work. Serving flat pages takes virtually no computing effort. It's all bandwidth. Hell, even scripting languages like ASP, CF, and PHP are light enough that just about any machine will work great. The database though... that's another story.
  
  Parent Share
  twitter facebook
gee, i wonder.. (Score:5, Funny)

by Anonymous Coward writes: on Saturday October 19, 2002 @07:13AM (#4484229)

.. if their webservers are as reliable as the ones in the article..
i guess there's only one way to find out..

slashdotters! advance! :P

Share
twitter facebook
- server load (Score:5, Funny)
  
  by MegaFur ( 79453 ) writes: <[moc.nzz.ymok] [ta] [0dryw]> on Saturday October 19, 2002 @07:36AM (#4484279) Journal
  
  Many other people will likely post a comment like mine, if they haven't already. But hey, karma was made to burn!
  
  According to my computer clock and the timestamp on the article posting, it's only been about 33 minutes (since the article was posted). Even so, it took me over a minute to finally receive the "Hardware Analysis" main page. The top of that page has:
  
  Please register or login. There are 2 registered and 995 anonymous users currently online. Current bandwidth usage: 214.98 kbit/s
  
  Draw your own conclusions.
  
  Parent Share
  twitter facebook
  - Re:server load (Score:2)
    
    by Queuetue ( 156269 ) writes:
    
    Please register or login. There are 4 registered and 1428 anonymous users currently online. Current bandwidth usage: 1183.73 kbit/s
    
    Took about 3 minutes, next page would not load.
  - Re:server load (Score:5, Interesting)
    
    by fusiongyro ( 55524 ) writes: <faxfreemosquito@@@yahoo...com> on Saturday October 19, 2002 @08:16AM (#4484338) Homepage
    
    Well, they're about slashdotted now. They lost my last request, and it says they have almost 2000 anonymous users. I sometimes think the reason I like reading Slashdot isn't because of the great links and articles, but instead because I like being a part of the goddamned Slashdot effect. :)
    
    Which brings me to the point. Ya know, about the only site that can handle the Slashdot effect is Slashdot. So maybe Taco should write an article like this (or maybe he has?). The Slashdot guys know what they're doing, we should pay attention. Although I find it interesting that when slashdot does "go down," the only way I know is because for some reason it's telling me I have to log in (which is a lot nicer than Squid telling me the server's gone).
    
    --
    Daniel
    
    Parent Share
    twitter facebook
    - Re:server load (Score:3, Interesting)
      
      by stevey ( 64018 ) writes:
      
      I seem to remember that there was an article just after the WTC attacks last year, which discussed how Slashdot had handled the massive surge in traffic after other online sites went down.
      
      From memory it involved switching to static pages, and dropping gifs, etc.
      
      Unfortunately the search engine on Slashdot really sucks - so I couldn't find the piece in question.
    - Re:server load (Score:3, Informative)
      
      by 1110110001 ( 569602 ) writes:
      
      Maybe the article Handling the Loads [slashdot.org], describing how Slashdot kept their Servers up at 9/11, is a bit of the thing you're looking for. b4n
  - how nice of them (Score:2)
    
    by twitter ( 104583 ) writes:
    
    Current bandwidth usage: 214.98 kbit/s
    
    Draw your own conclusions.
    
    How nice of them to share that information.
    
    The obvious conclusion is that my cable modem could take a minor slashdoting if Cox did not crimp the upload and block ports. Information could be free but thanks to the local Bell's efforts to kill DSL things will get worse until someone fixes the last mile problem.
    
    The bit about IDE being faster than SCSI was a shocker. You would think that some lower RPM SCSIs set to strip would have greater speed and equivalent heating. The good IDE performance is good news.
  - Re:server load (Score:2)
    
    by blibbleblobble ( 526872 ) writes:
    
    There are a few registered and quite a few anonymous users currently online. Current bandwidth usage: 6.80 kbit/s Oct 19 12:02 EDT
    
    Guess they stopped counting. We're supposed to be impressed that their dynamic page with 7 embedded tables and 160 images loads in less than three minutes?
    
    If only they hadn't copied the review format from Toms Hardware. Take a 1000-word article, add 2000 words of padding, and split between 9 pages including an index.
  - - - Re:server load (Score:5, Funny)
        
        by Anonymous Coward writes: on Saturday October 19, 2002 @08:04AM (#4484327)
        
        Please flush my dns entry, or better yet unplug me. There are 0 registered and millions of the slashdot horde currently refreshing their browser and laughing at my stats. Current bandwidth usage: 100 Mbit/s.
        
        Parent Share
        twitter facebook
That "howto" sucks (Score:5, Interesting)

by Nicolas MONNET ( 4727 ) writes: <nicoaltiva@gmai l . c om> on Saturday October 19, 2002 @07:17AM (#4484238) Journal

There is no useful information in that infomercial. They seem to have judged "reliability" through vendor brochures and in a couple days; reliability is when your uptime is > 1 year.

This article should be called "M. Joe-Average-Overclocker Builds A Web Server".

This quote is funny:

That brings us to the next important component in a web server, the CPU(s). For our new server we were determined to go with an SMP solution, simply because a single CPU would quickly be overloaded when the database is queried by multiple clients simultaneously.

It's well known that single CPU computers can't handle simultaneous queries, eh!

Share
twitter facebook
- - Re:That "howto" sucks (Score:5, Insightful)
    
    by khuber ( 5664 ) writes: on Saturday October 19, 2002 @07:46AM (#4484296)
    
    Well, not to mention that high traffic sites usually have a bunch of webservers and then a load balancer in front of them. This article obviously isn't for big league web serving.
    -Kevin
    
    Parent Share
    twitter facebook
    - Re:That "howto" sucks (Score:5, Informative)
      
      by jimfrost ( 58153 ) writes: <jimf@frostbytes.com> on Saturday October 19, 2002 @09:02AM (#4484428) Homepage
      
      High traffic sites, the ones that are really dynamic anyway, do more than that.
      They start with a load balancer at the front end, or possibly several layers of load balancer. If they run a distributed operation they'll use smart DNS systems or routers to direct requests to the most local server cluster. The server cluster will be fronted by a request scattering system.
      Behind the request scattering system you'll find a cluster of machines whose job it is to serve static content (often the bulk of data served by a site) and route dynamic requests to another cluster of servers, enforcing session affinity for the dynamic requests.
      Behind the static content servers are the application servers. They do the heavy lifting, building dynamic pages as appropriate for individual users and caching everything they can to offload the database.
      Behind the application servers is the database or database cluster. The latter is really not that useful if you have a highly dynamic site as there are problems with data synchronization in database clusters (no matter what the database vendors tell you). But that's ok, single databases can handle a lot of volume if built correctly and caching is done appropriately at the application level.
      And there you have it, the structure of a really large site.
      
      Parent Share
      twitter facebook
"Three times the power?" (Score:5, Insightful)

by mumblestheclown ( 569987 ) writes: on Saturday October 19, 2002 @07:18AM (#4484239)

From the article:
If we were to use, for example, Microsoft Windows 2000 Pro, our server would need to be at least three times more powerful to be able to offer the same level of performance.
"three times?" Can somebody point me to some evidence for this sort of rather bald assertion?

Share
twitter facebook
- Re:"Three times the power?" (Score:5, Interesting)
  
  by khuber ( 5664 ) writes: on Saturday October 19, 2002 @07:43AM (#4484293)
  
  That was total FUD. The two operating systems have comparable performance on the same hardware.
  -Kevin
  
  Parent Share
  twitter facebook
  - Re:"Three times the power?" (Score:5, Informative)
    
    by (H)elix1 ( 231155 ) writes: <slashdot.helix@nOSPaM.gmail.com> on Saturday October 19, 2002 @10:46AM (#4484696) Homepage Journal
    
    That was total FUD. The two operating systems have comparable performance on the same hardware.
    
    Win2k pro limits you to 10 concurrent TCP/IP connections, Win2K Server has no (artificial) limit but won't cluster, Advanced Server can cluster but I don't know a thing about it..
    
    Linux has no (artificial) limit... not sure about clustering options there either.
    
    Found out about the TCP/IP limit when I added SP2 and trashed my evening counter-strike server - this makes a HUGE difference.
    
    Parent Share
    twitter facebook
    - Re:"Three times the power?" (Score:4, Informative)
      
      by Magila ( 138485 ) writes: on Saturday October 19, 2002 @02:17PM (#4485478) Homepage
      
      Win2k pro limits you to 10 concurrent TCP/IP connections.
      
      Whao! bullshit meter rising! While Win2K does have a limit on TCP/IP connections, it is in the thousands. A limit of 10 would be totaly ridiculous, it would cripple the OS for MANY people. Also, most of the traffic for a CS server is UDP so the TCP/IP connection limit isn't going to affect that much at all.
      
      Parent Share
      twitter facebook
      - Re:"Three times the power?" (Score:5, Informative)
        
        by elemental23 ( 322479 ) writes: on Saturday October 19, 2002 @04:31PM (#4486065) Homepage Journal
        
        The maximum number of other computers that are permitted to simultaneously connect over the network to Windows NT Workstation 3.5, 3.51, 4.0, and Windows 2000 Professional is ten. This limit includes all transports and resource sharing protocols combined. This limit is the number of simultaneous sessions from other computers the system is permitted to host.
        
        From Microsoft Knowledge Base Article Q122920 [microsoft.com].
        (Warning: The page layout is broken in Mozilla)
        
        It's an artificial limitation. The idea is that if you need more simultaneous connections you should buy Win2k Server. In other words, MS wants you to spend more money.
        
        Parent Share
        twitter facebook
- Re:"Three times the power?" (Score:5, Informative)
  
  by NineNine ( 235196 ) writes: on Saturday October 19, 2002 @08:45AM (#4484387)
  
  "Microsoft Windows 2000 Pro"
  
  I got a good laugh out of this... W2K Pro is the desktop version, not the server version. Wow. Great article. Really well informed author.
  
  Parent Share
  twitter facebook
A little disapointing really (Score:5, Insightful)

by grahamsz ( 150076 ) writes: on Saturday October 19, 2002 @07:18AM (#4484243) Homepage Journal

The article seemed way too focused on hardware.

Anyone who's ever worked on a big server in this cash-strapped world will know that squeezing every last ounce of capacity out of apache and your web applications needs to be done.

Share
twitter facebook
- Re:A little disapointing really (Score:2, Insightful)
  
  by januschr ( 118746 ) writes:
  
  The article seemed way too focused on hardware.
  
  Well the name of the website is "Hardware Analysis"... ,-)
- Re:A little disapointing really (Score:2)
  
  by Zeinfeld ( 263942 ) writes:
  
  The article seemed way too focused on hardware
  Yeah, maybe if the site had not been slashdotted...
  Does not appear that the site considers the most effective way to make a Web server fly, replace the hard drives with RAM. Ditch the obsolete SQL engine and use in memory storage rebuilt from a transaction log.
  Of course the problem with that config is that an outage tends to be a problem so just duplicate the hardware at a remote disaster recovery site.
  Sound expensive? Well yes, but not half as expensive as some of the systems people put together to run SQL databases...
my $0.02 (Score:5, Informative)

by spoonist ( 32012 ) writes: on Saturday October 19, 2002 @07:19AM (#4484245) Journal

* I prefer SCSI over IDE

* RedHat is a pain to strip down to a bare minimum web server, I prefer OpenBSD [openbsd.org]. Sleek and elegant like the early days of Linux distros.

* I've used Dell PowerEdge 2650 [dell.com] rackmount servers and they're VERY well made and easy to use. Redundant power supplies, SCSI removable drives, good physical security (lots of locks).

Share
twitter facebook
- Re:my $0.02 (Score:3, Informative)
  
  by Door-opening Fascist ( 534466 ) writes:
  
  RedHat is a pain to strip down to a bare minimum web server, I prefer OpenBSD [openbsd.org]. Sleek and elegant like the early days of Linux distros.
  
  OpenBSD doesn't have support for multiple processors, which are a necessity for database servers and dynamic web servers. I'd say FreeBSD is the way to go.
- Re:my $0.02 (Score:3, Interesting)
  
  by yomahz ( 35486 ) writes:
  
  RedHat is a pain to strip down to a bare minimum web server, I prefer OpenBSD [openbsd.org]. Sleek and elegant like the early days of Linux distros.
  
  Huh?
  
  for i in `rpm -qa|grep ^mod_`;do rpm -e $i;done
  
  rpm -e apache
  cd ~/src/apache.xxx ./configure --prefix=/usr/local/apache \
  --enable-rule=SHARED_CORE \
  --enable-module=so
  make
  make install
  
  with mod_so (DSO - Dynamic Shared Object) support, module installation is trivial.
- - Re:my $0.02 (Score:3, Funny)
    
    by khuber ( 5664 ) writes:
    
    Back alley colocation. It's the only way to afford it these days.
    -Kevin
  - Re:my $0.02 (Score:3, Informative)
    
    by SuiteSisterMary ( 123932 ) writes:
    
    If your server isn't designed with 'security' in mind, including the ability to padlock the chassis, and at least send an SNMP trap when the chassis is opened, then you need to learn that as far as 'computer and data security' is concerned, protecting from external network attacks is actually quite low on the totem pole.
    
    Or, "If Joe Random Idiot can walk in and rip out the hard drive, who cares how 3117 your firewall and other network protections are."
Strange choice of processors (Score:5, Insightful)

by Ed Avis ( 5917 ) writes: <ed@membled.com> on Saturday October 19, 2002 @07:24AM (#4484254) Homepage

I know that in the server market you often go for tried-and-tested rather than latest-and-greatest, and that the Pentium III still sees some use in new servers. But 1.26GHz with PC133 SDRAM? Surely they'd have got better performance from a single 2.8GHz Northwood with Rambus or DDR memory, and it would have required less cooling and fewer moving parts. Even a single Athlon 2200+ might compare favourably in many applications.

SMP isn't a good thing in itself, as the article seemed to imply: it's what you use when there isn't a single processor available that's fast enough. One processor at full speed is almost always better than two at half the speed.

Share
twitter facebook
- Almost (Score:2, Insightful)
  
  by Anonymous Coward writes:
  
  > One processor at full speed is almost always better than two at half the speed.
  
  You can safely drop that 'almost'.
How to make a fool of yourself (Score:5, Funny)

by noxavior ( 581294 ) writes: on Saturday October 19, 2002 @07:25AM (#4484257) Homepage Journal

Step one: Submit story on high performance web servers.
Step two: ???
Step three: Die of massive slashdotting, loss of reputation and business

Still, if someone has a link to a cache...

Share
twitter facebook
And as the last step... (Score:5, Funny)

by MavEtJu ( 241979 ) writes: <slashdot&mavetju,org> on Saturday October 19, 2002 @07:27AM (#4484260) Homepage

... Don't forget to post an article on /. so you can actually measure high-volume bulk traffic.

[~] edwin@topaz>time telnet www.hardwareanalysis.com 80
Trying 217.115.198.3...
Connected to powered.by.nxs.nl.
Escape character is '^]'.
GET /content/article/1549/ HTTP/1.0
Host: www.hardwareanalysis.com

[...]
Connection closed by foreign host.

real 1m21.354s
user 0m0.000s
sys 0m0.050s

Do as we say, don't do as we do.

Share
twitter facebook
High powered webserver? (Score:5, Funny)

by Moonshadow ( 84117 ) writes: on Saturday October 19, 2002 @07:27AM (#4484263)

In an hour or so, I'm predicting it will be a high-powered heap of smoking rubble. It's almost like this is a challenge to us.
Maybe it's their idea of a stress test. It's kinda like testing a car's crash durability by parking it in front of an advancing tank.

Share
twitter facebook
Defintion of irony (Score:3, Funny)

by nervlord1 ( 529523 ) writes: on Saturday October 19, 2002 @07:30AM (#4484267) Homepage

An article about creating high performacne webservers being slashdotted

Share
twitter facebook
- - Re:Defintion of irony (Score:2)
    
    by Electrum ( 94638 ) writes:
    
    Well, by using the same brilliant skills of analysis you do, this article is running on Apache, and the webserver is dead. That must mean that Apache is the Taco Bell of the webserver world, right?
    
    That would be about right. It's cheap, lots of people use it, but it's certainly not the best.
Alternative HowTo (Score:4, Informative)

by h0tblack ( 575548 ) writes: on Saturday October 19, 2002 @07:42AM (#4484289)

1. goto here [apple.com]
2. click buy
3. upon delivery open box and plugin
4. turn on Apache with the click of a button [apple.com]
5. happily serve up lots of content :)

6. (optional) wait for attacks from ppl at suggesting using apple hardware...

Share
twitter facebook
- Re:Alternative HowTo (Score:2)
  
  by GoRK ( 10018 ) writes:
  
  You forgot at least one step. Pick one to add but not both:
  
  4.5. Just because we're using a mac webserver, doesn't mean we're free from the responsibility of properly tuning our configuration. Anyone can buy a box of any type that's preconfigured to run apache when you first plug it in. Anyway, we tune the heck out of our Apache so that it will stand up to the load we're expecting.
  
  or
  
  7. Wonder what is going wrong when we realize we have no grasp of how our computer or applications actually work.
  
  ~GoRK
- - Re:Alternative HowTo (Score:2)
    
    by h0tblack ( 575548 ) writes:
    
    Definitely sounds like an interesting evaluation exercise.
    I'm of the opinion that it was a great move by Apple to move into this lower end server market. There's a lot of organisations that need some sort of server system for their network, but don't have the resources or the expertise to use some of the more traditional *nix based systems. That isn't to say that these are solely aimed at the "Idiots Guide to running a Server" market. There may be some nice user-friendly management and monitoring tools, but there's a lot under the hood to play with too. In the future there's also some interesting possibilities with clustering and the upcoming PPC970's from IBM. After all, this is really the first 'proper' server offering from Apple, future generations of the Xserve are definitely something to keep an eye on IMHO.
Why Apache? (Score:5, Informative)

by chrysalis ( 50680 ) writes: on Saturday October 19, 2002 @07:55AM (#4484309) Homepage

I don't understand.

Their article is about building a high performance web server, and they tell people to use Apache.

Apache is featureful, but it has never been designed to be fast.

Zeus [zeus.com] is designed for high performance.

The article supposes that money is not a problem. So go for Zeus. The Apache recommendation is totally out of context.

Share
twitter facebook
- Re:Why Apache? (Score:2, Redundant)
  
  by khuber ( 5664 ) writes:
  
  Any web server can be good enough as long as you spread the load over enough boxes. Apache is much more flexible than Zeus.
  -Kevin
  - Re:Why Apache? (Score:2)
    
    by jimfrost ( 58153 ) writes:
    
    Apache is more flexible, but in traditional versions (1.x) you have a problem in that a new program instance is used for each request. That makes things like maintaining persistent connections to the application servers really hard.
    Using something like iPlanet each server instance opens a number of connections to each application server in your cluster; you get a nice connection pool that way. With the Apache design (again this is 1.x) you can't use a pool so TCP setup/teardown costs between the web server and the application servers start to be an issue.
    Not that people don't do it, but it's a lot less efficient.
    I can't speak for Zeus, and as I understand it the most recent version of Apache allows threaded deployments that can take advantage of connection pooling, but most high volume sites use IIS or iPlanet as their front end web server.
    - - Re:Why Apache? (Score:2)
        
        by jimfrost ( 58153 ) writes:
        
        You're talking about one particular application I imagine. MQ Series is actually pretty rare in large scale deployments, DB2 is like my third choice in databases, and I'd prefer not to use HTTP servers as the actual application server.
        YMMV.
  - Re:Why Apache? (Score:2)
    
    by Electrum ( 94638 ) writes:
    
    Any web server can be good enough as long as you spread the load over enough boxes. Apache is much more flexible than Zeus.
    
    Sure, but if you need 2+ Apache boxes to handle the load of one Zeus box, wouldn't it make more sense to buy Zeus in the first place?
    
    I would like you to qualify your statement about Apache being more flexible. Zeus is a lot easier to configure than Apache. In what aspects is Apache more flexible?
    
    When it comes to mass virtual hosting, Zeus beats the pants off Apache. Zeus' configuration is fully scriptable out of the box. Apache's is not. Zeus can do wildcard subservers. Apache cannot. Zeus does not require restarting to make configuration changes or add sites. Apache does. Sites can only be added in Apache if using the very limited mass vhost module.
Server running at near 100% load (Score:5, Informative)

by ssassen ( 613109 ) writes: on Saturday October 19, 2002 @08:09AM (#4484332)

From the SecureCRT console, connected through SSH1, as the backend is giving me timeouts. I can tell you that we're near 100% server load and are still serving out those pages to at least 1500 clients. I'm sure some of you get timeouts or can't even reach the server at all, for that I apologize, but we just have one of these, not a whole rack full of them.
Have a good weekend,
Sander Sassen
Email: ssassen@hardwareanalysis.com
Visit us at: http://www.hardwareanalysis.com

Share
twitter facebook
- - Re:Server running at near 100% load (Score:2)
    
    by happystink ( 204158 ) writes:
    
    Yeah, but 1500 clients WHAT? this minute, this second?
  - - Re:Server running at near 100% load (Score:3, Insightful)
      
      by Anonymous Coward writes:
      
      I'm sorry, but if your server cannot handle 2000 connections then NineNine is right, you have a crappy backend. How is the fact that you have Flash animation relevant? Isn't a 200k flash animation the same as a 200k jpeg from the server's point of view? If your server cannot handle 2000 connections, what business do you have writing an article about "high performance" webservers? It would be a different story if you entitled it "high performance webserver for less than $1000," but you didn't.
      
      Personally I think the new trend on Slashdot of "hey, I saw this article about ____, it's really insightful and just great!" being submitted by the author of that article is sort of shitty. If anybody knows about building a high traffic webserver, it would be Slashdot, so you'd think they'd be a little pickier about what they post regarding high performance servers.
how to build a high performance/reliable webserver (Score:4, Informative)

by jacquesm ( 154384 ) writes: <j@NoSpam.ww.com> on Saturday October 19, 2002 @08:42AM (#4484384) Homepage

1) use multiple machines / round robin DNS
2) use decent speed hardware but stay away from
'top of the line' stuff (fastest processor,
fastest drives) because they usually are not
more reliable
3) replicate your databases to all machines so
db access is always LOCAL
4) use a front end cache to make sure you use
as little database interaction as you can
get away with (say flush the cache once per
minute)
5) use decent switching hardware and routers, no
point in having a beast of a server hooked up
to a hub now is there...

that's it ! reasonable price and lots of performance

Share
twitter facebook
- Re:how to build a high performance/reliable webser (Score:2)
  
  by SuperCal ( 549671 ) writes:
  
  Thanks, I wish I hadn't posted earily in this article so I could use my mod points. Now, my only question is how fast is decent speed? I'm about to build my own server (actually I'm going to have some help, but I want to at least sound like I know what I'm doing) nothing fancy. I don't expect a huge hit count or anything, so would using older (500-750 mhz)second hand computers, properly upgraded memory and storage, work? Also would you recomend replacing the powersuply. One the guys whoes helping me swears that will save me money in the long run on energy costs, but I don't know if its worth the cost.
  - Re:how to build a high performance/reliable webser (Score:2, Informative)
    
    by jcrowe ( 207448 ) writes:
    
    The company I work for successfully runs our webserver(php & MySQL) on an old pentium 166. We have several thousand visitors every month & use it for an ftp site for suppliers, a router, firewall, gateway & squid server.
    
    I think that your 700mhz machine would work fine for just web pages. :)
  - Re:how to build a high performance/reliable webser (Score:2, Interesting)
    
    by jacquesm ( 154384 ) writes:
    
    we serve up between 5 and 7 million pageviews daily to up to 100,000 individual IP's
    
    Decent speed to me is one in which the server is no longer the bottleneck, in other words serving up
    dynamic content you should be able to saturate the pipe that you are connected to.
    
    I have never replaced the power supply because of energy costs, it simply isn't a factor in the
    overal scheme of things (salaries, bandwidth, amortization of equipment)
    
    500-700 Mhz machines are fine for most medium volume sites, I would only consider a really fast machine to break a bottleneck, and I'd have a second one on standby in case it burns up
- Re:how to build a high performance/reliable webser (Score:3, Interesting)
  
  by Electrum ( 94638 ) writes:
  
  3) replicate your databases to all machines so
  db access is always LOCAL
  
  This is probably a bad idea. Accessing the database over a socket is going to be much less resource intensive than accessing it locally. With the database locally, the database server uses up CPU time and disk I/O time. Disk I/O on a web server is very important. If the entire database isn't cached in memory, then it is going to be hitting the disk. The memory used up caching the database cannot be used by the OS to cache web content. A separate database server with a lot of RAM will almost always work better than a local one with less RAM.
  
  This Apache nonsense of cramming everything into the webserver is very bad engineering practice. A web server should serve web content. A web application should generate web content. A database server should serve data. These are all separate processes that should not be combined.
  - Re:how to build a high performance/reliable webser (Score:2, Interesting)
    
    by Anonymous Coward writes:
    
    Not so...
    You can cache with technologies like Sleepycat's DBM (db3).
    
    We have a PHP application that caches lookup tables on each local server. If it cant find the data in the local cache, then it hits our Postgresql database. The local DBM cache gets refreshed every hour.
    
    Typical comparison
    -------------------
    DB access time for query: .02 secs
    Local cache (db3) time: .00003 secs
    
    We server load dropped from typical 0.7 to an acceptable 0.2, and the load on the DB server dropped like a rock! This is with over a million requests (no graphics, just GETS to the PHP script) every day.
    
    We also tuned the heck out of Apache (Keepalive, # of children, life of children etc).
    
    Some other things we realized after extensive testing:
    1. Apache 2.0 sucks big time! Until modules like PHP and mod_perl are properly optimized, there's not much point in moving there.
    2. AolServer is great for Tcl, but not for PHP or other plugin technologies
    
    Because of all these changes, we were able to switch from a backhand cluster of 4 machines, back down to a single dual processer machine, with another machine available on hot standby. Beat that!
OK so where do I start? (Score:2)

by SuperCal ( 549671 ) writes:

I was really excited to see this article, because oddly enough I am seriously considering setting up my own webserver. In fact am thinking of running slashcode. So far everyone has been saying that the article generally sucks. So the question remains where should I start? I was thinking of buying a few of my company's used PCs and building a cluster... that scares me a bit, as I'm not a computer genius, but I can get a great deal on these computers (between 5 and 10 500mhz wintel computers)

OK, I know that was rambling so to recap simply, is it better to go with a expenive single MP solution like the article, or with a cheaper cluster of slow/cheap computers
- Re:OK so where do I start? (Score:3, Informative)
  
  by ssassen ( 613109 ) writes:
  
  People are negative because the server has been unreachable for some, but they tend to conveniently forget that we did not design for 2000+ simultaneous clients, just a couple of hunderd really. Just thought I'd let you know, as we only have one of these whereas most websites (like Anand and Tom) have a rack full of them. Still we're handling the load pretty well and are serving out the pages to about 1500 clients.
  Have a good weekend,
  Sander Sassen
  Email: ssassen@hardwareanalysis.com
  Visit us at: http://www.hardwareanalysis.com
- Re:OK so where do I start? (Score:2, Insightful)
  
  by drouse ( 34156 ) writes:
  
  I wouldn't worry too much.
  
  Probably 90% of all non-profit websites could be run off a single 500 MHz computer and most could be run from a sub 100 MHz CPU -- especially if you didn't go crazy with dynamic content.
  
  A big bottleneck can be your connection to the Internet. The company I work for once was "slashdotted" (not by slashdot) for *days*. What happened was our Frame Relay connection ran at 100%, while our web server -- a 300 MHz machine (running Mac OS 8.1 at the time) had plenty of capacity left over.
- Re:OK so where do I start? (Score:2)
  
  by cymen ( 8178 ) writes:
  
  Well where do you plan on putting all these boxes? Are you going to serve your pages over a DSL connection? Or colocate? If you are planning on colocating, you'll be investigating smaller sized servers, like 1U or 2U size, unless you have money to blow. To be honest, you should just setup one server and get some page hits. Then think about how you'll survive the hordes of people that may come in the future. Unless you're serving porn. I would imagine the loads are always fairly high on porn servers. Someone here can surely offer suggestions if porn is involved.
Apache 1.3x? (Score:2)

by djupedal ( 584558 ) writes:

What kind of 'high performance' web server uses back-leveled software? Apache 2.x may not be totally API compliant, but it certainly provides more than 1.3x in terms of performance.

I am glad they used an IDE RAID, however. The SCSI myth can now go on the shelf.
- Re:Apache 1.3x? (Score:2, Informative)
  
  by Pizza ( 87623 ) writes:
  
  Actually, their disk tests are fundamentally flawed. RAID0 is only good for boosting raw sustained throughput; it has pretty much no effect on access time. If you want a boost in access time, go for RAID1, as you can load-balance reads across two drives.
  
  Furthermore, RAID0+1 is also not really worth it, as it still only gives you the ability to fail one drive, and instead of two logical spindle you only have one to do all of the work. But I suppose of your software is inflexible enough to only be able to operate on one partition, so be it.
  
  I'd like to see some numbers for their boxes loaded up with RAM and high numbers of random I/O operations, which is where the high rotational speed of modern SCSI drives really shine. And this is the access pattern of a dynamic database-driven web site.
  
  And as others have said, it's not the hardware that makes the most difference in these circumstances, it's how the software is set up, and how the site/database is coded.
  
  Hell, I've completely saturated a 100mbps network serving dynamic content via pure Java Servlets, and this was only a dual P3-650. With a RAID5 array of 50G 7200RPM SCSI drives, hardly cutting edge even at the time. Dropping in a RAID1 array of WD120 IDE drives couldn't come anywhere close. But once the working set of data was loaded into RAM, they both performed about the same.
  
  Their IDE raid setup is certianly considerably cheaper though, and that's a tradeoff that most people can easily make.
- Re:Apache 1.3x? (Score:4, Insightful)
  
  by GoRK ( 10018 ) writes: on Saturday October 19, 2002 @10:08AM (#4484598) Homepage Journal
  
  Their IDE-RAID is actually software RAID. The SCSI myth can go off the shelf, sure, but don't take the RAID myth down.
  
  The promise FastTrak and Highpoint and a few others are not actually hardware RAID controllers. They are regular controlers with enough firmware to allow BIOS calls to do drive access via software RAID (located in the firmware of the controller), and OS drivers that implement the company's own software RAID implementation at the driver level, thereby doing things like making only one device appear to the OS. Some of the chips have some performance improvements over a purely software RAID solutions, such as the ability to do data comparisons between two drives in a mirror during reads, but that's about it. If you ever boot them into a new install of windows without preloading their "drivers", guess what? Your "RAID" of 4 drives is just 4 drives. The hardware recovery options they have are also pretty damned worthless when it comes to a comparison with real RAID controllers - be they IDE or SCSI.
  
  A good solution to the IDE RAID debacle are the controllers by 3Ware (very fine) or the Adaptec AAA series controllers (also pretty fine). These are real hardware controllers with onboard cache, hardware XOR acceleration for RAID 5 and the whole bit.
  
  Anyway, I'm not really all that taken aback that this webserver is floundering a bit, but seems really responsive when the page request "gets through," so to speak. If it's not running low on physical RAM, it's probably got a lot of processes stuck in D state due to the shit promise controller. A nice RAID controller would probably have everything the disks are thrashing on in a RAM cache at this point.
  
  ~GoRK
  
  Parent Share
  twitter facebook
More Advice from the site (Score:4, Funny)

by HappyPhunBall ( 587625 ) writes: on Saturday October 19, 2002 @08:59AM (#4484421) Homepage
Once you have the hardware setup and the software configured, it is time to design your site to perform. The following tips will help you create a site that is just as scalable as ours. Enjoy.
1. Use lots, and I mean lots of graphics. Cute ones, animated ones, you name it and people expect to see them. Skimping here will hurt your image.
2. CSS style sheets may be the way of the future, but just for now make sure you include dozens or even hundreds of font tags, color tags, and tables in your site. Trust us. This has the added benefit of increasing your page file size by at least 30%. You do want a robust site right?
3. Make sure you are serving plenty of third party ads! Their bandwidth matters also, and you know the way to make money on the web is be serving lots of "fun" animated ads. This will not slow down the user experience of your site one bit! Those ad people are slick, they know that you are building a high bandwidth / high performance site and will be expecting the traffic.
4. A site is not a high performance site until is has withstood the infamous Slashdot effect. You will want to post a link to your site on /. post haste to begin testing.
That should be enough to get you started. Now you too can build a rocking 200K per page site, and having read our hardware guidelines, you can expect it to perform just as well as ours did. One more free tip: Placing a cool dynamic hit counter or traffic meter on your site in a prominent position will encourage casual visitors to hit the reload button again and again, driving the performance of your site through the roof.
Share
twitter facebook
How not to get slashdotted? (Score:2, Funny)

by Bahamuto ( 227466 ) writes:

Does building this high performace web server prevent you from being slashdotted?
This is wrong on soooo many levels. (Score:5, Interesting)

by (H)elix1 ( 231155 ) writes: <slashdot.helix@nOSPaM.gmail.com> on Saturday October 19, 2002 @09:59AM (#4484584) Homepage Journal

(include standard joke about high performance web serving getting /.)

I'd post sooner, but it took forever to get to the article.. here are my thoughts...

First off SCSI.

IDE drives are fast in a single user/workstation environment. As a file server for thousands of people sharing an array of drives? I'm sure the output was solid for a single user when they benched it... looks like /. is letting them know what multiple users do to IDE. 'Overhead of SCSI controller'... Methinks they do not know how SCSI works. The folks who share this box will suffer.

Heat issues with SCSI. This is why you put the hardware in a nice climate controlled room that is sound proof. Yes, this stuff runs a bit hot. I swear some vendors are dumping 8K RPM fans with ducting engineered to get heat out of the box and into the air conditioned 8'x19" chassis that holds the other 5-30 machines as well.

I liked the note about reliability too... it ran, it ran cool, it ran stable for 2 weeks. I've got 7x9G Cheetahs that were placed into a production video editing system and ran HARD for the last 5+ years. Mind you, they ran about $1,200 each new... but the down time cost are measured in minutes... Mission critical, failure is not an option.

OS

Lets assume the Windows 2000 Pro was service packed to at least SP2... If that is the case, the TCP/IP stack is neutered. Microsoft wanted to push people to Server and Advanced Server... I noticed the problem when I patched my counter strike server and performance dogged on w2kpro w/sp2 - you can find more info in Microsoft's KB... (The box was used for other things too, so be gentle) Nuking the TCP/IP stack is was the straw that cracked my back to just port another box to Linux and run it there.

Red Had does make it easy to get a Linux box up and running, but if this thing is going outside the firewall, 7.3 was a lot of work to strip out all the stuff that are bundled with a "server" install. I don't like running any program I did not actually install myself. For personal boxes living at my ISP, I use slackerware (might be moving to gentoo however). Not to say I'm digging through the code or checking MD5 hashes as often as I could, but the box won't even need an xserver, mozilla, tux racer, or anything other than what it needs to deliver content and get new stuff up to the server.

CPU's (really a chassis problem):

I've owned AMD's MP and Intel's Xeon dually boards. These things do crank out some heat. Since web serving is usually not processor bound, it does not really matter. Pointing back to the over heating issues with the hard drives, these guys must have a $75 rack mount 19" chassis. Who needs a floppy or CD-ROM in a web server? Where are the fans? Look at the cable mess! For god's sake, at least spend $20 and get rounded cables so you have better airflow.

Share
twitter facebook
- Re:This is wrong on soooo many levels. (Score:3, Interesting)
  
  by seanadams.com ( 463190 ) writes:
  
  IDE drives are fast in a single user/workstation environment. As a file server for thousands of people sharing an array of drives? I'm sure the output was solid for a single user when they benched it... looks like /. is letting them know what multiple users do to IDE. 'Overhead of SCSI controller'... Methinks they do not know how SCSI works. The folks who share this box will suffer.
  
  Methinks it's been a LONG time since you've read up on IDE vs SCSI, and me also thinks you dont have the first clue about how a filesystem works. Yes, there was a time when IDE drives were way slower, mainly because the bus could only have one outstanding request at a time. IDE has since advanced to support tagged command queuing and faster data rates, closing the gap with all but the most horrendously expensive flavors of SCSI. Really, the bottleneck is spindle and seek speed - both IDE and SCSI are plenty fast now.
  
  The only thing SCSI really has going for it is daisy-chainability and support for lots of drives on one port. HOWEVER there are some really killer things you can do with IDE now. In my web server I'm using the promise RM8000 subsystem: a terabyte of RAID5 storage for about $3500 including the drives IIRC. Try doing that with SCSI drives!
  
  Anyway.... you suggest that this server is slashdotted because it's disk-bound. Serving the exact same page over and over again. Uh huh. Go read up on any modern file system, then figure out how long it takes to send a 100KB web page to 250,000 people over a DSL line, and then tell me where you think the problem lies.
Not to flame, but the article is bad for newbies (Score:2, Insightful)

by Anonymous Coward writes:

I'll just mention a couple of items:

1) For a high performance web server one *needs*
SCSI. SCSI can handle multiple request at one time and performs some DISK related processing compared to IDE that can only handle request for data single file and uses the CPU for disk related processing a lot more than SCSI does.

SCSI disk also have higher mean times to failure than SCSI. The folks writting this article may have gotten benchmark results showing their RAID 0+1 array matched the SCSI setup *they* used for comparison, but most of the reasons for choosing SCSI are what I mention above -- not the comparitive benchmark results.

2) For a high performance webserver, FreeBSD would be a *much* better choice than Redhat Linux. If they wanted to use Linux, Slackware or Debian would have been a better choice than Redhat Linux for a webserver. Ask folks in the trenches, and lots will concur with what I've written on this point due to mainenance, upgrading, and security concerns over time on a production webserver.

3) Since their audience is US based, It would make sense to co-lo their server in the USA. Both from the standpoint of how many hops packets take from their server to their audience, and from the logistical issues of hardware support -- from replacing drives to calling the data center if there are problems. Choosing a USA data center over one in Amsterdam *should* be a no brainer. Guess that's what happens when anybody can publish to the web. Newbies beware!!
Slashdotted (Score:3, Funny)

by entrylevel ( 559061 ) writes: <jaundoh@yahoo.com> on Saturday October 19, 2002 @11:27AM (#4484794)

Ooh! Ooh! I really want you guys to teach me how to build a high performance webserver! What's that? You can't, because your webserver is down? Curses!

(Obligatory disclaimer for humor-impaired: yes I understand that the slashdot effect is generally caused by lack of bandwidth rather than lack of webserver performance.)

Share
twitter facebook
"millions of page views every month" not High-Perf (Score:3, Insightful)

by Anonymous Coward writes: on Saturday October 19, 2002 @12:37PM (#4485069)

Too bad "millions of page views every month" is simply not even in the realm that would require "High-Performance Web Server"(s). These guys need to come back and write an article once they've served up 5+ million page views per day. Not hits. Page views.

Share
twitter facebook
Western Digital Drives?? (Score:3, Interesting)

by zentec ( 204030 ) writes: <zentec AT gmail DOT com> on Saturday October 19, 2002 @04:27PM (#4486056)

The mere fact that they recommended 7200 rpm Western Digital drives for their high performance system gives me the impression they haven't a clue.

I disagree with the assertion that a 10,000 rpm SCSI drive is more prone to failure than a 7,200 IDE drive because it "moves faster". I've had far more failures with cheap IDE drives than with SCSI drives. Not to mention that IDE drives work great with minor loads, but when you start really cranking on them, the bottlenecks of IDE start to haunt the installation.

Share
twitter facebook
- - Re:Troll? Informative is more like it. (Score:2, Interesting)
    
    by OpCode42 ( 253084 ) writes:
    
    Every time that you click on a link and get bumped back to the front page here on Slashdot, it's a failure of mysql. So much for high-performance.
    
    Why hasn't Slashdot changed to postgresql?
    
    I thought this was a good question, if slightly off-topic.
- Re:So fast and soo goo... (Score:3, Funny)
  
  by khuber ( 5664 ) writes:
  
  It's still running. It's just extremely slow. Or maybe it's so fast it's zipping through space-time and it only seems slow from our reference frame.
  -Kevin
- Re:So fast and soo goo... (Score:3, Informative)
  
  by irc.goatse.cx troll ( 593289 ) writes:
  
  Server has nothing to do with it.
  10,000 slashdotters * 500k pages = 5gigs in about an hour.
  these figures are both estimates, but you can see that network congestion is obviously more of a bottleneck than their performance server.
- Re:Not-so high performance (Score:4, Insightful)
  
  by chrysalis ( 50680 ) writes: on Saturday October 19, 2002 @07:59AM (#4484317) Homepage
  
  The article is about *WEB* high performance.
  
  I don't see your point. "ping" has never been designed to benchmark web servers AFAIK.
  
  My servers don't answer to "ping". Does it mean that the web server is down? Noppe... it's up a running...
  
  "ping" is not an all-in-one magic tool. By using "ping" you can test a "ping" server. Nothing else.
  
  Parent Share
  twitter facebook
  - - Re:Not-so high performance (Score:2)
      
      by chrysalis ( 50680 ) writes:
      
      You are pinging Sourceforge.
    - - Re:Not-so high performance (Score:2)
        
        by chrysalis ( 50680 ) writes:
        
        ICMP REPLY doesn't exist. Maybe you mean ICMP ECHO REPLY which has nothing to do with MTU discovery.
- Re:Not-so high performance (Score:2, Informative)
  
  by Fluffy the Cat ( 29157 ) writes:
  
  Servers will generally carry on pinging even if they're heavily overloaded. Lag or missing packets is generally either a congested or bad link.
- Re:Building a Better Webserver in the 21st Century (Score:3, Informative)
  
  by khuber ( 5664 ) writes:
  
  I hate to do this, but actually MS has put out some good stuff that's relevant to larger sites.
  http://www.microsoft.com/backstage/whitepaper.htm
  -Kevin

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

High-performance web server (Score:5, Informative)

Re:High-performance web server (Score:4, Informative)

Re:High-performance web server (Score:2, Informative)

10'000 RPM (Score:3, Insightful)

Re:10'000 RPM (Score:4, Funny)

Re:10'000 RPM (Score:3, Informative)

Re:10'000 RPM (Score:5, Funny)

Re:10'000 RPM (Score:5, Insightful)

But any web server is high-performance (Score:5, Insightful)

Re:But any web server is high-performance (Score:5, Insightful)

Re:But any web server is high-performance (Score:5, Insightful)

Re:But any web server is high-performance (Score:2, Interesting)

Re:But any web server is high-performance (Score:3, Informative)

Re:But any web server is high-performance (Score:4, Interesting)

Re:But any web server is high-performance (Score:5, Interesting)

Re:But any web server is high-performance (Score:2, Informative)

Re:But any web server is high-performance (Score:5, Informative)

Re:But any web server is high-performance (Score:2, Interesting)

Re:But any web server is high-performance (Score:5, Insightful)

Re:But any web server is high-performance (Score:3, Interesting)

Re:But any web server is high-performance (Score:2)

Re:But any web server is high-performance (Score:3, Informative)

Re:But any web server is high-performance (Score:5, Interesting)

Re:But any web server is high-performance (Score:3, Informative)

Re:But any web server is high-performance (Score:2)

Re:But any web server is high-performance (Score:5, Insightful)

Re:But any web server is high-performance (Score:4, Insightful)

gee, i wonder.. (Score:5, Funny)

server load (Score:5, Funny)

Re:server load (Score:2)

Re:server load (Score:5, Interesting)

Re:server load (Score:3, Interesting)

Re:server load (Score:3, Informative)

how nice of them (Score:2)

Re:server load (Score:2)

Re:server load (Score:5, Funny)

That "howto" sucks (Score:5, Interesting)

Re:That "howto" sucks (Score:5, Insightful)

Re:That "howto" sucks (Score:5, Informative)

"Three times the power?" (Score:5, Insightful)

Re:"Three times the power?" (Score:5, Interesting)

Re:"Three times the power?" (Score:5, Informative)

Re:"Three times the power?" (Score:4, Informative)

Re:"Three times the power?" (Score:5, Informative)

Re:"Three times the power?" (Score:5, Informative)

A little disapointing really (Score:5, Insightful)

Re:A little disapointing really (Score:2, Insightful)

Re:A little disapointing really (Score:2)

my $0.02 (Score:5, Informative)

Re:my $0.02 (Score:3, Informative)

Re:my $0.02 (Score:3, Interesting)

Re:my $0.02 (Score:3, Funny)

Re:my $0.02 (Score:3, Informative)

Strange choice of processors (Score:5, Insightful)

Almost (Score:2, Insightful)

How to make a fool of yourself (Score:5, Funny)

And as the last step... (Score:5, Funny)

High powered webserver? (Score:5, Funny)

Defintion of irony (Score:3, Funny)

Re:Defintion of irony (Score:2)

Alternative HowTo (Score:4, Informative)

Re:Alternative HowTo (Score:2)

Re:Alternative HowTo (Score:2)

Why Apache? (Score:5, Informative)

Re:Why Apache? (Score:2, Redundant)

Re:Why Apache? (Score:2)

Re:Why Apache? (Score:2)

Re:Why Apache? (Score:2)

Server running at near 100% load (Score:5, Informative)

Re:Server running at near 100% load (Score:2)

Re:Server running at near 100% load (Score:3, Insightful)

how to build a high performance/reliable webserver (Score:4, Informative)

Re:how to build a high performance/reliable webser (Score:2)

Re:how to build a high performance/reliable webser (Score:2, Informative)

Re:how to build a high performance/reliable webser (Score:2, Interesting)

Re:how to build a high performance/reliable webser (Score:3, Interesting)

Re:how to build a high performance/reliable webser (Score:2, Interesting)

OK so where do I start? (Score:2)

Re:OK so where do I start? (Score:3, Informative)

Re:OK so where do I start? (Score:2, Insightful)