High-Performance Web Server How-To - Slashdot

Catch up on stories from the past week (and beyond) at the Slashdot story archive

×

High-Performance Web Server How-To 281

Posted by Hemos on Saturday October 19, 2002 @07:03AM from the build-it-and-maybe-they-will-come dept.

ssassen writes "Aspiring to build a high-performance web server? Hardware Analysis has an article posted that details how to build a high-performance web server from the ground up. They tackle the tough design choices and what hardware to pick and end up with a web server designed to serve daily changing content with lots of images, movies, active forums and millions of page views every month."

This discussion has been archived. No new comments can be posted.

High-Performance Web Server How-To

Search 281 Comments Log In/Create an Account

Comments Filter:

High-performance web server (Score:5, Informative)

by quigonn ( 80360 ) writes: on Saturday October 19, 2002 @07:08AM (#4484219) Homepage

I'd suggest everybody with the need of a high-performance web server to try out
fnord [www.fefe.de]. It's extremely small, and pretty fast (without any special performance hacks!), see here [www.fefe.de].

Share
twitter facebook
my $0.02 (Score:5, Informative)

by spoonist ( 32012 ) writes: on Saturday October 19, 2002 @07:19AM (#4484245) Journal

* I prefer SCSI over IDE

* RedHat is a pain to strip down to a bare minimum web server, I prefer OpenBSD [openbsd.org]. Sleek and elegant like the early days of Linux distros.

* I've used Dell PowerEdge 2650 [dell.com] rackmount servers and they're VERY well made and easy to use. Redundant power supplies, SCSI removable drives, good physical security (lots of locks).

Share
twitter facebook
Re:10'000 RPM (Score:3, Informative)

by khuber ( 5664 ) writes: on Saturday October 19, 2002 @07:25AM (#4484256)

10k drives are LESS reliable, since they move faster.
Okay, well ,you can use ancient MFM drives since they move much slower and would be more reliable by your logic.
Personally, I'd take 10k SCSI drives over 7.2k IDE drives for a server, no question.
-Kevin

Parent Share
twitter facebook
Re:So fast and soo goo... (Score:3, Informative)

by irc.goatse.cx troll ( 593289 ) writes: on Saturday October 19, 2002 @07:33AM (#4484274) Journal

Server has nothing to do with it.
10,000 slashdotters * 500k pages = 5gigs in about an hour.
these figures are both estimates, but you can see that network congestion is obviously more of a bottleneck than their performance server.

Parent Share
twitter facebook
Alternative HowTo (Score:4, Informative)

by h0tblack ( 575548 ) writes: on Saturday October 19, 2002 @07:42AM (#4484289)

1. goto here [apple.com]
2. click buy
3. upon delivery open box and plugin
4. turn on Apache with the click of a button [apple.com]
5. happily serve up lots of content :)

6. (optional) wait for attacks from ppl at suggesting using apple hardware...

Share
twitter facebook
Slashdotted again (Score:1, Informative)

by Anonymous Coward writes: on Saturday October 19, 2002 @07:43AM (#4484292)

These guys got taken down a few weeks back:

Hard Drives Evaluated for Noise, Heat and Performance [slashdot.org]

I'm sure spreading out their content over nine pages is definitely helping their server load.

Share
twitter facebook
Why Apache? (Score:5, Informative)

by chrysalis ( 50680 ) writes: on Saturday October 19, 2002 @07:55AM (#4484309) Homepage

I don't understand.

Their article is about building a high performance web server, and they tell people to use Apache.

Apache is featureful, but it has never been designed to be fast.

Zeus [zeus.com] is designed for high performance.

The article supposes that money is not a problem. So go for Zeus. The Apache recommendation is totally out of context.

Share
twitter facebook
Server running at near 100% load (Score:5, Informative)

by ssassen ( 613109 ) writes: on Saturday October 19, 2002 @08:09AM (#4484332)

From the SecureCRT console, connected through SSH1, as the backend is giving me timeouts. I can tell you that we're near 100% server load and are still serving out those pages to at least 1500 clients. I'm sure some of you get timeouts or can't even reach the server at all, for that I apologize, but we just have one of these, not a whole rack full of them.
Have a good weekend,
Sander Sassen
Email: ssassen@hardwareanalysis.com
Visit us at: http://www.hardwareanalysis.com

Share
twitter facebook
Re:Building a Better Webserver in the 21st Century (Score:3, Informative)

by khuber ( 5664 ) writes: on Saturday October 19, 2002 @08:12AM (#4484336)

I hate to do this, but actually MS has put out some good stuff that's relevant to larger sites.
http://www.microsoft.com/backstage/whitepaper.htm
-Kevin

Parent Share
twitter facebook
Re:Not-so high performance (Score:2, Informative)

by Fluffy the Cat ( 29157 ) writes: on Saturday October 19, 2002 @08:40AM (#4484381) Homepage

Servers will generally carry on pinging even if they're heavily overloaded. Lag or missing packets is generally either a congested or bad link.

Parent Share
twitter facebook
how to build a high performance/reliable webserver (Score:4, Informative)

by jacquesm ( 154384 ) writes: <j@NoSpam.ww.com> on Saturday October 19, 2002 @08:42AM (#4484384) Homepage

1) use multiple machines / round robin DNS
2) use decent speed hardware but stay away from
'top of the line' stuff (fastest processor,
fastest drives) because they usually are not
more reliable
3) replicate your databases to all machines so
db access is always LOCAL
4) use a front end cache to make sure you use
as little database interaction as you can
get away with (say flush the cache once per
minute)
5) use decent switching hardware and routers, no
point in having a beast of a server hooked up
to a hub now is there...

that's it ! reasonable price and lots of performance

Share
twitter facebook
Re:"Three times the power?" (Score:5, Informative)

by NineNine ( 235196 ) writes: on Saturday October 19, 2002 @08:45AM (#4484387)

"Microsoft Windows 2000 Pro"

I got a good laugh out of this... W2K Pro is the desktop version, not the server version. Wow. Great article. Really well informed author.

Parent Share
twitter facebook
Re:But any web server is high-performance (Score:3, Informative)

by NineNine ( 235196 ) writes: on Saturday October 19, 2002 @08:54AM (#4484412)

Our databases are tuned. Some apps would just need to transfer too much data per request for a SQL call to be feasible.

I had this problem for a while... Sloppy coding on my part was querying 65K+ records per page. Server would start to crawl with a few hundred simultaneous users. Since I fixed it, 1000+ simultaneous users is no problem at all.

Parent Share
twitter facebook
Re:That "howto" sucks (Score:5, Informative)

by jimfrost ( 58153 ) writes: <jimf@frostbytes.com> on Saturday October 19, 2002 @09:02AM (#4484428) Homepage

High traffic sites, the ones that are really dynamic anyway, do more than that.
They start with a load balancer at the front end, or possibly several layers of load balancer. If they run a distributed operation they'll use smart DNS systems or routers to direct requests to the most local server cluster. The server cluster will be fronted by a request scattering system.
Behind the request scattering system you'll find a cluster of machines whose job it is to serve static content (often the bulk of data served by a site) and route dynamic requests to another cluster of servers, enforcing session affinity for the dynamic requests.
Behind the static content servers are the application servers. They do the heavy lifting, building dynamic pages as appropriate for individual users and caching everything they can to offload the database.
Behind the application servers is the database or database cluster. The latter is really not that useful if you have a highly dynamic site as there are problems with data synchronization in database clusters (no matter what the database vendors tell you). But that's ok, single databases can handle a lot of volume if built correctly and caching is done appropriately at the application level.
And there you have it, the structure of a really large site.

Parent Share
twitter facebook
Re:OK so where do I start? (Score:3, Informative)

by ssassen ( 613109 ) writes: on Saturday October 19, 2002 @09:03AM (#4484430)

People are negative because the server has been unreachable for some, but they tend to conveniently forget that we did not design for 2000+ simultaneous clients, just a couple of hunderd really. Just thought I'd let you know, as we only have one of these whereas most websites (like Anand and Tom) have a rack full of them. Still we're handling the load pretty well and are serving out the pages to about 1500 clients.
Have a good weekend,
Sander Sassen
Email: ssassen@hardwareanalysis.com
Visit us at: http://www.hardwareanalysis.com

Parent Share
twitter facebook
Re:But any web server is high-performance (Score:2, Informative)

by khuber ( 5664 ) writes: on Saturday October 19, 2002 @09:05AM (#4484433)

Very good info Jim.
Yeah, my experience is at a relatively large site. We use mostly large and midrange Suns, EMC arrays and so on. There's a lot of interest in the many small server architecture though that is still being investigated.
-Kevin

Parent Share
twitter facebook
Re:But any web server is high-performance (Score:5, Informative)

by jimfrost ( 58153 ) writes: <jimf@frostbytes.com> on Saturday October 19, 2002 @09:16AM (#4484473) Homepage

I've seen both kinds and take it from me, many small servers is more of a headache than the hardware cost savings is worth. Your network architecture gets complicated, you end up having to hire lots of people just to keep the machines running and with up-to-date software, and database connection pooling becomes a lot less efficient.
You save money in the long run by buying fewer, more powerful machines.

Parent Share
twitter facebook
Re:Apache 1.3x? (Score:2, Informative)

by Pizza ( 87623 ) writes: on Saturday October 19, 2002 @09:23AM (#4484504) Homepage Journal

Actually, their disk tests are fundamentally flawed. RAID0 is only good for boosting raw sustained throughput; it has pretty much no effect on access time. If you want a boost in access time, go for RAID1, as you can load-balance reads across two drives.

Furthermore, RAID0+1 is also not really worth it, as it still only gives you the ability to fail one drive, and instead of two logical spindle you only have one to do all of the work. But I suppose of your software is inflexible enough to only be able to operate on one partition, so be it.

I'd like to see some numbers for their boxes loaded up with RAM and high numbers of random I/O operations, which is where the high rotational speed of modern SCSI drives really shine. And this is the access pattern of a dynamic database-driven web site.

And as others have said, it's not the hardware that makes the most difference in these circumstances, it's how the software is set up, and how the site/database is coded.

Hell, I've completely saturated a 100mbps network serving dynamic content via pure Java Servlets, and this was only a dual P3-650. With a RAID5 array of 50G 7200RPM SCSI drives, hardly cutting edge even at the time. Dropping in a RAID1 array of WD120 IDE drives couldn't come anywhere close. But once the working set of data was loaded into RAM, they both performed about the same.

Their IDE raid setup is certianly considerably cheaper though, and that's a tradeoff that most people can easily make.

Parent Share
twitter facebook
Re:High-performance web server (Score:4, Informative)

by Electrum ( 94638 ) writes: <david@acz.org> on Saturday October 19, 2002 @09:36AM (#4484533) Homepage

Yep. fnord is probably the fastest small web server available. There are basically two ways to engineer a fast web server: make it as small as possible to incur the least overhead or make it complicated and use every possible trick to make it fast.

If you need features that a small web server like fnord can't provide and speed is a must, then Zeus [zeus.com] is probably the best choice. Zeus beats the pants off every other UNIX web server. It's "tricks" include non blocking I/O, linear scalability with regard to number of CPU's, platform specific system calls and mechanisms (acceptx(), poll(), sendpath, /dev/poll, etc.), sendfile() and sendfile() cache, memory and mmap() file cache, DNS cache, stat() cache, multiple accept() per I/O event notification, tuning the socket buffers, disabling nagle, tuning the listen queue, SSL disk cache, log file cache, etc.

Which design is better? Depends on your needs. It is quite interesting that the only way to beat a really small web server is to make one really big that includes everything but the kitchen sink.

Parent Share
twitter facebook
Re:how to build a high performance/reliable webser (Score:2, Informative)

by jcrowe ( 207448 ) writes: on Saturday October 19, 2002 @09:59AM (#4484582) Homepage

The company I work for successfully runs our webserver(php & MySQL) on an old pentium 166. We have several thousand visitors every month & use it for an ftp site for suppliers, a router, firewall, gateway & squid server.

I think that your 700mhz machine would work fine for just web pages. :)

Parent Share
twitter facebook
Re:my $0.02 (Score:3, Informative)

by Door-opening Fascist ( 534466 ) writes: <skylar@cs.earlham.edu> on Saturday October 19, 2002 @10:23AM (#4484634) Homepage

RedHat is a pain to strip down to a bare minimum web server, I prefer OpenBSD [openbsd.org]. Sleek and elegant like the early days of Linux distros.

OpenBSD doesn't have support for multiple processors, which are a necessity for database servers and dynamic web servers. I'd say FreeBSD is the way to go.

Parent Share
twitter facebook
Re:"Three times the power?" (Score:5, Informative)

by (H)elix1 ( 231155 ) writes: <slashdot.helix@nOSPaM.gmail.com> on Saturday October 19, 2002 @10:46AM (#4484696) Homepage Journal

That was total FUD. The two operating systems have comparable performance on the same hardware.

Win2k pro limits you to 10 concurrent TCP/IP connections, Win2K Server has no (artificial) limit but won't cluster, Advanced Server can cluster but I don't know a thing about it..

Linux has no (artificial) limit... not sure about clustering options there either.

Found out about the TCP/IP limit when I added SP2 and trashed my evening counter-strike server - this makes a HUGE difference.

Parent Share
twitter facebook
Re:High-performance web server (Score:2, Informative)

by Fefe ( 6964 ) writes: on Saturday October 19, 2002 @11:52AM (#4484883) Homepage

fnord supports CGI and PHP can be run in CGI mode.
Actually, at least two people are using fnord to host a PHP site.

Don't expect stellar performance, though. PHP is by no means a small interpreter. I guess it would be possible to be fast and PHP compatible with some sort of byte code cache. If there is enough demand, someone will implement it.

Parent Share
twitter facebook
Re:But any web server is high-performance (Score:3, Informative)

by Hast ( 24833 ) writes: on Saturday October 19, 2002 @12:45PM (#4485103)

How about reading the FAQ before you start giving out "facts"? Slashdot is running on:
* 5 load balanced Web servers dedicated to pages
* 3 load balanced Web servers dedicated to images
* 1 SQL server
* 1 NFS Server
Either the "little 4 way intel" you mention has a serious case of shizofrenia or your just full of it. (Guess which theory I'm going for.)

Besides the poster mentioned that those sites /are/ bigger than Slashdot. E.g. the mention that "Getting your URL posted during Friends" is nothing like getting it posted on Slashdot.

I know I shouldn't feed the trolls, but someone might actually belive this tripe.

Parent Share
twitter facebook
Re:Alternative HowTo (Score:2, Informative)

by mcowger ( 456754 ) writes: on Saturday October 19, 2002 @12:56PM (#4485142)

You missed a few steps:

3a) Pull off god awful packaging
3b) Install in rack with mickey mouse install setup thatrequires removing the cover from the machine, exposing all the internal electronics while your at it
3c) Making sure the system sags in the middle while installed in the rack.

and

4a) Wipe OS because you have to before you can set up RAID
4b) Setup RAID, have the disk set utility fail multiple times with cryptic errors, only to find that Apple's own docs say this is 'normal behavior'
4c) When disks fail are are removed, must reboot server to signle user mode to reconstruct failed data. May or may not work...apple says 'normal behavior'

and

5a) Hope that your machine doesn't exhause it TCP connection pool which it will if you make too many SSH connections to it.

Sorry, Im ust so pedantic today.

Really, though, the XServes are a cheap attempt at a server that just doesn't work. Its a mickey mouse hack from the beginning. And yes, I have set them up personally. Only 2, because I wont reccommend the purchase of anymore after THAT experiment.

Parent Share
twitter facebook
Re:But any web server is high-performance (Score:3, Informative)

by Aldurn ( 187315 ) writes: on Saturday October 19, 2002 @01:38PM (#4485338)

Aside: Is there any open source software that manages session affinity yet?

Yes. Linux Virtual Server [linuxvirtualserver.org] is an incredible project. You put your web servers behind it and (in the case of simple NAT balancing) you set the gateway of those computers to be the address of your LVS server. You then tell LVS to direct all IPs of a certain netmask to one server (i.e. if you set for 255.255.255.0, 192.168.1.5 and 192.168.1.133 will connect to the same server).

The only problem I had with it was that it does not detect downtime. However, I wrote a quick script that used the checkhttp program from Nagios [nagios.org] to pull a site out of the loop when it went down (these were Windows 2000 servers: it happened quite frequently, and our MCSE didn't know why :)

There are higher performance ways to set up clustering using LVS, but since I was lazy, that's what I did.

Parent Share
twitter facebook
Re:my $0.02 (Score:3, Informative)

by SuiteSisterMary ( 123932 ) writes: <slebrunNO@SPAMgmail.com> on Saturday October 19, 2002 @01:43PM (#4485360) Journal

If your server isn't designed with 'security' in mind, including the ability to padlock the chassis, and at least send an SNMP trap when the chassis is opened, then you need to learn that as far as 'computer and data security' is concerned, protecting from external network attacks is actually quite low on the totem pole.

Or, "If Joe Random Idiot can walk in and rip out the hard drive, who cares how 3117 your firewall and other network protections are."

Parent Share
twitter facebook
Re:"Three times the power?" (Score:2, Informative)

by Aldurn ( 187315 ) writes: on Saturday October 19, 2002 @01:47PM (#4485378)

At a website I used to work at, they decided they needed to use Windows 2000 Advanced Server for web clustering. That is, quite possibly, the worst decision they ever made (aside from going with Windows 2000; trust me on this one.)

Win2k AS Load Balancing (aka WLBS: Windows Load Balancing Service) works by detecting other computers on the network with the same service, and they decide who will handle what request. They both have a primary IP, which is unique, in addition to a "virtual" address, which is the same on all of them. They also have a fake MAC address which is identical on both (makes for interesting ping responses.)

An interesting thing we noticed about WLBS is that, unless a computer is off the network, it will still be in the cluster. I.e. if IIS fails on one machine, as long as you can ping it, it will still get traffic.

When we moved from WLBS to LVS [linuxvirtualserver.org], we noticed a 50% drop in average CPU usage. This is probably due to the fact that now the clustering horsepower was moved off the web servers, but still, a free product versus a rather expensive one. And we've had better uptime now than ever before.

Parent Share
twitter facebook
Re:"Three times the power?" (Score:4, Informative)

by Magila ( 138485 ) writes: on Saturday October 19, 2002 @02:17PM (#4485478) Homepage

Win2k pro limits you to 10 concurrent TCP/IP connections.

Whao! bullshit meter rising! While Win2K does have a limit on TCP/IP connections, it is in the thousands. A limit of 10 would be totaly ridiculous, it would cripple the OS for MANY people. Also, most of the traffic for a CS server is UDP so the TCP/IP connection limit isn't going to affect that much at all.

Parent Share
twitter facebook
Re:"Three times the power?" (Score:5, Informative)

by elemental23 ( 322479 ) writes: on Saturday October 19, 2002 @04:31PM (#4486065) Homepage Journal

The maximum number of other computers that are permitted to simultaneously connect over the network to Windows NT Workstation 3.5, 3.51, 4.0, and Windows 2000 Professional is ten. This limit includes all transports and resource sharing protocols combined. This limit is the number of simultaneous sessions from other computers the system is permitted to host.

From Microsoft Knowledge Base Article Q122920 [microsoft.com].
(Warning: The page layout is broken in Mozilla)

It's an artificial limitation. The idea is that if you need more simultaneous connections you should buy Win2k Server. In other words, MS wants you to spend more money.

Parent Share
twitter facebook
Re:server load (Score:3, Informative)

by 1110110001 ( 569602 ) writes: <(slashdot-0904) (at) (nedt.at)> on Saturday October 19, 2002 @04:33PM (#4486077)

Maybe the article Handling the Loads [slashdot.org], describing how Slashdot kept their Servers up at 9/11, is a bit of the thing you're looking for. b4n

Parent Share
twitter facebook

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Related Links Top of the: day, week, month.

413 commentsChatGPT Leans Liberal, Research Shows
347 commentsAmazon CEO Says 'It's Probably Not Going To Work Out' For Employees Who Defy Return-to-Office Policy
327 commentsHotel Owners Start To Write Off San Francisco as Business Nosedives
323 commentsChina is Building Nuclear Reactors Faster Than Any Other Country
315 commentsChina is Calling in Loans To Dozens of Countries

"Gravitation cannot be held responsible for people falling in love." -- Albert Einstein