Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
The Internet

Scaling Server Performance 349

An anonymous reader writes "When Ace's Hardware's article Hitchhiker's Guide to the Mainframe was posted on Slashdot, they got 590,000 hits and over 250,000 page requests during one day. This kind of traffic caused only a 21% average CPU load to their Java-based web server, which is powered by a single 550MHz UltraSparc-II CPU. In their newest article, Scaling Server Performance, Ace's Hardware explains how this was possible."
This discussion has been archived. No new comments can be posted.

Scaling Server Performance

Comments Filter:
  • 6 per second. (Score:4, Insightful)

    by blair1q ( 305137 ) on Friday January 17, 2003 @01:43PM (#5103223) Journal
    Are we supposed to be impressed with a computer that can serve 8 hits and 4 pages per second?
    • yes. (Score:5, Funny)

      by krog ( 25663 ) on Friday January 17, 2003 @01:47PM (#5103260) Homepage
      seeing as it took Slashdot 35 seconds to serve me up this comments.pl?op-Reply page, yes, i think we are supposed to be impressed.
      • Reply pages are dynamically generated (as are many other pages on Slashdot). News content on a site like Aceshardware would be static.

        Even for dynamic content, it seems any reasonable web server should easily be able to generate half a dozen pages per second. Of course, it won't be able to if you do something stupid like put all your content into a database.

      • seeing as it took Slashdot 35 seconds to serve me up this comments.pl?op-Reply page, yes, i think we are supposed to be impressed.

        I think that you must be having a problem on your end, since it loaded in under two seconds for someone else and loaded instantaneously for me.
      • I get that all the time as well. Pages on slashdot either load very fast, or very very slowly. this only happens to me on slashdot, so I assume it is their problem.
      • 35 seconds to load on your dialup? 35 seconds to render on your 486? 35 seconds to transfer from the other side of the world over a congested backbone? there could be many reasons why it takes so long to load... in my case the content is downloaded much quicker than the machine (a humble 250mhz) can render it.
    • Are we supposed to be impressed with a computer that can serve 8 hits and 4 pages per second?

      Sounds kinda weak to me too. I am currently working on a web server application that's supposed to serve highly dynamic, personalized pages (perhaps comparable to slashdot). Our perf goal is 200 pages/sec. Of course, it would be on bigger hardware but I think we could easily beat the number mentioned in the article on a 500Mhz PC.
    • Re:6 per second. (Score:3, Interesting)

      by mrtroy ( 640746 )
      perhaps. perhaps id be impressed if their cpu could keep up with the hits IF THEIR BANDWIDTH COULD KEEP UP
      *REQUEST TIMED OUT*
      My 1ghz server with 3 terrabytes of ram can handle any traffic you can throw at it!!! Now to upgrade that 56k....

      Burning karma :(
    • How many per second? (Score:4, Informative)

      by steveha ( 103154 ) on Friday January 17, 2003 @06:04PM (#5105094) Homepage
      They said that the peak load was 11 hits per second, with 4 pages being served. They also said that their CPU was 21% loaded to serve this much traffic.

      This says nothing about what they can serve under ideal conditions; this is what they actually served up during an actual slashdotting. If you want to max out their server, you will need to get more /. readers to hit them all at once, or perhaps they need a bigger pipe connecting them to the Net.

      Read the article; on ApacheBench with one particular page they tested, the server tested out at five dozen pages served up per second.

      I don't know about you, but I was somewhat impressed by all this. A $1000 Sun does seem to have been a wise choice for them.

      steveha
  • by darkov ( 261309 ) on Friday January 17, 2003 @01:45PM (#5103237)
    they got 590,000 hits and over 250,000 page requests during one day. This kind of traffic caused only a 21% average CPU load ... they didn't respond to any of them.
  • will be tested to see if it's meaningful. I like that. That is definitely putting your money where your mouth is.

    Of course, it is incumbent upon all of us to rush out and try to the link to the article. And some of us to actually read it as opposed to just reading the title.

  • by Anonymous Coward on Friday January 17, 2003 @01:46PM (#5103247)
    This kind of traffic caused only a 21% average CPU load to their Java-based web server, which is powered by a single 550MHz UltraSparc-II CPU. In their newest article, Scaling Server Performance, Ace's Hardware explains how this was possible.
    Battlestations!

    SLASHDOT THEM AGAIN!!!
  • by vanyel ( 28049 ) on Friday January 17, 2003 @01:47PM (#5103253) Journal
    When I was benchmarking web servers in *1994*, servers could handle 100,000/hr, which is only about 30/sec. You may need a T3 to handle the bandwidth, but any server that can't handle it today is misconfigured.
    • by stratjakt ( 596332 ) on Friday January 17, 2003 @02:00PM (#5103376) Journal
      In 1994 websites were nothing more than text documents with perhaps a handful of small .gifs in them. They werent plastered with media-intensive-ads, java applets and shockwave whizbangers, background music, video clips streaming off the same server and blah blah blah innovation.

      The web-design and server world seems to be focused on quantity, not quality.

      And frankly, much of what /. links to are personal sites run off of a DSL line. I think the effect has more to do with bandwidth than server load.
      • by vanyel ( 28049 ) on Friday January 17, 2003 @02:13PM (#5103519) Journal
        Yes, but each one of those wizbang annoyances is just another hit to the server. dynamic generation of pages is the real server killer, depending on how much hoop-de-loop you're going through to make them.
        • True enough, I meant to say that too.. Dynamic web pages are relatively 'new'.

          The difference between just showing a page and creating one is like the difference between a pre-rendered .avi file and rendering it realtime in hardware.

          I still figure bandwidth is the big killer. I mean you can only stuff watermelons through a garden hose so fast.
        • by FyRE666 ( 263011 ) on Friday January 17, 2003 @03:33PM (#5104083) Homepage
          Yes, but each one of those wizbang annoyances is just another hit to the server. dynamic generation of pages is the real server killer, depending on how much hoop-de-loop you're going through to make them.

          Maybe it's just late, but I'm having a problem following all this technical jargon ;-)
    • by Kunta Kinte ( 323399 ) on Friday January 17, 2003 @02:06PM (#5103434) Journal
      Those guys are using persistant server-side applications. Try getting those numbers from a reasonably complex PHP script, even with an opcode cache on such a small box( see my sig. for more info)

      Lots of people could use this type performance. I only had a chance to use JSP on one project, a while back. Tomcat was notoriously difficult to install back then. But when it was up, the difference between JSP application server and PHP become apparent. Application servers can make quite the difference.

      Just having an application scope for variables saved us a trip the the ldap server per request. PostNUKE, squirellmail, and lots of other large PHP apps could be sped up drastically if some of those features were available in the PHP engine.

      • Those guys are using persistant server-side applications. Try getting those numbers from a reasonably complex PHP script, even with an opcode cache on such a small box

        Like I said, misconfigured ;-)

        (yes, I'm joking)

    • When I was benchmarking web servers in *1994*, servers could handle 100,000/hr, which is only about 30/sec.

      But this is a Java-based server we're talking about.
    • That could be true, but this article is talking about dynamic database based pages. There's a huge difference.
  • by ites ( 600337 ) on Friday January 17, 2003 @01:48PM (#5103263) Journal
    Which is very funny: this is an article explaining how a web site survived the /. effect, thus trying to catch the /. readers back for a second round, and getting lots of advertising hits at the same time. If only that server could keep up.
    Now, a while back on /. I saw a report about a 200Mhz (?) PC running Windows 95 and with about 30 hard disks, that also seemed to do very well under the /. effect.
  • There are plenty of Web server machines that rarely crash, if ever. Many of these sites rely on the work on only one machine, just like Aces Hardware. If they have more than one Web machine, they split each up for each category (e.g. Google has a machine each for their "catalogs", "search", and "images" utilities)

    Academic: MIT.edu, Stanford.edu, Maryland.edu
    Business: Amazon.com, CDNow.com, Slashdot.com, Google.com
    Pleasure: TheHun.net, Playboy.com, Napster.com
  • Ones that crashs (Score:5, Interesting)

    by MCMLXXVI ( 601095 ) on Friday January 17, 2003 @01:52PM (#5103291)
    I would be more interested in stats on a webserver that took a puke. It would be interesting to see what started the dominos falling and what ultimatly brought it down. It would be as good a learning experiance as this article is.
  • ...so I could implement it on my PHProjekt server! I have become so dependent upon it that it's almost scary. I'd love to speed it up a bit though. Time to start reading I suppose.
  • Not to be cynical... (Score:2, Interesting)

    by Xunker ( 6905 )
    Not to be cynical, but serving (nearly) static pages shouldn't be a huge load by any standard. Even with dynamic (fully dynamic) pages, 250,000 isn't a huge number.

    As an example, I run a pretty popular site that pumps out about 250,000 as well, all CGI-created and database fed pages. This is being served by two 1ghz web heads and a 1ghz db server. Granted that those three machine run at 100% load during peak hours, it's still not a huge deal (this is because I haven't finished the local caching mechanism yet). Did I mention that the two webservers also toss 1 million images a day too?

    Of course, I don't wan to belittle the article that much -- If anything, it shows the preformance gains one gets when you use efficient hardware (I have no doubt that their 550 mhz Ultrasparc II has nearly the same horsepower as a 1 ghz x86) and efficient caching (caching data in RAM and serving from there, avoiding disk access penalties, is a huge performance increase).
    • by buysse ( 5473 )
      Incorrect. The 550 Mhz USII mentioned is in a Blade 100 or 150, which means that it's fucking slow. It's a IIe, not a II. A PIII/550 would smoke it for web serving.

      256K of cache on die, ALI chipset board that's a lot like a PC, slow PC133 (with very high latency) memory, dog-fucking-slow disk, unless they're using SCSI.

      This is not your father's E450.

  • by shoppa ( 464619 ) on Friday January 17, 2003 @01:56PM (#5103339)
    When one of the sites that I serve, The Computer History Simulation Project [trailing-edge.com], was slashdotted, I was serving 40-50 pages per second (which is nearly ten times the rate attributed to Ace's Hardware) on a 4-year-old webserver (a K6II-500) that cost about $200 to put together. And the server itself was ticking along with only a few percent CPU usage.

    OTOH, my puny little SDSL connection was seriously maxed out.

    Even old hardware can happily serve up hundreds of documents a second, if the pages are static.

    • Well then shoppa... tell us how you did it! The point is not to brag, but to help show others what you've learned in tuning your own box.
      • tell us how you did it!

        The steps:

        1. Install Linux [linuxfromscratch.org]
        2. Connect to the 'net
        3. Install Apache [apache.org], configure.
        4. Set up documents to serve
        The only limit when serving static documents to the 'net at large is network bandwidth.
    • A quick look at your web site reveals that it is not a database-backed web site with a lot of scripts running. Serving up dynamic content is much more challenging to do.
      • by shoppa ( 464619 ) on Friday January 17, 2003 @02:35PM (#5103699)
        I agree, dynamic content is very much more challenging - but is it wise for Ace's (or any of the other sites) who are serving up static stories to do so through dynamic methods?

        I am familiar with serving dynamic content of very high information density, and let me tell you, Ace's doesn't compare. The data I serve from work is updated every second; the stories on Ace's (and most other hardware-review sites) change every couple of days.

  • by Kevin Stevens ( 227724 ) <kevstev.gmail@com> on Friday January 17, 2003 @01:57PM (#5103348)
    I never really thought that the problem lied with the server's hardware, but in the bandwidth to the host. Shouldn't an article be written about how to conserve bandwidth during a slashdot effect? Even older servers should be able to handle 100 requests per second. I think most FPS's are alot more taxing than that.
    • One word: mod_gzip [freshmeat.net].
      • by pjrc ( 134994 ) <paul@pjrc.com> on Friday January 17, 2003 @03:22PM (#5104027) Homepage Journal
        One word: mod_gzip.

        Yes, mod_gzip is great and I use it on my own server [pjrc.com], but for any "normal" website the main advantage is an interactive speed-up for dialup users. It really doesn't save huge amounts of bandwidth (in this case, enough to matter for withstanding the slashdot effect).

        As an example, the page slashdot linked to is 22443 bytes of compressable html, and approx 84287 bytes of images (not including the ads and two images that didn't load because they're not handling the slashdot effect so well as they thing they can). At -9, the slowest and best compression (remember, this is a dynamic JSP site, not static content you can compress ahead of time), the html compresses to 5758 bytes, thereby reducing the total content from 106730 bytes to 90045.

        That's only a 15.6% reduction in bandwidth.

        Also, a typical HTTP response header, which can't be compressed, is about 300 bytes (not including TCP/IP packet overhead, which we'll ignore hoping that HTTP/1.1 keepalives are putting it all in one connection...). There were 18 images (actually 20, but junkbuster filtered 2 out for me). That's 19 HTTP headers, at 300 bytes each, all uncompressable. Adding in HTTP overhead we're at (approx) 112430 without compression and 95745 with mod_gzip. So the uncompressability of the headers reduces the bandwidth savings to 14.8%.

        The big advantage that makes mod_gzip really worthwhile for a site like that is the a dialup user can get all the html in about 2 seconds, rather than 5-6 (assuming the modem's compression is on). Then they can start reading, while the remaining 82k of images slowly appear over the next 20-30 seconds.

        Now in some cases, like slashdot's comments pages, mod_gzip makes a massive difference. But for most sites, the majority of the bandwidth is images that are already compressed. That 10% to 20% reduction in bandwidth from simply installing mod_gzip is pretty small compared to a bit of effort redesigning pages to trim the fatty images.

        • Content expiration (Score:3, Informative)

          by ttfkam ( 37064 )
          This is why people should set an expiration time on their static content. If, for example, I set up the images to expire one hour from the access time, multiple visits to the page (and images shared between multiple pages) would only be requested once. An ISP's proxy servers down the chain would only help in this regard.

          In addition, for static content, "LastModified" is easy to compute. Clients can request a page, send an "If-Modified-Since" header with the timestamp of the static item, and if the item hasn't changed, return a 304 response and no data.

          The same can be done for dynamic content, but it requires a bit more work. Most web servers do these things for static content out of the box.

          As was said in the article, the fastest request is the request that never has to be made.
  • by Glonoinha ( 587375 ) on Friday January 17, 2003 @01:57PM (#5103358) Journal
    I tried to go to the home page but it wasn't responding. Maybe they need to apply this technology to their home page.
  • by Wanker ( 17907 ) on Friday January 17, 2003 @01:58PM (#5103360)
    ... just put the webserver on a cheap network link. Then instead of being CPU-bound you're network-bound. Problem solved!

    I'd like to see some of their network performance graphs from that same day. That might make more interesting reading. I recall waiting a good long time for some of those pages to come up.
  • by AssFace ( 118098 ) <stenz77.gmail@com> on Friday January 17, 2003 @01:58PM (#5103369) Homepage Journal
    I'm going to write an extensive article on the 40 hits per day that I get.
    I manage all that with only an Athlon something or other with some amount of ram.

    I know a lot about the system - you have to when you are good like that.
  • I think that the main thrust of the article was essentially that size doesn't matter -- it's how you use what you've got.

    One thing that sort of made me think though, was the focus on being able to deliver massive numbers of "pages" and "hits". For most sites, this is not an issue -- their bandwidth would be hosed before the server would be. You can only stuff eight great tomatoes into that itty-bitty can. It doesn't take much to saturate a T-1.

    If you have nearly unlimited bandwidth, then these server-tuning issues start to become important. I think it is a good idea to focus on how applications are built and used when thinking about performance of servers. Too often, the sole focus is "can I do task X" and not "what is the most-efficient way I can do task X".

    A nifty article, all in all.

    GF.
  • Mad propz for surviving a slashdotting and everything, but:

    "an overall average CPU utilization of 21% for a modest 550 MHz uniprocessor machine is not too shabby."

    Firstly, when said CPU is an UltraSparc II, then 550Mhz is anything but modest. Secondly, I would not expect the CPU to be busy during a slashdotting; it would be hanging around waiting for the disk drives and network card to come up with something useful.

    • Unfortunately, the comment on the processor isn't quite right.

      The UltraSparc II only goes up to 480MHz, and the UltraIII starts at 750. In between is the grey area of the IIe and IIi, and the ONLY Sun box with a 550MHz processor is the SunBlade 100/150.

      If that's their web server, then the CPU is the least of their worries--the thing has internal IDE drives, two (only) 33MHz narrow PCI slots, and not much else. Assuming that one of the PCI slots is used for a faster and/or redundant network connection (QFE card most likely), then the other one is the only connection to SCSI disks. That CPU, low-end as it is (for Sun), is definitely going to spend its time waiting for the rest of the system.

      (And yes, I know that was your second point--I just wanted to back it up with some detail)
  • We Win! (Score:4, Funny)

    by PhoenxHwk ( 254106 ) on Friday January 17, 2003 @02:05PM (#5103428) Homepage
    "The page cannot be displayed".

    The lesson here is: put your money where your mouth is and you may end up eating it.
    • Connect failed


      Your request for http://www.aceshardware.com/read.jsp?id=45000241 could not be fulfilled, because the connection to www.aceshardware.com (216.87.214.213) could not be established.

      This is often a temporary failure, so you might just try again.

      • And if you do get a page, not many of the images are loading. They think they are hot for surviving a slashdotting with a peak at 1:30a, and the article telling the rest of us how to do it gets slashdotted :)
  • Updating Cache data (Score:2, Interesting)

    by andawyr ( 212118 )
    But here's the rub, and he mentions it at the end of the article:

    Of course, there are more complex applications where data caching can be implementing, such as discussion forums where multiple users can be adding, editing, and deleting messages simultaneously. But that's a topic for another article.

    Most of the applications I write involve updating data almost as often as fetching it from the database. In an environment like Apache where you have individual processes serving content (and database connections are process-centric), implementing caches that are updatable becomes a very complex excercise, without implementing an additional layer.

    eToys used a b-tree (Sleepycat?) database layer situated in front of the database layer - they would store objects in the b-tree, and fetch them from there if they had not expired. Once cache amongst all the servers made this worth doing; a Java web server can do something similar, since the objects are stored in memory shared between the various serving threads. The end result is similar to what Ace's Hardware has done.

    What have other people done? Since I use Apache, I'm leaning towards a disk-based caching system.

  • We didn't get them the first time, but that's okay. Just buck up and let's give them another shot! If they don't go down the first time, just keep hitting 'em until they do go down. You've got to give it a full 150%, you hear me? Now get out there and click some links!

    Go team /. go!
  • I hope they share the information on how and why their server trashed this time one of their stories appeared on slashdot. That would give us both success and failure, excellent edutainment!
  • by pjrc ( 134994 ) <paul@pjrc.com> on Friday January 17, 2003 @02:10PM (#5103486) Homepage Journal
    To point out the obvious, it looks like they're slashdotted.

    Reloading their page a couple times (2nd page of the article, not the one slashdot linked to), I'm getting occasional 503 errors, and the rest are taking a very long time to load. Usually the page comes up with some "broken" images that didn't load.

    At the bottom of each page, there's a number that seems to indicate the time they believe their server spent serving the page. Usually is says something like "2 ms" or "3 ms"... That may be how long their code spent creating the html, but the real world performance I see (via a 1.5 Mbit/sec DSL line) is many seconds for the html and many more for the images, some of which never show up, and sometimes a 503 error instead of anything useful at all.

    So, Brian, if you're reading this comment (which will probably be worthy of "redundant" moderation by the time I hit the Submit button)... it ain't workin' as well as you think. Maybe the next article will be an explaination of what went wrong this time, and you can try again???

  • I would paste the contents of the page in this post, because their page is slashdotted at the moment...I think that would be a bit ironic, though. ;)
  • It is not off line yet but it was not ready for the slashdotting. That box is slow as helllllll...

  • by backtick ( 2376 ) on Friday January 17, 2003 @02:18PM (#5103565) Homepage Journal
    We have OLD Cobalt Raq3's (300 MHz AMD K6, 128 MB Ram, single IDE drive) running the latest Cobalt OS, and we JUST had one of these boxes get hammered this week; in a 12 hour period, it handled 625,000 hits (mostly CGI's, but it had a reasonable amount of static content), and at the same time handled 35,000 POP requests, sent 4,500 emails, and did some other random functions (and things like hostname lookups are enabled for weblogs, FTP uploads are happening for weather-wite webcams that were associated with the heavy traffic, etc, so there's obviously not a huge amount of "tune it till it's ONLY gonna do one thing" going on here). Now, the box was taking a whipping compared to it's normal load, but c'mon. I can't say the "Poor little 550 MHz UltraSPARC story" makes me tear up :-)
  • OLD ARTICLE (Score:3, Informative)

    by mgkimsal2 ( 200677 ) on Friday January 17, 2003 @02:18PM (#5103570) Homepage
    Tuesday, November 27, 2001 8:07 AM EST
    ------
    It was published over a year ago, and undoubtedly was based on their spring/summer 2001 trials. Even then this info wasn't revolutionary, and is even less so now.
  • They've really simply discovered that dynamically generating essentially static content is a bad idea : the 'dynamic' pages they are talking of are just articles which once written stay the same, and so are serving identical pages to each user.

    Using scripting with database look ups to create such pages is obviously not good - much better is to compile your data in to static pages and serve those. I have done this for my own website using XSLT to generate the html pages with consistant links and menu's etc. - but you do have to remember to re-build it after making any changes or adding new content (I use gnu make to handle the dependancies of one page upon another so it doesn't rebuild the entire site everytime.)

    They've taken the alternative approach of still using a database for the requests, but then caching future requests for the same page-id's, which has the advantage of being compatible with their original dynamic generation system, but they don't mention how they handle the dependancy / cascading alterations problem if they change the content (though they could always flush the entire cache of course....).

    Neither of these approaches can help you though if you have real dynamic pages where every request is unique or there are are too many possible pages for caching to be feasible (for example amazon or google).

    • google i'll take, but i think that even amazon has a cachable amount of data. theirs probably changes once every day, though stock on hand could change more often. Vignette takes a similar caching approach, where each dynamic page can be cached on the webserver. you can have certain modules that are always dynamic or have everything static. once it's served up once, it saves the html and can reserve that same page again static.
  • by ledbetter ( 179623 ) on Friday January 17, 2003 @02:24PM (#5103621) Homepage
    I remember reading that original article, and yes, I was impressed at the responsiveness of the server. But before they are congratulated so much, consider this. The original story was posted on slashdot at 1AM.. so the initial spike of activity resulting from the linking being in the top few on Slashdot was directly proportional to the number of people on Slashdot at the time. As you can see from their graphs (if they're showing up for you) that traffic spiked, then continued on during the day.

    This time around, the link got posted at 2PM not 1AM, and so far as I can see, they handle this flurry of hits much less gracefully than the previous ones! There are a lot more people online at 2PM than 1AM (all arguments of nocturnal nighthawks and people in other time zones aside).
  • for the "slashdot effect"....if nothing else, this article gives me a good idea of number hits / time period. My little domain's web server runs on a 70MHz sparc 5, and I wondered what it could take if only serving *static* pages (which most of mine are)........so if I put an article on there with a 100k or less page and submitted to slashdot...hmmmmmm, maybe
  • by Jahf ( 21968 ) on Friday January 17, 2003 @02:31PM (#5103675) Journal
    Seriously ... the numbers aren't that great. I used to admin a DEC Alpha Digital Unix server running at a whopping 300Mhz and it routinely served over 1.5M hits per day along with email, authentication and accounting for over 5,000 people and we rarely if ever saw it over a 0.5 load average. This was 4 years ago.

    It's not apples to apples, since we weren't serving the same set of pages (we had around 500 personal homepages, each with a varied combination of static HTML, images and CGI programs) but honestly, if the numbers in this article are supposed to be impressive, we've grown too accustomed to web server feature bloat.
  • by mmcshane ( 155414 ) on Friday January 17, 2003 @02:32PM (#5103677)
    Queuing approaches have proven to be much more scalable in other areas - no reason to think it wouldn't work for web servers. Check out SEDA: An Architecture for Highly Concurrent Server Applications [berkeley.edu] for a working implementation in Java that outperformed Apache [insert benchmark caveat here].

    More on event-driven servers that minimize data copies and context-switching here [pl.atyp.us].
  • by pcraven ( 191172 ) <paul&cravenfamily,com> on Friday January 17, 2003 @02:34PM (#5103692) Homepage
    Some static story pages? Who cares?

    It all depends if you are actually doing something of interest.

    Like the comments in Slashcode, most apps go from static, to dynamic, to static caching of dynamic pages.

    At DTN we served up customized portal pages to people with commodity and equity quotes, news, graphs, etc. Since they didn't have any money we had to use a load balanced Pentium Pro and a Pentium II. The app had no problem serving the load, and it was fast.

    Now that I work for companies that have money, our apps run really slow. Developers get expensive machines and don't know how to optimize any more.
  • Notice that on their site the graph indicates that the page was linked on Slashdot at around 1:30 AM. Their actual peak traffic wasn't until 8 hours later. So instead of being truly hammered by the /. effect, they were able to watch it trail off through the day, probably with a lower number of hits than they would have had if, say, the referring article were posted mid-day. I suspect that if the refering link was posted in /.'s front page during a peak traffic time things would have been far heavier. Basically, as far as Slashdotting is concerned, this was probably a best-case scenario. How many people actually scroll down more than a fold's worth when reading /.? They probably missed out on a huge Slashdotting because of that. Lucky them.
  • He says you should avoid tying up database connections in processes that aren't using them. With mod_perl we do this by using a reverse proxy. You could do the same with PHP. He also says you should cache. Well, duh. It just seems odd how he puts this in terms of "Java saved us" when in fact these techniques are universal and any experienced developer would be using them by now.
  • but we are back, stronger than ever!
    It's nice to see a article like that, just what I was looking for [slashdot.org]
  • Ace HW needs a clue (Score:4, Informative)

    by LunaticLeo ( 3949 ) on Friday January 17, 2003 @04:04PM (#5104280) Homepage
    Ace's Hardware needs to research real servers before talking about their "scalable" servers. Their numbers are really saying that their box performs like a dog.

    For those of you interested in this topic here is a few pointers and words of wisdom.

    Server scalabilty and performance has three basic metrics, thruput (urls/sec), simultaneous connections, and performance while overloaded. Of course, you could add latensy but I'd argue that with the correct design latency is directly proportional to the real work you are doing, bad design insertes arbitrary waits.

    I know of a HTTP Proxy by a large ISP that does user authentications & URL authorization (re: database), header manipulation, and on-the-fly text compression at 3000 urls/sec for 2000-4000 simultaneous connections and maintains that performance under load by sheding connections, all this on a dual 1GHz Intel PIII box running a Open Source OS that starts with "L". That is a maximum of 260 Million URL/day, three orders of magnitude greater performance than Ace's Hardware stats.

    The simple answer to the question "How do I create a scalable fast network server?" is Event-driven GOOD & Threads BAD. Event driven network communication is two to three orders of magnitude better performing than thread/thread-pool based network communications. See Dan Kegel's C10K web page [kegel.com]. That means you must use non-blocking IO to client sockets and databases. Once you accomplish that small feat, dynamic content just consumes CPU; with 2.8 Ghz Xeon processors you have plenty of cycles for parsing HTML markup or whatever. Threads cause cache thrashing, and context switching. While thread programmers don't see the cost in their code, just read the kernel code and you'll see how much work HAS TO BE DONE to switch threads. Event driven programming just takes some state lookups (array manipulation) and a callback (push some pointers onto the stack and jump to a function pointer).

    Desgin is FAR MORE IMPORTANT than which runtime you use (execution tree, byte code, or straight assembly). I have done some very high load network programming with Perl using POE [perl.org].

    Python has Twisted Python [twistedmatrix.com]

    Java has the java.nio [sun.com] and the brilliant event/thread hybrid library SEDA [berkeley.edu] by Matt Welsch.

    I am also looking into the programming language Erlang [erlang.org] which builds concurrancy and event driven programming into the language. Further, Erlang is used by some big telco manufacturers to great effect (high performance and claimed 99.9999999% nine-nines reliability on a big app).
  • What I find much more interesting than this article is the way Slashdot handles the massive load. I might be a little off, but I believe Slashdot essentially has the main page updated every minute (five minutes?), so if you just load the main page, you're getting a static document, which is *much* faster.

    I've always thought more sites should do this. Why not have the pages you can get away with be static (updated every couple minutes for a 'real-time' feel), and only have the pages that need to be dynamic be generated on the fly? I was playing with ab (the Apache benchmark tool) on one of my computers, and I couldn't believe the difference -- loading a static page, I got something like 100,000 hits (I don't remember the time period); PHP got about 5,000 (unknown, but same as previous, time period). My numbers could be off, but assuming they're not, it would be 20x more effective to have the page generated every few minutes and saved as a static page, at least for high traffic sites. (For low traffic sites, this could probably consume *more* resources...)
  • 4 ms? (Score:3, Interesting)

    by suwain_2 ( 260792 ) on Friday January 17, 2003 @04:10PM (#5104326) Journal
    The load time, copied-and-pasted, from the bottom of their site:

    138974 ms

    A little over their 4 ms goal. Specifically, 138,970ms.

  • Basically, the whole article has 1 message: you should cache stuff. I couldn't agree more. Why doing a database request every time a page is hit? Even if you're going to show the same information say 1000 times? By combining dynamic and static elements, the "server load" part of the slashdot effect can be eliminated, I think slashdot also does this, but differently.

    Obviously, if you don't have enough bandwidth, you are screwed anyway, but usually it's the server load that is the problem.

    MfG shurdeek
  • by aussersterne ( 212916 ) on Friday January 17, 2003 @04:19PM (#5104396) Homepage
    As a bunch of people have pointed out, it is unlikely that the /. effect is a matter of "crashing" servers. It is much more likely that most of the "slashdotted" sites on the front page on a given day involve a server which is doing just fine and a bandwidth pipe which is seriously about to burst.

    You can saturate most any small-business-affordable pipe with a Pentium classic machine as a Web server. Or to put it another way, there's no point sticking a dual-P4-Xeon Web server with 4GB memory and a RAID-5 on a DSL line.

    The computer I'm using right now (a PIII system) could run Apache very nicely in the background and would likely survive quite a hitrate without too much trouble. But if even just a few thousand people were to hit it all at once, there would be a traffic jam, some people wouldn't get served, and the ISP would probably close me down, because I'm only sitting on a 256k pipe.
  • Whatever you can say about them being slashdotted, they are apparently squeezing the max out of their box, the site went down while they did some tuning and it's now back online; according to the article, they were allocating only 1GB out of 2 available for their current needs; probably that they had to use the whole 2GB to survive /. So, kudos to them, they survive at peak time, with a few links from slashdot frontpage. Time to finish reading the NSF (not so f..) article!
  • The best attack is thttpd's bandwidth throttling.. I have seen thttpd take a sound pounding serving pages and it heppily throttled back everyone to a dull roar. BUT serving a nice steady 30 pages a second is nothing... when you get 90,000 requests a second in bursts, espically when the story first hits the front page...

    Nothing but a gigabit ethernet connection can even come close to handling that.. and last time I checked a T-1000 line was not an option on internet-1
  • by rufusdufus ( 450462 ) on Friday January 17, 2003 @05:55PM (#5105039)
    This story is dopey. If you have a web server and it is hitting a CPU bottleneck, you have done something wrong.

    Ok, if the server actively plays chess against a hundred people, I'll let you be cpu bound.

  • by bloxnet ( 637785 ) on Friday January 17, 2003 @06:49PM (#5105334)
    I hate to do this, and get into some kind of "look at my l33t skills" type thing...but seriously, those numbers are just nothing to be impressed with. As several people have pointed out, usually the limitation on a well configured server is the bandwidth available. I have a buddy who runs a few adult sites, and I go ahead and keep his machines updated, optimized, etc, etc. On one web server alone, with simply rebuild Apache with a higher HSL and streamlining only essential services this *one* server is handling an average of 16,000,000 hits per day. (avg approx. 16,000,000 hits, 5,000,000 pageviews, 450,000 unique visitors per day). In fact, only last month did we set up a separate database server in anticipation of him getting even more traffic (I wanted to separate the web server from the db server esp. if we were gonna move to load balancing)...even still the cpu load was consistently low and the site was/is serving dynamically generated content (php) and is all driven by a mysql content management system. I have yet to even max out the usage of the server and do some ulimit type stuff or hard adjustments via kernel changes.... so what is the big deal about this article. I think it would be good to put up an article about how to optimize your web servers both in layout and actual configurations to allow for Slashdot levels of traffic. I doubt this will happen, just as the mirroring content on featured stories to help ease bandwidth or other similar suggestions. The saddest part is that once you spend the time to really optimize a machine or machines...it takes far less time to maintain them.

Our policy is, when in doubt, do the right thing. -- Roy L. Ash, ex-president, Litton Industries

Working...