Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Apache Software

Handling 1,000,000 Hits a Day? 22

Mr. Ed asks: "Hi! We run a very busy subscription-based service and we receive about 1,000,000 hits per day. My question is what is the best way to deal with this? DBM based access isn't very flexible and so I would like to switch to MySQL/mod_auth_mysql but MySQL doesn't seem to be able to deal with the load very well. What is everyone else doing? Am I going to have to go to a closed commercial solution or is there some open source software that can handle this sort of a beating? Thanks in advance!" We've talked about optimizing Apache/MySql in a production environment, but this sounds more of a configuration problem. Thoughts?
This discussion has been archived. No new comments can be posted.

Handling 1,000,000 Hits a Day?

Comments Filter:
  • Unless their subscribers are either very well regulated or very well distributed around the world I doubt that your calculation is correct. Simply saying 1,000,000 hits per day is not enough information to do the queuing calculations, but I think it's safe to assume that hits will not be evenly distributed throughout the day and night.

    At a rough guess, I'd say that a sight receiving 1,000,000 hits per day needs to support at least 5000 hits per minute if they don't want to be unavailable at peak times.
  • AOLserver [aolserver.com] if you're not many apache-specific things. php4 allegedly works with it.
  • I thought that it was the HP Vectra's running the "Unstoppable" Windows NT.
  • I run a site that gets up to 3,000,000 hits a day, and I disagree with the idea that you should try to use as much static content as possible. The problem with that, is maintaining it. I try to make all my sites customizable by using templates and database lookups. This way, when I need to make a change I just go to one spot.

    What I highly reccomend is getting a big hairy box for your DB server. Maybe something like a Sparc 4500 if you've got the cash. Maybe a 4 proccessor x86 box if you don't. Evaluate your database products thouroughly. You need to know how the product performs in the way you are using it, not what someone tells you is the best. MySQL is free of course, and you can get evals for DB2, Oracle, x10, etc... Spend some serious time here.

    Spread out your webservers. Your webserver shouldn't run on your DB server. Cheap webservers are, IMHO the best way to go... need more power... for under 2000$ you should be able to get a high quality rackmount with SCSI.

    RAID on the DB server, not the web servers. The web servers themselves should be hotswappable. keep at least one more in your round robin than you need.

    Additionally, you should implement some sort of caching system on the webserver, you could write an Apache module to do this, if it's not already out there. Or... You could go for ColdFusion... Now that it runs on Linux. Coldfusion has caching built in, and it's fast.

    Finally, get a good load testing program that allows you to put a simulated load on the group of webservers. This will allow you to know your limits.

    Remember, you have to evaluate your situation yourself, and get some trials of products and see how they work with your stuff. That counts a whole helluva lot more than the benchmarks they release.

  • ...Phil and Alex's guide to web publishing

    http://www.photo.net/wtr/thebook/
  • by johno ( 37036 )
    I'd do what slashdot itself does.... Apache with MySQL (reference [slashdot.org])

    I'd say they deal with a comparable load, too...

  • Well, appart from all the software/content tweaking explained above, how about simply upgrading the hardware ? Like :

    - a separate DB server, tweaked for MySQL and connected to the web server thru dedicated 100 Mbit Ethernet.
    - both server could have 2 fast P3 CPU, a 10000 rpm Ultra2 SCSI harddrive (or several ones so that you can span different DB tables accross different drives and get simultaneous access). Add plenty of RAM.

    All this hardware is rather cheap and can be found anywhere, and can take some loads. After all with the moore law being what it is there's no need to depreciate your Web content with static stuff...
  • lemme check ... 1.000.000 Hits /day = 41666 Hits / hour = abt. 700 Hits/minute.
    That should be not too much for a descent server with enough ram.

    Be sure to :
    - use an index on the username column ! :)
    - use an socket connection instead of tcp/ip
    - that max_clients(mysql) >= max_clients(apache)
    (this one is important !)

    Do the mysql processes eat much cpu time ?

    You could also give mysql more memory, the default
    is about 20 mb or so.

    Michael
  • how do u force it to use a socket instead of tcp/ip in the perl modules?
  • ...however you may want to separate scripts that do database access and ones that don't into different "Applications" in fhttpd config to reduce the number of persistent connections to the database.
  • AOLserver has really nice TCL support. Some people would prefer scripting in TCL to php4. Just thought it should be mentioned.

    --
  • I thought that it was the HP Vectra's running the "Unstoppable" Windows NT.
  • Far more information is required for such a question. Are you running it on a 486? or a Quad Xeon? Are you utilising clustering capabilities? do you have SCSI or IDE disk drives? how much RAM do you have? what is the nature of your application? is your SQL designed well to fit in with a MySQL style structure? (Ie: are you attempting to do heavy relational work, not MySQLs strong point, or do you have it well designed for the shallow style at which MySQL excels?) is your SQL efficient? do you have MySQL set up properly (Do you have all those performance-hitting log functions turned off?) what software are you using for your front end? PHP? Perl? C? do you use persistant database connections? do you have the database on the same machine as the web server?

    I could go on for hours. The simple fact is that this question requires 1. Far more information, and 2. should be posted to the MySQL list rather than to /.
  • by Hiro ( 20938 ) on Wednesday February 02, 2000 @07:53AM (#1311138)
    At work, we use old PHP version, but handling more than half M hits/day easiley. Here are my tips:

    1) separate db server & web server
    2) if possible, get some load balancing system for web servers
    3) go static! static! static! Avoid hitting MySQL as much as possible
    4) mod_auth_mysql is okay, but basic authentication is not well suited for average Joe user (use cookie and session to track logins)
    5) tweak MySQL as much as you can
    6) buy enough RAM for apache. they eat up alot
    7) use fastest HD (don't need to do striping or anything complex cuz that adds more things to worry)

    I think (3) is _the_ most important factor. In any possible situation, use batch program to spew out HTML files. Although PHP & MySQL can handle heavy hits, I belive static HTML are more responsive (much quicker to show up on your browser). Many users hate waiting, ya know?
  • by malice95 ( 40013 ) <[moc.liamg] [ta] ... gninnuC.leahciM]> on Wednesday February 02, 2000 @11:42PM (#1311139)
    Depending on your budget you can do quite a number of things..

    The poor/freeware route

    1. Seperate out static content from dynamic content. Try to make as much content as static as possible. Does that page need to be built every hit? Maybe it could be built out of cron every 10 mins instead?

    2. setup static content on one server or port (on the same machine) and dynamic on another with multiple apache servers running.

    3. Tons of ram! cache as much as you can in ram

    4. If you have another box set it up as a reverse caching proxy server using squid to cache the static content. Avoid using the nocache tag if at all possible so clients can cahe locally

    5. tune mysql/apache to death.. you can do a lot to mysql/apache to improve its speed..see the docs.

    6. Optimize all sql queries. Index everything important to your site even if it requires seperate tables. indexing saves a LOT of time.

    7. Make sure you use something like mod_perl to maintain static connections to the database so a seperate connection isnt made for every single hit.

    8. reduce your images down as much as possible and try to keep the site layout simple.

    9. Make sure apache is not doing dns lookups for each hit. Reduce apache and mysql logging if possible.

    10. Put the mysql server on a seperate box if possible with at least 100mbit between it and the web server(s).

    11. Compile everything from scratch with all optimizations for you processor/platform. Only include modules that you really NEED.

    12. DO NOT use .htaccess files at all

    13. And dont forget security in all of this..

    Layer security.. Mysql should have tcp wrappers installed and a firewall protecting the site as well. Dont do NAT on the firewall unless REALLY needed. Security is a whole of arena but very critical as well. If it is popular you WILL get attacked many times per day. heck my server is dinky and unknown basically and I get hit 4-5 times a day by port scans.

    14. Make sure all you servers are in the /etc/hosts table of all the others. files access is faster then dns lookups. Heck using ips instead of names would be best although it can be a pain with a lot of machines.

    15. Tune your OS as much as possible.. There are quite a number of tune linux sites out there.

    16. Setup raid 1 or 0+1 if possible on wide scsi drives with multiple controllers (at least one for each side of the mirror).

    Now if you have some bucks to spend route..

    1. everything above + put everything on sperate servers with 100 mbit between them. (switched if possible)

    2. IMHO 2 400 mhz cpus (smp) is better then 1 600 mhz cpu when using apache since it is threaded..

    If you want to build a site like slashdot then..

    1. Load balancing! get a load balancer like a BIG IP product (F5). and plenty of back end web servers.

    2. Pick up sybase replication server and replicate your content to multiple databases.. I am told it should work with other databases beside sybase. Load balance the database queries then as well.

    3. Get at t3 at least with a 3mbit backup line

    Document everything you do so and write a nice article for slashdot so others can have a reference on how to hit 1 mil plus hits per day:) And you get free advertising at the same time.

    Malice95
  • by Matts ( 1628 ) on Friday February 04, 2000 @03:58AM (#1311140) Homepage
    I worked for a while at an extremely large web site (in excess of 50m page views a day). Here's how they cope:

    Distributed servers. Their content is served from several different servers around the world. Static content is distributed with a simple script that copies static content to other servers. I think this is only really necessary when your hits reach the scale of Yahoo (although it wasn't Yahoo I was working at).

    shtml. Server side includes provided enough templating facilities to "get by" for most content.

    No fluff. Cookies and javascript were mostly banned. You had to get extra special permission to use either.

    Simple perl CGI. Although the content wasn't particularly dynamic, simple perl CGI's can go an awful long way, and often scale better than most people assume for simple scripts.

    The questioner's comment though related to scaling <B>authentication</B> to 1m hits/day. So let's deal with that.

    I'm pretty certain that mod_auth_mysql will provide enough for you. You don't need locking, transactions or any fancy facilities. So MySQL's raw speed will do you just fine. Handling dbm's for that many users or hits is just going to kill you.

    If that doesn't work, consider writing your own authentication handler, either in C, or pick up mod_perl.

    To all the other posters going on about how you need Zues or khttpd to serve that many hits - you obviously don't run a site taking that many hits. The benchmarks that show Zues faster than Apache show taking about 200 million hits a day. I don't know anybody mad enough to try and do that on a single server. The reality is that Apache provides the right level of stability and configuration options and speed to suit almost every site out there.
  • by drix ( 4602 ) on Wednesday February 02, 2000 @01:56PM (#1311141) Homepage
    Well this might be painfully apparent, but you'd be amazed how many times I've seen people do stupid things like turn their front page into an .asp script merely to have today's date on it (cron! I tell them). Anyways, make everything static, as much as possible. Stat out what pages are receiving the most hits and endeavor to make them as static as possible. This is what most big sites do - Yahoo recieves well over 100 million hits daily and they sure as hell aren't dynamically creating anything besides search results. Any page that is not immediately created or modifyed based on a user request can be made static when used in combination with scripting.

    This being said, don't enslave yourself to Apache. There are lots faster ways to shooting static text over a socket, namely Zeus and khttpd, the kernel http server. It (khttpd) only serves static pages, but by placing it out of user space you get to bypass all the kernel gunk that accompanies a user-level proces. Needless to say, it's really, really fast (especially compared to Apache [demon.nl]) and I have no doubt that even on a modest Pentium 1 you could crank out hundreds of thousands of pages a day using static HTML and the khttpd. Also, it can be used in combination with another webserver, so you don't have to sacrifice any dynamic fuctionality. Check Zeus if you need more features. Either way, they're faster than Apache. Other than that, all the obvious apply - put the SQL server on a separate box and make SQL queries as sparingly as possible. If your site is image-laden, consider putting those on a separate box. Or clone your boxes and employ load-balancing. Or just pray for Apache 2 to be released :) Lots of ways to skin the cat here.

    --

Solutions are obvious if one only has the optical power to observe them over the horizon. -- K.A. Arsdall

Working...