Forgot your password?
typodupeerror

Managing Site Growth? 37

Posted by Cliff
from the growing-pains dept.
markmcb asks: "I started a web site about two years ago. When it began it was simple. The code was 75% hacked, and administration was easy. However, the times they are a changin'. Now I get hundreds of thousands of hits and have a steady flow of new users. I'm noticing an ever-increasing gap in terms of my site's popularity and its technological progression. Specifically, I have all sorts of 'XYZ for Beginners' books that are no longer of use to me. Even the so-called non-beginner publications seem to only scratch the surface of running a site. As problems get more complex, trying to Google every situation/issue I have with site administration has become less useful as well. I'm finding things like writing optimal code, configuring servers for high-volume, balancing ad income vs. server costs, and maximizing the efficiency from my moderation team have all become issues and that aren't addressed most books. What is the best way for a low-income, non-professional, but enthusiastic web designer/administrator like myself to manage site growth as it leaves the realm of just-for-fun?"
This discussion has been archived. No new comments can be posted.

Managing Site Growth?

Comments Filter:
  • Offer him some Coke and M&Ms. Profit.
  • by Jerf (17166) on Wednesday August 02, 2006 @10:03PM (#15836728) Journal
    What is the best way for a low-income, non-professional, but enthusiastic web designer/administrator like myself to manage site growth as it leaves the realm of just-for-fun?
    Unfortunately, the only answers are either hire a professional, or become one.

    "Scalable" and "customized" are two things that when put together simply require a professional. And quite a lot of people calling themselves "professional" can't handle it, either.

    Now, by "professional" I don't necessarily mean a degreed guy who makes at least $X thousand a year with Y years of experience. What I mean is, you're stepping into the domain where you can't hardly acquire the experience and skills necessary with anything less than full dedication usually brought on by having a job in the relevant domain.

    There is, however, one other possibility for you to consider. If you analyze your needs and the available packages for your type of website, you may find that you can drop the "customized" aspect of it, if you can find a project close enough to your needs to require only minimal customization, perhaps even no actual code customization. Then you just need to import the data, and you will presumably have satisfied yourself that this package can meet your performance needs.

    If the website you are referring to is the "OmniNerd" site you have a link to, then I would imagine this should be feasible. There are a lot of "news" packages, free and otherwise, and at least on first blush I don't see anything particularly unique about it. It looks an awful lot like slash, although from what I've heard that's not the easiest thing to customize. (slash hackers feel free to comment.)

    Really, there's no excuse nowadays to start a new web framework from scratch, and your first impulse if your hack-job is starting to come apart at the seams should be to change to one of the umpteen bajillion tested, performant frameworks. Depending on your skill levels, which you did anything but talk up, you may even be missing basic pieces like caching, which is pretty important on a site like that. Non-professionals should not attempt to write website caching routines. Unless you want to go insane. (It's not that it's hard to write... it's that it's hard to get correct, and debugging cache problems are close to sheer hell.)
    • If it is OmniNerd he is talking about he should definitely look at something like Scoop, Drupal, or Slash. Any of them should meet the needs of that site easily and he will probably have a much easier time of finding assistance (paid or volunteer). I've played around with Drupal a fair amount, it's easy to install and configure, it has a lot of modules and themes, I'd definitely recommend he try it out first.
    • I'll echo that - I used some profesional help to quickly set up a fast, scalable user-driven site using Plone [plone.org].

      I am now in the proces of rolling our several more Plone sites on my own, with the help of a large and helpful Plone commmunity. [plone.org]

      Designed to scale well, themable (CSS-based), and good caching options (squid and the new CacheFu [plone.org]).

      I have found Plone to provide a great foundation to build upon.

  • please disregard any suggestions from "CmdrTaco."
  • by Beuno (740018) <argentina@@@gmail...com> on Wednesday August 02, 2006 @10:27PM (#15836818) Homepage
    Taking a look at what others have done [wikimedia.org] to solve these issues seems like the best option.
    I think it's very unlikely you will find books in that area considering that when you reach a certain level of complexity technology changes too fast to make a book relevant.
    • by Anonymous Coward on Wednesday August 02, 2006 @10:55PM (#15836953)
      I agree with your point, but not your example. Wikimedia is built on Mediawiki which is poorly written PHP+Memcached+MySQL saved by SquidCache. They need twice as many boxes as a well-designed architecture would need. Another bad example is MySpace which is a coldfusion/asp.net shop that add dozens of servers a day to keep up with demand. Most people estimate that they need 5 times as many boxes.

      Adding more hardware is always a way to dig yourself out of a digg, but be careful you don't just look at how the big boys do it and think that's right. Smart people can do more with less. Look to Python, Perl, C# (but not ASP.Net unless you really know what you're doing) who have mature libraries. PHP and Coldfusion suffer too much from the type of app that's built-up and destroyed on every page load. It doesn't encourage separation from what only need to be done once (app initialisation) and page response, which is a large part of optimisation (both in doing things every page load that should be done once, and seperating out the page data and having layers of cache from the DB and web-templating). Learn about http headers and SquidCache.

      So far as software recommendations I'd say Python with CherryPy and Kid Templates (but not turbogears). It's fast and simple.

  • ...i thought this was going to be about your PHYSICAL site growth (ie. Datacenter stuff.)

    Maybe we need to have that discussion later... or tomorrow...
  • by mabu (178417) on Wednesday August 02, 2006 @10:39PM (#15836878)
    You haven't elaborated much on your situation. To be honest, the scalability and technology available to you has a lot to do with what platform you're using. The initial design of a data-driven web site ultimately determines how easy, securely and efficently it will be to evolve to meet changing needs and increased demand.

    Open source technology tends to be more scalable and solid, but even there, a bad choice stifles your progress. If money is no object, I guess you can always scale up, but the commercial platforms often have their admins spending more time patching and maintaining the status quo than progressing. The bigger question is: Did you do your homework when you initially designed the system? If you're stuck, that's likely the problem.

    If you have a choice to redesign or redeploy your site, what you need to do is ask yourself, not whether or not the technology you're familiar with can do what you want, but instead, are you using the right technology to do what you want?
  • by identity0 (77976) on Wednesday August 02, 2006 @11:56PM (#15837237) Journal
    1) Have fun.
    2) Allow posting comments on your stories.
    3) When people abuse comments, put in moderation system to stop them.
    4) Hire some random writers with axes to grind, like "Geeks are oppressed by stupid conformist society".
    5) When people abuse moderation points, throw in meta-moderation system to stop them.
    6) Hype yourself up, claiming you are part of some revolution in media that will bring control to the masses.
    7) Sell out to some venture capitalists.
    8) Abuse moderation and metamoderation system yourself when comments piss you off.
    9) Cover the site in more ads than Times Square.
    10) Stop putting any effort into the site whatsoever.
    11) Let people pimp their own blogs in story submissions.
    12) Charge money to preview stories so people can read links before hordes of visitors take sites down, a problem you caused in the first place.
    13) Charge money to view the site without ads.
    14)???
    15).... profit!

    As you can see, the main goal of the Slashdot Method is to cause problems yourself, then charge people money to fix them. But remember, the most important step is to have fun! Hope you have as much success as they did!
  • Go over to Rent A Coder or Elance, find yourself some really smart foreign techies who will each give you 2 weeks of his undivided attention for $500-600, and set them loose on tightening up your code and configs. It will be the best $1000 you ever spent.

    For strategic planning, go to SCORE (a volunteer group of retired executives who advise small business owners) and get a couple of advisors... one to show you how to break down and analyze the financial numbers, one to help you learn how to better manage y
    • by Anonymous Coward
      I'm afraid he's pretty much hit the nail on the head for this one.

      Programmer, Tester, DBA, System Admin, Application Admin, Content Admin, and User Wrangler. There's a lot to keep track of, and that's without managing the business side of it.

      As for things I've learned as part of a team managing a large site:
      (Note, we're J2EE, with a sizeable budget, so not all applies)

      Get a good load balancing solution in place early. Get a strategy for where and when to add servers. Assume you will need to scale by addi
  • Interesting question (Score:3, Informative)

    by stevey (64018) on Thursday August 03, 2006 @04:58AM (#15838069) Homepage

    I run a community website [debian-adm...ration.org] which is written in Perl with a MySQL back end.

    Despite having just under 5000 users I had 3million hits last month, and shifted 13 Gb of traffic. Not bad for a single (dedicated) host!

    There are two things that I'd suggest above all:

    • Mimimize database queries
    • Caching, caching, and more caching

    I use Danga's memcached [danga.com] which has a perl interface, but there are PHP ones too. This allows me to sensibly cache database queries (don't forget to test things to make sure you expire the cache appropriately!)

    A combination of minimising queries and caching has kept me going even under a slashdotting.

    If you have written the site code yourself I'd urge you to add a test suite. My site runs a full test suite every day [debian-adm...ration.org], and I run it manually whenever I make changes - this allows me to be sure that I'm not breaking things when I make changes.

    Of course the standard development model of having a "live" site and a "test" site help here too. I develop the code on a laptop and store it under version control (CVS in my case, but it doesn't matter which system you use as long as you pick one) and only when it has passed the test suite do I push it to the live site.

    Adding extra hardware can be an option for bigger sites, but I'm not at that point now. I had my biggest strain when the site reached around 1000 users, since then things keep ticking over nicely, and although it is growing it isn't growing terribly quickly which suits me fine. (There are a lot of users who visit the site via google searches and never register/return; I'd like to fix that, but I don't mind too much!)

    • One thing which I just remembered - so I'll post it now even if it is a little late - the single biggest speedup/optimisation I made to my site was to disable DNS lookups in the Apache logfiles.

      In the normal course of things this isn't a big deal, but try surving under a /. attack whilst you're getting a ton of incoming connections and your bandwidth is saturated - suddenly theres nothing "spare" to do the DNS lookups for logging purposes.

      Nowadays I disable DNS lookups for the logfiles as a matter of poli

  • I didnt think I had a lot to learn about programming before I started uni... 3 years later, and I'm confident I didnt learn a single language aspect from any of my courses (with the exception of the new languages we were introduced to for assignments - such as ASML).

    However, even from day one, I could see they were trying to teach us something else, and not just how to program. This wasnt clear or evident to anyone else who didnt know how to program in a compiled language, because the workload of simply und
  • Me myself is also in the lead of serveradministration, scripting, programming and databaseadministration of several websites and webservers online. I experienced as well in the beginning how my Windows servers chrashed down several times, due to bad and rookieaffected configuration. Then I decided to switch to Linux Gentoo and learned lots of stuff about how to increase the performance. I'm very glad that i threw myself into the hell of crashes because of all the lessons i received by trying to fix it.

    Googl
  • I knew MySpace's infrastructure was bad, but I didn't know it was this bad!
  • I ran into this dilemma after Wrong Planet [wrongplanet.net] got slashdotted [slashdot.org] for an interview with Bram Cohen. I've had to hire a professional publicist and contract out freelancers to help me code. Managing a big site like yours really is hard but the hardest part is getting people to go to the site. Congratulations on your succcess. Your dilemma is what many would call a "high quality problem."
  • by FooAtWFU (699187) on Thursday August 03, 2006 @09:31AM (#15839062) Homepage
    Tips I learned running a millions-of-hits-a-year website...

    • Write valid XHTML. Use the magical DOCTYPES that keep Internet Explorer relatively happy. Develop the CSS to target Opera and Firefox (and Konqueror if you can), and install hacks for IE afterwards some way you can keep track of them. Anything else will probably blow up for you.
    • Consider keeping your structural CSS and your shiny CSS (colors and such) separate. Try to come up with a decent scheme.
    • You'll have all sorts of files and maybe even stacks of logs, presumably. At some point, you may want to replace a file, and name the previous version .old - don't. Never name anything .new either. There will always be a new new, an old new, an older old, a new old... Use dates - 20060802. YYYYMMDD sorts nicely too.
    • Use mod_rewrite (or equivalent) and pretty URLs which are purely logical. Try to hide things like 'php' and 'asp' and 'cgi' and such. If you ever want to replace whatever's driving that URL, you will be glad. For God's sake, avoid query strings in your URLs unless someone is sending a query. You'll also keep the search engines happier.
    • Sign up for a little Google Analytics account if you can. preeeeeeety shiny stats. everyone loves them.
    • PHP is a fine hypertext preprocessor. It's a lousy programming language. It's excessively convenient... a lot like Windows, really. Avoid it, of course.
    • If you're writing your own stuff, consider FastCGI and lighttpd instead of PHP+Apache.
    • I have heard good things about Ruby On Rails. Check it out some time.
    • You know HTML? Now learn HTTP. Headers. Lots of headers. Beautiful headers. And status codes of all kinds. Learn when to return a 304 versus a 206. And 301 Moved Permanently is a blessing when you're restructuring. [meta name="refresh" value="..."] is a hack.
    • Cacheability is your friend. If you can keep your public-facing content cacheable and install Squid, things go very very quickly. Otherwise, cache in your application. This usually requires something that's not PHP. FastCGI can do the trick.
    • There are always many ways to do something. Use the most elegant way possible, the simplest neatest prettiest way. Use the strictest dialect of your language, the most rigorous form of whatever you're doing - go the extra mile, do things which would make your computer science professors proud. When you don't, things will fall apart faster than you think they will.
    • If you're writing a big function suite or something to get something done, pause and reconsider. There's probably a library or such out there to do it already. Check CPAN if you're Perl, PEAR if you're stuck with PHP... Even if you think you have things under control, check them anyway. Better to look now then say "that would have been so much better" weeks in the future.
    • Keep the search engines happy. Include your page metadata. Use things like [link rel="previous"] and such. There's a whole suite of these things. Make RSS feeds and sitemaps for Google Sitemaps.
    • Get a version control system! Stuff your entire site in a Subversion repository. Then develop in a sandbox off to the side and synchronize it back. If you insist on skipping the sandbox, even just having old versions will save you, eventually. (This could also be good for backups- just back up the repository.)
    • In my opinion, caching is not a good idea unless the website is very big (like Slashdot or Digg). Constructive coding is to replace the caching systems, which is just another layer on the request.

      And by the way, another tip, if the coder necessarily has to use PHP - try organizing the functions using OOP. In the end the bad organizing on the website staggers the coder because of the confusing construction.
    • And once you've done all that and there's time left: read up on XmlHttpRequest. It is not just for fanciful features but can also reduce server load for Javascript-enabled clients (while remaining 100% compatible for others). Don't go overboard with it, but there are plenty of occasions where you just want some data to update and not your entire DOM tree.

      Also, consider writing your own Apache module. Scripting languages are *slow*.
  • by spacefem (443435) on Thursday August 03, 2006 @12:58PM (#15840720) Homepage
    Something similar happened to me when I started advicenators... a hackjob concept that floated along for few months and then suddenly grew to 10,000 members. I was overwhelmed. I scrambled to write code for a moderator system and abuse reporting, and any security issues we had took time away from that. Finally, about a year ago, I just gave the site away to a loyal member and his wife who knew their stuff and promised to keep the site pointed at the goals I'd started for it (I'm usually pretty satisfied with their methods in doing that).
     
    The site had gotten so it wasn't fun for me. I was home every night after work writing PHP, and coming back from a weekend vacation was a nightmare. I also felt like all those members deserved more than I could give them.
     
    Business professionals will tell you that it takes a certain type of person to get a business going, another type to get it stable, another type to get it to the top, etc. The web is the same way, and if you're a "starter" who can come up with innovative concepts that take off, then go do that. Don't get tied down with old projects.

A LISP programmer knows the value of everything, but the cost of nothing. -- Alan Perlis

Working...