Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
Check out the new SourceForge HTML5 internet speed test! No Flash necessary and runs on all devices. ×
The Media The Almighty Buck

Yahoo! Orders Wikipedia Hardware 240

Edit This Page writes "Jimmy Wales announced today that Yahoo! has ordered 23 HP servers for the Wikimedia Foundation. The three database servers are model DL 385, and will come with dual Athlons, 8GB of RAM, and 6x 146GB 15K RPM drives each. They will also provide rackspace and bandwidth. The announcement comes four months after Google's announcement of support, and two months after Yahoo's own. Google has not yet made their intentions clear. You can read more about the specifications of what will soon be a 100+ server cluster at the Wikimedia Servers wiki article."
This discussion has been archived. No new comments can be posted.

Yahoo! Orders Wikipedia Hardware

Comments Filter:
  • Also! (Score:5, Informative)

    by Raul654 ( 453029 ) on Sunday June 26, 2005 @02:30PM (#12915424) Homepage
    As I write this, our developers are switiching the entire site over to Mediawiki 1.5 (from 1.4), and most of the changes will make it run faster. So we're lowering the per-transaction cost of the software and increasing the server capacity -- this is a good thing.
    • Re:Also! (Score:3, Insightful)

      by slavemowgli ( 585321 )
      Out of curiosity, why are you switching to 1.5 yet when the last release is still listed as "not recommended for use in a production environment"?
      • Re:Also! (Score:5, Informative)

        by Jon Chatow ( 25684 ) * <slashdot@jdforrester.org> on Sunday June 26, 2005 @02:54PM (#12915571) Homepage
        Because the devs and the sysadmins are one and the same (generally), and they like playing fire with fire. :-)

        Seriously, "not recommended" is because it hasn't been properly tested yet in a large-scale environment; this is what is being done right now. If this version of MediaWiki works for Wikimedia, it should work for everyone else, too (barring the funny odd bits we don't use).

      • Re:Also! (Score:3, Interesting)

        by Jamesday ( 794888 )
        Because the technical team at Wikipedia includes the developers and we know that there are sure to be problems as it is introduced to full service. Anything from outright bugs to database queries with unacceptable load properties. It'll probably be released for a general audience in four to eight weeks, once it's been very thoroughly tested at its biggest user site.
      • Because "not recommended..." is carny for "we are covering our asses with this caveat." It's usually fine, and cowboy admins love this shit! And it makes me very happy that wiki has cowboy admins.
  • by creimer ( 824291 ) on Sunday June 26, 2005 @02:31PM (#12915429) Homepage
    Wikipedia Hardware?! I didn't know they make hardware. Does anyone have the Wikipedia link for this? ;)
  • required? (Score:2, Interesting)

    by cryptoz ( 878581 )
    Does wikipedia seriously need all that? I thought the data they were serving up was mostly just text and wasn't really a huge problem. As in, weren't their current servers enough? Or am I missing something?
    • Re:required? (Score:5, Informative)

      by xMilkmanDanx ( 866344 ) on Sunday June 26, 2005 @02:39PM (#12915479) Homepage
      Just think of all the links that get posted in slashdot to wikipedia and it doesn't falter under the load. That and it's not just static pages, between building, rebuilding, keeping reversion history, indexing for searches and constant slashdotting...
      • Re:required? (Score:3, Interesting)

        by m50d ( 797211 )
        Slashdot links barely touch the database. Any popular links are handled by the squid caches. It's the zillions of people all looking at different pages that stress the database.
      • Re:required? (Score:3, Informative)

        by bobbozzo ( 622815 )
        FWIW, they have Squid caches in front of the web farm, so there are cached static copies of busy pages.
      • Re:required? (Score:5, Interesting)

        by Pendersempai ( 625351 ) on Sunday June 26, 2005 @05:25PM (#12916358)
        I saw a presentation by Jimbo Wales in which he compared the readership of Wikipedia, Slashdot, and NYTimes.com. Wikipedia recently passed NYTimes, and slashdot doesn't even compare. In fact, he noted with something of a smile that Wikipedia would probably bring Slashdot to its knees with a front-page link.

        Slashdot ain't got squat on Wikipedia.
        • Oh, I'd seriously love to see that. Maybe if we expend the page about Slashdot (http://en.wikipedia.org/wiki/Slashdot [wikipedia.org]) and make it a featured article... :)
        • even if wp did direct link /. on the front page (which goes against thier way of doing things) i doubt it would have that much effect. /. is story orientated. folks go to whatever pages are listed in the current story. wikipedia is an encyclopedia, people generally go there looking for something in particlar and might look at the other stuff on the homepage if they are boared.
        • it's not the number of readers.. it's the bandwith of the readers ;) do you know anyone on slashdot with LESS than a 3 megabit connection? who reads the main page? I don't. I know people with less who read journals here, on less but not front page readers. how many dialup users are using wikipidia? how many aol users are reading nytimes?

          but you've got a point, a slashdotting just isn't what it used to be. It hasn't been for a long time, in the golden days of tech tv TSS was slashdotting sites that slas
    • Why does Google need more than 100,000 boxes? All they do is serve (mostly) text, too, after all.

      You underestimate the sheer number of hits that Wikipedia gets.
    • Re:required? (Score:5, Insightful)

      by teslatug ( 543527 ) on Sunday June 26, 2005 @03:03PM (#12915622)
      Have you looked at the MediaWiki features? There's tons of dynamic features. What doesn't hit he cache, goes to the DB. Wikipedia is 67th in the Alexa ratings (Slashdot is 1,441th, of course not too many slashdotters use Alexa, but check some of the other sites, CNN is in the 20s, and Wikipedia gets more traffic in a day than /. gets in a month).

      Additionally, Wikipedia's lag is a dampening factor to its popularity. As more servers are added, it becomes more responsive, servers go to capacity again, and yet more hardware is needed.
    • Re:required? (Score:2, Informative)

      by midom ( 535130 )
      Well, first of all, everything grows. Number of user increases all the time - doubles every two or three months. Number of pageviews increases as well. And last but not the least, there are more and more, bigger and bigger articles with more and more of history. Wikipedia is growing and it is running on really low-budget hardware. And... every time we make a site running faster, more users come and use available resources. Therefore, we can do two things. Optimize our software platform and increase our har
    • Wikipedia doesn't need that. It needs more - those aren't enough to handle the full load.:) They should be enough for the Asia-Pacific region for a few months at least. Wikipedia growth is still limited by performance when it comes to viewing pages not in cache and editing (adding and changing content).
    • "Does wikipedia seriously need all that? I thought the data they were serving up was mostly just text and wasn't really a huge problem. As in, weren't their current servers enough? Or am I missing something?"

      You're not missing anything. It's because they're generally proactive in adding servers that you tend to think of Wikipedia as being fast. All else being equal, this is the proper way to do it... to add more iron before you need it, and not adopting an interrupt-driven hardware acquisition policy.

  • So it seems now that Wikipedia has more street cred than either Yahoo OR Google, since they're both clammering to be seen as being in support.

    And with Google at aproximately 211 street cred units as of the last survey, Wikipedia is definitely doing well.
  • by mz001b ( 122709 ) on Sunday June 26, 2005 @02:36PM (#12915462)
    The trouble of course with wiki-hardware is that the system adminstration is left to the community.
    • Dear Moderators. This sentence is currently rated 4, Funny.

      Actually, mz001b made a valid point and all the donated hardware (which wikimedia is of course very thankful) has to be maintained by volunteers.

      So if HP or IBM or whatever company feels like, they should consider donating a full-time-employee-equivalent-sponsorship to someone who is doing this great job. IMHO.
  • by Anonymous Coward on Sunday June 26, 2005 @02:44PM (#12915511)
    Not Athlon
  • South Korea? (Score:2, Interesting)

    by s0rbix ( 629316 )
    Does anyone know why they are being set up in South Korea?
    • Re:South Korea? (Score:5, Informative)

      by Jon Chatow ( 25684 ) * <slashdot@jdforrester.org> on Sunday June 26, 2005 @02:56PM (#12915578) Homepage
      'Cos Yahoo! offered to host them at their facility there, and our overall global reach has a bit of a paucity in Asia.
    • by commodoresloat ( 172735 ) on Sunday June 26, 2005 @03:05PM (#12915631)
      Because only old people will administer the servers.
    • The current server farms (US and Paris) are far away from around there, it's nice if there's a server nearby for most users and this will improve that. As to why South Korea specifically, it's a country with very high internet connectivity (IIRC they have the greatest proportional broadband coverage for any decent-sized country) and also a strong interest in democracy (because they've got North Korea right next to them), so I'd imagine there are a lot of wikipedia users there.
    • They have a big cluster in the USA, they got just a few weeks ago 10 or so dual opteron servers in the netherlands (that will serve europe more or less completely), and these servers could take the asian part of load (plus increase redundance).

      Its just wastefull routing everything around large parts of the globe, plus keeping the database in different phyiscal locations cant hurt, either.
  • by TERdON ( 862570 ) on Sunday June 26, 2005 @02:59PM (#12915591) Homepage
    As I noticed, the summary says dual athlon, and they're not really actual anymore (as far as I know the Opteron was introduced about two years ago). AMD did make Athlon MP processors earlier, which was why I reacted (why buy three year old tech?).

    The server hardware spec link said the "athlons" in fact are opterons. *sigh*

  • by stimpleton ( 732392 ) on Sunday June 26, 2005 @02:59PM (#12915595)
    This is sort of like those school yard spats over a girl.

    Wiki is the girl. Google and Yahoo are the two guys.

    My mother's advice surely applies to this situation(that I got many years back):

    "Stay away from that little trollop! Anyone that causes a fight is not worth it."

    Of course, I did hang round that girl. Pretty wee thing. It was all fruitles of course.

    Bitch! You whore Wiki!

    *begins to cry*
    • I don't know.. I think a better analogy would have Wikipedia as the school's football team, and Google and Yahoo competing to become quarterback, knowing that they'll both get respect from the larger community if they do. The only difference is that in this case, there is room for more than one quarterback.
    • Interesting, the analogy between trollop and wiki is fitting:

      Everyone puts their bit in.

  • Cool. (Score:2, Interesting)

    by Vegeta99 ( 219501 )
    Since I can't think of anything really insightful to say, I'll just say thanks.

    Thanks to Yahoo, for supporting the Wikimedia Foundation, and thanks to the Wikimedia folks and all of their contributors for their great contributions to what I hope will become (and is already on its way) one of the world's best disseminators of human knowledge. It's meant to be free, at least as in speech, but they're pulling it off as in beer, too.

    Much kudos to them - One day when I'm not a poor college student, I'll help o
  • Yahoo/Google war (Score:3, Interesting)

    by BonoLeBonobo ( 798671 ) on Sunday June 26, 2005 @03:18PM (#12915703) Homepage
    Seems to be a war to be the best "opensource" helper. See Google wants to help wikipedia, Yahoo helps wikipedia, Google makes Google summer code ...

    What's next ;-) ?
    • Googlepedia?
    • In other news...
      • Microsoft announced today the Windows platform source code will be released onto the SourceForge under a OSI-compatible license...
      • Duke Nukem Forever date was moved forward, and not back. According to developers, the game is complete, they are "just trying to beat it first"...
  • Out of interest, does anyone know how much total bandwidth Wikipedia is consuming ?

    I wonder if there is somebody somewhere working on a peer-to-peer variant for distributing Wikipedia content and cutting some of the bandwidth costs.

    • Up until recently when they moved to a new co-lo this data was out there, but it is unfortunately no longer available. I can say as a fact though that they are currently pushing out about 17 terabytes per month and growing strong. There's a bandwidth graph and instructions to read it on this page [qwikly.com] of my site.
    • by Jamesday ( 794888 ) on Sunday June 26, 2005 @04:20PM (#12916055)
      Averaging 60-70 megabits per second over a whole month. Peaks at 320 megabits per second in extreme cases. Typical daily peaks in the 120 megabit per second range. 6 months ago it was more than 200 million database queries per day and it's probably several times that today.

      I'm wondering about setting up a network of boxes running the Coral software. Those have built in fault tolerance so it wouldn't take lots of admin work and would allow accepting many small bandwidth offers, in countries with comparatively low traffic. Makes most content even closer to the end users and spreads the bandwidth load around. Nothing actually happening on this front yet, though.

      A very large number of places witih full database servers and page builders, like this Yahoo announcement, would have too much admin overhead - 3-6 of those places is about right.

      P2P is a security problem. People can always modify P2P programs to add nasty content and Wikipedia has already seen people trying to upload that and has filters in place to catch and block some things.

    • A peer-to-peer Wikipedia would be a completely different project. It'd be a great project, in my opinion, but the way to go about it would be to build a P2P publishing network first, and then to just upload Wikipedia to it.

      Basically, you'd need to create a version of freenet which isn't ungodly slow, and probably wouldn't have all the encryption/anonymity features so as to accomplish that goal. Once you had that, like I said, it'd be a simple matter of uploading Wikipedia (and re-uploading it on a regula

  • Oh wait, now we have to be careful to avoid the Yahoo! View of History being predominant on Wikipedia!
  • The three database servers are model DL 385, and will come with dual Athlons, 8GB of RAM, and 6x 146GB 15K RPM drives each.

    AFAIK, the DL385 is a quad-Opteron model. Athlon64 is only for desktops. Just saying.

    Marcos
  • by njyoder ( 164804 ) on Sunday June 26, 2005 @05:25PM (#12916360) Journal
    This is a classic case of considering the hardware to be the problem rather than the software. The software has serious issues when it comes to performance and the developers are very slow to address it. Hell, Tim Starling, a lead developer, even stated that one of the design goals of the MediaWiki software was to spend as little time as possible developing it. I kid you not, that's paraphrasing something (with NO exaggeration) that was said in a presentation document which I can find if anyone doesn't believe me.

    I've heard some whining from some of the developers because they didn't have a ready made solution for certain things, meaning they would have to put actual *effort* into making their own. The idea of writing glue code (to C code) to make up for a feature lacking in existing php libraries was considered an abhorrent thing.

    Their best response to me pointing out flaws in their "development philosophy" was to them retort with the oh-so-clever "well why don't you write something better yourself?" Of course, that phrase is just a code word for "we know it sucks and we're just not willing to put all the extra effort into rewriting major portions of it." Really, it's sad when you have to define your software in terms of someone else (your opponent specifically) not writing something better.

    This isn't just unfounded complaints either. The developers have often complained that the existing implementation (and especially the choice to write the original code in PHP) needs to be rid of. They've said it has "everything and the kitchen sink" and that it degrades performance, but aren't trying that hard to get rid of it. They know this as a matter of fact through testing--Mediawiki has a massive overhead in setup time compared to other wiki software.

    Not just that, but the Wikipedia admins are all volunteers and aren't exactly the cream of the crop. They took them as volunteers since they were the best ones to devote that much time to it and unfortunately that means they're mediocre and they REALLY are not experienced for such a high traffic website.

    If they actually had a paid full time admin who had considerable background in sites like this, you'd suddenly see a massive drop in down time and other problems.
  • Wikipedia Servers (Score:2, Informative)

    by silverz ( 803241 )
    Here is their servers list. [wikimedia.org]
  • by aminorex ( 141494 ) on Sunday June 26, 2005 @11:20PM (#12917957) Homepage Journal
    Since I presumably have moderation to burn, I'll say frankly that I'm appalled. Wikipedia is enormously valuable as a resource in objective domains such as hard science and mathematics, but its articles in politically and culturally sensitive areas are abyssmal reflections of popular delusion and political correctness that do an enormous disservice to us all. The cockles of my heart not not warmed.
  • So that explains why the database is locked.

Computers are useless. They can only give you answers. -- Pablo Picasso

Working...