Forgot your password?
typodupeerror

IBM Introduces Petabyte-Capacity 'Storage Tank' 208

Posted by timothy
from the just-squash-it-in-there dept.
statikuz writes "Wired is reporting that IBM's new data storage system, codenamed "Storage Tank", uses software to link servers in multiple locations over an IP network, creating a sort of mega-server capable of connecting thousands of computers and processing multiple petabytes of data. 'Storage Tank has the potential to become to an organization's data what the Dewey Decimal system is to a library,' said Dan Colby, general manager of storage systems at IBM. 'It reinvents the way information is filed, managed, shared and accessed within an organization.' CERN is currently using a beta version of the system to store data from the Large Hadron Collider particle accelerator, which is being used to recreate the first moments of the Big Bang. IBM expects Storage Tank eventually will be able to handle 10 to 20 terabytes of CERN data. Get your own 'starter configuration' for only $90,000!"
This discussion has been archived. No new comments can be posted.

IBM Introduces Petabyte-Capacity 'Storage Tank'

Comments Filter:
  • by Anonymous Coward
    For my house, of course.
  • Dewey decimal? (Score:5, Insightful)

    by chennes (263526) * on Monday October 13, 2003 @10:46PM (#7204945) Homepage
    Quote:
    "Storage Tank has the potential to become to an organization's data what the Dewey Decimal system is to a library"

    Strange that he compares it to a system that few libraries use anymore. Yes, it revolutionized cataloguing. Right before it became obsolete (because it cost too much).

    Not too long ago Slashdot reported [slashdot.org] on the owners of the Dewey Decimal system suing a hotel [libraryhotel.com] in New York for using it as the theme for their room numbering. How long until IBM starts suing everyone with a storage tank [google.com]?
    • Yes, they had better be careful because the dewey decimal system is NOT public domain!
    • Strange that he compares it to a system that few libraries use anymore. Yes, it revolutionized cataloguing. Right before it became obsolete (because it cost too much).

      A lot of municipal libraries (you know, the markedly inferior, off campus establishments) in the United States use Dewey.
    • Don't IEEE numbers use the Dewey Decimal system as the theme for their specs? 802.11a, 802.3, etc.
      • check OIDs in SNMP, looks precisely as a decimal-position addressing system.
      • by shogun (657)
        Hmm I borrowed something from 823.912 W454 today, maybe I should of wandered over a couple of aisles and checked if there was network connectivity in 802.x
      • by b!arg (622192)
        Actually I do believe 802 refers to the year and month the first spec of these was formed...that being february of 1980. I could be wrong, but I'm fairly sure I'm not...
    • Strange that he compares it to a system that few libraries use anymore

      Heh, well every library I've ever seen uses it.

      (I'm in the UK)
    • Strange that he compares it to a system that few libraries use anymore.

      Few academic libraries, perhaps. LOC seems to have almost entirely supplanted it there. But public libraries, typically on the trailing edge of library science, still use the Dewey system extensively.

  • Quote:
    "IBM expects Storage Tank eventually will be able to handle 10 to 20 terabytes of CERN data. By 2007, when the proton smashing is scheduled to commence in earnest, CERN will be generating data at a minimum rate of 5 to 8 petabytes a year."

    Wow! This monster storage tank will be able to handle 20 terabytes of data! In four years?! That's just amazing!! A whole 1/1000th of the required yearly storage!
    • I'm betting that SHOULD be 10 to 20 petabytes. 10 to 20 terabytes isn't actually all that much, Maxtor has 300 gigabyte drives out. A very simple array could be built that is easily 10-20 terabytes.
      • Indeed, the figures are almost certainly wrong. FWIW, the old LHC experiments (of which there were 6, I think) would generate raw data at about a few gigabytes every 10ms, so their raw data requirements are really quite, well, astronomical :-). Of course, this data then had to be immediately processed, as there was far too much of it to record as such. The 'machine room' at CERN is just amazing - still has the NSA-mandated guard posts from when the SGIs, Crays, and so on that they had were export-licenced '
    • 10 to 20 Terabytes of data is what the LHC collisioner is going to generate each second while it is running. CERN is expecting to generate at least 5 petabytes of data per year.
      It should also be noted that CERN is a large user of lower cost large storage arrays based on 3ware cards, but those won't scale to what the LHC will require.
  • Wait 'til the NeoCons get a hold of this term.
  • 'Storage Tank has the potential to become to an organization's data what the Dewey Decimal system is to a library,'

    I'd be careful about making that comparison, unless you want a lawsuit from the Online Computer Library Center.

  • huh (Score:1, Redundant)

    by xao gypsie (641755)
    funny how pr0n still seems to drive technology..

    xao
  • Had a hard drive crash the other day without backups.

    Are there any easy solutions that can write data out to two HDs redundantly, perhaps to two SCSI or USB external drives?

  • That's 90,000 dollars and comes preinstalled with one 36Gb drive. Additional drives can be purchased at the low-low price of 4,000 dollars apiece.
  • Assuming you can put 6 HDDs in 1U enclosure and a
    standard 44U rack and those 300 Gig monster HDDs,
    that's still 12 racks worth of HDDs. Holy...
    • So would anyone take a shot at actually specifying the hardware and cost for a 1 Petabyte system? Include HDs, systems, # of racks, (don't forget the switches for the network). Assume no RAID.
      • 3,125 320GB Maxtor HDD's @ $283= $884,375
        391 8 bay 4U rackmount encluses @ $140= $54,740
        391 4 Channel IDE controllers @ $17= $6,647
        391 CPU+Mobo+Ram combo's @$100 = $39,100
        22 racks @ $328= $7,216
        17 24 port switches @ $61 = $1,037
        4 Spools of Cat5 cabling @ $40/1000' = $160
        800 Cat5 connectors @ $10/100 = $80

        Grand total = $991,355

        So roughly $1 Million with shipping for a cheap arse, cruddy, minimilistic way of doing it.
    • hopefully you're not actually putting them in a rackmount case. It would be much more efficient to rig it up where they are just bare drives out in the open. Sure, it looks like hell, but you should be able to stack almost all of the drives in a single rack with a bad-ass motherboard and with 3 or 4 of these [3ware.com] in each of them. You should be able to find a decent dual motherboard with 4 64-bit pci slots [amdmb.com]on them. That would be (4x12+4) 52 drives per computer. At 300GB per drive, that would be over 15 terabytes
  • Maxtor introduces [slashdot.org] a "monster", and IBM introduces a "storage tank". Coincidence? I think not...

    :-) for the :-)-impaired

    Now I'll sit back and wait for the obligatory "... bah! Tank-shmank! Gimme a few of these Maxtor monsters, and I'll roll my own "storage tank" using a spare full-tower chassis, a PIC controller and some duct tape..."

  • This has been done. I think they call it KaZaa though.

    -----
  • considering it isn't too difficult or expensive for an average Joe to assemble a terabyte's-worth of storage from off-the-shelf parts; a petabyte isn't really that much.
    • Only about 1000 times as much.

      Consider that you can get a 4U 24 drive array, and if you stock it full of 300GB drives, for 7.2 TB.

      Now, fill up the rack. 72 TB.

      Now fill up _ten_ of those racks. 720TB.

      (Actually, you'd need about 14 racks, but ...)

      That would be 22 feet by 7 feet of storage -- not raided, just JBOD.
  • "Get your own 'starter configuration' for only $90,000!"

    Can I get $10,000 if I include some Cracker Jack box tops? If not, I'm ordering a petabyte-capacity storage tank for sea monkeys.

    :P
  • Storage Tank comes extremely late - it was first promised to come out in early 2001.

    According to this article [theregister.co.uk] at The Register, IBM failed to provide such features of Storage Tank as, "link servers and storage systems from all vendors, making it possible to view and access a file from any system. ". Instead, it will only support AIX and Windows platforms starting this November. Support for other Unix versions, including Linux, is expected not earlier than mid-2004.

  • ...who read that one link as 'Large Hardon Collider' ...yeesh, I think I need to get out more.
  • Imagine a Beowulf cluster of storage tanks!
  • Imagine... (Score:5, Funny)

    by billbaggins (156118) on Monday October 13, 2003 @11:05PM (#7205132)
    Man, imagine a Beo... <blam>

    <thud>

  • Hmm... (Score:3, Interesting)

    by JoeLinux (20366) <joelinux&gmail,com> on Monday October 13, 2003 @11:08PM (#7205163) Homepage
    I always thought a good idea was multiple RAID storage across the entire network. So all the files are spread throughout the network. With multiple copies so if two or three computers go down, that data is not lost...kind of a cross between SAN and RAID.
    • I'd be interested in how a Storage Tank differs from EMC's Centera, which provides WAN access to large amounts of storage. Centera is a Linux-based rack of P2P nodes with ATA-based storage, all accessible only via HTTP. You "POST" the content of a file, and get back a cryptographic checksum as a file identifier which you then use to retreive the content later. This lets you verify that the data is still intact. Here's the marketing-speak:

      Centera's architecture is based on redundant arrays of independen

    • Sounds a lot like the Mango Medley file system (for Windows) and Coda (for Unix/Linux systems). Info at nearest Google-search.

      The Mango system was only produced for versions of Windows up to 95, with spotty NT support. The premise was pretty cool: each user of the system allocated part of their hard drive to a single network share. All of this space was added up and appeared as a single shared mapping to the network. Each file was copied to two users for safety. If a user accessed a file, they would g

    • by Servo (9177)
      Nope, not a SAN...
      Not RAID either..

      it's SAID!
  • Petafile (Score:1, Funny)

    by Anonymous Coward
    So what's a file called in a petabyte-capacity storage tank?

    A petafile!

    Ha, I crack myself up.
  • by Anonymous Coward
    open source solution that already stores 100s of terabytes that is called LUSTRE... LUSTRE is already deployed in a few live aplications run by the NCSE (hope I remembered that right)....

    At the symposium this year, the fellow mentionned they were working on scaling to petabyte storage for next year.

  • by Saeger (456549) <farrellj@gmUMLAUTail.com minus punct> on Monday October 13, 2003 @11:11PM (#7205194) Homepage
    Dear Manufacturer,
    Only petafiles have need for petabytes! Consider yourself boycotted!

    Sincerely,
    Mentally Challenged Parents Association

    (What's a Petafile, Walter?)

    --

    • Why is this only +3? This is the funniest thing I've read on slashdot in months.
    • Mentally Challenged Parents Association

      You laugh, but a few years ago in Wales a mob of angry "locals" stormed a house that they'd heard belonged to... a paediatrician. Those simple folk couldn't read beyond the first few letters.

      The whole issue is very strange. I mean, being anti-child-abuse is nothing special, it's just the default setting for civilized people. Yet some people seem to think that being rabidly anti-paedophile is some sort of shining badge of virtue. It's the same with fascism, being ant
  • Shh! (Score:2, Redundant)

    by Transcendent (204992)
    Storage Tank has the potential to become to an organization's data what the Dewey Decimal system is to a library.

    Shh!!! Don't mention Dewey Decimal or you might get sued [slashdot.org]!!
  • by Manhigh (148034) on Monday October 13, 2003 @11:30PM (#7205335)
    Hope they have lots of backup. Of course, how do you backup a system like this?
  • I've been considering an idea like this for years. I mean, what's the problem in splitting a file system up into lots of smaller chunks and storing them on many different computers? My idea was to introduce redundancy so that even if not all of the nodes are active or reachable at any given time, the information could be located or constructed from other information. By doing so, a distributed storage system could be placed on millions of computers worldwide, in a sort of SETI@home-like setup, and users cou
    • By doing so, a distributed storage system could be placed on millions of computers worldwide, in a sort of SETI@home-like setup, and users could donate a tiny chunk of their hard drive to help scientific research or student projects or whatever, where the people using the storage can't afford to pay for it. What's 50 megs, or 100, or 200, in today's hard drives anyway? People could easily "donate" the unused space on their drives and never feel a difference.

      And my hard drive starts smoking and you lose so
    • See a previous article regarding the Google File System. [slashdot.org] Google works in a very similar manner.

  • The Register [theregister.co.uk] claims that contents may have settled during shipping.
  • Again ... (Score:1, Offtopic)

    by petabyte (238821)
    ... I have nothing profound to say but hey, as my nick is a petabyte I figured I should chime in.

    Then again, I'm only a petabyte here, usually I'm in a larger configuration. [yottabyte.org]

    Ah, good times good times ..
  • Isn't this just one way to implement a P2P network? By selling it for enterprise use, IBM is supporting the argument that P2P networks have legitimate use and should not be outlawed as the RIAA has attempted.

    I have not used either, but Storage Tank seems to deliver similar functionality as Waste, though on a larger scale and with a different UI paradigm. Perhaps if Nullsoft had released Waste as a way for small and medium sized businesses to share files, AOL would have acted differently.
  • That is the hottest thing I've seen all day!(My girlfriend is out of town)
  • While the Dewey Decimal system was revolutionary for its day, it's long fallen into disuse in any serious library. A lot of school libraries still use it and some local libraries use it, but I can't think of one university or college library I've been to that didn't use the library of congress system. It's a lot more useful as most people who have used both would say.

    Also, it's interesting to note that the library at Amherst College, where the Dewey decimal system was created (by Dewey!) no longer uses t
  • It's still not enough.
  • Hey, if you make a file that is 100GB you would have a file that is a petabye or a petafile...get it hehe :) flam me if u want
  • Maybe I'm just old, but doesn't it strike anyone that a system built to handle petabytes of data should cost more than $90,000? That's not a whole lot of money for enterprize level hardware.

    Hell, I remeber seeing an IBM System 38 with 16 gigabytes of storage, bloody thing took up a room and cost a couple million bucks. All they did with it was keep a driver's licence database on it and run print batches.

    $90,000? CHEAP!
  • a porn solution that works!
  • Oh my god, that's $630,000 dog dollars!
  • It seems that IBMs system is just a specialized P2P file sharing/serving network, not really anything new and "revolutionary."
  • by Apuleius (6901) on Tuesday October 14, 2003 @12:24AM (#7205680) Journal
    ...you will discover that 1 petabyte is enough
    room for more Divx encoded porn than a man could
    watch in a lifetime with no sleep or bathroom
    breaks. Think about that for a second.
    • High Definition?
      Multi Screen?

      Come on, we need more then Petabyte storage on the desktop
    • Yeah, but I need MP3s too.

    • 1 petabyte = 1,024 terabytes = 1,048,576 gigabytes

      Let's say the DivX pr0ns are encoded so that 1 hour of video takes up 1 gigabyte (Divx video is often encoded at lower bandwidth than this but who wants compression artifacts in their pr0n?)

      1 petabyte will therefore store 1,048,576 hours of pr0n.

      1,048,576 hours = 43691 days = 120 years

      Yep, that's enough pr0n.
    • Assuming we are going with high quality, 23.9 fps DiVX. A 120 minute porno should be about 730MB.

      Petabyte: 1,125,899,906,842,624 bytes
      730MB: 1,048,576*730 = 765,460,480 bytes

      That's 1,470,879 and 120 minute porno films.
      This comes to a total of 176,505,480 minutes.

      Hours worth of porn: 2,941,758
      In a non-leap year, there are 8760 hours per year.

      That's right.. It would take 335 years to watch all that porn with no breaks.

      My suggestion would to be to purchase one of those monitor arrays featured on slashdot
  • If I defrag (Score:5, Funny)

    by superpulpsicle (533373) on Tuesday October 14, 2003 @12:30AM (#7205731)
    startime - 10-01-03
    endtime - 10-01-13
    • Re:If I defrag (Score:3, Interesting)

      by slim (1652)
      I know you were only joking, but seriously it bothers me that in this day and age we still need a defrag command.

      There have been "grown up" filesystems on UNIX and Linux for years -- I believe even extfs managed defragmentation on the fly.

      That NTFS on Windows still just leaves fragmented files lying around until you manually ask a program to fix them is frankly outrageous.
  • typo (Score:3, Interesting)

    by painehope (580569) on Tuesday October 14, 2003 @12:52AM (#7205867)
    that "10-20" terabytes line has to be a typo.
    I spoke w/ some people from CERN regarding their CASTOR HSM, and a few years ago they were up in the petabyte range already. By now, they're probably sitting at at least a few hundred TB online, and probably 5 PB offline, as a conservative guess.
    IBM's been doing GPFS filesystems in the > 50 TB size, w/ > 1 GB/sec. throughput for years. That, and even's IBM's mid-tier FAStT products can confortably carry 12 TB on one dual-controller storage head.
    Still, further abstracting the issue of locality is very exciting stuff. I'd be interested to see exactly how they go about doing it, and if it's anything that you can't get w/ Lustre [lustre.org] when it's ready.
    • Citation from here [web.cern.ch]:

      "The LHC itself is expected to run for 15-20 years, giving rise to a total data volume of between 75-100PB"

      Everybody expects these numbers to be underestimated by at least a factor of 2.

      Cheers, Rolf
  • the Andrew File System (AFS), and to some degree it sucessor DFS.
  • The RIAA sues Big Blue for creating a "haven for filthy music pirates"..

    Kazaa file numbers shoot up

    SCO sues IBM (again) because this "Storage tank" is just like the one they got to hold the shit that comes out of the SCO office toilet before it is tossed at Linux users

    ok so it's not THAT funny
    Suchetha
  • 'Storage Tank has the potential to become to an organization's data what the Dewey Decimal system is to a library,' said Dan Colby, general manager of storage systems at IBM.

    So the Storage Tank is obsolete and irrelevant, like the Dewey Decimal System? In the US, most libraries other than those of K-12 schools have converted over to the Library of Congress system. Possibly at least partially because they have to pay $500/year [slashdot.org] to use the Dewey Decimal System!

  • Only one word for it:

    owned!

  • Shouldn't that read Pitybyte or something like that? ;-P

    Okay okay, it's PebiByte, but hey...one can try =)
  • If you're curious about how Storage Tank works, check out the paper from the IBM Ssstems Journal [ibm.com].

Ernest asks Frank how long he has been working for the company. "Ever since they threatened to fire me."

Working...