Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Linux Software

Linux-based Solution for Massive Tape Library? 13

Charlie Zender asks: "The Earth System Science department at the University of California at Irvine is proposing to buy a 5--50 Tb DLT tape archive system to manage satellite and model-generated data. This will allow us to integrate the data we need for studies of climate and climate change. We all use Linux, and are wondering if there is a Linux-based solution? Vendors like StorageTek charge ~$250k for a solution based on their proprietary hardware and software. It seems like a Linux consulting company could beat this price and still make a nice profit. The software must make the tape library accessible as "near-line" storage and make transferring the tape to the disk transparent to the user. "
This discussion has been archived. No new comments can be posted.

Linux-based Solution for Massive Tape Library?

Comments Filter:
  • by Xtacy ( 12950 )
    man I just came on here to ask that question :)

    I've been searching around and all I could find was "backup professional" and it doesn't work for me. I would really like to see this for linux though. I mean, yet another area for linux to move into the enterprise arena would be cool. The company I work for makes a backup solution for multiple platforms and one of them is NT. If someone can use NT for that I'm sure Linux could fit the bill easily as well, but alas we don't make a linux server.

    Hopefully one day though.
  • The standard to control the robots are avaiable. Part of the SCSI stanadard if I recall correctly. If thats that case it shouldn't be hard to program, tell the robot what tape to put in what drive, and your done.

    I just looked at the Amanda page, and it appears that if you can controll the robot, it supports external programs to change tapes?

    With the amount of money you are talking, you should be able to talk your STK sales rep into giving you a copy of the documentation needed to controll the robot. Salesmen don't like to see sales not happen due to minor points, so make this a condition of sale. Even if you just buy hardware you are talking a nice pile of money.

    Disclaimer, I work for STK (not in the tape side of the buisness though), and own some of their stock. However, I do not speak and cannot for STK, nor do I know the official position of the company on amything relavent to the above. Thus it is my opinion.

  • This is really the realm of ADSM and Legato Networker with a Storage Tek or IBM library attached, and yes it is expensive.

    The closest Open Source software I've come across is Amanda.

    The thing is though, the backup hardware is only part of the solution:

    You need:

    1: Fast tape streamers.

    2: Well designed and very fast network.

    3: Fast system to handle the library, spool
    the incoming backups to disk and stream
    spooled data to all the tapes.

    4: A 600+ cartridge library + robot.

    5: Offsite storage for 600 copied tapes.

    6: A documented plan for when the bombs go
    off.

    And you have to do all this from scratch. The commercial systems have all of the above covered aleady.

    I've never *really* tried Amanda so I'm not sure how it'd handle the volume you mentioned.

    I know ADSM can handle that kind of volume and the hardware, even though it is like beating your head repeatedly against a padded cell wall.
  • Arkeia runs on Linux and Solaris, and handles big jukeboxen... http://www.arkeia.com.. This is not a commercial plug, but I had done some similar research...
  • Where I work (UCSB in the Institute for Computational Earth Science Systems) we have a 1 TB tape backup system in place. We are using a Qualstar tape cabnet (with two drives and room for something like forty 25GB native DAT tapes). This cabnet is attached to an old Sun Ultra 1 which is in turn attached to the network via a 100baseT connection (soon to be gigabit when CalRen2 comes on line). All backups are scheduled through Legato, which has linux clients and is also capable of backing up NT clients.

    Legato is a pretty good backup program with a nifty gui and the capablity to schedule everything two ways from Sunday. You can also attach bar code lables to the tapes and the Legato/cabnet combo reads each tape and then remembers what it stored on it. Makes recoveries much eaiser. I would recomend that you attach you tape cabnet to a Solaris box because of the hardware reliablity and because of software support. Ultra 1s are pretty cheap today and they work great for the task.

    This is all pretty old as far as computer time is concerned, but it works and we have made it scale to 1TB. Our sister org, the Bren School here at UCSB, is preparing to buy their own backup system that should eaisly handle 5TB (they are getting a bigger cabnet with four faster, bigger drives, and lots more tape bays). If you are interested e-mail me and I'll try to hook you up with those in charge of researching and purchasing the new system. They can probably tell you more than I.

    --Chris
  • DMF is an SGI/Cray product that does transparent file migration between disk and tape. We make pretty extensive use of it on our mass storage server (8 CPU Origin 2000 + ~1TB FC RAID + ~40TB tapes in an IBM 3494 robot).

    Unfortunately, I don't think there's anything like that just yet for Linux. SGI's OpenVault [sgi.com] may do some of the things you want, but it's mostly the device communication layer and not a complete solution.

    --Troy
  • I think you are definitely in the realm of if you want it, make it! As far as I know, there is no complete open source solution out there, but it doesn't seem like it will be long before there is. Everyone I know has been pretty unhappy with Legato and various other proprietery and *hideously* expensive solutions.

    If you are considering rolling your own, try NDMP, which is actually a protocol (client/server) designed for network backups, like a "super rmt." It seems like it was mostly developed by Network Appliance for their boxen, but it is open source:
    http://www.ndmp.org

    It seems to be built around the premise that you will buy commercial software to use the NDMP interface provided by their servers. But an open source client/server is available which can build (easily!) under just about any UNIX.

    The stock ndmp client and server basically support dumping and restoring, but the API supports controlling a mechanical tape loader and someone has written an open source utility to manipulate a StorageTek loader (10 or 20 dlts). Not the same scale, I know but I would imagine that StorageTek uses the same command interface for all of their loaders. This utility is available from:
    http://now.netapp.com/download/tools/ndmputil
    You might have to register for a free login to get in there.

    Also, you can roll your own from scratch! Our StoragTek loader came with a pretty clearly documented manual, and all of the commands for moving the robot arm are simply SCSI commands sent over the SCSI bus. The linux SCSI API is probably "reasonably" easy to program to send arbitrary SCSI commands to arbitrary devices. I know that someone (somewhere) has implemented a Solaris utility (and I am almost certain it is open source) to manipulate the robot arm of a StoragTek via the Sun SCSI driver.

    I think that a *lot* of the robots out there are StorageTek too, just simply relabelled, but I'm pretty sure that most of the non-StorageTek ones (if there are any) also talk to the robot arm over the SCSI bus.

    Good luck!
  • A couple cusomters have tried the by hand. After all, college students are cheap, and you fell good hiring a few college kids over night. They can study when not loading tapes, and earn some extra money. It should be perfect.

    This customer however noticed a problem of tapes breaking. Now they are just plastic, so once in a while it wasn't a surprize, but it seemed like too many. So he went into the office one night (about 2am) to check up on his help. He was one person with a hockey stick, anouther with a goalie mask and glove. When the computer called for a tape the goalies would yell "12c56" (Or whatever the number was), the other guy would find that tape, drop it on the floor, slapshot to the other. The first guy would catch it, and pop it into the drive. The customers fired those people right away, and called STK up for a robot. (Probably a 2 million dollar order, but I don't have the numbers and wouldn't share them if I did)

    Now any of you can figgure out how long you can pay students to before the difference comes up, but when you take into the cost of maintance and normal wear you realise that money isn't the reason for robots.

    The above story is true, but it has been through several people so some of the details are likely wrong.

    I do not speak for STK, but I do own their stock.

  • I have used STK's Near-line silos(6 connected together) on IBM's MVS, using ADSM and FDR/ABR. I have also used an IBM 3494 tape library with ADSM on AIX and an IBM 3570 SCSI media changer on AIX. STL's Near-line silos are awesome! With HSM disk files are migrated from disk to tape and when a user reads the file the file gets restored automatically without affecting their process. I know ADSM supports HSM on various UNIX platforms, but while there is a client for LINUX it is not supported and you would need the server version on Linux if that is where the tape devices are going to be. If IBM did put one out then the driver issue would probably be solved since ADSM Server comes with a large number of drivers to support it. If you are using a robot library then SCSI commands won't help much unless you are using a SCSI Media Changer, but if you are using a SMC then you are all set. But I have seen robot libraries use tcp/ip to communicate with the host that had the tape drives directly attached already via SCSI. Some libraries like STK's Near-line can use a large variety of tape devices that aren't limited to using SCSI to connect to the host. Another thing to consider is if you are looking to hook this up to a Linux server, is this going to be an x86 box? And if so how much data can it realistically move? I have found that if you want to get decent throughput on a SCSI tape device you can't put more than 2 per adapter and then there's your graphics card, networking, if you are using a lot of disk on the same box is it IDE or SCSI(another card). If you don't run out of slots or IRQs, there's still the issue that the PCI bus on a PC doesn't scale too well. I haven't seen any VLDBs on an x86 yet, though with Beowulf I am sure that is not an issue.
  • We archive a fair bit of video, I guess we have somewhere over 2 terabytes (107 20GB (native) 8mm tapes) and zillions of CDs of audio. We use an app we wrote, and a human to feed the drives (ever price these things? A human looks fairly reasonable). We have 21 CD drives and 11 tape drives, plus 3 tape drives for writing new tapes (one off site, one on site backup, one for the "changer"). We use Solaris on ultras for it. We also have 200GB of disk for caching the data, and are going to expand that.
  • Why use a commercially made robot? This looks like a job for Lego Mindstorms and an array of Commodore 64 tape drives. You can do it, we have faith in you. -Z
  • AIT-2 tape libraries are kewl and work with ADSM. Also ADSM comes with a linux client already.

THEGODDESSOFTHENETHASTWISTINGFINGERSANDHERVOICEISLIKEAJAVELININTHENIGHTDUDE

Working...