Please create an account to participate in the Slashdot moderation system


Forgot your password?

NetBSD's Real-Time Network Backup 166

jschauma writes "One of NetBSD's developers, der Mouse, was interviewed by DaemonNews about his real-time network backup system (originally presented at BSDCan 2005), where changes to your local filesystem are automatically propagated to a backup server. In his interview der Mouse tells about his idea, how it works, and of course, how cool it is."
This discussion has been archived. No new comments can be posted.

NetBSD's Real-Time Network Backup

Comments Filter:
  • by thedletterman ( 926787 ) <> on Monday March 06, 2006 @04:13PM (#14861021) Homepage
    But hasn't Sun been doing this with Solaris for at least 3 years?
  • B.S. D? (Score:2, Interesting)

    by ExE122 ( 954104 ) *
    So we could have backup servers all over the world keeping track of disk write commands...

    This is indeed very neat, but isn't it sorta how transactional databases have been working?

    I also don't see how this solution is effectively any better than RAID... If anything, a backup server is more expensive than a second hard drive for a RAID system (though it may pay off eventually). I'd think the backup server would need to be maintained as well... and if your backup ever fails, it seems like it would require
    • Re:B.S. D? (Score:5, Insightful)

      by ThePiMan2003 ( 676665 ) on Monday March 06, 2006 @04:19PM (#14861086)
      I think the point is that it could be used for an off site backup. Raid does not protect you from Hurricanes, or even fires.
    • by autopr0n ( 534291 )
      Obviously, RAID servers don't help you in the case of accidental deletion. And they certainly don't help if your whole computer gets blow up.

      Still, you'd want to be careful with this, it would suck to back up all the temp files generated by random processes.
      • Still, you'd want to be careful with this, it would suck to back up all the temp files generated by random processes.

        In UNIX systems, all temp files usually reside in /tmp, which need not be on a RAID partition (unless you want those processes to stay up when you lose one of your drives).

      • by Umrick ( 151871 )
        What I want is something like Plan9's Fossil+Venti file system. Versioned, with permenant copies offloaded to archive media. It's a rather nice, though not blazingly fast, complete view of data. Not the rather ephemeral view most of us take. Restore to any point in time since inception.

        Failing that, something like OpenAFS with mirrored globally addressable volumes that can work at the system level rather than user level. Sure you can use IP security for OpenAFS, and a few brave folks have even gotten ne
    • Re:B.S. D? (Score:4, Insightful)

      by Amouth ( 879122 ) on Monday March 06, 2006 @04:23PM (#14861160)
      yes you are missing the point..

      take 10 small servers that do the front end grunt work with 2-3 backup servers that keep complete working images of the servers and have access to their data..

      a front end server dies service can roll over to a backend until the front is replaced and is quickly made jsut like the orginal a backend dies and you have a second and if all the backups die then you still have the front end to recreate the backups..

      you don't normaly consider the bandwith costs as they are typicaly on a highspeed network between them and it offers you the option of replication over diffrent connections and areas..

      all redundent disks help with is if a disk dies not if ram or cpu fails

      some people have gotten too attached to their physical backups and tapes - personaly a backup is worthless if i can't have live access to it in a few min even if i am not physicaly at the point of failure..

      this isn't particulary useful for small setups but is great for mid to large scale setups and offers plenty of room to grow.
      • "front end server dies service can roll over to a backend until the front is replaced and is quickly made jsut like the orginal"

        This is a pretty shakey solution at best. You could change the default bootloader config and tell the machine to reboot using the chosen image but if there were the slightest problem that required a tweak on the system it may never come up and give network access again. If you replicate using a hybrid system where some portion of the configuration is already setup on the backup ser
    • Re:B.S. D? (Score:2, Interesting)

      by dpilot ( 134227 )
      I don't actually run RAID, but I've gotten some interesting stories from some (more than 1) people who do.

      In a RAID cabinet, you have a bunch of identical drives, most likely purchased together, too. Then you submit them to an essentially identical environment and operating history. Barring a defect, and assuming wearout-type phenomena, something bad may well happen.

      The weakest drive fails first. Power down the RAID box to replace the bad drive, so you can bring it back up and restore the data. The stress o
      • Re:B.S. D? (Score:3, Insightful)

        The weakest drive fails first. Power down the RAID box to replace the bad drive, so you can bring it back up and restore the data.

        well, no. enterprise level raid has spinning spares and hotswappable everything. you can lose two drives and still be running as long as you get those replacements in there before number 3 goes. been there, and yes, it happened when we shut down for maintenance. In the real world catastrophic failure happens. Raid is not used as a backup usually, it is used to keep data availabl
      • Re:B.S. D? (Score:5, Insightful)

        by PartialInfinity ( 856052 ) on Monday March 06, 2006 @05:05PM (#14861593)
        Why do you have to settle for one or the other? A proper backup strategy, like any security strategy, should involve more than one technology.

        Hotswappable RAID has saved my servers on more than one occasion. Likewise, the servers have also been saved by tape backups. RAID5, tape backups, and data replication all have different pros and cons.

        I think it is incorrect to say RAID5 is not acceptable in any backup strategy. The more chances you get at data redundancy, recovery, and failover, the better off your organization.
      • Re:B.S. D? (Score:5, Insightful)

        by Desert Raven ( 52125 ) on Monday March 06, 2006 @05:12PM (#14861677)
        I don't actually run RAID, but I've gotten some interesting stories from some (more than 1) people who do.

        I'll comment on this later...

        The weakest drive fails first. Power down the RAID box to replace the bad drive...

        OK, this is where I start getting dizzy. If their data is valuable enough to have RAID, why were they such cheap bastards that they didn't get hot-swap drives? I've worked in a LOT of places that have RAID systems, and three of my own servers have RAID, yet to date, none of them were anything but hot-swap. Additionally, with a small amount of intelligence and a few extra dollars, the administrator always puts in a hot-standby drive that will automatically take over if a drive fails, allowing for the failed drive to be replaced at a more convenient time than 1:30am without sacrificing the redundancy. Sysadmins running really critical systems will often have multiple hot-standby drives.

        The stress of the power-down and restart is enough to kill the second-weakest drive.

        Now, see, here's the funny part. When you spend the bucks for SCA hot-swap drives, you actually get drives of decent enough quality that this is very rarely a problem. Even if you did have to shut the array down, which you won't because you bought proper hardware.

        enough so that they've quit using RAID as "backup"

        Further evidence of idiocy. RAID is not a backup. RAID allows you to keep running in the event of a specific type of hardware failure. But that is all it protects you from. Backups are still just as critical as they were before you had RAID. Anyone who uses a RAID array instead of proper backups deserves to have their data sacrificed to the gods of entropy, shortly followed by their own careers.

        As for my delayed comment on the first sentence... Well, I suggest you get smarter friends.
    • It's better than RAID in that if your entire main server gets toasted (e.g. literally, your house burns down, or similar), you've still got a backup.

      From what I gathered in the article, it isn't similar to a database's logs because it just mirrors the writes, without saving the old disk image. Although it sounds like it would be fairly easy to just save the packet stream and have the ability to replay your disk image to any particular moment in the past.
      That would be much better than RAID. After all, back
    • We have about 20 SQL servers around the country connected via leased-line T1s originally designed to be constantly replicating to HQ. Not a huge system, but about 2TB of data total. One can imagine the kind of bandwidth such a redundant system sucks up. The cost and performance hits associated with this are absolutely extraordinary and there simply is no reason for about 90% of load.

      It's kind of like the SunRay system. Yeah, the idea is neat, but the architecture is a network-clogging, CPU-leeching nightmar
    • It's essentially an append-only remote filesystem. That comes with both benefits and drawbacks. The fundamental benefit is point-in-time recovery. Coincidental benefits include dramatically lower average throughput (since backups are always happening) and the potential for lower total backup bandwidth (if a 5GB log file gets 200MB of new entries, an incremental would have to back up the entire 5.2GB log file; a log-based backup system would only back up the new 200MB). It would probably also make the ba
      • you dont have to use it for everything. i, for example, would use it for some small databases and my svn repos. if your box dies between the nightly backups, it aint such a big deal that you lost, say, a days worth of apache logs and mail. but it would suck if you lost a days worth of orders or that code you cranked out furiously over lunch. i say, prioritize!
    • 1. Point in time recovery. This allows you to restore back to some point in the past. Good for recovering deleted files.
      2. Off site backups. A second server located at another office just in case of Earthquake, Fire, Flood, Hurricane, Tornado, or some other disaster.
      Raid doesn't replace backups. With encryption you could keep your backup server at a co-location facility, branch office, or home. Handy if the worst does happen.
      Another option would be for a local consulting firm offer this as part of there ser
    • Re:B.S. D? (Score:2, Interesting)

      I also don't see how this solution is effectively any better than RAID... If anything, a backup server is more expensive than a second hard drive for a RAID system (though it may pay off eventually). I'd think the backup server would need to be maintained as well... and if your backup ever fails, it seems like it would require a lot to set up another.

      I only skimmed TFA and it's not clear to me how like or unlike Windows' Distributed File System it is, but I'll give you a quick picture of what DFS does fo
    • If anything, a backup server is more expensive than a second hard drive for a RAID system (though it may pay off eventually).

      Unless you have hotswap ability, if a hard drive fails, you still have to power it down to remove it. If you have a second server up and running, you won't have any downtime other than changing an IP address.

      Sure it might be only 20 minutes to swap hard drives tops, but a server down during business hours is still a pain.
  • Neat. (Score:4, Interesting)

    by Pig Hogger ( 10379 ) <> on Monday March 06, 2006 @04:17PM (#14861061) Journal
    This is definitely the way to go. With huge hard-disks that offer capacities beyond tape drives, it is less and less feasible to use traditionnal tape-based backup systems in many organizations, if only by the time taken by the frigging tape drive...

    Here is the idea behind the setup I am currently using: Easy Automated Snapshot-Style Backups with Linux and Rsync [].

    • Re:Neat. (Score:5, Interesting)

      by Lord of Ironhand ( 456015 ) <> on Monday March 06, 2006 @04:35PM (#14861308) Homepage
      I prefer Dirvish [], and I highly recommend that people looking for a good harddisk-based backup system take a look at it. I've looked long and hard for a good backup system and this is the first that seems to fit the bill for me.
      • Looks neat :-). I use a similar mechanism for backups of one of my boxes. Another one, however, uses backup space that I only have FTP, SCP, SFTP and rsync access to (no other shell commands), for that I use Duplicity [], which is very clever. It even encrypts your backups using gpg.

        I should look into something like dirvish though to replace my current homemade 'backupd' which basically does the same thing with less flexibility.
      • Does anyone know how Dirvish compares to rsnapshot []?
    • rsnapshot ( []) packages mike rubels concept into an easy to use package, I found some red-hat rpms somewhere too.. it works great on our server
    • If by "the way to go" you mean, hard-disk-based backups, then I'd agree with you. In this particular case, this acts as a poor-man's replacement for RAID-1 (mirroring), with the same problems inherent in that system that make it unsuitable for general backups. Consider a simple command - "rm -rf s *". Ooops! With a point-in-time backup, you're not necessarily SOL, though of course, you weigh that against the data lost between your backups.
    • Well, the largest LTO 3 drives offer 400 GB uncompressed per tape, at 80 MB/s native transfer rate, which isn't too shabby.
  • Volume shadow storage is exactly this kind of incremental, real-time backup process. How does this differ technically from that? (Other than the fact that you can now dynamically back up your morning toast, which is useful if a slice goes up in flames...)
  • by TheFlyingGoat ( 161967 ) on Monday March 06, 2006 @04:20PM (#14861105) Homepage Journal
    This idea is really cool, but implementing it by putting hooks into each device driver seems overly complicated. It also doesn't sound like they're any sort of priority setting for this or any type of data filtering.

    Personally I'd like to see something like the MS filesystem in development that allows SQL calls to be run against it (not sure if there's any other filesystems that are similar). Query every 5 minutes for changed data that fits the backup parameters (within the system dir, the user's home dir, certain filetypes) and then transfer the data as the network isn't being used.

    That would achieve the same thing, but more flexibly and without affecting normal use.
    • A hook into each driver does seem like a strange way to do this, you would think that it could be done once at a higher level.

      Query every 5 minutes for changed data that fits the backup parameters (within the system dir, the user's home dir, certain filetypes) and then transfer the data as the network isn't being used.

      Then you loose the realtimeness.

    • by ivoras ( 455934 ) <ivoras @ f> on Monday March 06, 2006 @04:31PM (#14861248) Homepage
      This idea is really cool, but implementing it by putting hooks into each device driver seems overly complicated.
      FreeBSD's GEOM is solving that: []

      Also, there's "GEOM gate" on FreeBSD: []
      For other cool stuff with GEOM see here [] and here []. See also this discussion thread [] about ggate's limits.

    • Query every 5 minutes for changed data that fits the backup parameters (within the system dir, the user's home dir, certain filetypes) and then transfer the data as the network isn't being used.

      Unless I'm reading you wrong here, with a 5 minute delay you can already do this with rsync, a shell script and a cron job. According to the article this guy is doing it in near real time across the network (from what I can tell) by intercepting the write calls to the file system driver(s).

      Not sure how else yo
    • Device drivers would be the best solution for me. I want an exact copy of what I wrote to a physical drive. Hook, encrypt, send to another HD to repeat. Realtime, low-level. This allows it to be relatively fast (as opposed to having to process through layers of abstraction), accurate (as opposed to something an abstraction might do to it), and realtime...

      I want dual transactions. 1 for onsite and 1 for offsite. I'm not even interested in encrypting the data. I need to be able to kill my onsite immediately a
    • There is actually a better way, and it is being implemented in DragonFlyBSD. Instead of duplicating writes at the device level, VFS operations are logged to a journal descriptor, which may be a file or network pipe. As this is performed in a VFS layer, it is possible to use with any filesystem. However, it is not limited to remotely (or locally) mirroring a filesystem; with the journal available, it will also be possible to rewind the state of the filesystem to any point in time, subject to the journal s
    • No political party has a monopoly on wisdom or ignorance

      No, but when it isn't frustrating, it's hysterical watching them try to corner the market.
    • You can already do this - just do a recursive descent of the filesystem tree. SQL is just an interface for doing the same thing. It may be that the MS filesystem is more efficiently organized for doing this kind of query, but that's another issue.
    • All this means is, when the call occurs to save data to disk the call not only writes to the primary disk but as well to a network device in parallel. Almost like a network RAID setup, realtime only means that the writes are "requested in parallel" as network latency will be much higher then on device storage. For mission critical the data to be saved would have to remain in memory until it can be verrified that it has been successfully stored at both points which should really be done regardless.

      As for
    • On FreeBSD, you just use GEOM Gate (ggated and ggatec) to create a network filesystem/partition. Then you use GEOM Mirror (gmirror) to create a RAID1 array using the local disk and the ggate disk. The GEOM disk layer handles everything for you from there on. No special driver hooks required, works with any and all disks.

      If you want to get fancy, you could use ggate to create two network disks/partitions, and graid3 to create a RAID3 array. But the performance probably would be all that great. :)
  • DoubleTake (Score:2, Insightful)

    by ROOK*CA ( 703602 ) *
    Sounds like it's essentially a DoubleTake daemon for BSD, cool, I wonder how well it scales? Say if you wanted to fully mesh 10 or more servers or something. Sounds like it might come in handy for keeping the content in web farms in synch as well....
  • Cool, but not new (Score:2, Informative)

    by BlankStare ( 709795 )
    This concept has been in play for years as a commercial product for Disaster Recovery, Veritas Volume Replicator (VVR).
  • Those crazy Germans.
  • NBD? (Score:4, Informative)

    by mikeee ( 137160 ) on Monday March 06, 2006 @04:29PM (#14861236)
    How does this compare with Linux Network Block Device []? Sounds very similar.

    There are pretty mature commercial tools for this stuff, as well - Veritas' VVR replication comes to mind.
    • Re:NBD? (Score:4, Insightful)

      by tpgp ( 48001 ) on Monday March 06, 2006 @04:44PM (#14861399) Homepage
      How does this compare with Linux Network Block Device? Sounds very similar

      It doesn't compare at all.

      From my (quick) scan of the article - think of NBD as a replacement for NFS (well, sorta) & this as a sort of network RAID (kinda, not realtime).

      They're not really alike - for linux drbd [] is probably closer.
      • Ah, that looks nice. I've heard of people running RAID-1 over NDB ('cause it's a block device!), but NBD apparently is a little flaky - there are a lot of kernel deadlock issues, and it doesn't sound like it ever quite picked up a userbase.

        Understandable; the people interested in remote replication mostly aren't interested in doing it with Alpha software. :)
        • by rafa ( 491 )
          We had it enabled where I work - but it took a while to get it tweaked right. In the beginning we got massive lag-spikes on our nfs-exported home dirs. It's a good idea, and I hope the problems with it can be ironed out.
      • Repeat after me:

        RAID is not a backup solution!

        To answer the OP's question, it doesn't compare at all. NBD lets you export drives over the network so that they show up as block devices on remote systems (meaning you can do raw operations on them, use LVM, etc.); this, on the other hand, replicates changes to another filesystem.

        At first glance, this might not seem very effective for backup if deletes are replicated as well. That said, the benefit (as with replication in MySQL) seems (to me) to be that you can
    • Re:NBD? (Score:2, Informative)

      by SmallSpot ( 183360 )
      Not the same as NBD, but it is very similar to DRBD ( []). I've used DRBD before, and it works quite nicely.
  • Shouldn't this technically be called a point in time recovery solution? When I think of a backup solution, I expect to be able to retrieve arbitrary files from an arbitrary point in time. Also, rather than mucking with the kernel, wouldn't it have been simpler to use the geom system?
  • Isn't this guy reinventing the wheel? Why not just run a RAID 1 [] setup using iSCSI []? Wouldn't that accomplish the same thing a lot easier?
    • Possibly, but it would be a lot more EXPENSIVE as well, iSCSI HBA's + the iSCSI SAN device, not to mention what if you want to replicate your backups to multiple locations? then you're looking at replication agents on your iSCSI device.
      • You can do iSCSI pretty much entirely in software(I've done it with two VMWare VM's before) The only "special" hardware things that you would need to have are GbE NIC's that support jumbo frames as well as a switch that does. And you don't have to have those, though performance might not be quite as good. I don't know if there's iSCSI target/initiator software for NetBSD though.

        I'm sure having expensive HBA's can give you a lot better performance(as well as the ability to boot from an iSCSI drive), but j
  • do you get ALL the data on the backup server to start with? Pushing the writes off to the backup server in real-time is identical to what the HP VA7410 SAN I work with does internally in RAID 1+0 except that this happens over the network. But how are the disks in the backup server ever going to get all the original filesystem data if that data already exists AFTER you build your backup server? Even if you have a log of writes, you can't reconstruct the data. You'll only be able to reconstruct rec
    • Every time the server is started, it sends a command to all the clients causing a full sync of all changes that occured while the server was offline. The same thing happens when a client is restarted, it sends a full sync to the backup server, any blocks that do not match the client checksum are re-sent.

      Thus the first time you ran this thing it would copy the whole disk image to the backup server. After that subsequent writes would be the only output.

  • I've been doing this with a web-based system. Not as direct but works automatically when you connect to the site. Platform independant that way.
    • You built this yourself? How do you handle differential compression through a web browser? How do you compare file signatures? Handle permissions?
      • Java applets can do darn near anything.
        • Do you have a link to any code? Or references on the net? What you describe is a significant improvement over rsync/rdiff-backup for mobile users, and I'd like to know more.
          • I haven't yet decided if I'll opensource the system or not. I'm leaning towards opensourcing the client portion (the Java applet) but licensing the server software out. Or maybe I'll go with the new GPL if it protects my rights on software that is used as a service rather than distributed.

            I've done a few test runs which was enough to let me know I had to look into getting a multi-TB server farm before I could open it to the public. I've been trying to get investors for that (it costs around $250/mo per TB o
  • DRBD (Score:1, Informative)

    by Anonymous Coward
    How is this any different from DRBD ( []

    From the website:

    DRBD is a block device which is designed to build high availability clusters. This is done by mirroring a whole block device via (a dedicated) network. You could see it as a network raid-1.

    Each device (DRBD provides more than one of these devices) has a state, which can be 'primary' or 'secondary'. On the node with the primary device the application is supposed to run and to access the device (/dev/drbdX; used to be /dev/nbX). Every
  • In every case I've actually needed backups to date, I find that, if I did them instantly instead of nightly, I'd end up losing data. The most common need for a backup for me comes when I've made a mistake with the main data, and I need to go back to what I had, say, yesterday.

    This isn't to say that instant backups wouldn't be nice for failover architectures, though. I just don't deal with systems that large, yet.
    • Nothing that says you can't do delayed backups with this solution as well, replicate to your (near) real-time backup machine across the network, then tape back-up the replicated machine, this way you're never having to run backups (loading) against your production box and you've got a near-line image sitting on your replicated machine for quick restores.
  • protection (Score:2, Insightful)

    by NynexNinja ( 379583 )
    How does this protect against an rm -rf against the filesystem... I guess it would trash the backup on the other side.
  • by evilviper ( 135110 ) on Monday March 06, 2006 @05:25PM (#14861799) Journal
    This is basically RAID over the network. Personally, I can't see a lot of use for it... Just put the second drive in the machine, and use software RAID, rather than putting the second drive in a network server. Less network slowdown and congestion that way, not to mention CPU-time wasted packetizing, encrypting, etc.

    As always, RAID (and now this) is not a backup solution.

    • To belabor that point just a little bit, my personal observation is that for every hardware-based data loss event I've experienced, there have been 10 user-based events.

      Just today I had to recover the Inbox of a user who deleted a message but didn't know who sent it, when it arrived, or what the subject of it was. He also wasn't sure when he deleted it, so I had to do the restore twice.

      I keep a lot of data backed up on disk, rsynched once a day. Some data I even back up once an hour. It doesn't cost anyt
  • oops! (Score:5, Funny)

    by krismon ( 205376 ) on Monday March 06, 2006 @05:43PM (#14861964)
    Oh no! the rootkit got replicated to the backup server!
  • It seems to me that if you use a journalling filesystem that journals everything, not just meta-data, you can just send the journal logs off to your backup device. Presuming your backup device starts with the same baseline data (i.e. a full level-zero dump) Then you would have the ability to restore your files, or entire filesystem, to the state it was at any point in time just by playing back the journal logs. Presumably a "smart" replay algorithm could be implemented that would use some sort of regular
  • It needs to do a full rescan on reboot?


    That kills it for me.
  • I just wanted to point out that there are several FUSE-based filesystem implementations that do the same thing (functionally, not implementation-wise) and they do not require hooks in the device drivers -- they don't even care what the filesystem is for the original or the backup.

    And, yes, RAID is a very good solution if you've got the money and are smart enough to recognize when a disk fails...
  • Wouldn't this also replicate deletes across to the offsite machine in near-real-time? So if one were to accidentally delete a file, or a $HOME directory, or a complete filesystem, then there would be no way to recover from this from the "backup" machines, because their files would have gotten nuked too?
    • If you chose to implement it that way, then yes it would. Dunno about the NetBSD implementation, but real commercial ones know the difference between Cr U and D and handle each differently allowing for file versioning and deletion versioning in the backups.
  • oh, wait, what?
  • rsync -avz ~/ user@remote:homebackup

    in crontab?
  • If you aren't looking for network functionality, there's a filesystem called ext3cow [] that lets you roll back to older versions of the contents of the filesystem.

Promising costs nothing, it's the delivering that kills you.