Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Unix Operating Systems Software

What About A File System That Uses Snapshots? 11

equitir asks: "The file system scene for Linux is now all about journaling. While this is good and necessary, I'd like to ask if anyone is working on snapshot capabilities for any of the file systems available. Is it possible? Should it be done? Is there someone already working on something like this?" (Read More)

"I'm referring to what Network-Appliances are doing: at set times the top-level inodes are replicated but kept pointing to the hierarchy (i.e. the file system gets a new entry point). New inodes and blocks are only allocated and pointed-to for changes. This keeps a large number of entry points within the file system, representing the state of the file system at the times the snapshots where taken. This is used for backups and allows easy retrieval of deleted or overwritten files. This is done by the regular users and from every time a snapshot was taken - this is just like undelete, only better...

This is very useful. With the cost of backups being as high as it is and with restorations from tape being admin level work (and slow at that) - I know from experience that this ability is extremely useful. I have also asked myself why this isn't implemented in any of the file systems available for Linux (in fact, I only know of its existence in Network-Appliances file servers)."

This discussion has been archived. No new comments can be posted.

What About A File System That Uses Snapshots?

Comments Filter:
  • Peter Braam (one of the Coda guys) and some others are working on a new distributed filesystem called InterMezzo [inter-mezzo.org]. Its intention is to provide Coda's features, but utilise the features (ie. journalling) and performance (cf. reiserfs [devlinux.com]) of the local filesystem on both the server and client.

    It is my hope that it will prove a lot better integrated with Linux(-based GNU systems :-) than Coda. If it fulfills it promise, I have at least a hundred machines which I am looking to install it on.

    Matthew.

  • From what I know of WAFL (the Netapp filesystem) snapshots should have been a fairly easy thing to implement.

    WAFL is a "write anywhere" filesystem. Greatly simplified, this essentially means that when a disk write (don't confuse this with a filesystem write, which will usually be cached and not hit disk for a while) occurs, the newly written block will be written in the free block nearest to the current disk-head position.

    When the transaction is complete (be it a write, a file extension, or whatever) the original data block (which will be located elsewhere on disk) will be released.

    When your filesystem works that way, you already have a lot of code which deals with the possibility that block N of a file may be written in multiple locations. From there, it's not unusually difficult to handle the case that block X on disk may hold data for multiple files. (The cases really aren't as independant as you might think.)

    So, rather than returning obsoleted blocks to the pool of available blocks immediately, a copy-on-write strategy is used to hold chosen versions around without wasting enormous amounts of space with copies or (more importantly) I/O bandwidth with copying.

    There was a thread on the reiserfs list about copy-on-write a while ago (prompted by the silly "Microsoft invents symlinks" article here). Hans Reiser has said that he would like have efficient copy-on-write for reiserfs, though it's not a priority. He has also said in the past that what he calls a "wandering log" (which will boil down to the same thing as write-anywhere) is fairly high up the TODO list for reiserfs.

    Don't expect to see this sort of stuff done well on top of a traditional Unix filesystem like FFS or ext2, though. Even with journalling, ext2 is, I think, too far from the requirements of this sort of thing to provide a useful base.

    Matthew.

  • We have developed a filesystem OBDFS (based on ext2) which HAS snapshot capability, among other things. Its GPL, and available at http://www.lustre.org/ [lustre.org] for download.

    What we have done is split the ext2 filesystem into two layers:

    • the VFS interface, which knows about files and directories
    • the disk interface, which knows about files, blocks, and bitmaps

    What this allows is for the on-disk data handling to be abstracted from what the mounted filesystem looks like, so we can do RAID/mirroring, snapshots, RPC (remote access like NFS), encryption, etc, just by stacking a small mid-level driver between the filesystem and the disk. Currently we have snapshots implemented, and RPC is in progress. The snapshots let you mount several "versions" of your filesystem, with all of the older copies being read-only filesystems.

    It is definitely NOT for production yet, but OK to play with. A regular OBDFS filesystem can be mounted as ext2 and vice versa, but a snapshot filesystem cannot be mounted/fscked by the normal ext2 tools, unfortunately, although this may change in the future.

  • On a similar note, the Coda network filesystem offers a "disconnected" mode on the client, in which changes can later be "reintegrated" with the server. See http://www.coda.cs.cmu.edu/ [cmu.edu].

    I too would really like to see other filesystems have this capability.
  • check out LinLogFS [tuwien.ac.at].

    from the FAQ: [tuwien.ac.at]

    What is a Log-Structured Filesystem, Anyway?

    A Log-Structured Filesystem (LFS) brings database-like semantics to disk writes. Disk writes are performed in a way that they are either recognized as "completed as a whole" or "hasn't happened at all" even in the case of a power failure/OS crash... whilst a write is in progress.

    Basically, this can be achieved by writing to the disk in an append-only manner. So every change to the filesystem causes the updated information to be appended at the end of the log (i.e: no "update in place").

    So, an LFS can offer you a variety of advantages:

    • Fast crash recovery.
    • Easy support for logical volume management (resizing partitions)
    • Undoing (even multiple levels of) filesystem changes
    • Increased write performance (since all writes are performed in large, sequential chunks).
    Is a Log-Structured Filesystem the Same as a Journaling Filesystem?

    Acutally, these two terms are often used as synonyms, but they aren't. Journaling can be viewed as a way that allows you to add fast crash recovery capabilities to "classic" filesystems without making big changes to the existing filesystem. This is done by applying the logging approach to filesystem metadata changes. Journaling filesystems put things like "block x allocated to file y" into a log that they use for replaying filesystem changes after a crash.

    greetings, eMBee.
    --

  • And a linux port is planned. OTOH, it's commercial and expensive. But if you've gotta have it...
  • by Anonymous Coward

    LVM (logical volume manager) has snapshot support. LVM is included in 2.3.47 and later.

    http://linux.msede.com/lvm/ [msede.com]

    -- caudle
  • Although I don't actually know whether anyone's working on this for Linux, I'd just like to chip in and say that it would be an extremely useful feature.

    The Snapshot feature on the NetApp [netapp.com] filer boxes (which are highly recommended, btw) is described here [netapp.com] - for a simple idea, it's extraordinarily useful, and it's saved my hide a couple of times.


  • I'm fairly sure that Tru64 UNIX's AdvFS (from Compaq/Digital) provides such snapshotting.

    --LP
  • by Anonymous Coward on Monday March 20, 2000 @11:43AM (#1190901)
    Just played around with this problem.
    I think the following "rdist" script works
    and can be placed in a cron job:

    rdist -f snapshot.rdist

    #------------------snapshot.rdist--------------- -
    ~ -> localhost
    install -oremove ~/.snapshot/.current ;
    except ~/.snapshot ;
    cmdspecial ~ "DATE=`/bin/date +\"\%%Y-%%m-%%d.%%T\"` ; cp -al ~/.snapshot/.current ~/.snapshot/snapshot.$DATE" ;
    #----------------------------------------------- --

    The first time this is run, it has the overhead of creating a full duplicate of the user's home directory. It then creates a second (visible) copy that is nothing but hard links to the first copy. (Minimal space requirements).
    This copy is named with a date-stamp.
    On subsequent runs, only changed files will be replaced, but the hard links in the date-stamped directories will preserve the older files in their archives. The only additional space required is the files that have changed plus a whole mess of hard links.
    Because of all the stat comparisons, this method isn't suitably efficient for entire systems, as a specialized filesystem would be, and it also preserves entire files, rather than just the blocks that have changed. I suppose something like xdelta could be incorporated to further reduce the space overhead.

    Modes and ownership are preserved, so the user must be careful not to modify the files in the snapshot directory.
    One nice thing about using rdist is that the destination "backup" directory can be on a different machine, hidden from the user if desired.

Top Ten Things Overheard At The ANSI C Draft Committee Meetings: (5) All right, who's the wiseguy who stuck this trigraph stuff in here?

Working...