Filesystems with Transactions? 13
Bryan Andersen asks: "I'm looking for a filesystem that I can rollback all the changes made by a user to a given date/time. Are there any for Linux or *BSD, or is my only option to go to one of the NAS vendors? I want this so I can more easily cleanup after users trash all the files they can access. Yes I know this would mean I'd have to have much larger partition sizes, but I feel with disk prices the way they are I can't go wrong doing this." I'm not aware of any filesystems that can specifically do this, and I'm not quite up on my JFS knowledge to know if any of those can be adapted to this task without code changes. It would seem like the easiest way to do this would be to mirror the drives at set times (your "commit") and then a "rollback" would be a simple matter of restoring from those images. Of course, there may be just such a file system in the works that I simply haven't heard about yet. Have you?
Cacadeless roll backs? (Score:4, Insightful)
Hmmm, as soon as you talked about rolling back "trashed" files I immediately began thinking about some sort of optimistic validation protocol where transactions would attempt to write to the same file and one would roll back (like based on time stamp) but then I caught you just want a restore point for users and I'm wondering why the overhead? Why not just a backup like to external tape, or, as you suggest, added internal HD locations for backup?
Ok, so maybe having a file system handle the restoration rather than you might seem easy, but how hard is backup software?
somewhat unrelated, but... (Score:2, Informative)
What I did is have my computer backup every text file in my home directory (other than a list of patterns to exclude) using cvs, every 3 hours. This did not take up much space, because cvs only backs up the changes to a file.
Every time I did a full backup, I backed up everything including the CVS directory, and then emptied that directory.
It was really easy to set up, and I can dig up the script if anyone is interested.
How about using Samba VFS? (Score:3, Interesting)
OpenVMS (Score:1)
'nuff said.
And hey, you can make a KICKASS cluster of these! Forget that beowulf stuff... why not have a cluster that actually _does_ something?
--nbvb
Re:OpenVMS - ClearCase? (Score:1)
Isn't ClearCase based on the versioned file systems in VMS? It's based on something like that.
We have a transactional system built on top of ClearCase where I work. It's OK but I'm sure there are neater solutions to the problem.
I hate to do this, but (Score:2)
Re:I hate to do this, but (Score:1, Insightful)
Re:I hate to do this, but (Score:2)
Network Appliance (Score:3, Informative)
A place where this in quite clever is for stable, snapshot views of the filesystem for the backpup software to look at, while applications continue to use it.
Some thoughts... (Score:5, Interesting)
To implement it, you need to create three subpartitions. The naive implementation has three distinct areas, better implementations would interleave them somehow.
The first subpartition contains the live filesystem, and it could be *any* filesystem. It really doesn't matter - like the loopback FS, this approach creates a new virtual device that only cares about individual blocks.
The second subpartition contains a circular buffer with the *previous* contents of each block as it is written.
The third subpartition contains an index, one entry for each block in the second partition. Again, it would be a circular buffer on the disk. (Indeed, for performance it should be interleaved with the cache, e.g., one index block followed by the 256 cache blocks it represents, repeating.) The index contains the block number and the time it was updated. Alternately, you could store just the last block number and maintain a separate list containing time stamps and "last index written."
Write access is straightfoward - immediately before you write any block you copy the existing block into the circular buffer, update the index, then write the new block. This is not much different from regular journaling systems.
Read access is a bit more complex. If you are "live," you always read the live FS. If you have rolled back the FS, you check the index for the first update after the time in question. If it exists, you return the cached block. Otherwise you return the block from the live FS. But in practice you will undoubtably explicitly mount each rolled back version of the FS. With a fixed time, you can create a bitmap of changed blocks and quickly load the appropriate block. The driver would have to update this bitmap if the 'live' FS is also mounted. With a "delayed realtime" mount (e.g., showing changes as they occured 12 hours ago) you would update the bitmap from the index prior to each read.
Transactional Versioning? (Score:1)
You could also look into Oracle Intermedia, which lets you store files of any type in a transaction database.
Otherwise, you could kludge something with CVS/cron/tar. I've seen some interesting things done during the day with software mirrors splitting for backups (OpenVMS volume shadowing or Solaris Disk Suite), but you risk hosing open files at that point.
Best of luck.
Union mounts (Score:4, Informative)
I'm surprised nobody's yet mentioned union mounts, at least available in OpenBSD and FreeBSD.
The classical use for a union filesystem is to make a CD-ROM appear to be read-write. You mount the CD and then mount another partion on top of it with the union option. Any changes are made to the union-mounted partition.
The underlying filesystem doesn't have to be a CD-ROM, of course. Your problem could be quite easily solved with three disk partitions: two large enough to hold everything, and one large enough to hold the changes.
Start by mounting one of the large partitions and then union mounting the smaller one on top of it. If you need to roll back, simply unmount and newfs the union partition. When you want to commit, assume that wd1c and wd2c are your large partitions and wd3c is your small partition and do something like:
As an added bonus, the union-mounted filesystem can be mounted normally later and you only see the modified files.
Of course, if you're working with really large filesystems and time is critical, this is likely to be too slow for you.
b&