Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?

Comment Re:Already in Linux and FreeBSD (Score 1) 342

You should ALWAYS have backups.

What's interesting is how much people like to trash ZFS as being unstable, referring to all of the reports of catastrophic data loss.

I've read a bunch of these reports of data loss, before I decided to use ZFS on OpenSolaris for my new fileserver.

When you refuse to follow the repeated warnings of the vendor about best practices and you lose your data, it's not the fault of the vendor or the technology; it's yours.

I've read a ton of documentation/articles/manuals/guides which state that many USB enclosures when given the "cache flush" command, will lie. They report completion of the flush when in fact the cache still has data in it. This can lead to the updated uberblock being corrupted and catastrophic pool failure. USB is not only not recommended, there are warnings about it's use.

Virtual Box by default has the issue as well. Other VMs may not honor write cache flush commands as well.

And yet, so many morons out there are still using USB enclosures and whining loudly when they lose everything. Others using VMs which don't honor cache flush commands are doing the same.

I have a bare-metal OpenSolaris install which is using it the way it was intended. I've actually read the documentation. I follow the best practices.

Comment Re:Already in Linux and FreeBSD (Score 1) 342

ZFS in OpenSolaris is more than just a File System.

It's the entire storage system stack AND the associated file system.

This is why there are TWO major commands that you use to administer it; "zpool" and "zfs". "zfs" is used to deal with the file system. "zpool" is used to deal with the equivalent of the volume manager.

While the file system is supported by Linux, the stack isn't. The stack is one of the things that makes ZFS great.

While the ZFS stack has been described as "a rampant layering violation", it's the fact that it's not obsessively partitioned that gives it some pretty incredibly useful abilities.

The stack offers end to end checksums. Data leaving the drive is checksummed from the time it leaves the drive until it hits RAM, where hopefully ECC takes over in providing data integrity.

Rebuilding a traditional RAID5 or RAID6 takes the same amount of time even if it's 10% full, due to the fact that the VM doesn't know about files as it only deals with volumes (partitions). Rebuilding a RAIDZ, RAIDZ2, or RAIDZ3 which is 40% full takes 1/2 the time of one that is 80% full, due to "rampant layering violations". :) If a drive starts racking up unrecoverable read errors leading to corruption, ZFS detects this (on scrub or access) and will replace bad blocks provided there's a redundant copy or parity. This does not require a rebuild; the amount of time it takes depends on the amount of corruption. Triple parity RAID support was recently added.

Mirrors under ZFS write at single drive speed, but read at the speed of a stripe. If you have a co-located server, where you have to pay more for after hours access and for rack units; you can set up a 3 way or 4 way mirror which will provide extremely high data redundancy with RAID-0 read speeds. How much of a web servers disk IO load is writes depends on what it's serving, but generally it is overwhelmingly read heavy so for one of my friends this is extremely useful.

ZFS offers storage pools as the basic volume structure. Storage pools consist of a number of RAIDs, mirrors, single drives, or iSCSI shares. Data is striped across all devices currently in a pool when the data is written. Online Expansion of storage space is accomplished this way with a single "zpool add" command very quickly. Adding storage is painless the way JBOD often is, with the performance of striping. The downside is that once you've added a mirror, RAID, or drive to a pool; it's there for keeps. Support for removal was supposed to be added at some point...

Unlike anything else I've ever used, when you create a volume/partition/filesystem on a zpool, you don't specify a size. The size of a filesystem is dynamically reallocated. No more having to manually manage the amount of space allocated to a given partition. The file system will take the amount of space that it's contents require; filesystem limits/quotas are available.

The ZFS stack supports a high speed zpool cache called a ZIL (writes), and one called L2ARC (reads). This allows for huge performance increases using one or more solid state drives. Seagate is doing something similar in hardware with their new Momentus XT laptop drive, which according to Anandtech is comparable to the WD Velociraptor.

Referring to ZFS as a filesystem alone, or using it via FUSE neglects 90% or more of what ZFS under OpenSolaris offers. ZFS as a filesystem alone is NOT compelling. It's the whole stack that makes it incredible.

Slashdot Top Deals

I have a very small mind and must live with it. -- E. Dijkstra