Comment Re:Google doesn't need journaling? (Score 4, Interesting) 348
So there's a major problem with Soft Updates, which is that you can't be sure that data has hit the disk platter and is on stable store unless you issue a barrier operation, which is very slow. What Soft Updates apparently does is assume that once the data is sent to the disk, it is safely on the disk. But that's not a true assumption! The disk drive, especially modern ones with large caches, can reorder writes which are sent to the disk, sometimes (with the right pathological workloads) for minutes at a time. You won't notice this problem if you just crash the kernel, or even if you hit the reset button. But if you pull the plug or otherwise cause the system to drop power, data in the disk's write cache won't necessarily be written to disk. The problem that we saw with journal checksums and ext4 only showed up on a power drop, because there was a missing barrier operation, so this is not a hypothetical consideration.
In addition, if you have a very heavy write workload, the Soft Updates code will need to burn a fairly large amount of memory tracking the dependencies and burn quite a bit of CPU figuring out which dependencies need to be rolled back. I'm a bit suspicious of how well they perform and how much CPU they steal from applications --- which granted, may not show up in benchmarks which are disk bound. But if the applications or the large number of jobs running on a shared machine are trying to use lots of CPU as well as disk bandwidth, this could very much be an issue.
BTW, while I was doing some quick research for this reply. it seems that NetBSD is about to drop Soft Updates in favor of a physical block journaling technology (WAPBL), according to Wikipedia. They didn't get a reference to this, nor did they say why NetBSD was planning on dropping Soft Updates, but there is a description of the replacement technology here: http://www.wasabisystems.com/technology/wjfs. But if Soft Updates is so great, why is NetBSD replacing it and why did Free BSD add file system journaling alternative to UFS?