Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?
DEAL: For $25 - Add A Second Phone Number To Your Smartphone for life! Use promo code SLASHDOT25. Also, Slashdot's Facebook page has a chat bot now. Message it for stories and more. Check out the new SourceForge HTML5 Internet speed test! ×

Comment Re:Death knell (Score 2, Informative) 361

In the link you posted, the admin found three uberblocks (there are supposed to be four). ZFS correctly made multiple uberblocks, per design. It appears that all three were corrupt.

ZFS keeps a history of the last 256 uberblocks in four different places in the pool. So even if all copies of the most recent uberblock got corrupted, it could still fall back to one of the older ones. You'd maybe loose the last few minutes of work, but that's not nearly as catastrophic as loosing the whole pool. It could fall back, but it doesn't, it rather panics the kernel. This is where a userspace fsck would help, for examply by giving you the choice to safely invalidate the last uberblocks. I was not feeling very comfortable when I wrote the dd/ud script that automated that task, but I had nothing to loose at that point.

The silent corruption was just an example. It doesn't matter what causes the corruption. But _if_ you end up in a situation like the admin or me, you have to resort to such ugly tricks to recover your pool. And that is something I'm not willing to accept on a production system - or any system at all for that matter.

Comment Re:Death knell (Score 2, Interesting) 361

Even if you have a 100% bugfree drive firmware, silent data corruption is still possible (resilience against silent data corruption is one of ZFS's selling points!). Filesystems that can't handle that simply have no place in todays world. The problem is that one flipped bit can cause ZFS to think that the whole pool is unusable - even though it keeps redundant copies of the metadata which it then completely ignores! What for does it keep the copies then?

Comment Re:Death knell (Score 5, Insightful) 361

Every disk will corrupt eventually, it's just a matter of time. Not even the best hardware will help you there. So the question is, how well does the filesystem catch these errors and correct them. It turns out, ZFS is really bad at this, as it can get into a state where you can't even import the pool (where zpool either stops with an error and in worse cases causes a kernel panic). There have been numerous bug reports on the zfs mailing list and the opensolaris bug tracker. So far nobody seems interesting in fixing those. My pool got corrupted in such way. I had to manually poke around the filesystem and invalidate metadata until zpool was able to import the pool. Something that a 'fsck' could have easily done, but Sun refuses to create such tool because, according to them, ZFS is robust enough. All credits go to this guy who had the idea to invalidate the uberblocks directly on the disk: http://opensolaris.org/jive/message.jspa?messageID=318457#318457

Comment Re:Screen works welll (Score 1) 288

Of course two different users can't connect to the same screen session. That would be a big security risk. Look into /var/run/screen and you'll see that only you have access to the directory where your screen sockets are stored. Maybe it's possible to fiddle with the file permissions, but I wouldn't recommend it. Just create a new use that all the people have access to and tell them to use 'screen -x'

Submission + - .. this software has bugs.

no@bo.dy writes: I know of a pretty heavy bug (causes a crash) in very popular software which affects thousands of users. As I am a software developer myself I have found out why exactly it crashes (off-by-one error). I also know how to work around it, but the 'fix' is too complicated for the majority of users. The developer (big company) doesn't have a web form nor email address for bug reporting. I could send it to their 'technical support' or use a forum, but the success to have the bug fixed that way has proven to be marginal. The hardest thing is that the company doesn't give any sort of feedback about accepted bugs or their status. The bug has already been reported many times through the available channels, but I don't even know whether they are aware of it (may have been overlooked etc). How do I best tell the developer? What are my options (besides of making all relevant data public and hoping that the public pressure will force the company to fix it). Thanks.

Slashdot Top Deals

FORTUNE'S FUN FACTS TO KNOW AND TELL: A giant panda bear is really a member of the racoon family.