Slashdot readers might be interested in the following story about how we recovered from an interesting disk corruption issue.
One of the instruments in PLATO used four 500GB hard disk drives in external firewire boxes — model AE5SACSUF from Addonics Technologies. The disk drives appeared to work perfectly during testing, but in Antarctica we noticed data corruption, and it was then that we realized that there was a maximum 127GB limit due to firmware problems in the Oxford Semiconductor 911 chipset. The Addonics website claims that you can get around the limit by simply creating a number of 127GB partitions. However, we found this not to be the case.
We originally noticed the problem following a file system check. The fsck showed errors, which we let it correct, but subsequent checks showed even more errors, and so on, until the filesystem rapidly
To understand the problem we wrote a short program to write a diagnostic block to every block on the disk. The block simply contained the number of the block being written to. Upon reading the data back, we discovered that about 1% of the time some blocks were written with the wrong number, and multiple reads of the same block would sometimes give different results. Furthermore, the problem appeared to be dependent on the previous history of reading/writing the disk.
So, we had four 500GB disks that would unpredictably read/write data from/to the wrong blocks. We spent a week trying to see if there were any patterns in the errors, and there was some interesting bit-twiddling going on between the desired and actual block numbers. However, a consistent pattern didn't emerge.
We had hoped that since the problem had first been identified back in 2001 that we would at least be able to find out what the firmware issue was, so that we could work around it. However, Oxford Semiconductor were completely uninterested in providing any assistance. IMHO it is fairly extraordinary that you can buy Firewire enclosures in 2008 that have unpatched firmware bugs from 2001.
Anyway... what to do? Remember that our disks were now alone at Dome A, with no chance of human intervention on-site for 12 months.
Those of us who are old enough to remember the VMS tape backup utility will recall that it was legendary how it could recover from all sorts of media errors. You could cut and re-splice the tape and backup would fix it for you.
Perhaps there was something similar that could be used for disks?
Yes there is, par2. par2 is a program that creates "parchive parity files", which basically start with an original data file and add a user-specifiable amount of redundant parity information to it. You can then take a par2 file and delete, say, 20% of it, or reorder blocks in it, or corrupt parts of it, and par2 will be able to verifiably reconstruct the original file. Wonderful!
We wrote a perl wrapper for par2 that writes par2 archives to the disks in raw mode. We keep an index of the starting block number of each file, pointers to the next file, and various md5 checksums. We also stored multiple special blocks at the beginning and end of each par2 file on disk so that it would be possible to recover the index by scanning the disk.
This technique worked very well, and we were able to store about 400GB of compressed data per disk and accurately recover any file we wished despite the firmware occasionally reading/writing the wrong blocks.
In fact, we are so pleased with the par2 technique that we are thinking of using it for archiving data on disks that aren't afflicted with an Oxford Semiconductor 911 controller. It is much easier to reconstruct the data from a par2-encoded disk than it is from any standard filesystem. A single bad-block can result in substantial data loss with a normal filesystem — but with par2 you can have 10% or more of the disk unreadable and still recover everything.
[NOTE TO SLASHDOT EDITORS: I can't seem to be able to reset the tags. I just wanted "Science" and "Data Storage"]