You mean so all the windows on my XP machine have a different look and the "minimize" buttons don't line up when I stack multiple windows on top of each other? No thanks.
At a glance, I thought that the article title meant that Firefox 4.0 was going to be based upon the Chrome browser, and therefore Webkit... no such luck, I guess. A browser which has full compatibility with the Firefox legacy of plug-ins, and runs on the Webkit rendering engine would almost certainly replace Safari as my default browser on both my Macintosh and my PC -- and I would hazard a guess that I'm not the only one who could say this. What's more, then the "browser wars" would effectively be whittled (back) down to a boxing match between Internet Explorer and Webkit, instead of this wild-and-crazy-free-for-all that's been going on ever since Netscape gave up the fight and sold out to AOL. Maybe then, the collective market share of all of these webkit-based browsers might drive web development more strongly to a "standards centered" philosophy of design and away from the "IE workaround" philosophy of design.
Ah, well. A guy can dream, can't he?
4.1.1. Data Integrity in HDFS
HDFS transparently checksums all data written to it and by default verifies checksums when reading data. A separate checksum is created for every io.bytes.per.checksum bytes of data. The default is 512 bytes, and since a CRC-32 checksum is 4 bytes long, the storage overhead is less than 1%.
Datanodes are responsible for verifying the data they receive before storing the data and its checksum. This applies to data that they receive from clients and from other datanodes during replication. A client writing data sends it to a pipeline of datanodes (as explained in Chapter 3), and the last datanode in the pipeline verifies the checksum. If it detects an error, the client receives a ChecksumException, a subclass of IOException.
When clients read data from datanodes, they verify checksums as well, comparing them with the ones stored at the datanode. Each datanode keeps a persistent log of checksum verifications, so it knows the last time each of its blocks was verified. When a client successfully verifies a block, it tells the datanode, which updates its log. Keeping statistics such as these is valuable in detecting bad disks.
Aside from block verification on client reads, each datanode runs a DataBlockScanner in a background thread that periodically verifies all the blocks stored on the datanode. This is to guard against corruption due to "bit rot" in the physical storage media. See Section 10.1.4.3 for details on how to access the scanner reports.
I judge a religion as being good or bad based on whether its adherents become better people as a result of practicing it. - Joe Mullally, computer salesman