Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror

Slashdot videos: Now with more Slashdot!

  • View

  • Discuss

  • Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

×

Comment: Re:./ed !!!! Server Reboot Time? (Score 5, Informative) 596

by bradfitz (#11371245) Attached to: LiveJournal Servers Go Down
At this point all my whiteboards are full of boxes of each database cluster, the machines in that cluster, which have passed their checksum tests. (innodb checksums each 16k page), which replayed their replay/undo logs, where in binlogs each was writing/reading/executing etc...

So lots of waiting now on the checksum validators. I don't want to put a machine back in and find out in a week there was a database page that was corrupt because the battery-backed write-back cache on the RAID card didn't work as advertised. (which happens on about 95% of RAID cards, in my experience, because they're mostly crap, even the most expensive ones...)

Also whenever there's any doubt about something's integrity, we backup or snapshot the potentially corrupt version before operating on it. That operation can take time too.

It's going to be a fun night.

RAM wasn't built in a day.

Working...