Forgot your password?
typodupeerror
User Journal

cyan's Journal: Repairs

Journal by cyan

The server that runs RRX has been in the need of some repairs, lately. One of the drives in the RAID6 array failed, both CPU's were hitting their temperature thresholds way too early, and the kernel needed to be patched to fix the sendpage() vulnerability, among a few other things.

Since work has decided to bless me with a wonderful start time of 2:00am on Monday, I decided to switch up my sleep schedule last night so that it wouldn't be as much of a blow to my system. Thus, I was up at Midnight, and brought down the server at around 3:00am to do a variety of repairs.

The first thing I regretted was not getting a can of compressed air. No doubt part of the problem with the temperature thresholds was that dust caked the fins of the heat sink, causing the air flow to be less efficient. I took those puppies outside and blew on them until I was blue in the face. Unsurprisingly, the old thermal compound between the CPU and heat sink was dry and cracked. So, I removed that with some isopropanol and q-tips, and re-applied some fresh Arctic Silver.

I had always been a little afraid to take the CPU assembly apart, since server boards can be a little trickier to disassemble than desktop boards, but it turned out to be pretty simple. The fans needed a good blowing out as well, what with thick dust collected on the insides and along the fan blades. Many a q-tip was sacrificed to clean those.

My 3ware RAID controller was beginning to creep out of the PCI-X socket, too. With that re-seated, I fired the machine up. There's always a small risk that you blew something in the disassembly, or the reassembly, so powering up is always the most exciting part. All it takes is just a bit too much downward force on a CPU, or some isopropanol dripping where it shouldn't, and I'd be going off to Memory Express to buy a new motherboard as soon as they opened.

Fortunately, the machine just worked. Next was a quick patch and recompile of the kernel, followed by the removal and replacement of the bad hard drive. After that, I finally organized the RAID hot-swap bays to be in alphabetic order (i.e., sda at the top, sde at the bottom.)

It felt good to get all of these little maintenance items done in one go. This server still has a ton of life left in it. RAM, CPU, and disk usage are all below 25% on average, so it's going to be a few years yet before it needs to be replaced. Considering I bought the thing four or five years ago, that's not bad value for money at all.

Opportunities are usually disguised as hard work, so most people don't recognize them.

Working...