Comment Re:FTFY (Score 4, Interesting) 629
I'd expect a systems admin to be able to diagnose a problem like that -- not that ours can. But most programmers I meet can't. They'll be trying to fix their code all day long when their system has bad ram.
Our customers have the same problem. They'll be asking why our software is slow on "just this one node". Telling us to "fix the bug".
I have to look through system call timings, application logs, kernel messages, kernel dev tools blah blah to give them evidence of what I already know. "it's a hardware problem. It seems this is a known failure pattern in the linux kernel for cache coherency errors betwen SMP cpus".. or whatever. We're an application vendor. I guess these companies spend enough money with us that it's worth it to my employer for me to play tinker-toy remote systems admin for them via proxy of systems debugging.
I get roped into these problems because no one else on my team can figure them out.
It pays.