I find that file you cite very readable. It's well formatted, it's clear what the code is supposed to do etc., comments where necessary. Why do you think it's sarcasm?
Well, for one thing, it’s just code in a file. There is no obvious indication of how this code fits into the wider design of the program, because C doesn’t have much of a module/namespace system. (There are some comments right at the end that seem to be about build order dependencies, but it’s not clear to me what they are trying to achieve. I assume there is some sort of project standard that requires them.)
Next, consider the first function, xor_blocks. It appears to take about 20 lines of code just to call one of four other functions based on how many entries are in an array that was passed in. A significant proportion of the code is only there because the input arrived as a void** and a count rather than a typed array. The rest is repeating essentially the same pattern of code almost verbatim four times. It’s not clear whether the four do_N functions are completely different algorithms or just the same algorithm using defaults if there aren’t enough inputs provided. In the former case, you could express the entire function in about five or six lines in numerous other mainstream languages, most of which would just be a look-up table identifying the required functions. In the latter case, the entire 20+ line function would probably be redundant in many languages. And I see no reason another language that can express this kind of logic without the overheads shouldn’t generate code behind the scenes that is still 100% as efficient as the example.
A little further down, we start defining macros like BENCH_SIZE. When these are later used elsewhere, you can’t tell whether you’re working with a constant or a function call with side effects. (This is a big objection I have to complaints that C++ overloaded operators could do almost anything, coming from people who then argue that we should use C instead because everything is explicit.)
That brings us to the second big function, do_xor_speed, in which we again encounter our ambiguous struct containing function pointers and void* parameters. This time, we also use a magic number, rely on (presumably) a global variable and implicit side effects for the main loop control logic, apparently try very hard not to let that loop be optimised in some unspecified way, and cause various implicit side effects on some other (I assume) global variable.
The final major function, calibrate_xor_blocks, has similar issues, and further complicates things by interweaving local macro definitions that mean some of the code isn’t executed, or is executed but is immediately overridden anyway, as well as apparently obfuscating a simple function call behind another macro with a name that looks like a regular function itself.
Now, I do realise that a lot of this is how a lot of industrial C gets written in practice. I also realise that there are few realistic choices for a low-level, systems programming language today, and none that I know of has much better readability than C. But that doesn’t negate the criticism that the C code has fairly horrible readability/maintainability properties compared to what could be achieved in a more expressive language.