Yup. I can't believe that there were such dodgy trade-offs made for SPEED (at the expense of code readability and complexity) in openSSL.
At least a couple of reasons:
- First of all, OpenSSL was designed with much slower hardware in mind. MUCH slower. And much of is in still in use - embedded devices that last for 15+ years, for example.
- Then there's the problem that while you can dedicate your PC to SSL, the other end seldom can. A single server may serve hundreds or thousands of requests, and doesn't have enough CPUs to dedicate one to each client. Being frugal with resources is very important when it comes to client/server communications, both on the network side and the server side.
Certain critical functionality should be written highly optimized in low level languages, with out-of-the-box solutions for cutting Gordian knots and reduce delays.
A problem is when you get code contributes who think high level and write low level, like in this case. Keeping unerring mental track of what's data, pointers and pointers to pointers and pointers to array elements isn't just a good idea in C - it's a must.
But doing it correctly does pay off. The often repeated mantra that high level language compilers do a better job than humans isn't true, and doesn't become true through repetition. The compilers can do no better than the person programming them, and for a finite size compiler, the optimizations are generic, not specific. And a good low level programmer can take knowledge into effect that the compiler doesn't have.
The downside is a higher risk - the programmer has to be truly good, and understand the complete impact of any code change. And the APIs have to be written in stone, so an optimization doesn't break something when an API changes.