Your notion of "optimizing for the hardware" is something that isn't real. According to your theory, Linux also shouldn't perform well because it also is hardware-agnostic.
As for what OEMs have to do, a modern mobile device is immensely complex, consisting of dozens of processors, many on the SoC (system on a chip) but many not. All of them have to be configured, which is a complex and tedious operation, and easy to get wrong -- and every custom board requires a custom configuration. In addition, there are drivers for all of the bits and pieces that have to be assembled and tested together. Plus there's also typically a complex, multi-stage boot process that has to be orchestrated to bring up all the bits and pieces of the hardware in the right way and in the right order. And other stuff that I don't know about because I'm not a hardware systems guy.
Some of the above doesn't depend on the OS, and can be done before it's available. But much of it does depend on OS requirements and has to wait.
And then if the OEM decides to customize Android they have to do that, with whatever skin, and default apps they want, plus whatever changes they need deep in the system to support the hardware and their changes to the software. Finally there is lots and lots of testing, because such complex, custom devices always expose new interactions between components that have to be debugged and fixed. Oh, and lots of hardware testing as well, including endless burn-in tests to validate that the stuff not only works but that some subtle design flaw doesn't stop it from working.
And I'm sure there's still more that I don't know about at that level as well.
Then they have to run Google's compliance tests, to find out what they've broken with all of their changes, or what they missed in configuring their device for proper support (actually, this is something they do throughout, not at the end), and then go back and fix what's broken until it passes... or else negotiate with Google for waivers on things they think should be okay.
Then comes carrier validation and testing, more rounds of fixes, etc.
Little or none of this has anything to do with "optimization". That's mostly the compiler's job, and it does that job well.