Comment: Re:Too funny reading these comments (Score 1) 753
Assume a switch with eight possible cases, and non-linear indices that do not allow for a simple jump table optimization. Two of them may be vastly more common than the other six, and PGO can take them out of the switch, or reorder them to the top. You can also do this by hand, but it will make your code messier and harder to maintain.
Or take the frequency of code execution. Two routines may execute a lot, so they are stuck together in memory to aid in caching. But it may make no sense to have these two functions anywhere near each other
PGO also frequently decides when to inline things or when not to, and even telling the compiler when to inline isn't guaranteed. You'd have to resort to compiler-specific attributes (always_inline/__forceinline.) And the list goes on and on.
Taking your argument to the extreme, hand-writing your code in pure assembler can get you even more speed, and now you don't even need -O3 for decent speed. But there's a reason people use high-level languages. PGO is just one more tool to make your code more stable and easier to maintain.