The optimization flags have nothing to do with CPU errata. You should know that.
Most of that is due to the compiler taking a few liberties with floating point correctness that may or may not work out OK.
From memory (and this is a few years ago). I do recall using the Intel compiler, to generate bits of binaries to make use of some pretty hard-core optimizations where a threaded application which uses SMID instructions could be used to access overlapping registers (compliant with the specification) and do simultaneous processing off those registers. It ran perfectly on Intel and Transmeta processors at the time, not so much when it came to AMD though.