You don't seem to realize how the economics of this really works out. Nobody will set up a production run before hand and say "this line only needs to produce 3 usable cores". Nobody will do this because no fabrication process has 100 % yield... in fact, most cutting-edge runs have far less.
I don't think you realize how this works out in practice.
In practice there are multiple stages to testing.
And a part may be down-binned for numerous reasons.
1> Frequency - ie. some part of the chip doesn't function at the freq./voltage specified.
2> Power - ie. the part would function but it would consume more power than spec.
3> Functional - ie. some portion just doesn't work (ex. part of the cache is so messed up that the repairing mechanisms can't compensate, and have to disable that section of cache completely, so a part that has patterns for a 8M cache is used as a 4M cache. Or maybe one core doesn't have it's branch prediction operate properly or maybe one opcode doesn't give the right result in a certain corner-case, so one core is disabled.)
4> Supply/demand - ie. the actual yield is higher than the actual demand at the top bins, so parts have to be down-binned to meet the demand in the lower bins. This may mean that certain lots of parts - or parts with certain characteristics get run through a test program that automatically jumps to a lower bin if there is already more than enough supply at the top bins. (Testing is expensive, and if you can shorten test time by an average of 10% because on 30% of the parts shipped you shortened the test time by 30% it's a multi-million dollar benefit.)
If the parts get downbinned in an earlier stage of testing (because there are normally multiple stages) normally that portion that's disabled won't be tested at the later stages. For example, if you test at the wafer level, and determine that you need to downbin some parts because they're almost certainly going to consume too much power, you only test those parts at the lower frequency/core-count once they're in packages. Then Joe Q Hacker gets the part and re-enables a disabled core - he doesn't know how much that part was tested. It is quite possible that the core he re-enabled wasn't tested as thoroughly as the ones that were enabled when he got it. Since he's got a liquid cooled setup though, he doesn't have issues with the power dissipation - but maybe there was some other latent issue that was never even encountered.
Or maybe it gets disabled when it's socketted in a Credence Sapphire ATE (Automated Test Equipment) but the next stage is a more PC-like environment, and at that stage it already has a core disabled, so the 4th core doesn't get the full testing in that PC-like environment.
In your example you didn't put what the demand was - if the demand is 10% four-core, and 90% two-core, it makes sense that you meet demand by skipping over the four-core testing 3/4 of the time, and jump right to the 2-core testing. It'll save time, and that means it saves money because maybe you can get by with 8 ATE platforms instead of 10. And the code to implement that took maybe a month of engineering effort to implement and test (spread across 2-5 people), which is much much cheaper than even 1 ATE.
Without knowing the actual test-flow AMD uses and the yields, (neither of which will be revealed to the public) it's impossible to know how likely a core that was sold disabled is actually good, and how thoroughly that (disabled) core was tested before it was shipped out to customers.
Variables don't; constants aren't.