I don't necessarily hate the marketing concept of 'The Cloud', but I am fascinated by the business decisions and risk acceptance that organisations are willing to take. ie- the typical: "Demanding high availability and hot failover, instantaneous incident resolution, and 'we are your primary customer'... but also a low cost." I think that Amazon and their competitors *may* get there with their offerings, but until there is a bit more maturity, I expect to see more incidents like this.
My wild guess is that a change triggered this, which of course leads to why has the backout plan failed (and who signed off on the risk)? I can't imagine that this is not change related - otherwise there is a serious architectural design flaw here somewhere.