who said that that flow (dev->qa->staging->prod) didn't happen?
how many people had bugs in prod that didn't show up in the previous steps? how many people had problems only hours after deploying to prod?
you can't not always test all conditions, dev and qa may not have the amount of users/access/info to really replicate a problem, staging may read the prod DB, but not trigger the conditions, maybe they are intermittent, require a special corner case, etc
in this case, a prod DB change is ALWAYS harder to test, and even if you replicate the DB, make the changes and still get the prod updates in real time, you may not trigger the issue. As they said, sometimes the file reach the 200 entries, other time it was lower. Everything can be fine until certain problem show up and you end with a cascading, runaway problem. staging may not have real life load for that runaway problem to show up!!
even adding canary deploy or traffic shifting deploy could help, but we already know that cloudflare already do that... but again, DB is harder, most DBs can't have a cluster of multiple masters with different configs, so it is hard to have a master DB in the old config and a master DB in the new config so both app versions can use their own DBs... much worse if those DB need to have performance!
so talk is easy, but there is not perfect solution.
Several years ago a datacenter went dark because the extremely redundant multi-ups, generator setup, different power sources, very complex setup that was build to never fail... failed due to a small problem that a simple setup would not care, but the extremely complex setup cause a small cascading issue and in the end totally failed. Yes, it may be possible to fix that, but maybe we are adding just another layer that can act strange and cause itself another cascading problem.
Similar the Iberian power failure a few months ago, all systems were distributed, redundant and like... a small issue and cascaded out of control
you can not predict all possible problems and the more complex the system is, the harder is to predict and control them