This back and forth is ignoring a critical point: that not all bugs are created equal, and not all systems fail in the same ways or have the same risk profile and scale. What if your REST service returns 500 for a user because of something you just released? Ok, that's bad if you just rolled it out to all your servers and it happens to all users. But what if the client always does 3 re-tries (as REST clients should do), and you only rolled it out to 5% of your servers? Now most clients are unlikely to see anything wrong at all, and it's obvious you should immediately pull back the release. In fact, the pull-back should be automatic as soon as it's observed that the failure profile is worse.
And regarding risk and scale, what if you have a banking application that is only used thousands of times a day, and compare that to a social network used thousands of times a minute? The risk of getting something wrong and tripping regulator ire is great in one case, while the risk of seeing some entries missing on your wall ranges from a little annoyance to unnoticeable. And the likelihood you'll actually see the problem quickly is huge on the social network, while it may not be so on the less-used app. The social network is obviously a good candidate for devops-style continuous-release systems, while the banking app would need more evaluation to see where the line is drawn.