I disagree.
1. Backups were stored on the same volume as live data, and were destroyed by the same command. I agree that is a bad design on the vendor's part, but dude's responsibility was to read and understand the system he was using, and he tacitly admits he didn't understand that:
This is the part that should be a red alert for every Railway customer reading this. Railway markets volume backups as a data-resiliency feature. But per their own docs: "wiping a volume deletes all backups."
2. No, I think you misread - he says he didn't understand the token's scope:
We had no idea — and Railway's token-creation flow gave us no warning — that the same token had blanket authority across the entire Railway GraphQL API, including destructive operations like volumeDelete. Had we known a CLI token created for routine domain operations could also delete production volumes, we would never have stored it.
3. DR !=backups. Disaster recovery is is ensuring you have a path back to operational health from disasters. It is a set of plans, procedures and assets that has to be rehearsed. We test our ours once a year; if you are not exercising your procedures, you don't have a DR plan.
Further, the "agent obtained the key itself" - from stuff it was allowed to dig through. It found the credential hardcoded in a script it has access to. This required three different fuckups to happen:
(1) They didn't understand the scope of the token - see above.
(2) They hardcoded the token (which they didn't understand to be 'root' scoped) in a script. This turns any disclosure into a full compromise.
(3) They obviously let the robot root around lots of stuff it shouldn't have access to. Even aside from the disaster that happened, that's an invitation for adversarial disclosure - if this didn't get them, something else would have at some point.
Replace the word "AI agent" with "rogue employee". Would you blame yourself for them going postal and burning your business down?
To start with the utterly obvious, an LLM is not a human, and if you attempt to substitute one for the other, you are necessarily taking responsibility for the robot's actions. This is the same logic as not leaving weapons laying around where kids can find them, except some do kids have the capacity to know better than to use them.
That aside, I do agree that in early-stage companies you're not going to have the safeguards you need to survive a rogue employee or carelessly deployed robot, except probably around the bank account. Which is all the more reason to to be careful and understand your tools, or pay someone to do that for you.
The industry is shoehorning this shit into every product and service out there despite multiple documented examples of safeguards not working.
Oh my god. Tech companies are exaggerating their capabilities. This is a never-before seen crisis - how can other companies possibly be expected to understand that advertised claims may not be accurate or products might even be dangerous? My faith in capitalism is crushed. Please pass me my High Noon beverage so I can drink it while driving my Ford Pinto as my kid uses their Samsung Galaxy in the back seat.