Comment Re:Is Ohio shooting themselves in the foot? (Score 1) 94
That's why everything running in the cloud runs in containers on a cluster (Kubernetes or similar). If a physical server dies, the cluster control software just drops that server from the cluster. Load management then automatically moves the containers to the remaining servers in the pool. When enough servers are dead they send a tech and a truckload of replacements out. Same for storage: everything's on RAID arrays and as physical SSDs die the array drops that drive and keeps on going with no data loss. Once enough drives in the bay are dead they send someone to swap them out and the RAID controller takes care of initializing them and restoring data from the existing drives as required. It's not uncommon for 30% of the capacity to be out-of-service before replacements are ordered.
They still have to catch up to IBM's old mainframes though. Those you could go in during peak business hours and start pulling and replacing CPUs and memory modules and I/O controllers while everything was live and not disrupt anything.