A friend of mine lost his job over a simmilar "automation" task on windows.
Upgrade script was tested on lab environement who was supposed to be exactly like production (but it turns out it wasn't - someone tested something before without telling anyone and did not reverted). Upgrade script was scheduled to be run on production during the night.
Result - \windows\system32 dir deleted from all the "upgraded" machines. Hundreds of them.
On the Linux side i personally had RedHat doing some "small" changes on the storage side and PowerPath getting disabled at next boot after patching. Unfortunate event, since all Volume Groups were using /dev/emcpower devices. Or RedHat doing some "small" changes in the clustering software from one month to the other. No budget for test clusters. Production clusters refusing to mount shared filesystems after patching. Thankfuly on both cases the admins were up & online at 1AM when the patching started and we were able to fix everything in time.
Then you can have glitchy hardware/software deciding not to come back up after reboot. RHEL GFS clusters are known to randomly hang/crash at reboot. HP Blades have sometimes to be physically removed & reinserted to boot.
Get the business side to tell you how much is going to cost the company for the downtime until:
- Monitoring software detects that something is wrong;
- Alert reaches sleeping admin;
- Admin wakes up and is able to reach the servers.
Then see if you can risk it.