There are a few things that can cause tremendously widespread outages, essentially all of them network configuration changes. Actually deleting customer data is dramatically more difficult to the point of impossible - there are so many different services in so many different locations with so many layers of access control. There is no "one command" that can do such a thing - at the scale of a worldwide network of data centers there is no "rm -rf /".
Ah, but you fail to account for Google's incredible knack for building tools designed to do things at scale. Or put AI in things that don't need it.
The possibility Google will either manage to unleash a malicious AI on their infrastructure and/or develop a way to destroy a lot of data at scale quite efficiently or some combination of the two is far from zero.
"We deployed this private cloud with a missing parameter and it wasn't caught" is as different from "we wiped out all customer data" as hello world is from Kubernetes.
No one promised this "should be impossible". Did you confuse "we'll take steps to ensure this never happens again"?
You contend there's no global rm rf for a global cloud provider, but clearly a missing parameter can rm rf a customer in an irrecoverable manner.
The only half you're missing is... how every major cloud outage happens today... a bad configuration update. These companies have hundreds of thousands of servers, but they also use orchestration tools to distribute sets of changes to all of them.
You only need a command to rm rf one box, if you are distributing that command to every box.
Now sure, there are tons of security precautions and checks and such to prevent this! But pretending it's impossible is delusional. People do stupid stuff, at scale, every day.
The most likely scenario is a zero day in an environment necessitating an extremely rapid global rollout, combined with a plain, simple error.
And the most telling thing about most of these outages is that the provider later admits in their postmortem that they just didn't really understand how the system they made worked until it fell over and were forced to learn how it really works.
It's the sort of thing that used to keep me up at night.
The release process, monitoring checks, etc. for a customer's private cloud is generally significantly different from the release process for a global product. I'm not going to get any more specific for all the standard NDA reasons, but having worked for Google and Microsoft among others....no, the risk you describe doesn't translate from one to the other.
I understand you believe the checks cannot fail that catastrophically, and I agree that the likelihood they do is quite low.
But it can happen, and it only has to happen once. (Also FYI, telling me your work history just tells me you've drunk the koolaid, ain't proof you know more.)
There are a few things that can cause tremendously widespread outages, essentially all of them network configuration changes. Actually deleting customer data is dramatically more difficult to the point of impossible - there are so many different services in so many different locations with so many layers of access control. There is no "one command" that can do such a thing - at the scale of a worldwide network of data centers there is no "rm -rf /".