First things first
Returning a service to a green status is the priority in an outage.
This mean looking first into rolling back recent changes or falling back to unaffected regions.
Avoid the temptation to forward fix or spending time to identify the root cause before you do this.
Software Engineering from the Frontlines Course on Maven
If you liked this article, I will be teaching a “Software Engineering from the Frontlines” course on Maven where I will teach hard-learned lessons I acquired developing large-scale products at companies such as Uber, Airbnb, and Microsoft.