Software Engineering Tidbits

Share this post

Blameless Postmortem

www.softwareengineeringtidbits.com

Discover more from Software Engineering Tidbits

Small (or sometimes big) tidbits about software engineering. This is where I share tips and learnings I acquired building, maintaining and supporting software in production at Airbnb, Uber and Microsoft.
Over 6,000 subscribers
Continue reading
Sign in

Blameless Postmortem

Georges El Khoury
Mar 22, 2022
1
Share this post

Blameless Postmortem

www.softwareengineeringtidbits.com
Share

Putting together a blameless postmortem culture is one of the most effective way to improve quality in an engineering organization.

One of the best manager I had once told us that the only sure way not to have an outage is not to deploy to production and I surely expect you to ship and deploy a lot of features. The only thing I care about once we have an outage is to put a postmortem and learn from it.

A good outline for a postmortem outage report:

Executive Summary
Timeline
Impact
Root Cause Analysis
Detection
Mitigation
Prevention
Next Steps

Writing a good report, sharing it broadly and following up on next steps will unlock a culture of continuous improvement in your organization.


Mastering Software Engineering Course on Maven

If you liked this article, I will be teaching a “Mastering Software Engineering” course on Maven where I will teach hard-learned lessons I acquired developing large-scale products at companies such as Uber, Airbnb, and Microsoft.

View Course


Thanks for reading Software Engineering Tidbits! Subscribe for free to receive new posts and support my work.

1
Share this post

Blameless Postmortem

www.softwareengineeringtidbits.com
Share
Comments
Top
New
Community

No posts

Ready for more?

© 2023 Georges El Khoury
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing