A post-mortem (also called an incident review or retrospective) is a structured analysis conducted after an incident has been resolved. Its purpose is to understand what happened, why it happened, what was done to resolve it, and what can be improved to prevent similar incidents in the future.
A thorough post-mortem document typically includes a timeline of events, the root cause analysis, the impact on users and business, what went well during the response, what could be improved, and a list of action items with owners and deadlines.
The most effective post-mortems follow a blameless culture — focusing on systemic improvements rather than individual fault. This encourages honest reporting and knowledge sharing. Google's SRE practices and many modern engineering organizations champion blameless postmortems as a core practice for building reliable systems.