Postmortem Index

Explore incident reports from various companies

Category

Automation

Incidents caused or amplified by automated systems acting incorrectly, too aggressively, or without enough human review.

Postmortems
101
Companies
45
Years covered
31
Date range
Jun 1996 – Feb 2026
Title Company Date Other categories
GitHub Actions and Codespaces outage of February 2026 GitHub 2026-02-02 – 2026-02-03
Cloudflare systemwide outage, November 18, 2025 Cloudflare 2025-11-18
incident.io service disruption during AWS us-east-1 outage on October 20, 2025 incident.io 2025-10-20
LaunchDarkly service disruption due to AWS us-east-1 outage and internal cascading failures (October 2025) Launchdarkly 2025-10-20 – 2025-10-21
Amazon DynamoDB US-EAST-1 outage of October 2025 Amazon 2025-10-20
Global Google Cloud API outage due to Service Control null pointer exception Google 2025-06-12 – 2025-06-13
CircleCI workflows latency and failures on April 4, 2025 CircleCI 2025-04-04
CircleCI UI and build capabilities disruption on April 4, 2025 CircleCI 2025-04-04
GitHub DNS infrastructure failure and service degradation on October 11, 2024 GitHub 2024-10-11 – 2024-10-12
GitHub.com database configuration change causes 36-minute outage GitHub 2024-08-14
CrowdStrike Falcon Content Update Incident of July 2024 CrowdStrike 2024-07-19
GitHub Copilot degradation on July 13, 2024 GitHub 2024-07-13
Cloudflare 1.1.1.1 lookup failures on October 4, 2023 Cloudflare 2023-10-04
GitHub availability incidents May 9-11, 2023 GitHub 2023-05-09
Datadog Infrastructure Connectivity Issue March 2023 Datadog 2023-03-08 – 2023-03-10
Cloudflare service token incident on January 24, 2023 Cloudflare 2023-01-24
Intermittent downtime from repeated crashes incident.io 2022-11-18
BigQuery Storage WriteAPI elevated error rates in US Multi-Region Google 2022-10-13 – 2022-10-14
Cloud Filestore ListInstances API failed with error code 429 globally Google 2022-09-13
Google Cloud europe-west2 outage due to cooling system failure Google 2022-07-19 – 2022-07-21
Google Cloud Networking, Storage, and BigQuery reduced capacity for lower priority traffic Google 2022-07-15
Cloudflare outage on June 21, 2022 Cloudflare 2022-06-21
Google Cloud Networking packet loss May 2022 Google 2022-05-20
Heroku April 2022 security incident Heroku 2022-04-07 – 2022-04-14
Atlassian April 2022 customer site deletion outage Atlassian 2022-04-05 – 2022-04-18
GitHub mysql1 cluster repeated service disruptions (March 2022) GitHub 2022-03-16 – 2022-03-23
Slack’s Incident on 2-22-22 Slack 2022-02-22
AWS US-EAST-1 Internal Network Congestion on December 7, 2021 Amazon 2021-12-07 – 2021-12-08
GitHub November 2021 Availability Incident due to MySQL Schema Migration Github 2021-11-27
Google Cloud Networking and Load Balancing outage of November 2021 Google 2021-11-16
Google Cloud Networking issues in Europe and other regions on November 12, 2021 Google 2021-11-12
Roblox 73-hour outage due to Consul and BoltDB issues (October 2021) Roblox 2021-10-28 – 2021-10-31
Fastly global outage of June 8, 2021 Fastly 2021-06-08
Delay in starting Docker Jobs. Machine & remote Docker environments blocked CircleCI 2021-05-21 – 2021-05-22
GitHub Actions and Pages impacted by scoped token INT32 overflow GitHub 2021-05-16
Slack Outage on January 4th 2021 Slack 2021-01-04
Amazon Kinesis US-EAST-1 outage November 2020 Amazon 2020-11-25 – 2020-11-26
Cloudflare API and dashboard availability incident on 2020-11-02 Cloudflare 2020-11-02
GitHub background job system degraded availability October 2020 GitHub 2020-10-09 – 2020-10-10
Datadog US region infrastructure connectivity issue DataDog 2020-09-24 – 2020-09-25
GitHub February 2020 mysql1 service disruptions GitHub 2020-02-19 – 2020-02-27
Amazon EC2 and EBS Issues in Tokyo (AP-NORTHEAST-1) on August 23, 2019 Amazon 2019-08-23
Google Cloud Network Outage in Eastern USA, June 2019 Google 2019-06-02
Mandrill Postgres XID Wraparound Outage February 2019 Mandrill 2019-02-04 – 2019-02-05
Elastic Cloud AWS us-east-1 outage of February 2019 Elastic 2019-02-04
GitHub October 2018 Service Degradation due to MySQL Failover GitHub 2018-10-21 – 2018-10-22
Gentoo GitHub Organization compromise of June 2018 Gentoo 2018-06-28 – 2018-07-03
Travis CI database truncation and cross-account session exposure Travis CI 2018-03-13
Travis CI production database truncation TravisCI 2018-03-13
Fortnite service outages of February 3-4, 2018 Epic Games 2018-02-03 – 2018-02-05
GoCardless API and Dashboard outage on 10 October 2017 GoCardless 2017-10-10
Amazon S3 US-EAST-1 outage of February 2017 Amazon 2017-02-28
Instapaper AWS RDS MySQL 2TB File Size Limit Outage Instapaper 2017-02-09 – 2017-02-14
GitLab.com database outage of January 31, 2017 Gitlab 2017-01-31 – 2017-02-01
Buildkite outage of August 22nd, 2016 Buildkite 2016-08-22
Platform.sh EU region outage of August 2016 Platform.sh 2016-08-18
Reddit outage and degraded performance on August 11, 2016 Reddit 2016-08-11 – 2016-08-12
Travis CI GCE base image deletion Travis CI 2016-08-09
AWS Sydney Region EC2 and EBS power disruption Amazon 2016-06-05
Google Compute Engine global connectivity loss April 2016 Google 2016-04-12
GitHub January 28th, 2016 datacenter power disruption GitHub 2016-01-28
Sentry hosted Postgres XID wraparound outage Sentry 2015-07-20 – 2015-07-21
Azure Storage service interruption November 2014 Microsoft 2014-11-19
BrowserStack security incident due to Shellshock vulnerability on prototype machine BrowserStack 2014-11-09 – 2014-11-10
Strava upload outage Strava 2014-07-29 – 2014-07-30
Untitled postmortem Gitlab 2014-07-08
GitHub DDoS attack of March 2014 GitHub 2014-03-11
GitHub DNS Outage on January 8, 2014 GitHub 2014-01-08
AWS SA-EAST-1 Availability Zone Power and Network Incident, December 2013 Amazon 2013-12-18
MongoHQ security breach impacting CircleCI customer data CircleCI 2013-10-27 – 2013-10-30
Stackdriver Intelligent Monitoring application outage on October 23, 2013 Stackdriver 2013-10-23 – 2013-10-26
Healthcare.gov launch failure Centers for Medicare & Medicaid Services (CMS) 2013-10-01 – 2013-12-31
PagerDuty notification dispatch system outage of April 2013 Pagerduty 2013-04-13
Cloudflare global outage due to router configuration error Cloudflare 2013-03-03
Amazon ELB Service Event in US-East Region on December 24, 2012 Amazon 2012-12-24 – 2012-12-25
GitHub.com outage of December 2012 GitHub 2012-12-22 – 2012-12-23
GitHub network problems on November 30, 2012 GitHub 2012-11-30 – 2012-12-01
AWS US-East Region Service Event of October 22, 2012 Amazon 2012-10-22 – 2012-10-23
Netflix's response to October 2012 AWS EBS degradation Netflix 2012-10-22
GitHub.com availability issues in September 2012 GitHub 2012-09-10
Knight Capital SMARS algorithmic trading incident of August 2012 Knight Capital 2012-08-01
Knight Capital SMARS deployment incident Knight Capital 2012-08-01
AWS US East-1 power failure and service disruption in June 2012 Amazon 2012-06-30
Amazon EC2 and Amazon RDS Service Disruption in US East Region Amazon 2011-04-21 – 2011-04-24
Ariane 5 Flight 501 launch failure of June 1996 European Space Agency 1996-06-04
Amazon EC2, EBS, and RDS EU West Region Service Event Amazon
Untitled postmortem Dropbox
Untitled postmortem Elastic
Foursquare MongoDB memory exhaustion outage Foursquare
GitHub availability incidents in February and March 2026 GitHub
Untitled postmortem Gliffy
Untitled postmortem Google
Google Cloud GCVE deletion incident impacting UniSuper Google
Google Code Jam 2014 Repeated Email Incident Google
Honeycomb operational burden and scaling issues in September and October Honeycomb
Multiple Slack service disruptions in October 2014 Slack
Untitled postmortem PagerDuty
Subversion SHA1 Collision Affects WebKit Repository WebKit code repository
Summary of the Amazon DynamoDB Service Disruption and Related Impacts in the US-East Region Amazon
Supermarket Intermittent Unresponsiveness Chef.io
Untitled postmortem WebKit code repository