Postmortem Index

Explore incident reports from various companies

Stackdriver

In October 2013, Stackdriver, experienced an outage, when its Cassandra cluster crashed. Data published by various services into a message bus was being injested into the Cassandra cluster. When the cluster failed, the failure percolated to various producers, that ended up blocking on queue insert operations, eventually leading to the failure of the entire application.

Source | Edit Metadata | JSON file