Postmortem Index

Explore incident reports from various companies

Amazon

A “network disruption” caused metadata services to experience load that caused response times to exceed timeout values, causing storage nodes to take themselves down. Nodes that took themselves down continued to retry, ensuring that load on metadata services couldn’t decrease.

Source | Edit Metadata | JSON file