Postmortem Index

Explore incident reports from various companies

BBC Online outage on Saturday 19th July 2014

BBC Online · BBC iPlayer, BBC iPlayer Radio, bbc.co.uk audio and video playback, BBC homepage

2014-07-19 – 2014-07-21

On Saturday, July 19, 2014, at 9:30 AM, BBC Online experienced a serious incident affecting several key services. The initial problem involved a metadata service, which saw its database load spike, causing requests to application servers to fail. Simultaneously, a caching layer pool failed, rendering products like BBC iPlayer and the BBC homepage inaccessible.

The metadata service, comprising 58 application servers and 10 database servers, provides program and clip metadata for various BBC iPlayer applications and other BBC Online sites. The sudden increase in database load led to widespread request failures, particularly impacting older applications that did not cache metadata locally.

Concurrently, a critical caching layer, which sits in front of most BBC Online products, suffered a complex failure. This failure directly impacted major services such as BBC iPlayer and the BBC homepage, making them inaccessible to users. The exact root cause of this caching layer failure was still under forensic investigation at the time of the report.

Remediation efforts began with prioritizing the restoration of the caching layer. Significant remedial work was undertaken on the metadata service throughout Saturday afternoon and evening. While some services returned to a “walking wounded” state, allowing close to normal operation for much of the site, BBC iPlayer remained unavailable on several devices.

The services continued in a degraded state through the rest of the weekend. A full restoration was planned for Monday morning, July 21st, to avoid further risk during Sunday evening’s peak usage. The incident resulted in reduced availability for users to watch or listen to programs within the normal seven-day catch-up window, particularly for content aired on July 12th and 13th.

Keywords

bbc onlineiplayeriplayer radiometadatadatabasecachingoutagejuly 2014