BBC website down – Proxar has a better uptime than BBC

Today, 6th February 2013 just after 10 am GMT the website started experiencing intermittent problems. Based on our monitoring platform that checks BBC homepage every minute we have established that the BBC website was unstable between 10:15 and 10:23 followed by a full downtime from 10:23 to 10:27.

When managing large size dynamic websites, a number of factors have to be taken into consideration when discussing capacity and availability including:

  • CDN – content delivery network that mostly provides caching, but it can also mitigate a number of DOS attacks
  • Capacity of network devices including switches, routers, firewalls and Intrusion Prevention Systems
  • Resiliency of the network – in the ideal world each device should have a backup. It is obviously possible to setup various devices in Active-Active High Availability mode , but that often increases the risk of exceeding 50% of utilisation on both devices in Active-Active. The implication being,  that if one device fails, the other one is not capable of handling the load,. resulting in creating two single points of failure…
  • Caching – all the static resources should be cached, preferably on a separate layer than web servers/application servers. There are number of ways to implement caching including Apache mod_cache, squid and file system caching. It’s also worth mentioning that a relatively new solution for web optimisation “on-the-fly” called mod_pagespeed has been developed by Google. Despite the fact that stable release was made available to the public in the 3rd quarter of 2012, many companies have substantially reduced the size of their content, the number of (?)amount of requests and improved the structure of their websites by implementing mod_pagespeed already
  • Application performance optimisation – if one request from the client generates 10 requests to the database it implies that popular websites that receives over 1000 homepage impressions per second at peak time will generate 10,000 requests to the database. Combining and caching requests can significantly reduce the workload of both the web/application layers as well as the database layer.


