Gmail outage caused by rogue code
- — 03 March, 2009 08:52
New code triggered a failure during routine maintenance of Google's European datacentres, which led to a two hour shutdown of its Gmail system around the world last week.
The outage was an "unforeseen side-effect of some new code that tries to keep data geographically close to its owner," Acacio Cruz, Google's Gmail site reliability manager, wrote in a Google blog post.
The rogue software caused a datacentre in Europe to become overloaded, which caused cascading problems from one datacentre to another.
"It took us about an hour to get it all back under control," wrote Cruz.
Users around the world could either not get access to their inboxes or had to wait a minute or more for them to open during the two-hour outage last Tuesday.
Google has had trouble with Gmail before, and users have voiced concerns over the reliability of the service. In the past six months, Gmail has suffered some form of downtime on five separate occasions. In the month of August alone, Gmail had three significant outages that affected not only individual consumers of the free web mail service but also companies and organisations paying for Apps Premier, the company's hosted suite of collaboration, messaging and office productivity services.
According to Google, the bugs have been found and fixed.
Cruz wrote: "We know how painful an outage like this is - we run Google on Gmail, so outages like this affect us the same way they affect you."