The Inquirer-Home

Amazon cloud problems downed Netflix at Christmas

Load balancing issues interrupted movie streaming service
Wed Jan 02 2013, 12:17
netflix-logo-image

ONLINE BOOKSELLER Amazon has explained how a load balancing issue at its facilities took down the Netflix TV on demand service on Christmas Eve.

Netflix went down in parts of the US on Christmas Eve, a time when people like to watch films.

Netflix admitted to the problems on its Twitter feed, explaining that as soon as it became aware of the issue its engineers started working on a fix.

Judging by the time between that tweet and another one that said the problem had been fixed, Netflix service was interrupted for about four hours.

Netflix had called on Amazon's web services engineers to help it, and Amazon has issued its own explanation of the problem and how it affected Netflix and its other customers.

"While the service disruption only affected applications using the Amazon Elastic Load Balancing Service (ELB) service (and only a fraction of the load balancers were affected), the impacted load balancers saw significant impact for a prolonged period of time," it said.

"A portion of the ELB state data was logically deleted [by] a maintenance process that was inadvertently run against the production ELB state data... Unfortunately, the developer did not realize the mistake at the time. After this data was deleted, the ELB control plane began experiencing high latency and error rates for API calls to manage ELB load balancers."

Amazon added that while this situation continued some of its customers "began to experience performance issues with their running load balancers".

It said that it has fixed the ELB to prevent the same thing from happening again, permission must now be granted before data can be deleted, and it apologised to all of its business users.

"We want to apologize," it added in closing its lengthy apology. "We know how critical our services are to our customers' businesses, and we know this disruption came at an inopportune time for some of our customers. We will do everything we can to learn from this event and use it to drive further improvement in the ELB service." µ

 

Share this:

blog comments powered by Disqus
Advertisement
Subscribe to INQ newsletters

Sign up for INQbot – a weekly roundup of the best from the INQ

Advertisement
INQ Poll

Blackberry completes restructuring process

Do you think Blackberry can bounce back to growth?