One Customer Reportedly Triggered The Outage That Broke The Internet
One customer caused the major internet outage that made several well-known websites crash earlier this week, according to the cloud service at the root of the problem.
“The problems were reportedly due to issues experienced by the cloud computing provider Fastly, which identified concerns within its global content delivery network (CDN) and was in the process of implementing a fix.”
“We identified a service configuration that triggered disruptions across our POPs globally and have disabled that configuration,” Fastly announced in a statement.
“The issues began at around 11am BST and lasted for an hour,” according to the BBC. Among the websites impacted were The New York Times, PayPal, CNN, Twitch, Hulu, Vimeo, BBC.com, Shopify, the Financial Times, and The Guardian, as well as specific services, such as Twitter’s emoji feature.
In a blog post titled, “Summary of June 8 outage,” Fastly’s Senior Vice President of Engineering Nick Rockwell explained what happened.
“This outage was broad and severe, and we’re truly sorry for the impact to our customers and everyone who relies on them,” Rockwell wrote, before claiming that the crash was caused by a single customer.
“On May 12, we began a software deployment that introduced a bug that could be triggered by a specific customer configuration under specific circumstances,” Rockwell said. “Early June 8, a customer pushed a valid configuration change that included the specific circumstances that triggered the bug, which caused 85% of our network to return errors.”
After providing a timeline of events — between “Initial onset of global disruption” at 09:47 UTC and “Bug fix deployment began” at 17:25 UTC — Rockwell continued.
“Once the immediate effects were mitigated, we turned our attention to fixing the bug and communicating with our customers. We created a permanent fix for the bug and began deploying it at 17:25,” he said.
Among some short-term next steps laid out by Fastly was a “complete post mortem of the processes and practices we followed during this incident” in order to “figure out why we didn’t detect the bug during our software quality assurance and testing processes,” and to “evaluate ways to improve our remediation time.”
“Even though there were specific conditions that triggered this outage, we should have anticipated it. We provide mission critical services, and we treat any action that can cause service issues with the utmost sensitivity and priority. We apologize to our customers and those who rely on them for the outage and sincerely thank the community for its support,” Rockwell concluded.
Fastly is based in San Francisco, and in 2017 “it launched an edge cloud platform designed to bring websites closer to the people who use them.”
As CNET explained, “Effectively this means that if you’re accessing a website hosted in another country, it will store some of that website closer to you so that there’s no need to waste bandwidth by going to fetch all of that website’s content from far away every time you need it.”
No comments