SAN FRANCISCO â Facebook said on Thursday that it had repaired a technical error that led to long lapses in service at its various properties, including Instagram, WhatsApp and Messenger.
The interruption lasted nearly 24 hours on some of the services and was the longest in Facebookâs recent history. It was an eye-opening reminder that even the most powerful internet companies, employing the best computer scientists and cutting-edge technology, can still be crippled by human error.
âAll of the big web companies have multiple lines of defense, but sometimes a coding mistake made by one engineer can make its way onto many thousands of computers and cause major errors,â said Alex Stamos, a former chief security officer at Facebook and a lecturer at Stanford University. âIn other words, rebooting something as complex as Facebook is very, very hard.â
A âserver configuration changeâ made on Wednesday had a cascading effect through the companyâs network, a Facebook spokesman said. That created a repeating loop of problems that kept growing and could not be immediately fixed, according to one current and one former Facebook employee, who spoke on the condition of anonymity because they were not allowed to talk to reporters.
That small mistake had big consequences. Instagram users couldnât view other profiles, WhatsApp users couldnât send messages, and news feeds across Facebookâs main app went blank.
Downdetector, which likens itself to a weather report for the internet, said it had received 7.5 million problem reports about Facebookâs apps. In comparison, widespread problems on YouTube in October prompted just 2.7 million reports. Downdetector measures service interruptions in part by counting reports from users who are experiencing problems.
âNever before have we seen such a large-scale outage,â said Tom Sanders, a co-founder of Downdetector.
Early Thursday, Facebook was able to pull most of its systems back online. The company is still trying to figure out how that error reverberated throughout its network. Facebook officials emphasized that the problem had not been caused by hacking or a cyberassault like a so-called denial-of-service attack, which would hit servers with a wave of traffic that caused them to stop working.
For years, Facebook has recruited engineers on the idea that within weeks they can release computer code that touches billions of people.
âI still get a large amount of fulfillment from seeing my work make a meaningful impact on so many peopleâs lives,â a testimonial from one employee says on Facebookâs âcareersâ recruiting page.
But that also means a single employeeâs mistake can have widespread consequences, especially as Facebook works on a recently detailed plan to consolidate the infrastructure of its âfamily of apps.â The more tightly woven a computer network becomes, the more likely it is that a small technical problem can grow into a large one.
Facebook, like other internet giants, prides itself on never going offline. That predictability has helped it become one of the most influential â and criticized â companies in the world. An estimated two billion-plus people use one or several of its services daily.
As people become more dependent on Facebookâs services, for chatting with family and friends as well as doing their jobs, they have higher expectations for performance, Mr. Sanders said.
âThe tolerance for down time decreases, and people are increasingly expecting services to operate flawlessly 365 days per year,â he said.
Although the incident was an irritation for many users, it had more urgent consequences for businesses, like advertising, that rely on Facebookâs network to generate revenue.
Kieley Taylor, global head of social at the advertising agency GroupM, said her firm hadnât been able to get access to Facebookâs system, meaning new advertising campaigns were delayed.
âItâs never a good day for an outage,â she said. âLuckily, it was relatively a short period, but it was fully out.â
Her company was still trying to determine how many ad campaigns had been hit. Ms. Taylor said that because Facebookâs ad system worked on a pay-as-you-go basis, GroupM wouldnât need to seek reimbursements from Facebook for ad campaigns that werenât delivered.
GroupM diverted advertising to Google search, YouTube and other websites, but said Facebook had unique reach given its size.
âBecause of all the people who are on the platform, it continues to be a really powerful digital marketing platform,â Ms. Taylor added.
Algolia Custom Site Search