VOOZH about

URL: https://thenewstack.io/cdn-outages-exploring-ways-to-increase-resilience/

⇱ CDN Outages: Exploring Ways to Increase Resilience - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2022-07-06 08:55:08
CDN Outages: Exploring Ways to Increase Resilience
DevOps / Frontend Development / Security

CDN Outages: Exploring Ways to Increase Resilience

Issues at Content Delivery Network (CDN) providers caused several high-profile outages in recent years — here's how to increase resilience.
Jul 6th, 2022 8:55am by Jeff Goldman
👁 Featued image for: CDN Outages: Exploring Ways to Increase Resilience
Feature image via Shutterstock.

Issues at Content Delivery Network (CDN) providers have caused several high-profile outages over the past few years — a Cloudflare outage last month impacted a large proportion of its customers; a Fastly outage a year ago knocked out websites ranging from Amazon to CNN; and another Cloudflare outage in 2020, this time to its 1.1.1.1 DNS service, had a similarly broad impact.

In each case, the cause has been relatively random, though the common denominator has been human error. Cloudflare’s recent outage was caused by an error in a configuration change, Fastly’s 2021 outage was caused by a bug in newly deployed software, and Cloudflare’s 2020 outage was caused by a misconfiguration that overloaded a router in Atlanta.

Responding to the Unexpected

Those issues point to a key challenge in a constantly changing online environment: as Fastly chief product and strategy officer Lakshmi Sharma told The New Stack, no amount of testing or simulations can anticipate every possible contingency. “The internet and the large edge and cloud infrastructures that support it are highly resilient, but, as with any complex system with interdependencies on many fixed and variable pieces, the unexpected can happen — resulting in outages,” she said.

Fastly’s answer, Sharma said, is to be as direct and forthcoming as possible if and when issues do come up. “The best we can do is to be transparent with customers, be honest about what happened and share key learnings to not only help resolve the issue in the moment but help reduce recovery time when any outage occurs in the future,” she said.

In part, that means encouraging customers to consider a diversified strategy for increased resilience. “We want our customers to be successful, so we also recommend changes they can make in their own infrastructure, including implementing a multi-CDN or multicloud strategy for business continuity where strategically appropriate,” Sharma said.

Cloudflare declined to comment for this article beyond the information in the blog posts linked above, but they’ve been similarly transparent about the causes of these events and the actions they’re taking in response to them.

Still, when simple configuration mistakes can have this powerful an impact, is there a broader lesson to learn?

Considering Smaller Providers

Mark Boost, CEO of cloud service provider Civo, told The New Stack that outages like these suggest it’s never a good idea to put all your eggs in one basket: a single error at a single provider shouldn’t be able to take down a large swath of the internet. “Over-reliance on one or two providers is what companies should be looking at in de-risking themselves,” he said.

For a larger company with a global customer base, Boost said, it makes sense to count on a CDN to distribute content worldwide — but that may not be the case for many smaller companies, which would likely benefit from considering other options. “A lot of them probably aren’t using cached content in terms of images and things like that, and there are other solutions that can help with security,” he said. “There’s appliances you could buy, and there’s various smaller providers that you can use.”

And while many larger companies have free offerings that make it easy to get started with them, Boost said, it’s important to do so with an awareness of the risks. He pointed to the 2020 AWS outage that disabled iRobot vacuums as an example of the potential downsides of relying on a larger provider, which can face outages due both to human error and to cyber attacks. “I don’t know if people are really thinking about the security risks of targeted attacks against some of these huge companies,” he said.

With vast scale, Boost said, often comes vast complexity — which can lead both to issues on the provider’s end and to misconfiguration by the user. “If you think of the 150 services that Amazon offers, all with different options, there [are] lots of ways you can accidentally misconfigure things,” he said. “You may think you’ve got a high availability setup, but it turns out you haven’t configured it in the right way, and that could lead to a security risk.”

A Diversified Landscape

Still, there’s clear resistance to switching from larger providers to smaller ones: a recent Civo survey [PDF] of 1,000 developers found that 51% see smaller cloud providers as less secure, and 47% think they suffer more outages. “Sometimes people are unwilling to give them a chance, these small providers — but in reality, just because they’re small doesn’t mean they’re not very good at what they do,” Boost said.

All of that could, of course, be seen as a sales pitch for Boost’s own company, but he stressed that he’s more interested in persuading potential customers to broaden their range of options than simply selling them on his own offerings. “There [are] niche providers out there that might focus on certain areas, even more so than AWS — like a real high-security, lockdown, zero trust type environment,” he said. “There [are] lots of other people that are available.”

And the potential benefit of considering a range of providers is clear, whether it’s prompted by the kind of multi-CDN or multicloud strategy advised by Fastly’s Sharma, or by a desire to bring smaller providers into the mix as Boost suggests — the two aren’t mutually exclusive. “If we share some of that overall load that is currently with these hyperscalers with some of these small and medium-sized providers, it would mean the world is not locked into very few people — which is a dangerous place to be from a security perspective,” Boost said.

In that sense, Sharma’s and Boost’s suggestions, coming from a larger provider and a smaller one, are ultimately very similar: it makes sense to turn to multiple providers to increase resilience where appropriate; and regardless of the provider or providers you consider, transparency is key. “There’s not going to be a world where we’re not going to have any outages,” Boost said. “There’s always going to be something that goes wrong — it’s how you deal with it.”

TRENDING STORIES
Jeff Goldman is a Los Angeles-based technology journalist.
Read more from Jeff Goldman
SHARE THIS STORY
TRENDING STORIES
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.