VOOZH about

URL: https://thenewstack.io/move-away-from-manual-with-automated-incident-response/

⇱ Move Away from Manual with Automated Incident Response - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2022-04-12 06:39:21
Move Away from Manual with Automated Incident Response
contributed,sponsor-pagerduty,sponsored,sponsored-post,
Cloud Native Ecosystem

Move Away from Manual with Automated Incident Response

The journey to automated incident response is one that can’t be completed overnight. Teams can do this with a crawl-walk-run approach.
Apr 12th, 2022 6:39am by Hannah Culver
👁 Featued image for: Move Away from Manual with Automated Incident Response
Featured image via Pixabay
PagerDuty sponsored this post.

As companies ramp up their digitization efforts, there’s a lot of extra pressure from growing incidents, which in turn puts team health at risk with potential burnout and attrition. In fact, we saw a 19% growth in critical incidents from 2019 to 2020 on our platform, and initial 2021 data shows that this number has only gone up over the past year.

Hannah Culver
Hannah is a solutions marketer at PagerDuty interested in how real-time, urgent work plays out across all industries in this digital era.

In addition to the increase in incidents, the timing of these incidents resulted in disruptions to both personal time and focused work time. According to platform data, teams experienced:

  • 9% more off-hour interruptions.
  • 7% more holiday/weekend interruptions.
  • 5% more business-hour interruptions.

This additional strain is even more apparent when you look at our users’ working hours. Based on our data, we saw that users worked an additional two hours per day in 2020, which adds up to nearly 12 extra weeks per year! This is unsustainable and bad for team and responder health both long term and short term.

And there’s a toll here that extends beyond the individual. This extra work only increased attrition in an era commonly referred to as the Great Resignation. As teams lose people and look to hire for these open positions, their workloads only grow. Something must change, and IT leaders are looking to a few strategic initiatives to help.

We asked 700 IT decision-makers about their priorities for 2021 and 73% of technical leaders reported that they’re investing in AIOps and automation to help boost their operational processes and remove the burden of manual work. This is because nine out of 10 respondents said that traditional ITOps approaches are no longer able to keep up with today’s pace and complexity.

To relieve some of the pressure and adapt to this new era, 72% of leaders are ramping up digital transformation efforts and 74% are using DevOps to drive better alignment. The overall purpose of these investments is better, faster incident response with less toil for responders. Another way to think of this is that these aspirations set the stage for the category of automated incident response (AIR).

What Is Automated Incident Response?

Gartner introduced the term automated incident response last year as an evolution of the long-standing definition of incident response. This change reflects the growing need for teams to adopt automation and stems from the increasing complexity in our technology environments.

According to Gartner, “Automated incident response (AIR) solutions automate incident response processes by enabling centralized alert or incident routing. Using a policy or rule-based engine, on-call scheduler, or streamlined collaboration, this can improve operational efficiencies with action-oriented insights.”

At PagerDuty, we’ve been thinking about how automation plays a role in incident response for a long time. In fact, our standard solution has already met this definition since 2017. Our original on-call management capabilities provided scheduling. Our IT service alerting and escalation policies enabled centralized alert routing. And, our stakeholder notifications and response plays streamlined collaboration across incident response teams. That left us asking the question, “What’s next?”

Expanding the Value of Automation Across More of the Incident Response Life Cycle

Adding automation into the incident response process should be done strategically at points where humans are taking on undue burden. We’ve found it most helpful to visualize where automation could help ease this burden with these types of questions in the graphic below.

👁 Image

Some of these needs must be addressed by humans. After all, retrospectives can’t be completed with machines only; you need humans there to actually do the learning. But, other parts of this process don’t require humans as the first line of defense.

It’s apparent from our research that the traditional manual way of incident response is no longer enough to satisfy customers and is too toilsome and exhausting for responders to maintain. Humans are burning out addressing issues and completing tasks that machines could be resolving without intervention. The gap between the manual and automated processes is becoming more painful.

The manual way involves interrupting humans from whatever they’re doing, whether that’s walking the dog, sleeping, or focusing on the next key project, and asking them to find the root cause all by themselves. But the right humans at the right time is no longer enough. Teams need to lean on automation to manage this increased pace and complexity.

The new automated way is about preventing humans from being the first line of defense. It introduces ways to leverage machines to shoulder some of the burden and help humans balance critical workloads. And it works in real time or on demand to address a multitude of use cases based on what each team is ready for. Sounds great, so how can teams actually get started?

How Do I Get from Manual to Automated?

The journey to automated incident response is one that can’t be completed overnight. When it comes to automation, it’s important to focus on reducing operational loads to get more done while at the same time increasing organizational speed and innovation. Teams can do this with a crawl-walk-run approach. The key is starting where your organization is today and having a plan for continued maturity.

At PagerDuty, we mark operational maturity across five stages ranging from manual to preventative. One of the most important parts of the journey is understanding where your team and organization is on this model currently. Then you can start by picking a specific area to focus on.

👁 Image

For organizations in the manual and reactive stages, you can identify and enable those in your organization with an affinity for automation. Leaning into automation can feel daunting, so encourage people to use the skills and languages they already have to keep it feeling familiar.

Teams in the early stages of operational maturity should favor action and focus on turning manual documented steps into automated steps. Once you’ve done this, you’ll have pockets of automation across your organization that make your subject matter experts more effective.

When teams reach the responsive stage, the objective becomes to standardize the incident response process and enable self-service. Standardization helps you build automation that you can reuse across teams and services. Self-service is how you leverage automation for greater value by enabling others to do what previously only your subject matter experts could.

Standardization and self-service distribute the operation load, provide more effective use of resources and enable SMEs to get out from under toil and focus on what moves the business forward. Incidents will be resolved much more quickly because first responders have the tools they need.

PagerDuty is the global leader in AI-first operations management serving more than 35,000 organizations worldwide. The PagerDuty Operations Cloud is a comprehensive, multi-product operations cloud platform that sits at the center of the enterprise technology stack.
Learn More
The latest from PagerDuty
Hear more from our sponsor

In the proactive stage, automation is optimized for real-time work. This means running automation in response to incidents, creating auto-remediation capabilities and removing more of the real-time burden on the teams responding to critical work.

People capacity is an organization’s most precious resource. The best way to protect team capacity is to resolve as much as you can without human intervention. It’s not about replacing humans; it’s about augmenting your humans with automation that keeps repetitive or noisy tasks away from them so they can focus on innovating.

When teams can effectively link automation with their incident response processes, they benefit both in terms of fewer total incidents and shorter response times. This ultimately means less firefighting for teams, less burnout, less attrition and more time spent innovating. Doesn’t this truly sound like a breath of fresh AIR?

To see what automated incident response with PagerDuty looks like in action, check out this webinar. Or, if you want to try it out yourself, sign up for our 14-day free trial.

PagerDuty is the global leader in AI-first operations management serving more than 35,000 organizations worldwide. The PagerDuty Operations Cloud is a comprehensive, multi-product operations cloud platform that sits at the center of the enterprise technology stack.
Learn More
The latest from PagerDuty
Hear more from our sponsor
TRENDING STORIES
Hannah Culver is a senior product marketing manager at PagerDuty with more than five years in the incident management space.
Read more from Hannah Culver
PagerDuty sponsored this post.
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Pragma.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.