VOOZH about

URL: https://thenewstack.io/take-the-human-out-of-3-a-m-incident-responses/

⇱ Take The Human Out of 3 A.M. Incident Responses - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2022-06-23 11:00:46
Take The Human Out of 3 A.M. Incident Responses
DevOps

Take The Human Out of 3 A.M. Incident Responses

PagerDuty updated its Operations Cloud with incident-workflows capabilities that help to address how to quickly and repeatedly handle incident response.
Jun 23rd, 2022 11:00am by B. Cameron Gain
👁 Featued image for: Take The Human Out of 3 A.M. Incident Responses

Responding to incidents and deciding how to take the right actions when they occur represent one of the major pain points in the DevOps world today.

Not only do many unlucky site reliability engineers and operations team members face getting that 3 a.m. emergency wake-up call or page, but significant amounts of resources, time and effort often continue to be wasted when proper processes and incident response protocols are not in place.

Improving upon, and especially, automating event responses was one of the major themes of the recent PagerDuty Summit 2022 user conference. The venue also served as a springboard for the announcement of and discussion about the major expansion of PagerDuty Operations Cloud capabilities.

Incident Workflows and Automation Actions

These updates help to ensure that modern work realities and the systems teams are aligned, Sean Scott, chief product officer for PagerDuty, said. Operations Cloud’s new incident-workflows capabilities help to solve the challenge of how to quickly and repeatedly handle incident response, he said.

Best practices and team learnings have already surfaced regarding what steps responders should follow in various cases, such as creating an incident-specific Slack channel or sending a stakeholder update. But how do teams ensure that those steps are followed in the heat of a high-severity incident?” Scott told The New Stack. “Some teams might have a wiki page with a checklist, but remembering to bookmark and then check the wiki becomes a point of failure in the process, especially when onboarding new team members. With Incident Workflows, teams can define the steps they want to be followed in different cases and what should trigger them.”

Oftentimes, an incident is classified as a certain priority or urgency. The process should be such that there is no ambiguity about the action that needs to be taken. When such events occur, a checklist of steps is followed automatically, “the same way every time,” when Incidents Workflows is implemented, Scott said.

The Automation Actions capability solves the challenge of who and how automation is activated, Scott described. For example, running a simple network or database test is often part of troubleshooting a problem, but the permissions and know-how of how to run such a test exists with a specialist team.

“During an incident or customer support interaction, escalating to a specialist team to run a routine test takes time and adds to the cost of the incident in terms of increased duration and more people working on the issue,” Scott said. “With Automation Actions, first responders and customer service agents can take action directly and run an automated diagnostic test, for example, to validate or troubleshoot an issue.”

Event Orchestration

Event Orchestration is also a major component of the Operations Cloud release. “Teams can even define a logical flow that triggers a step, like running a test or restarting a node, without any human intervention,” Scott said. “This accelerates incident resolution and reduces the number of teams and cost that are involved.”

PagerDuty said that the benefits that Incident Workflows offers that will streamline the chain reactions that occur during an incident and make the response process more rapid and consistent include:

  • Workflows that are easy to design and activate with no-code capabilities.
  • Automated sequences for common incident actions.
  • Customizable workflows that enable more consistent responses across the organization.

The benefits associated with Automated Actions to orchestrate automated diagnostics and remediation steps include:

  • Immediate automated actions triggered either by PagerDuty’s Event Intelligence or manually by responders and automated diagnostics to investigate status, gain context or directly initiate runbook automation to remediate an incident.
  • Allowing customer service agents to validate customer issues by running automated actions directly from the PagerDuty application in the Salesforce Service Cloud, thus reducing resolution time and the number of incidents escalated to back-end teams.

PagerDuty's Frank Emery on @PagerDuty's platform's incident response automation: "50% of knowledge reduction that happens is done with suppression, where people use event orchestration to target flows where they know events aren't really adding value.” @thenewstack #PDSummit22. pic.twitter.com/SeZHBjfXpA

— BC Gain (@bcamerongain) June 10, 2022

As part of the Event Orchestration, automated reductions of incidents and events are downgraded so that more incidents and other tasks are given more priority. Also known as “suppression,” this capability should help to alleviate a major pain point that operations teams have struggled with.

“This is what event rules are built to do originally, and what Event Orchestration really cranks up to 11. And so I would say maybe out of 50% of all knowledge reduction that happens is actually done with suppression, where people are using Event Orchestration to target different sorts of flows where they know events aren’t really adding value,” Frank Emery, senior product manager for PagerDuty said during a summit talk. “What’s interesting here is what you can do with the basic tier vendor orchestration … You can start to really weed out some of those noisy events that are not adding value for your team.”

TRENDING STORIES
BC Gain is founder and principal analyst for ReveCom Media. His obsession with computers began when he hacked a Space Invaders console to play all day for 25 cents at the local video arcade in the early 1980s. He then...
Read more from B. Cameron Gain
SHARE THIS STORY
TRENDING STORIES
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.