VOOZH about

URL: https://thenewstack.io/tech-debt-incidents-and-on-call/

⇱ Tech Debt, Incidents and On Call - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2022-06-29 06:28:47
Tech Debt, Incidents and On Call
contributed,sponsor-pagerduty,sponsored,sponsored-post-contributed,
Software Development / Tech Culture

Tech Debt, Incidents and On Call

On-call rotation could be used for tech debt work. It's not a good time for developers to be working on features, but a great time to dig into incidents.
Jun 29th, 2022 6:28am by Dormain Drewitz
👁 Featued image for: Tech Debt, Incidents and On Call
Feature image via Pixabay.
PagerDuty sponsored this post.

I was recently chatting with a cloud ops and platform team leader who was navigating how to manage incident response. Like many organizations, they were trying to adopt a “build and run” approach. This is sometimes called “full-service ownership.” Whatever the term, this approach refers to software development teams taking responsibility to make sure the code they write also runs well in production.

Dormain Drewitz
Dormain is vice president of product marketing and developer relations at PagerDuty. Prior to PagerDuty, she led product marketing and content strategy for VMware Tanzu and held similar roles at Pivotal and Riberbed Technology. She also spent over five years as a technology investment analyst, closely following enterprise infrastructure software companies and industry trends. Dormain holds a bachelor’s degree in history from the University of California at Los Angeles.

Naturally, I asked if the software development teams were taking on-call rotations to support their code in production. After a deep sigh and a “it’s complicated”-type of response, he said something really insightful: Yes, software development teams are often in the escalation path for an incident, but they had moved toward building a site reliability engineering team to take primary on-call duties.

Why? Why not live the values of “build and run” to their fullest? His answer reflected a divide I hadn’t heard articulated before, but it made perfect sense. Although developers know the code best, they aren’t as helpful in an incident. While ops teams want a service restored as quickly as possible, development teams want to root out the underlying issue.

On the surface, these sound similar. But imagine you’re at the checkout line at a grocery store and your credit card is declined. You’ve been living paycheck to paycheck so tightly that your last credit card payment bounced. All you need to restore that credit card and check out is to pay the minimum balance. Call it $25. But the real root issue is that you’re buried in debt, carrying a balance and falling behind. Paying off the balance costs $25,000. It not only frees up your credit card, but eliminates a source of high-interest tech debt on your personal balance sheet.

PagerDuty is the global leader in AI-first operations management serving more than 35,000 organizations worldwide. The PagerDuty Operations Cloud is a comprehensive, multi-product operations cloud platform that sits at the center of the enterprise technology stack.
Learn More
The latest from PagerDuty
Hear more from our sponsor

Coming up with the $25 is relatively easy. For the ops team, it may mean something like restarting a system. A service is quickly brought back online, and the incident is “resolved.” The underlying source, however, still lingers. It looms as a future incident waiting to happen again. But coming up with $25,000 is a longer, more complicated endeavor. And there’s food to get on the table once you finish checking out from the grocery store.

Who’s right?

Both perspectives have a point. But the right thing to do is sort out the $25 problem first, then quickly figure out the $25,000 problem. After all, wouldn’t you find the quicker fix to get through the grocery line and get dinner on the table first? But the challenge is that we rarely find the time to come back to the $25,000 problem. So we face the same problem a week later on our grocery run. It’s exhausting and demoralizing.

How Do We Make Time to Unwind Tech Debt?

First, to define tech debt, I prefer to consider all code as technical debt. Why? Because all code will require servicing or maintenance at some point. At a minimum, security updates for libraries with vulnerabilities are bound to come up. Just look at the Log4j vulnerability of late 2021. It required widespread, urgent maintenance work across many organizations.

Rather than debate whether code is debt or not, the better question is how easily can you service that code. If it’s easier to change and update, you’re in a much better position to unwind that $25,000 problem shortly after an incident. But that still doesn’t answer the question of when you do that work.

The answer to that might, ironically, come back to involving developers in that “build and run” responsibility. In her talk at PagerDuty Summit 2022, Charity Majors suggested using on-call rotation time for tech debt work. It’s already not a good time for developers to be working on feature work. They could be interrupted at any moment by an incident that needs their attention. But it’s a great time to dig into what’s been causing incidents.

Using on-call time to “pay down” tech debt serves a couple of purposes. First, it should reduce recurring incidents that have been resolved without addressing the root cause. Second, as Majors’s talk title, “On Call Doesn’t Have to Suck” implies, it makes on-call rotations more attractive. As that cloud and platform ops leader observed, developers want to address the root cause. Ultimately, no one wants to be woken up by an issue, especially one they already know exists. That seems like pain that could have been avoided.

Check out the talk by Charity Majors, co-founder and CTO of Honeycomb for more insights on using on-call rotations to focus on tech debt and other tips to improve the on-call experience.

PagerDuty is the global leader in AI-first operations management serving more than 35,000 organizations worldwide. The PagerDuty Operations Cloud is a comprehensive, multi-product operations cloud platform that sits at the center of the enterprise technology stack.
Learn More
The latest from PagerDuty
Hear more from our sponsor
TRENDING STORIES
Dormain Drewitz is vice president of marketing at ZEDEDA, the leader in edge computing platforms. Before joining ZEDEDA, Dormain was vice president of product marketing and developer relations at PagerDuty and led product marketing and content strategy for VMware Tanzu....
Read more from Dormain Drewitz
PagerDuty sponsored this post.
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Pragma, Honeycomb.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.