VOOZH about

URL: https://thenewstack.io/googles-formula-for-elite-devops-performance/

⇱ Google's Formula for Elite DevOps Performance - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2020-01-01 06:00:17
Google's Formula for Elite DevOps Performance
feature,
Cloud Native Ecosystem / DevOps

Google’s Formula for Elite DevOps Performance

Every organization wants to be successful, but who decides which are successful or not? For Google, there’s a clear definition of how to measure the success of a DevOps team. At CloudNative London last year, Google Cloud Platform Advocate, and co-author of  "Continuous Delivery," Jez Humble explained Google’s four key DORA metrics and how to become one of these few, proud elite teams.
Jan 1st, 2020 6:00am by Jennifer Riggins
👁 Featued image for: Google’s Formula for Elite DevOps Performance

Every organization wants to be successful, but who decides which are successful or not? For Google, there’s a clear definition of how to measure the success of a DevOps team. At CloudNative London last year, Google Cloud Platform Advocate, and co-author of  “Continuous Delivery,” Jez Humble explained Google’s four key metrics — commonly referred to as the DevOps Research and Assessment or DORA metrics — and how to become one of these few, proud elite teams.

Let’s start by clarifying how Google defines DevOps, namely as “an organizational and cultural movement that aims to increase software delivery velocity, improve service reliability, and build shared ownership among software stakeholders.” This definition further goes into how DevOps teams should work to “improve the speed, stability, availability, and security of your software delivery capability.”

Humble says a lot of teams think they can just install Kubernetes and start deploying apps, but few organizations have the technical and managerial capabilities that drive real success. So what those few top performers are doing right?

The 4 DORA Metrics of a Successful DevOps Team

👁 Image

For Google, it comes down to three teams — software development, software deployment, and service operations — who care about four metrics plus one that can’t be measured directly, availability, but nonetheless can’t be compromised during this process. The four key DevOps DORA metrics are:

  • Lead time for changes
  • Deployment frequency
  • Time to restore service
  • Change failure rate

Successful DevOps teams understand the shared ownership of these objectives.

Humble further defines DevOps high performers as those that do better at throughput, stability and availability. These elite performers:

  • Release many times per day.
  • The lead time for changes and moving into production is less than a day.
  • The time to restore service is less than an hour (low performers take between a week or a month).
  • The change failure rate is zero to 15%.

These elite performers reach corporate goals because they do well at the following metrics:

  • Profitability
  • Productivity
  • Market share
  • Number of customers
  • Quality of products or services
  • Operating efficiency
  • Customer satisfaction
  • Quantity of products or services provided
  • Achieving organizational and mission goals

All of these pour into the belief that software delivery and operations (SDO) performance predicts the whole organizational performance. Humble also says there is a further correlation between your SDO’s performance and your cultural performance. “They predict culture. The extent to where your culture is mission-oriented, not pathological, controlling-oriented,” Humble said.

All these elite performers share in fostering a climate for learning, highly participatory retrospectives, and the encouragement of trust, voice, and autonomy.

But it’s also not just about the people. Tech is just as important as culture in DevOps.

Can You Have DevOps Without Cloud Computing?

The Google State of DevOps 2019 report found that 80% of its respondents were primarily hosting on some sort of cloud platform. Humble explained that Google applied the National Institute of Science and Technology’s definition of cloud computing to the SDO performance. This outlines the five essential characteristics of cloud computing:

  1. On-demand self-service: provisioning computing resources without human interaction from a cloud provider.
  2. Broad network access: heterogeneous access through phones, laptops, and tablets, not just work stations.
  3. Resource pooling: multi-tenant, can be abstracted through country, state or data center.
  4. Rapid elasticity: capabilities can scale up and down easily.
  5. Measured service: cloud systems automatically control, optimize and report key resource usage.

Only 29% of respondents for Google’s survey met all five requirements. Unsurprisingly these lined up with DevOps performance. In fact elite performers were 24 times more likely to have met all these essential cloud characteristics than the low performers.

Last year’s report realized, along with this year’s, validated that it didn’t matter of they were working on a public, private or hybrid cloud, a team focusing on cloud-based execution should see success in terms of speed, stability, and availability.

Humble said enterprises are typically running hundreds of thousands of services, made up of heterogeneous tech, but there are many other companies where more than 70% of the IT budget is “keeping lights on and adding capacity.” Then when they have to support CI/CD, they need to buy unsupported hardware on eBay or they are “running something mission-critical that no one has the code to anymore.”

Are You Fostering an Elite Culture?

Elite teams have a clear understanding of who does what and automates as much as possible.

Humble said that that the hardest bit to measure of the four key DevOps metrics is lead time. This looks to answer questions like: How long would it take your organization to deploy a change that involves just one single line of code? Can you do this on a repeatable, reliable basis?

He went onto highlight different areas that dramatically affect lead time and the strongest software development and operations teams all seem to have answers to:

  1. Garbage collection: Who should I be billing for this virtual load balancer or database instance? What would happen if I deleted this service? Are people still using it? The platform should ensure that every virtual resource is assigned to either an app or the platform itself.
  2. Making changes: If an app has a vulnerability, how do I fix and deploy it? If I need to update this dependent service, where is the source code? Humble says it should be possible to redeploy any app at the click of a button.
  3. Multitenancy: How do we enable developers with self-service deployments or configuration? Humble admitted that making AWS and Kubernetes multitenant is difficult but it’s an essential requirement of any enterprise platform-as-a-service (PaaS).
  4. Managing complexity: Making sure the stack is up-to-date. How will we hire people in 15 years that know how to work this? Humble says to limit options. For example, all apps must be built on predefined approved runtime stacks that PaaS operators can patch and redeploy on demand.

And probably most importantly, when there’s a vulnerability in your stack, how long will it take you to patch, build and redeploy all of your impacted applications? Humble referenced the need for this rapid patching up against what happened when Equifax had a headline-grabbing breach from a flaw in the Apache Struts framework.

The Architecture Behind Elite DevOps Teams

During his keynote, Humble outlined the architectural outcomes that allow teams the flexibility needed for high-performing DevOps without security risks. It’s not surprising that Conway’s Law rears its head here.

Ask yourselves these questions:

  1. Can my team make large-scale changes to the design of its systems without the permission of somebody outside the team or depending on other teams?
  2. Can my team complete its work without needing fine-grained communication and coordination with people outside the team?
  3. Can my team deploy and release its product or service on demand, independently of other services the product or service depends on?
  4. Can my team do most of its testing on demand, without require an integrated test environment?
  5. Can my team perform deployments during normal business hours with negligible downtime?

Humble suggests the need for a Platform as a Service. He says a self-service, multi-tenant PaaS minimizes the attack service area.

For Humble, the key principles of PaaS is a separation of responsibilities, which are:

  1. The platform team is responsive for PaaS.
  2. Make the application part as small as possible.
  3. Leverage a self-service API for deployment.

He calls this a “Function-as-a-Service model,” where ideally app stacks are part of the platform too so you can patch them easily. This all helps maintain compliance.

As you transition to a PaaS, Humble suggests you start by thinking about the outcomes and think about the contracts there are for the entire software development lifecycle. A PaaS will help you achieve short lead times and short times to restore.

He suggests a JAMstack — JavaScript, APIs and Markup — for your website and apps. Then anything involving business logic or state changes happens through functions that your API talks to. All on top of your PaaS.

Humble listed the advantages of a PaaS include:

  • Extremely cheap and highly scalable
  • Minimizes attack surface area
  • A separation of concerns between content and functionality,
  • Super easy to configure and deploy
  • Decouples presentation and services
  • All around easier to develop/test

The Psychological Safety Behind High-Performing Teams

Finally, Humble talked about how teams achieve outcomes not individuals, which is why team and organizational culture are what make or break these elite DevOps performers.

So, what’s the secret to building high-performing teams and enabling them to delivery with speed and stability? Psychological safety.

Organizations that foster psychological safety — where team members feel safe to take risks and be vulnerable around each other — have a greater capacity for dependability, structure, clarity, meaning and impact on the organization.

All of the above combines to strongly affect both culture and SDO performance which help org performance. And it also helps reduce burnout, deployment pain and, rework.

Author’s Note, November 22, 2023: For years, I have been driving part of the focus on defining DORA metrics as solely those famous four – or sometimes five if you include reliability – measurements. Yes, deployment frequency, lead time for changes, change failure rate, and failed delivery recovery time (previously called mean time to recovery, or MTTR) are important, especially when benchmarking yourself against your own team’s or company’s previous measurements, but they aren’t nearly the whole perspective of the sociotechnical power of DORA. In a more recent conversation with some of the creators of the recent State of DevOps Report 2023 report, I highlight several of the other 45 or so DORA questions, in my piece Google Says You Might Be Doing DORA Metrics Wrong.

TRENDING STORIES
Jennifer Riggins is a tech storyteller and journalist, event and panel host. She bridges the gap between business, culture and technology, with her work grounded in the developer experience. She has been a working writer since 2003, and is based...
Read more from Jennifer Riggins
SHARE THIS STORY
TRENDING STORIES
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.