VOOZH about

URL: https://dev.to/yash_step2dev/the-devops-metrics-that-actually-matter-and-how-to-track-them-3123

⇱ The DevOps metrics that actually matter (and how to track them) - DEV Community


The DevOps metrics that actually matter (and how to track them)

The 4 DORA metrics

Deployment Frequency Elite: Multiple/day Low: < once/month
Lead Time for Changes Elite: < 1 hour Low: > 1 month
Change Failure Rate Elite: 0-5% Low: 15-30%
MTTR Elite: < 1 hour Low: > 1 week

Track deployment frequency in GitHub Actions

- name: Record deployment metric
 if: success()
 run: |
 aws cloudwatch put-metric-data --namespace "DevOps/Deployments" --metric-name "DeploymentCount" --value 1 --dimensions Service=${{ env.SERVICE_NAME }},Environment=production

Track lead time (commit → production)

- name: Record lead time
 if: success()
 run: |
 COMMIT_TIME=$(git show -s --format=%ct ${{ github.sha }})
 LEAD_TIME=$(($(date +%s) - COMMIT_TIME))
 aws cloudwatch put-metric-data --namespace "DevOps/Deployments" --metric-name "LeadTimeSeconds" --value $LEAD_TIME --dimensions Service=${{ env.SERVICE_NAME }}

Track MTTR via alarm state changes (Lambda)

def handler(event, context):
 alarm = event['detail']['alarmName']
 state = event['detail']['state']['value']
 ts = datetime.fromisoformat(event['time'].replace('Z', '+00:00'))

 if state == 'ALARM':
 ssm.put_parameter(Name=f'/incidents/{alarm}/start',
 Value=ts.isoformat(), Type='String', Overwrite=True)
 elif state == 'OK':
 start = ssm.get_parameter(Name=f'/incidents/{alarm}/start')
 mttr = (ts - datetime.fromisoformat(start['Parameter']['Value'])).total_seconds()
 cw.put_metric_data(Namespace='DevOps/Incidents',
 MetricData=[{'MetricName':'MTTR','Value':mttr,'Unit':'Seconds'}])

Leading indicators

  • Alert fatigue rate — rising = signal quality degrading
  • Deployment size — larger = higher failure rate
  • Test coverage trend — declining = more future failures

Step2Dev includes deployment metrics instrumentation in the workflow it generates.

👉 step2dev.com