VOOZH about

URL: https://docs.datadoghq.com/deployment_gates/setup/preconfigured/

⇱ Set Up Preconfigured Deployment Gates


Set Up Preconfigured Deployment Gates

This product is not supported for your selected Datadog site. ().
Join the Preview!

Deployment Gates are in Preview. If you’re interested in this feature, complete the form to request access.

Request Access

With preconfigured Deployment Gates, gates and rules are persisted in Datadog and referenced by service and environment at evaluation time. Preconfigured gates are a good fit when you want to share rules across many deployments, manage configuration in Terraform, or let non-CI users edit rules in the Datadog UI.

Looking to define rules inline in your deployment config? See Just-In-Time (JIT) Deployment Gates.

Create a gate

In addition to using the Deployment Gates UI, you can manage gates and rules programmatically with the Deployment Gates API or Datadog Terraform provider.
  1. Go to Software Delivery > Deployment Gates > Configuration.
  2. Click Create Gate.
  3. Configure the following settings:
    • Service: The service name (example: transaction-backend).
    • Environment: The target environment (example: dev).
    • Identifier (optional, default value is default): Unique name for multiple gates on the same service/environment. Use this to:
      • Allow different deployment strategies (example: fast-deploy vs default)
      • Distinguish deployment phases (example: pre-deploy vs post-deploy)
      • Define canary stages (example: pre-deploy vs canary-20pct)
    • Evaluation Mode: Enable Dry Run to test gate behavior without impacting deployments. The evaluation of a dry-run gate always responds with a pass status, but the in-app result reflects the real evaluation. This is useful when performing an initial evaluation of the gate behavior without impacting the deployment pipeline.

Add rules to a gate

Each gate requires one or more rules to evaluate. All rules must pass for the gate to succeed. For each rule, specify:

  1. Name: A descriptive label that appears on the Deployment Gates Evaluations page (for example, Check all P0 monitors).
  2. Type: Select Monitor or Faulty Deployment Detection.
  3. Additional settings based on the selected rule type. See Rule types for the available options.
  4. Evaluation Mode: When a rule is set as a Dry Run, its result is not taken into account when computing the overall gate result.

Rule types

For the full schema and all available options, see the Deployment Gates API reference.

The Monitor rule evaluates the state of a set of monitors over a configurable period of time. It fails if at any time during the evaluation period:

  • No monitors match the query.
  • More than 50 monitors match the query.
  • Any matching monitor is in ALERT or NO_DATA state.
Configuration settings
  • Search Query: The query used to find the monitors to evaluate, based on the Search Monitor syntax. Filter on monitor tags:
    • Monitor static tags: service:transaction-backend
    • Tags within the monitor’s query: scope:"service:transaction-backend"
    • Tags within a monitor grouping: group:"service:transaction-backend"
  • Duration: The period of time (in seconds) for which the matching monitors are evaluated. Default is 0 (monitors are evaluated instantly). Maximum is 7200 seconds (2 hours).
Example queries
  • env:prod service:transaction-backend
  • env:prod (service:transaction-backend OR group:"service:transaction-backend" OR scope:"service:transaction-backend")
  • tag:"use_deployment_gates" team:payment
  • tag:"use_deployment_gates" AND (NOT group:("team:frontend"))

Notes:

  • group filters evaluate only matching groups.
  • Muted monitors are automatically excluded from the evaluation (the query always includes muted:false).

This rule type uses Watchdog’s APM Faulty Deployment Detection analysis to compare the deployed version against previous versions of the same service. The analysis detects:

  • New types of errors.
  • Significant increases in error rates compared to previous versions.

The analysis is automatically performed for all APM-instrumented services, and no prior setup is required.

Configuration settings
  • Operation Name: Auto-populated from the service’s APM primary operation settings.
  • Duration: The period of time (in seconds) for which the analysis runs. For optimal analysis confidence, this value should be at least 900 seconds (15 minutes) after a deployment starts. Maximum is 7200 seconds (2 hours).
  • Included Resources (optional): A comma-separated list of APM resources to include in the analysis. When specified, only the listed resources are analyzed.
  • Excluded Resources (optional): A comma-separated list of APM resources to ignore (such as low-volume or low-priority endpoints).

Notes:

  • The rule is evaluated for each additional primary tag value as well as an aggregate analysis. To consider only a single primary tag, specify it when requesting a gate evaluation.
  • New errors and error rate increases are detected at the resource level.
  • This rule type does not support services marked as database or inferred service.

Evaluate a gate from your pipeline

After the gate is configured, request an evaluation when deploying the related service, and decide whether to block or continue the deployment based on the result.

The datadog-ci deployment gate command runs the evaluation in a single command:

datadog-ci deployment gate --service transaction-backend --env staging --identifier default

If the Deployment Gate contains APM Faulty Deployment Detection rules, also specify the version (for example, --version 1.0.1).

The command:

  • Sends a request to start the gate evaluation and blocks until the evaluation is complete.
  • Provides a configurable timeout for how long to wait for an evaluation.
  • Has built-in automatic retries for errors.
  • Accepts --fail-on-error to customize behavior on unexpected Datadog errors.

The deployment gate command is available in datadog-ci versions v3.17.0 and above.

Required environment variables:

  • DD_API_KEY: Your API key.
  • DD_APP_KEY: Your application key.
  • DD_BETA_COMMANDS_ENABLED=1: The deployment gate command is a beta command.

For complete configuration options and usage examples, see the deployment gate command documentation.

Call Deployment Gates from an Argo Rollouts Kubernetes Resource by creating an AnalysisTemplate or a ClusterAnalysisTemplate. The template runs the datadog-ci deployment gate command to interact with the Deployment Gates API.

Use the template below as a starting point:

  • Replace <YOUR_DD_SITE> with your Datadog site name (for example, ).
  • Define the API key and application key as environment variables. The example uses a Kubernetes Secret called datadog with two data values: api-key and app-key. You can also pass the values in plain text with value instead of valueFrom.
apiVersion:argoproj.io/v1alpha1kind:ClusterAnalysisTemplatemetadata:name:datadog-job-analysisspec:args:- name:service- name:envmetrics:- name:datadog-jobprovider:job:spec:ttlSecondsAfterFinished:300backoffLimit:0template:spec:restartPolicy:Nevercontainers:- name:datadog-checkimage:datadog/ci:v3.17.0env:- name:DD_BETA_COMMANDS_ENABLEDvalue:"1"- name:DD_SITEvalue:"<YOUR_DD_SITE>"- name:DD_API_KEYvalueFrom:secretKeyRef:name:datadogkey:api-key- name:DD_APP_KEYvalueFrom:secretKeyRef:name:datadogkey:app-keycommand:["/bin/sh","-c"]args:- datadog-ci deployment gate --service {{ args.service }} --env {{ args.env }} --identifier default
  • The analysis template can receive arguments from the Rollout resource (such as service, env, and version). For more information, see the official Argo Rollouts docs.
  • ttlSecondsAfterFinished removes finished jobs after 5 minutes.
  • backoffLimit is set to 0 because the job should not be retried if the gate evaluation fails.

After you create the analysis template, reference it from the Argo Rollouts strategy:

apiVersion:argoproj.io/v1alpha1kind:Rolloutmetadata:name:rollouts-demolabels:tags.datadoghq.com/service:transaction-backendtags.datadoghq.com/env:devspec:replicas:5strategy:canary:steps:...- analysis:templates:- templateName:datadog-job-analysisclusterScope:true# Only needed for cluster analysisargs:- name:envvalueFrom:fieldRef:fieldPath:metadata.labels['tags.datadoghq.com/env']- name:servicevalueFrom:fieldRef:fieldPath:metadata.labels['tags.datadoghq.com/service']- name:version#Required for APM Faulty Deployment Detection rulesvalueFrom:fieldRef:fieldPath:metadata.labels['tags.datadoghq.com/version']- ...

The Datadog Deployment Gate GitHub Action runs the evaluation as part of a workflow.

Add a DataDog/deployment-gate-github-action step to your existing deployment workflow:

name:Deploy with Datadog Deployment Gateon:push:branches:[main]jobs:deploy:runs-on:ubuntu-lateststeps:- name:Deploy Canaryrun:| echo "Deploying canary release for service:'my-service' in 'production'. Version 1.0.1"
 # Your deployment commands here- name:Evaluate Deployment Gateuses:DataDog/deployment-gate-github-action@v2.1.0env:DD_API_KEY:${{ secrets.DD_API_KEY }}DD_APP_KEY:${{ secrets.DD_APP_KEY }}with:service:my-serviceenv:productionidentifier:default- name:Deployrun:| echo "Deployment Gate passed, proceeding with deployment"
 # Your deployment commands here

If the Deployment Gate contains APM Faulty Deployment Detection rules, also specify the version (for example, version: 1.0.1).

The action:

  • Sends a request to start the gate evaluation and blocks until the evaluation is complete.
  • Provides a configurable timeout for how long to wait for an evaluation.
  • Has built-in automatic retries for errors.
  • Accepts fail-on-error to customize behavior on unexpected Datadog errors.

Required environment variables:

For complete configuration options and usage examples, see the DataDog/deployment-gate-github-action repository.

Use this script as a starting point. It evaluates a preconfigured gate without inline rules.

Replace the following:

#!/bin/sh
# Configuration
MAX_RETRIES=3
DELAY_SECONDS=5
POLL_INTERVAL_SECONDS=15
MAX_POLL_TIME_SECONDS=10800 # 3 hours
API_URL="https://api.<YOUR_DD_SITE>/api/v2/deployments/gates/evaluation"
API_KEY="<YOUR_API_KEY>"
APP_KEY="<YOUR_APP_KEY>"
PAYLOAD=$(cat <<EOF
{
 "data": {
 "type": "deployment_gates_evaluation_request",
 "attributes": {
 "service": "$1",
 "env": "$2",
 "version": "$3"
 }
 }
}
EOF
)
# Step 1: Request evaluation
echo "Requesting evaluation..."
current_attempt=0
while [ $current_attempt -lt $MAX_RETRIES ]; do
 current_attempt=$((current_attempt + 1))
 RESPONSE=$(curl -s -w "%{http_code}" -o response.txt -X POST "$API_URL" \
 -H "Content-Type: application/json" \
 -H "DD-API-KEY: $API_KEY" \
 -H "DD-APPLICATION-KEY: $APP_KEY" \
 -d "$PAYLOAD")
 HTTP_CODE=$(echo "$RESPONSE" | tail -c 4)
 RESPONSE_BODY=$(cat response.txt)
 if [ ${HTTP_CODE} -ge 500 ] && [ ${HTTP_CODE} -le 599 ]; then
 echo "Attempt $current_attempt: 5xx Error ($HTTP_CODE). Retrying in $DELAY_SECONDS seconds..."
 sleep $DELAY_SECONDS
 continue
 elif [ ${HTTP_CODE} -ge 400 ] && [ ${HTTP_CODE} -le 499 ]; then
 echo "Client error ($HTTP_CODE): $RESPONSE_BODY"
 exit 1
 fi
 EVALUATION_ID=$(echo "$RESPONSE_BODY" | jq -r '.data.attributes.evaluation_id')
 if [ "$EVALUATION_ID" = "null" ] || [ -z "$EVALUATION_ID" ]; then
 echo "Failed to extract evaluation_id from response: $RESPONSE_BODY"
 exit 1
 fi
 echo "Evaluation started with ID: $EVALUATION_ID"
 break
done
if [ $current_attempt -eq $MAX_RETRIES ]; then
 echo "All retries exhausted for evaluation request, but treating 5xx errors as success."
 exit 0
fi
# Step 2: Poll for results
echo "Polling for results..."
start_time=$(date +%s)
poll_count=0
while true; do
 poll_count=$((poll_count + 1))
 current_time=$(date +%s)
 elapsed_time=$((current_time - start_time))
 if [ $elapsed_time -ge $MAX_POLL_TIME_SECONDS ]; then
 echo "Evaluation polling timeout after ${MAX_POLL_TIME_SECONDS} seconds"
 exit 1
 fi
 RESPONSE=$(curl -s -w "%{http_code}" -o response.txt -X GET "$API_URL/$EVALUATION_ID" \
 -H "DD-API-KEY: $API_KEY" \
 -H "DD-APPLICATION-KEY: $APP_KEY")
 HTTP_CODE=$(echo "$RESPONSE" | tail -c 4)
 RESPONSE_BODY=$(cat response.txt)
 if [ ${HTTP_CODE} -eq 404 ]; then
 echo "Evaluation not ready yet (404), retrying in $POLL_INTERVAL_SECONDS seconds... (attempt $poll_count, elapsed: ${elapsed_time}s)"
 sleep $POLL_INTERVAL_SECONDS
 continue
 elif [ ${HTTP_CODE} -ge 500 ] && [ ${HTTP_CODE} -le 599 ]; then
 echo "Server error ($HTTP_CODE) while polling, retrying in $POLL_INTERVAL_SECONDS seconds... (attempt $poll_count, elapsed: ${elapsed_time}s)"
 sleep $POLL_INTERVAL_SECONDS
 continue
 elif [ ${HTTP_CODE} -ge 400 ] && [ ${HTTP_CODE} -le 499 ]; then
 echo "Client error ($HTTP_CODE) while polling: $RESPONSE_BODY"
 exit 1
 fi
 GATE_STATUS=$(echo "$RESPONSE_BODY" | jq -r '.data.attributes.gate_status')
 if [ "$GATE_STATUS" = "pass" ]; then
 echo "Gate evaluation PASSED"
 exit 0
 elif [ "$GATE_STATUS" = "fail" ]; then
 echo "Gate evaluation FAILED"
 exit 1
 else
 echo "Evaluation still in progress (status: $GATE_STATUS), retrying in $POLL_INTERVAL_SECONDS seconds... (attempt $poll_count, elapsed: ${elapsed_time}s)"
 sleep $POLL_INTERVAL_SECONDS
 continue
 fi
done

The script:

  • Receives three inputs: service, environment, and version. version is required if the gate has APM Faulty Deployment Detection rules. You can also add identifier and primary_tag if needed.
  • Sends a request to start the evaluation and records the evaluation_id. Handles HTTP response codes:
    • 5xx: server error, retries with delay.
    • 4xx: client error, evaluation fails.
    • 2xx: evaluation started.
  • Polls the evaluation status endpoint with the evaluation_id until the evaluation is complete:
    • 5xx: server error, retries with delay.
    • 404: evaluation not started yet, retries with delay.
    • 4xx (except 404): client error, evaluation fails.
    • 2xx: check gate_status and retry with delay if not complete.
  • Polls every 15 seconds until the evaluation completes or the maximum polling time (10800 seconds = 3 hours by default) is reached.
  • If all retries are exhausted for the initial request (5xx responses), the script treats this as success to be resilient to API failures.

Adapt the script to your use case. It uses curl (to perform the request) and jq (to process the returned JSON). If those commands are not available, install them at the beginning of the script (for example, with apk add --no-cache curl jq).

Deployment Gate evaluations are asynchronous. When you trigger an evaluation, it’s started in the background, and the API returns an evaluation ID that you can use to track its progress:

  • First, request a Deployment Gate evaluation, which starts the process and returns an evaluation ID.
  • Then, periodically poll the evaluation status endpoint with the evaluation ID to retrieve the result when the evaluation is complete. Polling every 10-20 seconds is recommended.

Replace the following:

Request an evaluation for a gate that already exists in Datadog:

curl -X POST "https://api.<YOUR_DD_SITE>/api/v2/deployments/gates/evaluation" \
-H "Content-Type: application/json" \
-H "DD-API-KEY: <YOUR_API_KEY>" \
-H "DD-APPLICATION-KEY: <YOUR_APP_KEY>" \
-d @- << EOF
{
 "data": {
 "type": "deployment_gates_evaluation_request",
 "attributes": {
 "service": "transaction-backend",
 "env": "staging",
 "identifier": "my-custom-identifier",
 "version": "v123-456",
 "primary_tag": "region:us-central-1"
 }
 }
}
EOF

Optional attributes:

  • identifier: Optional, defaults to default.
  • version: Required for APM Faulty Deployment Detection rules.
  • primary_tag: Optional, scopes down APM Faulty Deployment Detection analysis to the selected primary tag.

Note: A 404 HTTP response can mean the gate was not found, or the gate was found but has no rules.

If the gate evaluation was successfully started, a 202 HTTP status code is returned:

{
 "data": {
 "id": "<random_response_uuid>",
 "type": "deployment_gates_evaluation_response",
 "attributes": {
 "evaluation_id": "e9d2f04f-4f4b-494b-86e5-52f03e10c8e9"
 }
 }
}

The field data.attributes.evaluation_id contains the unique identifier for this gate evaluation.

Fetch the status of a gate evaluation by polling the status endpoint with the evaluation ID:

curl -X GET "https://api.<YOUR_DD_SITE>/api/v2/deployments/gates/evaluation/<evaluation_id>" \
-H "DD-API-KEY: <YOUR_API_KEY>" \
-H "DD-APPLICATION-KEY: <YOUR_APP_KEY>"

Note: If you call this endpoint too soon after requesting the evaluation, a 404 HTTP response may be returned because the evaluation did not start yet. Retry a few seconds later.

When a 200 HTTP response is returned, it has the following format:

{
 "data": {
 "id": "<random_response_uuid>",
 "type": "deployment_gates_evaluation_result_response",
 "attributes": {
 "dry_run": false,
 "evaluation_id": "e9d2f04f-4f4b-494b-86e5-52f03e10c8e9",
 "evaluation_url": "https://app.datadoghq.com/ci/deployment-gates/evaluations?index=cdgates&query=level%3Agate+%40evaluation_id%3Ae9d2f14f-4f4b-494b-86e5-52f03e10c8e9",
 "gate_id": "e140302e-0cba-40d2-978c-6780647f8f1c",
 "gate_status": "pass",
 "rules": [
 {
 "name": "Check service monitors",
 "status": "fail",
 "reason": "One or more monitors in ALERT state: https://app.datadoghq.com/monitors/34330981",
 "dry_run": true
 }
 ]
 }
 }
}

The field data.attributes.gate_status contains the result of the evaluation, with one of these values:

  • in_progress: The Deployment Gate evaluation is still in progress; continue polling.
  • pass: The Deployment Gate evaluation passed.
  • fail: The Deployment Gate evaluation failed.

Note: If the field data.attributes.dry_run is true, the field data.attributes.gate_status is always pass.

Recommendation for first-time onboarding

When integrating Deployment Gates into your Continuous Delivery workflow, an evaluation phase helps confirm the product is working as expected before it impacts deployments. Use the Dry Run evaluation mode and the Deployment Gates Evaluations page:

  1. Create a gate for a service and set the Evaluation Mode to Dry Run.
  2. Add the gate evaluation to your deployment process. While the gate is in dry-run mode, the API always returns pass and deployments are not impacted by the gate result.
  3. After a period of time (for example, 1-2 weeks), check the gate and rule executions on the Deployment Gates Evaluations page. The UI shows the real status, so you can see when the gate would have failed and the reason behind it.
  4. When you are confident that the gate behavior is as you expect, edit the gate and switch the evaluation mode from Dry Run to Active. Afterwards, the API starts returning the actual status and deployments start getting promoted or rolled back based on the gate result.

Further reading