![]() |
VOOZH | about |
Deployment Gates are in Preview. If you’re interested in this feature, complete the form to request access.
Request AccessWith Just-In-Time (JIT) Deployment Gates, rules are defined inline in the evaluation request. No gate needs to exist in Datadog ahead of time, which makes JIT a good fit for rules-as-code and per-deployment flexibility.
Looking for persistent gates managed in the Datadog UI, API, or Terraform? See Preconfigured Deployment Gates.
Example configuration:
{
"configuration": {
"dry_run": false,
"rules": [
{
"type": "monitor",
"name": "Service monitors",
"options": {
"query": "service:transaction-backend env:production",
"duration": 300
}
}
]
}
}
Top-level fields:
rules (required): One or more rule entries. All rules must pass for the gate to pass.dry_run (optional): When true, the gate always returns pass over the API while the real result is recorded in the UI. Useful for onboarding. See Recommendation for first-time onboarding.Each rule has these fields:
type (required): The rule type, monitor or faulty_deployment_detection. See Rule types for what each evaluates.name (required): A human-readable label that shows up on the Deployment Gates Evaluations page.options (required): Rule-specific settings; see Rule types.dry_run (optional): Per-rule dry-run override. Overrides the gate-level dry_run.For the full schema and all available options, see the Deployment Gates API reference.
The Monitor rule evaluates the state of a set of monitors over a configurable period of time. It fails if at any time during the evaluation period:
ALERT or NO_DATA state.Options:
query: The monitor search query, based on the Search Monitor syntax. Filter on monitor tags:service:transaction-backendscope:"service:transaction-backend"group:"service:transaction-backend"duration: The period of time (in seconds) for which the matching monitors are evaluated. Default is 0 (monitors are evaluated instantly). Maximum is 7200 seconds (2 hours).Example inline rule:
{
"type": "monitor",
"name": "Service monitors",
"options": {
"query": "service:transaction-backend env:production",
"duration": 300
}
}
Notes:
group filters evaluate only matching groups.muted:false).This rule type uses Watchdog’s APM Faulty Deployment Detection analysis to compare the deployed version against previous versions of the same service. The analysis detects:
The analysis is automatically performed for all APM-instrumented services, and no prior setup is required.
Options:
duration: The period of time (in seconds) for which the analysis runs. For optimal analysis confidence, this value should be at least 900 seconds (15 minutes) after a deployment starts. Maximum is 7200 seconds (2 hours).included_resources (optional): APM resources to include in the analysis. When specified, only the listed resources are analyzed.excluded_resources (optional): APM resources to ignore (such as low-volume or low-priority endpoints).Example inline rule:
{
"type": "faulty_deployment_detection",
"name": "APM Faulty Deployment Detection",
"options": {
"duration": 900,
"excluded_resources": ["GET /healthcheck"]
}
}
Notes:
primary_tag in the request attributes.database or inferred service.You can request a gate evaluation from your deployment pipeline in several ways. The datadog-ci CLI, Argo Rollouts integration, and GitHub Action accept inline rules through a JSON config file using camel case keys (dryRun). Direct API calls and the generic script send the same configuration in the request payload using snake case keys (dry_run), matching the API schema.
The datadog-ci deployment gate command runs the evaluation in a single command. Pass a JSON config file with the --config flag:
datadog-ci deployment gate --service transaction-backend --env production --version 1.2.3 --config ./gate-config.json
Example gate-config.json:
{
"dryRun": false,
"rules": [
{
"type": "monitor",
"name": "Service monitors",
"options": {
"query": "service:transaction-backend env:production",
"duration": 300
}
},
{
"type": "faulty_deployment_detection",
"name": "APM Faulty Deployment Detection",
"options": {
"duration": 900,
"excluded_resources": ["GET /healthcheck"]
}
}
]
}
The command:
--fail-on-error to customize behavior on unexpected Datadog errors.The deployment gate command is available in datadog-ci versions v3.17.0 and above. The --config flag requires version v5.19.0 or above.
Required environment variables:
DD_API_KEY: Your API key.DD_APP_KEY: Your application key.DD_BETA_COMMANDS_ENABLED=1: The deployment gate command is a Preview command.For complete configuration options and usage examples, see the deployment gate command documentation.
Call Deployment Gates from an Argo Rollouts Kubernetes Resource by creating an AnalysisTemplate or a ClusterAnalysisTemplate. The template runs the datadog-ci deployment gate command to interact with the Deployment Gates API.
Use the template below as a starting point:
<YOUR_DD_SITE> with your Datadog site name (for example, ).datadog with two data values: api-key and app-key. You can also pass the values in plain text with value instead of valueFrom.--config flag (version v5.19.0 or higher).Store the gate config in a ConfigMap, then mount it into the job and pass --config to the CLI:
apiVersion:v1kind:ConfigMapmetadata:name:gate-configdata:gate-config.json:| {
"dryRun": false,
"rules": [
{
"type": "monitor",
"name": "Service monitors",
"options": {
"query": "service:transaction-backend env:production",
"duration": 300
}
},
{
"type": "faulty_deployment_detection",
"name": "APM Faulty Deployment Detection",
"options": {
"duration": 900,
"excluded_resources": ["GET /healthcheck"]
}
}
]
}---apiVersion:argoproj.io/v1alpha1kind:ClusterAnalysisTemplatemetadata:name:datadog-job-analysisspec:args:- name:service- name:env- name:versionmetrics:- name:datadog-jobprovider:job:spec:ttlSecondsAfterFinished:300backoffLimit:0template:spec:restartPolicy:Nevercontainers:- name:datadog-checkimage:datadog/ci:latestenv:- name:DD_BETA_COMMANDS_ENABLEDvalue:"1"- name:DD_SITEvalue:"<YOUR_DD_SITE>"- name:DD_API_KEYvalueFrom:secretKeyRef:name:datadogkey:api-key- name:DD_APP_KEYvalueFrom:secretKeyRef:name:datadogkey:app-keycommand:["/bin/sh","-c"]args:- datadog-ci deployment gate --service {{ args.service }} --env {{ args.env }} --version {{ args.version }} --config /etc/datadog/gate-config.jsonvolumeMounts:- name:gate-configmountPath:/etc/datadogvolumes:- name:gate-configconfigMap:name:gate-configservice, env, version). For more information, see the official Argo Rollouts docs.ttlSecondsAfterFinished removes finished jobs after 5 minutes.backoffLimit is set to 0 because the job should not be retried if the gate evaluation fails.After you create the analysis template, reference it from the Argo Rollouts strategy:
apiVersion:argoproj.io/v1alpha1kind:Rolloutmetadata:name:rollouts-demolabels:tags.datadoghq.com/service:transaction-backendtags.datadoghq.com/env:devspec:replicas:5strategy:canary:steps:...- analysis:templates:- templateName:datadog-job-analysisclusterScope:true# Only needed for cluster analysisargs:- name:envvalueFrom:fieldRef:fieldPath:metadata.labels['tags.datadoghq.com/env']- name:servicevalueFrom:fieldRef:fieldPath:metadata.labels['tags.datadoghq.com/service']- name:version#Required for APM Faulty Deployment Detection rulesvalueFrom:fieldRef:fieldPath:metadata.labels['tags.datadoghq.com/version']- ...The Datadog Deployment Gate GitHub Action runs the evaluation as part of a workflow. Commit a gate configuration file to the repository and pass its path with the config input. The config input requires version v2.1.0 or above:
name:Deploy with Datadog Deployment Gateon:push:branches:[main]jobs:deploy:runs-on:ubuntu-lateststeps:- name:Checkoutuses:actions/checkout@v5- name:Deploy Canaryrun:| echo "Deploying canary release for service:'my-service' in 'production'. Version 1.0.1"
# Your deployment commands here- name:Evaluate Deployment Gateuses:DataDog/deployment-gate-github-action@v2.1.0env:DD_API_KEY:${{ secrets.DD_API_KEY }}DD_APP_KEY:${{ secrets.DD_APP_KEY }}with:service:my-serviceenv:productionversion:1.0.1config:.github/gate-config.json- name:Deployrun:| echo "Deployment Gate passed, proceeding with deployment"
# Your deployment commands hereExample .github/gate-config.json:
{
"dryRun": false,
"rules": [
{
"type": "monitor",
"name": "Service monitors",
"options": {
"query": "service:my-service env:production",
"duration": 300
}
},
{
"type": "faulty_deployment_detection",
"name": "APM Faulty Deployment Detection",
"options": {
"duration": 900,
"excluded_resources": ["GET /healthcheck"]
}
}
]
}
The action:
fail-on-error to customize behavior on unexpected Datadog errors.Required environment variables:
DD_API_KEY: Your API key.DD_APP_KEY: Your application key.For complete configuration options and usage examples, see the DataDog/deployment-gate-github-action repository.
Use this script as a starting point. It evaluates a gate using inline JIT rules.
Replace the following:
<YOUR_DD_SITE>: Your Datadog site name (for example, )<YOUR_API_KEY>: Your API key<YOUR_APP_KEY>: Your application key#!/bin/sh
# Configuration
MAX_RETRIES=3
DELAY_SECONDS=5
POLL_INTERVAL_SECONDS=15
MAX_POLL_TIME_SECONDS=10800 # 3 hours
API_URL="https://api.<YOUR_DD_SITE>/api/v2/deployments/gates/evaluation"
API_KEY="<YOUR_API_KEY>"
APP_KEY="<YOUR_APP_KEY>"
PAYLOAD=$(cat <<EOF
{
"data": {
"type": "deployment_gates_evaluation_request",
"attributes": {
"service": "$1",
"env": "$2",
"version": "$3",
"configuration": {
"dry_run": false,
"rules": [
{
"type": "monitor",
"name": "Service monitors",
"options": {
"query": "service:$1 env:$2",
"duration": 300
}
},
{
"type": "faulty_deployment_detection",
"name": "APM Faulty Deployment Detection",
"options": {
"duration": 900,
"excluded_resources": ["GET /healthcheck"]
}
}
]
}
}
}
}
EOF
)
# Step 1: Request evaluation
echo "Requesting evaluation..."
current_attempt=0
while [ $current_attempt -lt $MAX_RETRIES ]; do
current_attempt=$((current_attempt + 1))
RESPONSE=$(curl -s -w "%{http_code}" -o response.txt -X POST "$API_URL" \
-H "Content-Type: application/json" \
-H "DD-API-KEY: $API_KEY" \
-H "DD-APPLICATION-KEY: $APP_KEY" \
-d "$PAYLOAD")
HTTP_CODE=$(echo "$RESPONSE" | tail -c 4)
RESPONSE_BODY=$(cat response.txt)
if [ ${HTTP_CODE} -ge 500 ] && [ ${HTTP_CODE} -le 599 ]; then
echo "Attempt $current_attempt: 5xx Error ($HTTP_CODE). Retrying in $DELAY_SECONDS seconds..."
sleep $DELAY_SECONDS
continue
elif [ ${HTTP_CODE} -ge 400 ] && [ ${HTTP_CODE} -le 499 ]; then
echo "Client error ($HTTP_CODE): $RESPONSE_BODY"
exit 1
fi
EVALUATION_ID=$(echo "$RESPONSE_BODY" | jq -r '.data.attributes.evaluation_id')
if [ "$EVALUATION_ID" = "null" ] || [ -z "$EVALUATION_ID" ]; then
echo "Failed to extract evaluation_id from response: $RESPONSE_BODY"
exit 1
fi
echo "Evaluation started with ID: $EVALUATION_ID"
break
done
if [ $current_attempt -eq $MAX_RETRIES ]; then
echo "All retries exhausted for evaluation request, but treating 5xx errors as success."
exit 0
fi
# Step 2: Poll for results
echo "Polling for results..."
start_time=$(date +%s)
poll_count=0
while true; do
poll_count=$((poll_count + 1))
current_time=$(date +%s)
elapsed_time=$((current_time - start_time))
if [ $elapsed_time -ge $MAX_POLL_TIME_SECONDS ]; then
echo "Evaluation polling timeout after ${MAX_POLL_TIME_SECONDS} seconds"
exit 1
fi
RESPONSE=$(curl -s -w "%{http_code}" -o response.txt -X GET "$API_URL/$EVALUATION_ID" \
-H "DD-API-KEY: $API_KEY" \
-H "DD-APPLICATION-KEY: $APP_KEY")
HTTP_CODE=$(echo "$RESPONSE" | tail -c 4)
RESPONSE_BODY=$(cat response.txt)
if [ ${HTTP_CODE} -eq 404 ]; then
echo "Evaluation not ready yet (404), retrying in $POLL_INTERVAL_SECONDS seconds... (attempt $poll_count, elapsed: ${elapsed_time}s)"
sleep $POLL_INTERVAL_SECONDS
continue
elif [ ${HTTP_CODE} -ge 500 ] && [ ${HTTP_CODE} -le 599 ]; then
echo "Server error ($HTTP_CODE) while polling, retrying in $POLL_INTERVAL_SECONDS seconds... (attempt $poll_count, elapsed: ${elapsed_time}s)"
sleep $POLL_INTERVAL_SECONDS
continue
elif [ ${HTTP_CODE} -ge 400 ] && [ ${HTTP_CODE} -le 499 ]; then
echo "Client error ($HTTP_CODE) while polling: $RESPONSE_BODY"
exit 1
fi
GATE_STATUS=$(echo "$RESPONSE_BODY" | jq -r '.data.attributes.gate_status')
if [ "$GATE_STATUS" = "pass" ]; then
echo "Gate evaluation PASSED"
exit 0
elif [ "$GATE_STATUS" = "fail" ]; then
echo "Gate evaluation FAILED"
exit 1
else
echo "Evaluation still in progress (status: $GATE_STATUS), retrying in $POLL_INTERVAL_SECONDS seconds... (attempt $poll_count, elapsed: ${elapsed_time}s)"
sleep $POLL_INTERVAL_SECONDS
continue
fi
done
The script:
service, environment, and version. version is required if one or more APM Faulty Deployment Detection rules are evaluated.evaluation_id. Handles HTTP response codes:evaluation_id until the evaluation is complete:gate_status and retry with delay if not complete.Adapt the script to your use case. It uses curl (to perform the request) and jq (to process the returned JSON). If those commands are not available, install them at the beginning of the script (for example, with apk add --no-cache curl jq).
Deployment Gate evaluations are asynchronous. When you trigger an evaluation, it’s started in the background, and the API returns an evaluation ID that you can use to track its progress:
Replace the following:
<YOUR_DD_SITE>: Your Datadog site name (for example, )<YOUR_API_KEY>: Your API key<YOUR_APP_KEY>: Your application keyPass configuration with inline rules (snake_case at the API boundary):
curl -X POST "https://api.<YOUR_DD_SITE>/api/v2/deployments/gates/evaluation" \
-H "Content-Type: application/json" \
-H "DD-API-KEY: <YOUR_API_KEY>" \
-H "DD-APPLICATION-KEY: <YOUR_APP_KEY>" \
-d @- << 'EOF'
{
"data": {
"type": "deployment_gates_evaluation_request",
"attributes": {
"service": "transaction-backend",
"env": "production",
"version": "1.2.3",
"configuration": {
"dry_run": false,
"rules": [
{
"type": "monitor",
"name": "Service monitors",
"options": {
"query": "service:transaction-backend env:production",
"duration": 300
}
},
{
"type": "faulty_deployment_detection",
"name": "APM Faulty Deployment Detection",
"options": {
"duration": 900,
"excluded_resources": ["GET /healthcheck"]
}
}
]
}
}
}
}
EOF
If the gate evaluation was successfully started, a 202 HTTP status code is returned:
{
"data": {
"id": "<random_response_uuid>",
"type": "deployment_gates_evaluation_response",
"attributes": {
"evaluation_id": "e9d2f04f-4f4b-494b-86e5-52f03e10c8e9"
}
}
}
The field data.attributes.evaluation_id contains the unique identifier for this gate evaluation.
Fetch the status of a gate evaluation by polling the status endpoint with the evaluation ID:
curl -X GET "https://api.<YOUR_DD_SITE>/api/v2/deployments/gates/evaluation/<evaluation_id>" \
-H "DD-API-KEY: <YOUR_API_KEY>" \
-H "DD-APPLICATION-KEY: <YOUR_APP_KEY>"
Note: If you call this endpoint too soon after requesting the evaluation, a 404 HTTP response may be returned because the evaluation did not start yet. Retry a few seconds later.
When a 200 HTTP response is returned, it has the following format:
{
"data": {
"id": "<random_response_uuid>",
"type": "deployment_gates_evaluation_result_response",
"attributes": {
"dry_run": false,
"evaluation_id": "e9d2f04f-4f4b-494b-86e5-52f03e10c8e9",
"evaluation_url": "https://app.datadoghq.com/ci/deployment-gates/evaluations?index=cdgates&query=level%3Agate+%40evaluation_id%3Ae9d2f04f-4f4b-494b-86e5-52f03e10c8e9",
"gate_id": "e140302e-0cba-40d2-978c-6780647f8f1c",
"gate_status": "pass",
"rules": [
{
"name": "Service monitors",
"status": "fail",
"reason": "One or more monitors in ALERT state: https://app.datadoghq.com/monitors/34330981",
"dry_run": false
}
]
}
}
}
The field data.attributes.gate_status contains the result of the evaluation, with one of these values:
in_progress: The Deployment Gate evaluation is still in progress; continue polling.pass: The Deployment Gate evaluation passed.fail: The Deployment Gate evaluation failed.Note: If the field data.attributes.dry_run is true, the field data.attributes.gate_status is always pass.
When integrating Deployment Gates into your Continuous Delivery workflow, an evaluation phase helps confirm the product is working as expected before it impacts deployments. Use dry-run mode and the Deployment Gates Evaluations page:
dry_run: true on the configuration (or dryRun: true in the CLI config file). To mark only some rules as dry-run, set dry_run per rule. A dry-run evaluation always returns pass over the API, but the real result is recorded in the UI.dry_run to false. Afterwards, the API starts returning the actual status and deployments start getting promoted or rolled back based on the gate result.Additional helpful documentation, links, and articles:
| |