VOOZH about

URL: https://docs.datadoghq.com/security/ai_guard/setup/

⇱ Set Up AI Guard


For AI agents: A markdown version of this page is available at https://docs.datadoghq.com/security/ai_guard/setup.md. A documentation index is available at /llms.txt.

Set Up AI Guard

AI Guard isn't available in the site.

Complete the following steps to set up AI Guard:

1. Check prerequisites

Before you set up AI Guard, ensure you have everything you need:

  • While AI Guard is in Preview, Datadog needs to enable a backend feature flag for each organization in the Preview. Contact Datadog support with one or more Datadog organization names and regions to enable it.
  • Certain setup steps require specific Datadog permissions. An admin may need to create a new role with the required permissions and assign it to you:
    PermissionTypeDescription
    AI Guard Evaluate (ai_guard_evaluate)WriteRequired to call the AI Guard evaluate API and to create an application key with the ai_guard_evaluate scope.
    AI Guard View (ai_guard_view)ReadRequired to view the AI Guard UI, including signals, spans, and read-only settings (service blocking policies, evaluation sensitivity, tool policies, tool allowlist). Also required to report false positives.
    AI Guard Write (ai_guard_write)WriteRequired to modify AI Guard configuration, including blocking policies, sensitive data scanning, tool policies, tool blocking, tool allowlist, and evaluation sensitivity thresholds.
    User Access Manage (user_access_manage)WriteRequired to create a restricted dataset that limits access to AI Guard spans with Data Access Control.

Usage limits

The AI Guard evaluator API has the following usage limits:

  • 1 billion tokens evaluated per day.
  • 12,000 requests per minute, per IP.

If you exceed these limits, or expect to exceed them soon, contact Datadog support to discuss possible solutions.

2. Create API and application keys

To use AI Guard, you need at least one API key and one application key set in your Agent services, usually using environment variables. Follow the instructions at API and Application Keys to create both.

When adding scopes for the application key, add the ai_guard_evaluate scope. The user creating the application key must have the AI Guard Evaluate permission.

3. Instrument your application

Choose an instrumentation approach based on your framework and language:

SDK

The AI Guard SDK provides language-specific libraries (Python, JavaScript, Java, Ruby) to call the AI Guard REST API and monitor activity in real time in Datadog.

Automatic integrations

Automatic integrations provide out-of-the-box AI Guard protection for supported frameworks. When you run your application with the Datadog SDK, AI Guard evaluations are automatically performed without requiring any code changes.

LanguageSupported Frameworks
PythonLangChain
Node.jsAI SDK

Manual integrations

Manual integrations require additional configuration to enable AI Guard protection for supported frameworks.

LanguageSupported Frameworks
PythonAmazon Strands, LiteLLM Proxy

HTTP API

The AI Guard HTTP API lets you call the AI Guard JSON:API endpoint directly with any HTTP client, for languages or environments the SDK doesn’t cover.

4. Create a custom retention filter

To view AI Guard evaluations in Datadog, create a custom retention filter for AI Guard-generated spans. Follow the linked instructions to create a retention filter with the following settings:

  • Retention query: resource_name:ai_guard
  • Span rate: 100%
  • Trace rate: 100%

5. Configure AI Guard policies

AI Guard provides settings to control how evaluations are enforced, how sensitive threat detection is, and whether sensitive data scanning is enabled.

Configure service policies

On the Security > AI Guard > Settings > Services page, you can configure policies that determine what actions AI Guard should take when it detects unsafe content. For each policy, you determine:

Beside Default policy, click Edit to set AI Guard’s default behavior. To override the default behavior, click Add Service Policy, select the service and environment you want your override to apply to, then configure the more specialized policy.

Blocking policy

By default, AI Guard evaluates conversations and returns an action (ALLOW, DENY, or ABORT) but does not block requests. To enable blocking so that DENY and ABORT actions actively prevent unsafe interactions from proceeding, configure the blocking policy for your services.

You can configure blocking at different levels of granularity, with more specific settings taking priority:

  • Organization-wide: Apply a default blocking policy to all services and environments.
  • Per environment: Override the organization default for a specific environment.
  • Per service: Override the organization default for a specific service.
  • Per service and environment: Override all of the above for a specific service in a specific environment (for example, enable blocking in production but not in staging).

Sensitive data scanning

AI Guard can detect personally identifiable information (PII) such as email addresses, phone numbers, and SSNs, as well as secrets such as API keys and tokens, in LLM conversations. When you create or edit a policy for a service, you can choose to enable or disable sensitive data detection.

When enabled, AI Guard scans the last message in each evaluation call, including user prompts, assistant responses, tool call arguments, and tool call results. Findings appear on APM traces for visibility. Sensitive data scanning is detection-only; findings do not independently trigger blocking.

Block specific tools

You can configure AI Guard to block requests for specific tools, for specific services and environments. To do so, go to Security > AI Guard > Settings > Tool Blocklist. Click Add Tool Blocking Configuration, select the service, environment, and tool, and choose whether AI Guard should follow the default service policy or block all requests for the tool.

Evaluation sensitivity

AI Guard assigns a confidence score to each threat category it detects (for example, prompt injection or jailbreaking). You can control the minimum confidence score required for AI Guard to flag a threat by going to Security > AI Guard > Settings > Evaluation Sensitivity.

Evaluation sensitivity is a value between 0.0 and 1.0, with a default of 0.5.

  • A lower value increases sensitivity: AI Guard flags threats even when the confidence is low, surfacing more potential attacks but also more false positives.
  • A higher value decreases sensitivity: AI Guard only flags threats when the confidence is high, reducing noise but potentially missing some attacks.

Add context with your system prompt

AI Guard evaluates the full conversation, including your system prompt, when assessing threats. Adding context about your agent’s purpose, the data it handles, and the tools it is authorized to use helps AI Guard distinguish legitimate operations from genuine threats—reducing false positives without reducing security coverage.

What to include

In your system prompt, describe:

  • Agent purpose: The agent’s role and intended scope.
  • Authorized data: The categories of data the agent is expected to read, write, or export.
  • Authorized tools: The tools and operations the agent is permitted to call.

Example

A system prompt with minimal context is more likely to result in false positives for legitimate operations:

You are a helpful assistant.

A system prompt with explicit context helps AI Guard evaluate intent accurately:

Youareafinancialdataanalystassistantforinternalemployees.Youareauthorizedto:-Queryinternalfinancialdatabases(read-only)usingthe`sql_query`tool.-ExportqueryresultstoCSVorPDFusingthe`file_export`tool.-Retrieveandsummarizeinternalfinancialreports.Donotaccessexternalsystemsorprocessrequestsunrelatedtofinancialreporting.

With this context, AI Guard treats SQL queries and file exports as expected, authorized operations, and is less likely to flag them as data exfiltration or destructive tool calls.

Limitations

Do not use the system prompt to override AI Guard’s security checks or to instruct AI Guard directly. AI Guard evaluates the system prompt as part of the conversation context, and ignores instructions that attempt to disable or weaken its own security checks.

6. (Optional) Limit access to AI Guard spans

To restrict access to AI Guard spans for specific users, you can use Data Access Control. Follow the linked instructions to create a restricted dataset, scoped to APM data, with the resource_name:ai_guard filter applied. Then, you can grant access to the dataset to specific roles or teams.

Further reading

Additional helpful documentation, links, and articles: