![]() |
VOOZH | about |
After you instrument your application with Agent Observability, you can access Agent Observability metrics for use in dashboards and monitors. These metrics capture span counts, error counts, token usage, and latency measures for your LLM applications. These metrics are calculated based on 100% of the application’s traffic.
ml_obs.* entries on this page are Datadog Metrics: numerical values that describe an aspect of your LLM application over time, derived from your LLM spans (counts, distributions of cost, tokens, latency, errors). They are 100%-sampled, follow standard Datadog metric retention (15 months at full granularity), and are queryable from dashboards, monitors, and notebooks like any other Datadog metric.| Metric Name | Description | Metric Type | Tags |
|---|---|---|---|
ml_obs.span | Total number of spans with a span kind | Count | env, error, ml_app, model_name, model_provider, service, span_kind, version |
ml_obs.span.duration | Total duration of spans in seconds | Distribution | env, error, ml_app, model_name, model_provider, service, span_kind, version |
ml_obs.span.error | Number of errors that occurred in the span | Count | env, error, ml_app, model_name, model_provider, service, span_kind, version |
| Metric Name | Description | Metric Type | Tags |
|---|---|---|---|
ml_obs.span.llm.input.tokens | Number of tokens in the input sent to the LLM | Distribution | env, error, ml_app, model_name, model_provider, service, version, matched_model_name, matched_model_provider |
ml_obs.span.llm.output.tokens | Number of tokens in the output | Distribution | env, error, ml_app, model_name, model_provider, service, version, matched_model_name, matched_model_provider |
ml_obs.span.llm.output.reasoning.tokens | Number of reasoning tokens in the output | Distribution | env, error, ml_app, model_name, model_provider, service, version, matched_model_name, matched_model_provider |
ml_obs.span.llm.prompt.tokens | Number of tokens used in the prompt | Distribution | env, error, ml_app, model_name, model_provider, service, version, matched_model_name, matched_model_provider |
ml_obs.span.llm.completion.tokens | Tokens generated as a completion during the span | Distribution | env, error, ml_app, model_name, model_provider, service, version, matched_model_name, matched_model_provider |
ml_obs.span.llm.total.tokens | Total tokens consumed during the span (input + output + prompt) | Distribution | env, error, ml_app, model_name, model_provider, service, version, matched_model_name, matched_model_provider |
ml_obs.span.llm.input.cache_write.tokens | Number of input tokens written to the prompt cache in an LLM span | Distribution | env, error, ml_app, model_name, model_provider, service, version, matched_model_name, matched_model_provider |
ml_obs.span.llm.input.cache_read.tokens | Number of input tokens served from the prompt cache in an LLM span | Distribution | env, error, ml_app, model_name, model_provider, service, version, matched_model_name, matched_model_provider |
ml_obs.span.llm.input.non_cached.tokens | Number of input tokens that did not interact with the prompt cache in an LLM span | Distribution | env, error, ml_app, model_name, model_provider, service, version, matched_model_name, matched_model_provider |
ml_obs.span.llm.input.characters | Number of characters in the input sent to the LLM | Distribution | env, error, ml_app, model_name, model_provider, service, version, matched_model_name, matched_model_provider |
ml_obs.span.llm.output.characters | Number of characters in the output | Distribution | env, error, ml_app, model_name, model_provider, service, version, matched_model_name, matched_model_provider |
| Metric Name | Description | Metric Type | Tags |
|---|---|---|---|
ml_obs.span.embedding.input.tokens | Number of input tokens used for generating an embedding | Distribution | env, error, ml_app, model_name, model_provider, service, version, matched_model_name, matched_model_provider |
| Metric Name | Description | Metric Type | Tags |
|---|---|---|---|
ml_obs.span.llm.input.cost | Estimated input cost in an LLM span | Distribution | env, error, ml_app, model_name, model_provider, service, version, source, matched_model_name, matched_model_provider |
ml_obs.span.embedding.input.cost | Estimated input cost in an embedding span | Distribution | env, error, ml_app, model_name, model_provider, service, version, source, matched_model_name, matched_model_provider |
ml_obs.span.llm.output.reasoning.cost | Estimated reasoning output cost in an LLM span | Distribution | env, error, ml_app, model_name, model_provider, service, version, source, matched_model_name, matched_model_provider |
ml_obs.span.llm.output.cost | Estimated output cost in an LLM span | Distribution | env, error, ml_app, model_name, model_provider, service, version, source, matched_model_name, matched_model_provider |
ml_obs.span.llm.total.cost | Estimated total cost in an LLM or embedding span | Distribution | env, error, ml_app, model_name, model_provider, service, version, source, matched_model_name, matched_model_provider |
ml_obs.span.llm.input.cache_write.cost | Estimated cache write input cost in an LLM span | Distribution | env, error, ml_app, model_name, model_provider, service, version, source, matched_model_name, matched_model_provider |
ml_obs.span.llm.input.cache_read.cost | Estimated cache read input cost in an LLM span | Distribution | env, error, ml_app, model_name, model_provider, service, version, source, matched_model_name, matched_model_provider |
ml_obs.span.llm.input.non_cached.cost | Estimated non cached input cost in an LLM span | Distribution | env, error, ml_app, model_name, model_provider, service, version, source, matched_model_name, matched_model_provider |
| Metric Name | Description | Metric Type | Tags |
|---|---|---|---|
ml_obs.trace | Number of traces | Count | env, error, ml_app, service, span_kind, version |
ml_obs.trace.duration | Total duration of all traces across all spans | Distribution | env, error, ml_app, service, span_kind, version |
ml_obs.trace.error | Number of errors that occurred during the trace | Count | env, error, ml_app, service, span_kind, version |
| Metric Name | Description | Metric Type | Tags |
|---|---|---|---|
ml_obs.estimated_usage.llm.input.tokens | Estimated number of input tokens used | Distribution | evaluation_name, ml_app, model_name, model_provider, model_server |
| Metric Name | Description | Metric Type | Tags |
|---|---|---|---|
ml_obs.estimated_usage.llm.output.tokens | Estimated number of output tokens generated | Distribution | evaluation_name, ml_app, model_name, model_provider, model_server |
ml_obs.estimated_usage.llm.total.tokens | Total estimated tokens (input + output) used | Distribution | evaluation_name, ml_app, model_name, model_provider, model_server |
Make use of your Agent Observability metrics:
Additional helpful documentation, links, and articles:
| |