![]() |
VOOZH | about |
This guide describes the Datadog Agent’s behavior when it fails to send HTTP requests to the Metrics, Logs, APM, and Processes intake endpoints. All retry strategies use exponential backoff with randomized jitter. See the backoff source code for implementation details.
2xx HTTP response.The Agent retries failed HTTP requests using an exponential backoff strategy. The Agent uses the following default retry configurations for the metrics intake:
The Agent retries failed requests for the following scenarios:
4xx responses (see note for exceptions)5xx responses4xx responses, the Agent does not retry requests with status codes 400, 403, or 413.404 response are retried because they often indicate a configuration or availability issue that could be resolved.When the Agent fails to send a metric to the Datadog intake, it compresses and stores the metric in an in-memory retry buffer. See Buffer configurations for the available settings.
The Agent also supports an optional on-disk retry buffer. If you enable this setting, the Agent:
This prioritization helps ensure that the Agent sends recent and live metrics before it backfills older data.
The Datadog Agent has the following default configurations for metric retry buffering:
You can configure the default maximum in-memory buffer size using the forwarder_retry_queue_payloads_max_size setting.
During restart, the Agent:
During shutdown, the Agent:
The Logs Agent retries failed HTTP requests indefinitely using an exponential backoff strategy. The Agent uses the following default retry configurations for the logs intake:
The Agent retries failed log payloads until the logs intake endpoint becomes available.
400, 401, 403, or 413.The Logs Agent guarantees log delivery during transmission. When a payload fails to send, the Agent applies backpressure by stopping reading from the log source and resuming from the last known position when the intake becomes available.
logrotateHTTP logs:
TCP logs:
The Logs Agent maintains a registry that tracks log sources and current read offsets. The Agent flushes the registry to disk every second and reloads it when the Agent restarts. You cannot configure this process.
On restart, the Agent resumes reading from the position recorded in the registry.
When you enable dual shipping:
For the Agent logic when is_reliable is enabled, see Logs Dual Shipping.
The Agent retries failed APM requests using an exponential backoff strategy. The Agent uses the following default retry configurations for the APM intake:
The Agent retries failed requests for the following scenarios:
408 responses5xx responsesThe Agent compresses and stores failed APM payloads in memory, dropping them when queues are full.
apm_config.stats_writer.queue_sizeint(max(1, max memory / payload size))int(max(1, (250 * 1024 * 1024) / 1500000)) = 174 payloadsWhen you enable dual shipping for the APM intake, each endpoint has an independent sender and queue.
The Agent retries failed Processes requests using an exponential backoff strategy. The Agent uses the same default retry configurations as the metrics intake:
See Metrics retry strategy for complete details on retry scenarios and exceptions.
The Process Agent uses the metrics forwarder for downstream delivery. Before forwarding check results, the Process Agent stores them in an in-memory queue.
The in-memory queue buffers data when the intake is unavailable or during transmission delays.
DefaultProcessQueueSize)DefaultProcessQueueBytes)With checks running every 10 seconds, these settings buffer approximately 30 minutes of process data.
Agent versions 7.38 and earlier:
Agent versions 7.39 and later:
| |