![]() |
VOOZH | about |
Watchdog proactively looks for anomalies on your systems and applications. Each anomaly is then displayed in the Watchdog Alert Explorer with more information about what happened, the possible impact on other systems, and the root cause.
An alert overview card contains the sections below:
ongoing, resolved, or expired. (An anomaly is expired if it has been ongoing for over 48 hours.)Clicking anywhere on an alert overview card opens the alerts details pane.
In addition to repeating the information in the alert overview card, the Overview tab may contain one or more of the following fields:
Additionally, Watchdog suggests one or more monitors you can create to notify you if the anomaly happens again. These monitors do not exist yet, so the table lists their status as suggested. Click Enable Monitor to enable the suggested monitor for your organization. A series of icons pops up allowing you to open, edit, clone, mute, or delete the new monitor.
You can use the time range, search bar, or facets to filter your Watchdog Alerts feed.
Available facets:
| All Alerts Group | Description |
|---|---|
| Alert Category | Display all apm, infrastructure, or logs alerts. |
| Alert Type | Select alerts using metrics from APM or infrastructure integrations. |
| Alert Status | Select alerts based on their status (ongoing, resolved, or expired). |
| APM Primary Tag | The defined APM primary tag to display alerts from. |
| Environment | The environment to display alerts from. See Unified Service Tagging for more information about the env tag. |
| Service | The service to display alerts from. See Unified Service Tagging for more information about the service tag. |
| End User Impacted | (Requires RUM). If Watchdog found any end users impacted. See Impact Analysis for more information. |
| Root Cause | (Requires APM). If Watchdog found the root cause of the anomaly or the critical failure. See Root Cause Analysis for more information. |
| Team | The team owning the impacted services. Enriched from the Catalog. |
| Log Anomaly Type | Only display log anomalies of this type. The supported types are new log patterns and increases in existing log patterns. |
| Log Source | Only display alerts containing logs from this source. |
| Log Status | Only display alerts containing logs of this log status. |
Watchdog Alerts cover multiple application and infrastructure metrics:
Ingested logs are analyzed at the intake level where Watchdog performs aggregations on detected patterns as well as environment, service, source, and status tags.
These aggregated logs are scanned for anomalous behaviors, such as the following:
All log anomalies are surfaced as Insights in the Log Explorer, matching the search context and any restrictions applied to your role.
Log anomalies that Watchdog determines to be particularly severe are surfaced in the Watchdog Alert Explorer and can be alerted on by setting up a Watchdog logs monitor.
A severe anomaly is defined as:
noise score (to avoid having a lot of alerts for a given service). The noise score is calculated at the service level by:Watchdog requires some data to establish a baseline of expected behavior. For log anomalies, the minimum history is 24 hours. Watchdog starts finding anomalies after the minimum required history is available, and Watchdog improves as history grows. Best performances are obtained with six weeks of history.
To disable log anomaly detection, go to the Log Management pipeline page and click the Log Anomalies toggle.
Watchdog scans all services and resources to look for anomalies on the following metrics:
Watchdog filters out barely-used endpoints or services to reduce noise and avoid anomalies on small amounts of traffic. Watchdog requires at least 0.5 requests per second for an endpoint to be monitored. Additionally, if an anomaly on hit rate is detected but has no impact on latency or error rate, the anomaly is then ignored.
Watchdog requires some data to establish a baseline of expected behavior. For metric anomalies, the minimum history is two weeks. Watchdog starts finding anomalies after the minimum required history is available, and Watchdog improves as history grows. Best performances are obtained with six weeks of history.
Watchdog scans all services and resources to look for anomalies on the following metrics:
Watchdog filters out minimally-used endpoints and services to reduce noise and avoid anomalies on small amounts of traffic. Additionally, if an anomaly on hit rate is detected but has no impact on latency or error rate, the anomaly is ignored.
Watchdog requires data to establish a baseline of expected behavior. For metric anomalies, the minimum history is two weeks. Watchdog starts finding anomalies after the minimum required history is available, and Watchdog improves as history grows. Best performances are obtained with six weeks of history.
Watchdog looks at infrastructure metrics from the following integrations:
Watchdog requires some data to establish a baseline of expected behavior. For metric anomalies, the minimum history is two weeks. Watchdog starts finding anomalies after the minimum required history is available, and Watchdog improves as history grows. Best performances are obtained with six weeks of history.
Watchdog uses the same seasonal algorithms that power monitors and dashboards. To look for anomalies on other metrics or to customize the sensitivity, the following algorithms are available:
Watchdog Alerts appear in the following places within Datadog:
When Watchdog detects an irregularity in an APM metric, the pink Watchdog binoculars icon appears next to the impacted service in the APM Catalog.
You can see greater detail about a metric anomaly by navigating to the top of a Service Page with the Watchdog Insights carousel.
You can also find the Watchdog icon on metric graphs.
Click on the binoculars icon to see a Watchdog Alert card with more details.
To archive a Watchdog Alert, open the side panel and click the folder icon in the upper-right corner. Archiving hides the alert from the explorer, as well as other places in Datadog, like the home page. If an alert is archived, the pink Watchdog binoculars icon does not show up next to the relevant service or resource.
To see archived alerts, select the checkbox option to Show _N_ archived alerts in the top left of the Watchdog Alert Explorer. The option is only available if at least one alert is archived. You can see who archived each alert and when it was archived, and restore archived alerts to your feed.
Note: Archiving does not prevent Watchdog from flagging future issues related to the service or resource.
| |