Pipelines

Docs > Log Management > Log Configuration > Pipelines

Overview

The pipelines and processors outlined in this documentation are specific to cloud-based logging environments. To aggregate, process, and route on-premises logs, see Observability Pipelines.

Datadog automatically parses JSON-formatted logs. You can then add value to all your logs (raw and JSON) by sending them through a processing pipeline. Pipelines take logs from a wide variety of formats and translate them into a common format in Datadog. Implementing a log pipelines and processing strategy is beneficial as it introduces an attribute naming convention for your organization.

With pipelines, logs are parsed and enriched by chaining them sequentially through processors. This extracts meaningful information or attributes from semi-structured text to reuse as facets. Each log that comes through the pipelines is tested against every pipeline filter. If it matches a filter, then all the processors are applied sequentially before moving to the next pipeline.

Pipelines and processors can be applied to any type of log. You don’t need to change logging configuration or deploy changes to any server-side processing rules. Everything can be configured within the pipeline configuration page.

Note: For optimal use of the Log Management solution, Datadog recommends using at most 20 processors per pipeline and 10 parsing rules within a Grok processor. Datadog reserves the right to disable underperforming parsing rules, processors, or pipelines that might impact Datadog’s service performance.

Pipeline permissions

Pipelines use Granular Access Control to manage who can edit pipeline and processor configurations. This means permissions can be assigned to roles, individual users, and teams, ensuring precise control over pipeline resources. Pipelines without any restrictions are considered unrestricted, meaning any user with the logs_write_pipelines permission can modify the pipeline and its processors.

For each pipeline, administrators can choose the following edit scopes:

Editor: Only specified users, teams, or roles can edit pipeline configuration and processors.
Processor Editor: Only the processors (including nested pipelines) can be edited by specified users, teams, or roles. No one can modify the pipeline attributes, such as its filter query or its order in the global pipeline list.

Granting a user access to a pipeline's restriction list does not automatically grant the logs_write_pipelines or logs_write_processors permissions. Administrators must grant those permissions separately.

You can manage these permissions programmatically through API and Terraform.

Preprocessing

Preprocessing of JSON logs occurs before logs enter pipeline processing. Preprocessing runs a series of operations based on reserved attributes, such as timestamp, status, host, service, and message. If you have different attribute names in your JSON logs, use preprocessing to map your log attribute names to those in the reserved attribute list.

JSON log preprocessing comes with a default configuration that works for standard log forwarders. To edit this configuration to adapt custom or specific log forwarding approaches:

Navigate to Pipelines in Datadog and select Preprocessing for JSON logs.
Note: Preprocessing JSON logs is the only way to define one of your log attributes as host for your logs.
Change the default mapping based on reserved attribute:

Source attribute

If a JSON formatted log file includes the ddsource attribute, Datadog interprets its value as the log’s source. To use the same source names Datadog uses, see the Integration Pipeline Library.

Note: Logs coming from a containerized environment require the use of an environment variable to override the default source and service values.

Host attribute

Using the Datadog Agent or the RFC5424 format automatically sets the host value on your logs. However, if a JSON formatted log file includes the following attribute, Datadog interprets its value as the log’s host:

host
hostname
syslog.hostname

Note: In Kubernetes, if a JSON log ingested by the Datadog Agent contains a host, hostname, or syslog.hostname key attribute, that value overrides the default Agent hostname for that log. As a result, the log does not inherit the expected host-level tags, which are set at the host level, of the correct host. In this case, Datadog recommends clearing these attributes to ensure your logs can be attributed to the correct hosts.

Date attribute

By default Datadog generates a timestamp and appends it in a date attribute when logs are received. However, if a JSON formatted log file includes one of the following attributes, Datadog interprets its value as the log’s official date:

@timestamp
timestamp
_timestamp
Timestamp
eventTime
date
published_date
syslog.timestamp

Specify alternate attributes to use as the source of a log’s date by setting a log date remapper processor.

Note: Datadog rejects a log entry if its official date is older than 18 hours in the past.

The recognized date formats are: ISO8601, UNIX (the milliseconds EPOCH format), and RFC3164.

Message attribute

By default, Datadog ingests the message value as the body of the log entry. That value is then highlighted and displayed in the Log Explorer, where it is indexed for full text search. However, if a JSON formatted log file includes one of the following attributes, Datadog interprets its value as the log’s official message:

message
msg
log

Specify alternate attributes to use as the source of a log’s message by setting a log message remapper processor.

Status attribute

Each log entry may specify a status level which is made available for faceted search within Datadog. However, if a JSON formatted log file includes one of the following attributes, Datadog interprets its value as the log’s official status:

status
severity
level
syslog.severity

Specify alternate attributes to use as the source of a log’s status by setting a log status remapper processor.

Service attribute

Using the Datadog Agent or the RFC5424 format automatically sets the service value on your logs. However, if a JSON formatted log file includes the following attribute, Datadog interprets its value as the log’s service:

service
syslog.appname
dd.service

Specify alternate attributes to use as the source of a log’s service by setting a log service remapper processor.

Trace ID attribute

By default, Datadog SDKs can automatically inject trace and span IDs into your logs. However, if a JSON formatted log includes the following attributes, Datadog interprets its value as the log’s trace_id:

dd.trace_id
contextMap.dd.trace_id
named_tags.dd.trace_id
trace_id

Specify alternate attributes to use as the source of a log’s trace ID by setting a trace ID remapper processor.

Span ID attribute

By default, Datadog SDKs can automatically inject span IDs into your logs. However, if a JSON formatted log includes the following attributes, Datadog interprets its value as the log’s span_id:

dd.span_id
contextMap.dd.span_id
named_tags.dd.span_id
span_id

Create a pipeline

Navigate to Pipelines in Datadog.
Select New Pipeline.
Select a log from the live tail preview to apply a filter, or apply your own filter. Choose a filter from the dropdown menu or create your own filter query by selecting the icon. Filters let you limit what kinds of logs a pipeline applies to.
Note: The pipeline filtering is applied before any of the pipeline’s processors. For this reason, you cannot filter on an attribute that is extracted in the pipeline itself.
Name your pipeline.
(Optional) Add a description and tags to the pipeline to indicate its purpose and ownership. Pipeline tags do not affect logs, but can be used to filter and search within the Pipelines page.
Press Create.

An example of a log transformed by a pipeline:

Integration pipelines

See the list of supported integrations.

Integration processing pipelines are available for certain sources when they are set up to collect logs. These pipelines are read-only and parse out your logs in ways appropriate for the particular source. For integration logs, an integration pipeline is automatically installed that takes care of parsing your logs and adds the corresponding facet in your Log Explorer.

To view an integration pipeline, navigate to the Pipelines page. To edit an integration pipeline, clone it and then edit the clone:

See the ELB logs example below:

Note: Integration pipelines cannot be deleted, only disabled.

Integration pipeline library

To see the full list of integration pipelines that Datadog offers, browse the integration pipeline library. The pipeline library shows how Datadog processes different log formats by default.

To use an integration pipeline, Datadog recommends installing the integration by configuring the corresponding log source. After Datadog receives the first log with this source, the installation is automatically triggered and the integration pipeline is added to the processing pipelines list. To configure the log source, see the corresponding integration documentation.

It’s also possible to copy an integration pipeline using the clone button.

Add a processor or nested pipeline

Navigate to Pipelines in Datadog.
Hover over a pipeline and click the arrow next to it to expand processors and nested pipelines.
Select Add Processor or Add Nested Pipeline.

Processors

A processor executes within a pipeline to complete a data-structuring action. See the Processors docs to learn how to add and configure a processor by processor type, within the app or with the API.

See Parsing dates to learn about custom date and time formats and the required timezone parameter for non-UTC timestamps.

Nested pipelines

Nested pipelines are pipelines within a pipeline. Use nested pipelines to split the processing into two steps. For example, first use a high-level filter such as team and then a second level of filtering based on the integration, service, or any other tag or attribute.

A pipeline can contain nested pipelines and processors whereas a nested pipeline can only contain processors.

Move a pipeline into another pipeline to make it into a nested pipeline:

Hover over the pipeline you want to move, and click on the Move to icon.
Select the pipeline you want to move the original pipeline into. Note: Pipelines containing nested pipelines can only be moved to another top level position. They cannot be moved into another pipeline.
Click Move.

Manage your pipelines

Identify when the last change to a pipeline or processor was made and which user made the change using the modification information on the pipeline. Filter your pipelines using this modification information, as well as other faceted properties such as whether the pipeline is enabled or read-only.

Reorder pipelines precisely with the Move to option in the sliding option panel. Scroll and click on the exact position to move the selected pipeline to using the Move to modal. Pipelines cannot be moved into other read-only pipelines. Pipelines containing nested pipelines can only be moved to other top level positions. They cannot be moved into other pipelines.

Clone pipelines to reuse existing rules and processors without having to start over. When you clone a pipeline, Datadog automatically disables the pipeline you cloned. Click the toggle to enable.

Estimated usage metrics

Estimated usage metrics are displayed for each pipeline. This shows the volume and count of logs being ingested and modified by each pipeline. Every pipeline includes a link to the out-of-the-box Logs Estimated Usage Dashboard. This dashboard offers detailed charts of the pipeline’s usage metrics.

URL: https://docs.datadoghq.com/logs/log_configuration/pipelines/

⇱ Pipelines