Jobs Monitoring for AWS Glue

Docs > Data Observability Overview > Data Observability: Jobs Monitoring > Jobs Monitoring for AWS Glue

Jobs Monitoring for AWS Glue is in Preview.

Overview

Data Observability: Jobs Monitoring gives visibility into the performance and reliability of your AWS Glue jobs.

Prerequisites

Before you begin, make sure you have:

An AWS account with Glue jobs you want to monitor.
The Datadog AWS integration configured for the account.
IAM permissions to modify the Datadog role’s policies.

Configure the AWS account

Navigate to Datadog Data Observability > Settings.
Click Configure next to AWS Glue.
Select an existing AWS account that is already connected to Datadog, or add a new one. For help adding a new account, see the AWS Integration documentation.

Add required IAM permissions

The Data Observability crawler requires additional permissions to monitor Glue jobs. Attach the following policy to the Datadog IAM role configured for your AWS integration:

{
 "Version": "2012-10-17",
 "Statement": [
 {
 "Effect": "Allow",
 "Action": [
 "glue:GetCatalog",
 "glue:GetDatabase",
 "glue:GetDatabases",
 "glue:GetJobRun",
 "glue:GetJobRuns",
 "glue:GetJob",
 "glue:GetJobs",
 "glue:GetTable",
 "glue:GetTables",
 "glue:ListJobs",
 "s3:ListBucket",
 "kms:Decrypt",
 "lakeformation:GetDataAccess"
 ],
 "Resource": ["*"]
 },
 {
 "Sid": "AllowIcebergMetadataOnly",
 "Effect": "Allow",
 "Action": [
 "s3:GetObject",
 "s3:GetObjectVersion"
 ],
 "Resource": [
 "arn:aws:s3:::*/metadata/*"
 ]
 }
 ]
}

Some of these permissions are related to monitoring Iceberg tables in Glue. For more details on dataset-related IAM permissions, see the AWS Glue Data Quality Monitoring documentation.

Configure the crawler

Select the AWS regions where your Glue jobs are located.
Enable the Job Monitoring toggle.
Click Save.

(Optional) Configure Glue jobs logs

Follow these steps to send AWS logs from CloudWatch to Datadog.
Manually configure triggers in AWS CloudWatch to capture AWS Glue logs. By default, Glue logs are stored in the following log groups:
- /aws-glue/jobs/error
- /aws-glue/jobs/output
- /aws-glue/jobs/logs-v2
Note: After logs are ingested into Datadog, the CloudWatch log group name maps to the host attribute in Datadog Logs.
Create a Log Index that includes logs where the host attribute matches:
- /aws-glue/jobs/error
- /aws-glue/jobs/output
- /aws-glue/jobs/logs-v2

This helps ensure the logs are searchable and available under the Glue tab in Data Observability: Jobs Monitoring.

(Optional) Configure Glue metrics

Enable the Glue Integration tile for Glue metrics collection. Metrics should be available under the Glue job tab in Data Observability: Jobs Monitoring.

(Optional) Enable dataset lineage

Glue jobs that run with the Spark engine can emit OpenLineage events directly to Datadog. This provides dataset-level lineage, showing which datasets your job reads and writes.

Note: AWS Glue includes the Spark OpenLineage connector in its default class path. To use a more recent version, add the connector JAR manually through the --extra-jars Glue job parameter and set --user-jars-first=true to override the bundled version. For example: --extra-jars s3://<YOUR_BUCKET>/openlineage-spark-<VERSION>.jar and --user-jars-first true.

Configure the SparkSession

In your Glue job script, configure the SparkSession with the following settings:

spark = SparkSession.builder \
 .config("spark.extraListeners", "io.openlineage.spark.agent.OpenLineageSparkListener") \
 .config("spark.openlineage.transport.type", "http") \
 .config("spark.openlineage.transport.url", "<DD_DATA_OBSERVABILITY_INTAKE>") \
 .config("spark.openlineage.transport.auth.type", "api_key") \
 .config("spark.openlineage.transport.auth.apiKey", "<DATADOG_API_KEY>") \
 .config("spark.redaction.regex", "(?i)secret|password|token|access[.]key|apikey") \
 .config("spark.openlineage.capturedProperties", "spark.glue.JOB_RUN_ID") \
 .getOrCreate()

Replace <DD_DATA_OBSERVABILITY_INTAKE> with https://data-obs-intake.. Replace <DATADOG_API_KEY> with your Datadog API key. spark.glue.JOB_RUN_ID is the Spark configuration property automatically set by AWS Glue with the current job run ID — use it verbatim.

Validate

After enabling OpenLineage, open a job run in Data Observability: Jobs Monitoring. In the flame graph, additional spans such as spark.application or spark.sql_job should appear. The payloads of these spans should be helpful when debugging dataset extraction.

Next steps

The crawler runs every few minutes. In Datadog, view the Data Observability: Jobs Monitoring page to see a list of your Glue job runs after setup.

URL: https://docs.datadoghq.com/data_observability/jobs_monitoring/glue/