![]() |
VOOZH | about |
Data Observability: Jobs Monitoring gives visibility into the performance and reliability of your AWS Glue jobs.
Before you begin, make sure you have:
Navigate to Datadog Data Observability > Settings.
Click Configure next to AWS Glue.
Select an existing AWS account that is already connected to Datadog, or add a new one. For help adding a new account, see the AWS Integration documentation.
The Data Observability crawler requires additional permissions to monitor Glue jobs. Attach the following policy to the Datadog IAM role configured for your AWS integration:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"glue:GetCatalog",
"glue:GetDatabase",
"glue:GetDatabases",
"glue:GetJobRun",
"glue:GetJobRuns",
"glue:GetJob",
"glue:GetJobs",
"glue:GetTable",
"glue:GetTables",
"glue:ListJobs",
"s3:ListBucket",
"kms:Decrypt",
"lakeformation:GetDataAccess"
],
"Resource": ["*"]
},
{
"Sid": "AllowIcebergMetadataOnly",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:GetObjectVersion"
],
"Resource": [
"arn:aws:s3:::*/metadata/*"
]
}
]
}
Some of these permissions are related to monitoring Iceberg tables in Glue. For more details on dataset-related IAM permissions, see the AWS Glue Data Quality Monitoring documentation.
Select the AWS regions where your Glue jobs are located.
Enable the Job Monitoring toggle.
Click Save.
Follow these steps to send AWS logs from CloudWatch to Datadog.
Manually configure triggers in AWS CloudWatch to capture AWS Glue logs. By default, Glue logs are stored in the following log groups:
/aws-glue/jobs/error/aws-glue/jobs/output/aws-glue/jobs/logs-v2Note: After logs are ingested into Datadog, the CloudWatch log group name maps to the host attribute in Datadog Logs.
Create a Log Index that includes logs where the host attribute matches:
/aws-glue/jobs/error/aws-glue/jobs/output/aws-glue/jobs/logs-v2This helps ensure the logs are searchable and available under the Glue tab in Data Observability: Jobs Monitoring.
Enable the Glue Integration tile for Glue metrics collection. Metrics should be available under the Glue job tab in Data Observability: Jobs Monitoring.
Glue jobs that run with the Spark engine can emit OpenLineage events directly to Datadog. This provides dataset-level lineage, showing which datasets your job reads and writes.
Note: AWS Glue includes the Spark OpenLineage connector in its default class path. To use a more recent version, add the connector JAR manually through the --extra-jars Glue job parameter and set --user-jars-first=true to override the bundled version. For example: --extra-jars s3://<YOUR_BUCKET>/openlineage-spark-<VERSION>.jar and --user-jars-first true.
In your Glue job script, configure the SparkSession with the following settings:
spark = SparkSession.builder \
.config("spark.extraListeners", "io.openlineage.spark.agent.OpenLineageSparkListener") \
.config("spark.openlineage.transport.type", "http") \
.config("spark.openlineage.transport.url", "<DD_DATA_OBSERVABILITY_INTAKE>") \
.config("spark.openlineage.transport.auth.type", "api_key") \
.config("spark.openlineage.transport.auth.apiKey", "<DATADOG_API_KEY>") \
.config("spark.redaction.regex", "(?i)secret|password|token|access[.]key|apikey") \
.config("spark.openlineage.capturedProperties", "spark.glue.JOB_RUN_ID") \
.getOrCreate()
Replace <DD_DATA_OBSERVABILITY_INTAKE> with https://data-obs-intake.. Replace <DATADOG_API_KEY> with your Datadog API key. spark.glue.JOB_RUN_ID is the Spark configuration property automatically set by AWS Glue with the current job run ID — use it verbatim.
After enabling OpenLineage, open a job run in Data Observability: Jobs Monitoring. In the flame graph, additional spans such as spark.application or spark.sql_job should appear. The payloads of these spans should be helpful when debugging dataset extraction.
The crawler runs every few minutes. In Datadog, view the Data Observability: Jobs Monitoring page to see a list of your Glue job runs after setup.
Additional helpful documentation, links, and articles:
| |