![]() |
VOOZH | about |
Data Observability: Jobs Monitoring gives visibility into the performance and reliability of Apache Spark applications on Kubernetes.
Follow these steps to enable Data Observability: Jobs Monitoring for Spark on Kubernetes.
If you have already installed the Datadog Agent on your Kubernetes cluster, make sure you’ve enabled the Datadog Admission Controller. You can then go to the next step, Enable Single Step Instrumentation.
You can install the Datadog Agent using the Datadog Operator or Helm.
Helmkubectl CLIInstall the Datadog Operator by running the following commands:
helm repo add datadog https://helm.datadoghq.com
helm install my-datadog-operator datadog/datadog-operator
Create a Kubernetes Secret to store your Datadog API key.
kubectl create secret generic datadog-secret --from-literal api-key=<DATADOG_API_KEY>
Replace <DATADOG_API_KEY> with your Datadog API key.
Create a file, datadog-agent.yaml, that contains the following configuration:
kind:DatadogAgentapiVersion:datadoghq.com/v2alpha1metadata:name:datadogspec:features:apm:enabled:truehostPortConfig:enabled:truehostPort:8126admissionController:enabled:truemutateUnlabelled:false# (Optional) Uncomment the next three lines to enable logs collection# logCollection:# enabled: true# containerCollectAll: trueglobal:site:<DATADOG_SITE>credentials:apiSecret:secretName:datadog-secretkeyName:api-keyoverride:nodeAgent:image:tag:<DATADOG_AGENT_VERSION>Replace <DATADOG_SITE> with your Datadog site. Your site is . (Ensure the correct SITE is selected on the right).
Replace <DATADOG_AGENT_VERSION> with version 7.64.0 or later.
Optional: Uncomment the logCollection section to start collecting application logs which will be correlated to Spark job run traces. Once enabled, logs are collected from all discovered containers by default. See the Kubernetes log collection documentation for more details on the setup process.
Deploy the Datadog Agent with the above configuration file:
kubectl apply -f /path/to/your/datadog-agent.yaml
Create a Kubernetes Secret to store your Datadog API key.
kubectl create secret generic datadog-secret --from-literal api-key=<DATADOG_API_KEY>
Replace <DATADOG_API_KEY> with your Datadog API key.
Create a file, datadog-values.yaml, that contains the following configuration:
datadog:apiKeyExistingSecret:datadog-secretsite:<DATADOG_SITE>apm:portEnabled:trueport:8126# (Optional) Uncomment the next three lines to enable logs collection# logs:# enabled: true# containerCollectAll: trueagents:image:tag:<DATADOG_AGENT_VERSION>clusterAgent:admissionController:enabled:truemuteUnlabelled:falseReplace <DATADOG_SITE> with your Datadog site. Your site is . (Ensure the correct SITE is selected on the right).
Replace <DATADOG_AGENT_VERSION> with version 7.64.0 or later.
Optional: Uncomment the logs section to start collecting application logs which will be correlated to Spark job run traces. Once enabled, logs are collected from all discovered containers by default. See the Kubernetes log collection documentation for more details on the setup process.
Run the following command:
helm install <RELEASE_NAME> \
-f datadog-values.yaml \
--set targetSystem=<TARGET_SYSTEM> \
datadog/datadog
Replace <RELEASE_NAME> with your release name. For example, datadog-agent.
Replace <TARGET_SYSTEM> with the name of your OS. For example, linux or windows.
Single Step Instrumentation (SSI) injects the Java tracer into your Spark driver and executor pods at startup. It works regardless of whether your Spark driver runs in cluster mode (as a dedicated Kubernetes pod) or client mode (as a process inside your submitter pod; for example, an Airflow scheduler or worker).
Spark automatically sets spark-role: driver on driver pods and spark-role: executor on executor pods. In client mode, replace spark-role: driver with the labels that identify your submitter pod instead. To find those labels, run:
kubectl get pod <SUBMITTER_POD_NAME> -n <NAMESPACE> --show-labels
Requires Datadog Operator version 1.13.0 or later.
Add the features.apm.instrumentation section to your datadog-agent.yaml and apply it:
features:apm:instrumentation:enabled:trueenabledNamespaces:- <NAMESPACE> # namespace where your Spark jobs runtargets:- name:spark-driverpodSelector:matchLabels:spark-role:driver # replace with your submitter pod labels if running in client modeddTraceVersions:java:"latest"ddTraceConfigs:- name:DD_DATA_JOBS_ENABLEDvalue:"true"- name:spark-executorpodSelector:matchLabels:spark-role:executorddTraceVersions:java:"latest"ddTraceConfigs:- name:DD_DATA_JOBS_ENABLEDvalue:"true"kubectl apply -f /path/to/your/datadog-agent.yaml
Add the following to your datadog-values.yaml and apply it:
datadog:apm:instrumentation:enabled:trueenabledNamespaces:- <NAMESPACE> # namespace where your Spark jobs runtargets:- name:spark-driverpodSelector:matchLabels:spark-role:driver # replace with your submitter pod labels if running in client modeddTraceVersions:java:"latest"ddTraceConfigs:- name:DD_DATA_JOBS_ENABLEDvalue:"true"- name:spark-executorpodSelector:matchLabels:spark-role:executorddTraceVersions:java:"latest"ddTraceConfigs:- name:DD_DATA_JOBS_ENABLEDvalue:"true"helm upgrade <RELEASE_NAME> datadog/datadog -f datadog-values.yaml
After applying the configuration, restart the targeted pods. SSI injects the init container into each pod on startup.
In Datadog, view the Data Observability: Jobs Monitoring page to see a list of all your data processing jobs.
To attach service, environment, and version tags to your job traces, pass the following JVM options in your spark-submit configuration or spark-defaults.conf:
spark.driver.extraJavaOptions=-Ddd.service=<JOB_NAME> -Ddd.env=<ENV> -Ddd.version=<VERSION>
spark.executor.extraJavaOptions=-Ddd.service=<JOB_NAME> -Ddd.env=<ENV> -Ddd.version=<VERSION>
You can set tags on Spark spans at runtime. These tags are applied only to spans that start after the tag is added.
// Add tag for all next Spark computations
sparkContext.setLocalProperty("spark.datadog.tags.key", "value")
spark.read.parquet(...)
To remove a runtime tag:
// Remove tag for all next Spark computations
sparkContext.setLocalProperty("spark.datadog.tags.key", null)
Additional helpful documentation, links, and articles:
| |