![]() |
VOOZH | about |
Logs
The Databricks (Zerobus) destination is in Preview. Contact your account manager to request access.
Use Observability Pipelines’ Databricks (Zerobus) destination to send logs to a Databricks Unity Catalog table. The destination streams logs to the Zerobus Ingest API and authenticates to Databricks with an OAuth service principal.
Before you configure the Databricks (Zerobus) destination, you must:
The SQL examples in this section use the following placeholders:
| Placeholder | Description | Example |
|---|---|---|
<USER> | The user who creates the schema and table. | databricks-user@example.com |
<CATALOG_NAME> | The Unity Catalog name. | main |
<SCHEMA_NAME> | The schema name. | obs_pipelines |
<TABLE_NAME> | The table name. | apache_common_logs |
<YOUR_MANAGED_LOCATION> | (Optional) The managed location URI. | s3://your-bucket/managed |
Note: The GRANT commands must be run by a Databricks workspace admin.
In the Databricks workspace:
If you’re not a Databricks workspace admin, have an admin run the following command to grant your user permission to create a schema:
GRANTCREATESCHEMAONCATALOG<CATALOG_NAME>TO<USER>;Create the schema:
CREATESCHEMAIFNOTEXISTS<CATALOG_NAME>.<SCHEMA_NAME>MANAGEDLOCATION'<YOUR_MANAGED_LOCATION>';MANAGED LOCATION is optional. See Databricks’ Create Schemas documentation for more information.If you’re not an admin user, have an admin run the following command to grant your user permission to create a table on the schema:
GRANTCREATETABLEONSCHEMA<CATALOG_NAME>.<SCHEMA_NAME>TO<USER>;Run the following command to create the table that Observability Pipelines writes log data to:
CREATETABLE<CATALOG_NAME>.<SCHEMA_NAME>.<TABLE_NAME>(hostSTRING,messageSTRING,serviceSTRING,source_typeSTRING,timestampTIMESTAMP);The fully qualified table name is catalog.schema.table, for example main.obs_pipelines.apache_common_logs. This is the value you enter for Table Name when you set up the Observability Pipelines Databricks destination.
The Databricks Zerobus Ingest API uses OAuth authentication. When you create the service principal, the OAuth client secret is generated and the OAuth client ID is the service principal’s UUID.
To create a service principal:
<SERVICE_PRINCIPAL_UUID> with the service principal’s application ID from the previous step:GRANTUSECATALOGONCATALOG<CATALOG_NAME>TO<SERVICE_PRINCIPAL_UUID>;GRANTUSESCHEMAONSCHEMA<CATALOG_NAME>.<SCHEMA_NAME>TO<SERVICE_PRINCIPAL_UUID>;GRANTSELECT,MODIFYONTABLE<CATALOG_NAME>.<SCHEMA_NAME>.<TABLE_NAME>TO<SERVICE_PRINCIPAL_UUID>;See Databricks’ Add service principals to your account and Grant permissions on an object documentation for more information.
Configure the Databricks (Zerobus) destination when you set up a pipeline. You can set up a pipeline in the UI, using the API, or with Terraform. The steps in this section are configured in the UI.
Note: Log fields that are not present in the table schema are dropped. For example, if a log has the fields id, name, and host, and the table schema only contains the columns name and host, then the id field is dropped and not written to the table.
After you select the Databricks (Zerobus) destination in the pipeline UI:
TIMESTAMP type. If your table uses a timestamp column, see Convert string timestamps to timestamp format for more information.DD_OP_. For example, if you entered PASSWORD_1 for a password identifier, the environment variable for that password is DD_OP_PASSWORD_1.https://<workspace_id>.zerobus.<region>.cloud.databricks.com. The Worker sends logs to this endpoint.catalog.schema.table, such as main.obs_pipelines.apache_common_logs.https://<workspace>.cloud.databricks.com. The Worker uses this endpoint to read the table’s schema.abcdefgh-1234-5678-abcd-ef0123456789.Toggle the switch to enable Buffering Options. Enable a configurable buffer on your destination to ensure intermittent latency or an outage at the destination doesn’t create immediate backpressure, and allow events to continue to be ingested from your source. Disk buffers can also increase pipeline durability by writing data to disk, ensuring buffered data persists through a Worker restart. See Destination buffers for more information.
If your logs have timestamps in string format and your Databricks table has a timestamp column declared as a TIMESTAMP type, you must convert the string to timestamp format before sending logs to the Databricks (Zerobus) destination. Databricks (Zerobus) can only convert the timestamp format to its TIMESTAMP type.
If you do not convert the string timestamp, the Worker throws an error similar to:
Protobuf encoding failed: Error converting timestamp field: Can't convert '2012-04-23T10[41]15Z' to i64: invalid digit found in string
To convert timestamps in string format to timestamp format:
.timestamp = parse_timestamp!(.timestamp, format: "%+")
These are the defaults used for secret identifiers and environment variables.
DESTINATION_DATABRICKS_ZEROBUS_OAUTH_CLIENT_SECRET.DD_OP_DESTINATION_DATABRICKS_ZEROBUS_OAUTH_CLIENT_SECRET.A batch of events is flushed when one of these parameters is met. See event batching for more information.
| Maximum Events | Maximum Size (MB) | Timeout (seconds) |
|---|---|---|
| None | 10 | 1 |
| |