![]() |
VOOZH | about |
CData Sync is a standalone application designed to support a variety of replication scenarios, such as replicating both sandbox and production instances into your database. It includes a web-based interface that simplifies the management of multiple Databricks connections. In this article, we demonstrate how to use the web app to replicate multiple Databricks accounts into a single database.
Accessing and integrating live data from Databricks has never been easier with CData. Customers rely on CData connectivity to:
While many customers are using CData's solutions to migrate data from different systems into their Databricks data lakehouse, several customers use our live connectivity solutions to federate connectivity between their databases and Databricks. These customers are using SQL Server Linked Servers or Polybase to get live access to Databricks from within their existing RDBMs.
Read more about common Databricks use-cases and how CData's solutions help solve data problems in our blog: What is Databricks Used For? 6 Use Cases.
Using CData Sync, you can replicate data from Databricks to any number of databases, both cloud-based and on-premises. In this example, we use SQLite as the replication destination to demonstrate the process. To add it as a destination, navigate to the Connections tab.
For each destination database:
You are now connected to SQLite and can use it as both a source and a destination.
NOTE: You can use the Label feature to add a label for a source or a destination.
π Add a label.You can configure a connection to Databricks from the Connections tab. To add a connection to your Databricks account, navigate to the Connections tab.
To connect to a Databricks cluster, set the properties as described below.
Note: The needed values can be found in your Databricks instance by navigating to Clusters, and selecting the desired cluster, and selecting the JDBC/ODBC tab under Advanced Options.
CData Sync enables you to control replication with a point-and-click interface and with SQL queries. For each replication you wish to configure, navigate to the Jobs tab and click Add Job. Select the Source and Destination for your replication.
π Select Source and Destination connections for the replication.To replicate an entire table, navigate to the Task tab in the Job, click Add Tasks, choose the table(s) from the list of Databricks tables you wish to replicate into SQLite, and click Add Tasks again.
π Choose the account table to replicate (Salesforce is shown).The statement below caches and incrementally updates a table of Databricks data:
REPLICATE Customers;
You can specify a file containing the replication queries you want to use to update a particular database. Separate the replication statements with semicolons. The following options are useful when replicating multiple Databricks accounts into the same database:
Use a different table prefix in the REPLICATE SELECT statement:
REPLICATE PROD_Customers SELECT * FROM Customers;
Alternatively, use a different schema:
REPLICATE PROD.Customers SELECT * FROM Customers;
Select the Overview tab in the Job, and click Configure under Schedule. You can schedule a job to run automatically by configuring it to run at specified intervals, ranging from once every 10 minutes to once every month.
π Schedule your job to run automatically.Once you have configured the replication job, click Save Changes. You can configure any number of jobs to manage the replication of your Databricks data to disparate on-premises, cloud-based, and other databases.
Once all the required configurations are made for the job, select the Databricks table you wish to replicate and click Run. After the replication completes successfully, a notification appears, showing the time taken to run the job and the number of rows replicated.
π Run the job.Now that you've seen how to replicate and configure multiple jobs to manage the replication of your Databricks data to various on-premises, cloud-based, and other databases, visit our CData Sync page to learn more and download a free 30-day trial. Start consolidating your enterprise data today!
As always, our world-class Support Team is ready to answer any questions you may have.
Learn more or sign up for a free trial:
CData Sync