![]() |
VOOZH | about |
Databricks is a unified data analytics platform that allows organizations to easily process, analyze, and visualize large amounts of data. It combines data engineering, data science, and machine learning capabilities in a single platform, making it easier for teams to collaborate and derive insights from their data.
The CData SSIS Components enhance SQL Server Integration Services by enabling users to easily import and export data from various sources and destinations.
In this article, we explore the data type mapping considerations when exporting to Databricks and walk through how to migrate FTP data to Databricks using the CData SSIS Components for FTP and Databricks.
| Databricks Schema | CData Schema |
|---|---|
|
int, integer, int32 |
int |
|
smallint, short, int16 |
smallint |
|
double, float, real |
float |
|
date |
date |
|
datetime, timestamp |
datetime |
|
time, timespan |
time |
|
string, varchar |
If length > 4000: nvarchar(max), Otherwise: nvarchar(length) |
|
long, int64, bigint |
bigint |
|
boolean, bool |
tinyint |
|
decimal, numeric |
decimal |
|
uuid |
nvarchar(length) |
|
binary, varbinary, longvarbinary |
binary(1000) or varbinary(max) after SQL Server 2000 |
Follow the steps below to specify properties required to connect to FTP.
To connect to FTP or SFTP servers, specify at least RemoteHost and FileProtocol. Specify the port with RemotePort.
Set User and Password to perform Basic authentication. Set SSHAuthMode to use SSH authentication. See the Getting Started section of the data provider help documentation for more information on authenticating via SSH.
Set SSLMode and SSLServerCert to secure connections with SSL.
The data provider lists the tables based on the available folders in your FTP server. Set the following connection properties to control the relational view of the file system:
Stored Procedures are available to download files, upload files, and send protocol commands. See the Data Model chapter of the FTP data provider documentation for more information.
π Configure the source connection (Salesforce is shown)With the FTP Source configured, we can configure the Databricks connection and map the columns.
Note: The needed values can be found in your Databricks instance by navigating to Clusters, selecting the desired cluster, and selecting the JDBC/ODBC tab under Advanced Options.
You can now run the project. After the SSIS Task has finished executing, data from your SQL table will be exported to the chosen table.
Download a free trial of the FTP SSIS Component to get started:
Download NowLearn more:
π FTP IconPowerful SSIS Source & Destination Components that allow you to easily connect SQL Server with remote files and directories through SSIS Workflows.
Use the FTP Data Flow Components to access all kinds of data. Perfect for data synchronization, local back-ups, reporting, and more!