ChatGPT is an AI chatbot developed by OpenAI, launched in November 2022. Based on large language models (LLMs), it enables users to refine and steer conversations through natural language processing. ChatGPT's developer mode, available to Plus and Pro subscribers, provides full Model Context Protocol (MCP) support for connecting to external data sources and tools.
CData Connect AI offers a dedicated cloud-to-cloud interface for connecting to Azure Data Lake Storage data. The CData Connect AI Remote MCP Server enables secure communication between ChatGPT and Azure Data Lake Storage. This allows you to ask questions and take actions on your Azure Data Lake Storage data using ChatGPT, all without the need for data replication to a natively supported database. With its inherent optimized data processing capabilities, CData Connect AI efficiently channels all supported SQL operations, including filters and JOINs, directly to Azure Data Lake Storage. This leverages server-side processing to swiftly deliver the requested Azure Data Lake Storage data.
Step 1: Configure Azure Data Lake Storage Connectivity for ChatGPT
Connectivity to Azure Data Lake Storage from ChatGPT is made possible through CData Connect AI Remote MCP. To interact with Azure Data Lake Storage data from ChatGPT, we start by creating and configuring a Azure Data Lake Storage connection in CData Connect AI.
-
Log into your Connect AI account, click Sources, and then click Add Connection.
π Adding a Connection
- Select "Azure Data Lake Storage" from the Add Connection panel.
π Selecting a data source
-
Enter the necessary authentication properties to connect to Azure Data Lake Storage.
Authenticating to a Gen 1 DataLakeStore Account
Gen 1 uses OAuth 2.0 in Entra ID (formerly Azure AD) for authentication.
For this, an Active Directory web application is required. You can create one as follows:
- Sign in to your Azure Account through the
- Select "Entra ID" (formerly Azure AD).
- Select "App registrations".
- Select "New application registration".
- Provide a name and URL for the application. Select Web app for the type of application you want to create.
- Select "Required permissions" and change the required permissions for this app. At a minimum, "Azure Data Lake" and "Windows Azure Service Management API" are required.
- Select "Key" and generate a new key. Add a description, a duration, and take note of the generated key. You won't be able to see it again.
To authenticate against a Gen 1 DataLakeStore account, the following properties are required:
- Schema: Set this to ADLSGen1.
- Account: Set this to the name of the account.
- OAuthClientId: Set this to the application Id of the app you created.
- OAuthClientSecret: Set this to the key generated for the app you created.
- TenantId: Set this to the tenant Id. See the property for more information on how to acquire this.
- Directory: Set this to the path which will be used to store the replicated file. If not specified, the root directory will be used.
Authenticating to a Gen 2 DataLakeStore Account
To authenticate against a Gen 2 DataLakeStore account, the following properties are required:
- Schema: Set this to ADLSGen2.
- Account: Set this to the name of the account.
- FileSystem: Set this to the file system which will be used for this account.
- AccessKey: Set this to the access key which will be used to authenticate the calls to the API. See the property for more information on how to acquire this.
- Directory: Set this to the path which will be used to store the replicated file. If not specified, the root directory will be used.
π Configuring a connection (Salesforce is shown)
-
Click Save & Test
With the connection configured, we are ready to connect to Azure Data Lake Storage data from ChatGPT.
Step 2: Connect ChatGPT to CData Connect AI
Follow these steps to add a CData Connect AI connection in ChatGPT:
- Sign in to ChatGPT with a Plus or Pro subscription.
- Navigate to Apps from the left panel.
- Select the CData Connect AI app from the list.
.
π Search for CData Connect AI
- Click Connect to authenticate to your Connect AI account.
π Click Connect to authenticate
- Click Sign in with CData Connect AI to add Connect AI to your ChatGPT account.
π Sign in with CData Connect AI
- After successful authorization, you will be redirected back to ChatGPT.
- Click Start Chat to start a new conversation in ChatGPT with Connect AI connected at the background.
π Start Chat
Step 3: Explore Live Azure Data Lake Storage Data with ChatGPT
- Start a new conversation in ChatGPT.
- Connect AI should ideally be automatically enabled in the chat. If not, navigate to -> More -> CData Connect AI, under the chatbox, to enable the connector from the dropdown
π Enable CData Connect AI in the chat
- You can now start exploring your data with natural language prompts. ChatGPT will use the Connect AI MCP server to query your live Azure Data Lake Storage data. Example prompts:
- "Show me all customers from the last 30 days"
- "What are my top performing products?"
- "Analyze sales trends for this quarter"
- "List all active projects and their current status"
Refer to CData prompt library for more prompt ideas.
π Using natural language to explore your Azure Data Lake Storage data (Salesforce used here).
- Permit ChatGPT to access your Azure Data Lake Storage data. Click Query Data to continue, or Deny to refuse.
π Give permissions to ChatGPT.
- ChatGPT translates your natural language queries into SQL and execute them against your Azure Data Lake Storage data through the Connect AI MCP server.
π ChatGPT shows the desired results from your Azure Data Lake Storage data based on the prompts
Get CData Connect AI
To get live data access to hundreds of SaaS, Big Data, and NoSQL sources directly from your cloud
applications, try CData Connect AI today!