![]() |
VOOZH | about |
LlamaIndex is a data framework for building LLM applications — agents, RAG pipelines, and structured workflows that reason over external data. By integrating LlamaIndex with CData Connect AI through the built-in MCP Server, your agents can discover and query live Presto data as native tools without writing custom connectors.
CData Connect AI offers a secure, low-code environment to connect Presto and other data sources, removing the need for complex ETL and enabling seamless automation across business applications with live data.
This article outlines how to configure Presto connectivity in CData Connect AI, register the MCP server with LlamaIndex, and build a ReAct agent that queries Presto data in real time.
Accessing and integrating live data from Trino and Presto SQL engines has never been easier with CData. Customers rely on CData connectivity to:
Presto and Trino allow users to access a variety of underlying data sources through a single endpoint. When paired with CData connectivity, users get pure, SQL-92 access to their instances, allowing them to integrate business data with a data warehouse or easily access live data directly from their preferred tools, like Power BI and Tableau.
In many cases, CData's live connectivity surpasses the native import functionality available in tools. One customer was unable to effectively use Power BI due to the size of the datasets needed for reporting. When the company implemented the CData Power BI Connector for Presto they were able to generate reports in real-time using the DirectQuery connection mode.
Before LlamaIndex can access Presto, a Presto connection must be created in CData Connect AI. This connection is then exposed to LlamaIndex through the remote MCP server.
Set the Server and Port connection properties to connect, in addition to any authentication properties that may be required.
To enable TLS/SSL, set UseSSL to true.
In order to authenticate with LDAP, set the following connection properties:
In order to authenticate with KERBEROS, set the following connection properties:
LlamaIndex authenticates to Connect AI using an account email and a Personal Access Token (PAT). Creating separate PATs for each integration is recommended to maintain access control granularity.
With the Presto connection configured and a PAT generated, LlamaIndex is prepared to connect to Presto data through the CData MCP server.
To connect LlamaIndex with CData Connect AI Remote MCP Server and use OpenAI for reasoning, configure your MCP server endpoint and authentication in a
config.pyfile. These values let LlamaIndex’s MCP tool spec call the MCP server tools, while OpenAI handles the natural language reasoning.
config.pyand
llamaindex_agent.py
config.py, define your MCP server URL and your Base64-encoded CData Connect AI email and PAT (obtained in the prerequisites):
class Config: MCP_BASE_URL = "https://mcp.cloud.cdata.com/mcp" # MCP Server URL MCP_AUTH = "base64encoded(EMAIL:PAT)" # Base64 encoded Connect AI Email:PAT
Note: You can create the base64 encoded version of MCP_AUTH using any Base64 encoding tool.
llamaindex_agent.py, wire up the MCP tool spec and a ReAct agent:
"""
Integrates a LlamaIndex ReAct agent with the CData Connect AI MCP server.
The script discovers MCP tools, wraps them as LlamaIndex tools, and runs an
agent loop driven by OpenAI for reasoning.
"""
import asyncio
from llama_index.tools.mcp import BasicMCPClient, McpToolSpec
from llama_index.core.agent.workflow import ReActAgent
from llama_index.llms.openai import OpenAI
from config import Config
async def main():
# Initialize the MCP client pointed at Connect AI
mcp_client = BasicMCPClient(
Config.MCP_BASE_URL,
headers={"Authorization": f"Basic {Config.MCP_AUTH}"},
)
# Discover tools the MCP server exposes (getCatalogs, queryData, etc.)
tool_spec = McpToolSpec(client=mcp_client)
tools = await tool_spec.to_tool_list_async()
print("Discovered MCP tools:", [t.metadata.name for t in tools])
# Configure the LLM that drives the ReAct loop
llm = OpenAI(
model="gpt-4o",
temperature=0.2,
api_key="YOUR_OPENAI_API_KEY", # https://platform.openai.com/
)
# Build the agent with the MCP-backed tools
agent = ReActAgent(tools=tools, llm=llm)
user_prompt = "How many tables are available in Presto1?" # Change as needed
print(f"
User prompt: {user_prompt}")
response = await agent.run(user_prompt)
print("Agent final response:", response)
if __name__ == "__main__":
asyncio.run(main())
Since this workflow uses LlamaIndex together with the CData Connect AI MCP server and OpenAI for reasoning, install the required Python packages.
Run the following command in your project terminal:
pip install llama-index llama-index-tools-mcp llama-index-llms-openai
python llamaindex_agent.pyto execute the script
queryDataagainst Presto, and responds with the result
To get live data access to hundreds of SaaS, Big Data, and NoSQL sources directly from your cloud applications, try CData Connect AI today!
Learn more about CData Connect AI or sign up for free trial access:
Free Trial