![]() |
VOOZH | about |
LangGraph is a framework for building and visualizing intelligent, graph-based AI workflows that combine reasoning models (LLMs) with tool integrations and data operations. By integrating it with the CData Connect AI, you can enable agents to securely access, query, and act on live enterprise data in real-time via a standardized toolset.
CData Connect AI is a managed MCP-platform that allows you to expose your data sources (such as Amazon Athena) through the Model Context Protocol (MCP). This means your AI agents can work with metadata, catalogs, tables, and SQL-enabled data access from hundreds of data sources, without complex ETL or custom integrations.
This article explores how to register the MCP endpoint in LangGraph, configure data source connectivity via CData Connect AI, and build a workflow that queries and visualizes live data (for example, Amazon Athena objects) on demand. It demonstrates how to use the built-in MCP toolset (getCatalogs, getSchemas, getTables, queryData, etc.) to enable natural-language agents to interact with your enterprise data securely and interactively.
CData provides the easiest way to access and integrate live data from Amazon Athena. Customers use CData connectivity to:
Users frequently integrate Athena with analytics tools like Tableau, Power BI, and Excel for in-depth analytics from their preferred tools.
To learn more about unique Amazon Athena use cases with CData, check out our blog post: https://www.cdata.com/blog/amazon-athena-use-cases.
Before LangGraph can access Amazon Athena, a Amazon Athena connection must be created in CData Connect AI. This connection is then exposed to LangGraph through the remote MCP server.
To authorize Amazon Athena requests, provide the credentials for an administrator account or for an IAM user with custom permissions: Set to the access key Id. Set to the secret access key.
Note: Though you can connect as the AWS account administrator, it is recommended to use IAM user credentials to access AWS services.
To obtain the credentials for an IAM user, follow the steps below:
To obtain the credentials for your AWS root account, follow the steps below:
If you are using the CData Data Provider for Amazon Athena 2018 from an EC2 Instance and have an IAM Role assigned to the instance, you can use the IAM Role to authenticate. To do so, set to true and leave and empty. The CData Data Provider for Amazon Athena 2018 will automatically obtain your IAM Role credentials and authenticate with them.
In many situations it may be preferable to use an IAM role for authentication instead of the direct security credentials of an AWS root user. An AWS role may be used instead by specifying the . This will cause the CData Data Provider for Amazon Athena 2018 to attempt to retrieve credentials for the specified role. If you are connecting to AWS (instead of already being connected such as on an EC2 instance), you must additionally specify the and of an IAM user to assume the role for. Roles may not be used when specifying the and of an AWS root user.
For users and roles that require Multi-factor Authentication, specify the and connection properties. This will cause the CData Data Provider for Amazon Athena 2018 to submit the MFA credentials in a request to retrieve temporary authentication credentials. Note that the duration of the temporary credentials may be controlled via the (default 3600 seconds).
In addition to the and properties, specify , and . Set to the region where your Amazon Athena data is hosted. Set to a folder in S3 where you would like to store the results of queries.
If is not set in the connection, the data provider connects to the default database set in Amazon Athena.
π Configuring a connection (Salesforce is shown)LangGraph authenticates to Connect AI using an account email and a Personal Access Token (PAT). Creating separate PATs for each integration is recommended to maintain granular access control.
With the Amazon Athena connection configured and a PAT generated, LangGraph is prepared to connect to Amazon Athena data through the CData MCP server.
Set up your project directory and install the required dependencies to connect LangGraph with CData Connect AI and use OpenAI LLM for reasoning. This setup enables LangGraph to call the Amazon Athena MCP server tools exposed by Connect AI while OpenAI processes the natural language reasoning.
mkdir LangGraph cd LangGraph
pip install langgraph langchain-openai langchain-mcp-adapters python-dotenv "langgraph-cli[inmem]"
LangGraph uses environment variables to connect to the CData Connect AI and define the API credentials and configuration settings. Store these credentials in a .env file to keep them secure and reusable. LangGraph automatically reads this file at runtime, so the script can authenticate and communicate with the MCP server without hardcoding sensitive information.
# LangSmith (Optional) LANGSMITH_API_KEY=lsv2_pt_xxxx #LangSmith API Key LANGCHAIN_TRACING_V2=true LANGCHAIN_PROJECT=LangGraph-Demo # MCP Configuration MCP_BASE_URL=https://mcp.cloud.cdata.com/mcp #MCP Server URL MCP_AUTH=base64encoded(EMAIL:PAT) #Base64 encoded Connect AI Email:PAT OPENAI_API_KEY=sk-proj-xxxx
Note: You can generate the Base64-encoded authorization string using any online Base64 encoder, such as Base64 encoding tool. Encode your CData Connect AI username and PAT (obtained in the prerequisites).
In this step, you need to create a Python script that connects LangGraph to your CData Connect AI MCP server. The script retrieves the available MCP tools, such as getCatalogs, getSchemas, and queryData, builds a LangGraph workflow, and runs a natural language prompt against your connected Amazon Athena data.
The workflow uses the MCP to securely retrieve live Amazon Athena data from Connect AI and uses OpenAI GPT-4o to interpret and reason over that data. You also expose the graph so you can visualize it later in LangGraph Studio.
Create a new Python file named test.py inside your LangGraph project folder.
Use the following script into test.py:
import asyncio
import os
import operator
from typing_extensions import TypedDict, Annotated
from dotenv import load_dotenv
from langgraph.graph import StateGraph, START, END
from langchain.agents import create_agent
from langchain_mcp_adapters.client import MultiServerMCPClient
from langchain_openai import ChatOpenAI
from langchain_core.messages import BaseMessage, HumanMessage
# Load environment variables
load_dotenv()
# Define the agent state
class AgentState(TypedDict):
messages: Annotated[list[BaseMessage], operator.add]
# Define the async agent logic
async def run_agent(user_prompt: str) -> str:
MCP_BASE_URL = os.getenv("MCP_BASE_URL")
MCP_AUTH = os.getenv("MCP_AUTH")
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
print("Connecting to the MCP server...")
mcp_client = MultiServerMCPClient(
connections={
"default": {
"transport": "streamable_http",
"url": MCP_BASE_URL,
"headers": {"Authorization": f"Basic {MCP_AUTH}"} if MCP_AUTH else {},
}
}
)
print("Loading available MCP tools...")
all_mcp_tools = await mcp_client.get_tools()
print(f"Loaded tools: {[tool.name for tool in all_mcp_tools]}")
# Initialize the LLM
llm = ChatOpenAI(model="gpt-4o", temperature=0.2, api_key=OPENAI_API_KEY)
print("Creating the LangGraph agent...")
agent = create_agent(
model=llm,
tools=all_mcp_tools,
system_prompt="You are a helpful assistant. Use tools when needed."
)
# Build the workflow graph
builder = StateGraph(AgentState)
builder.add_node("agent", agent)
builder.add_edge(START, "agent")
builder.add_edge("agent", END)
graph_instance = builder.compile()
print(f"Processing user query: {user_prompt}\n")
initial_state = {"messages": [HumanMessage(content=user_prompt)]}
result = await graph_instance.ainvoke(initial_state)
print(f"Agent Response:\n{result['messages'][-1].content}")
# Expose the graph for visualization
builder = StateGraph(AgentState)
builder.add_node(
"agent",
create_agent(
model=ChatOpenAI(model="gpt-4o", temperature=0.2, api_key=os.getenv("OPENAI_API_KEY")),
tools=[],
system_prompt="You are a helpful assistant."
)
)
builder.add_edge(START, "agent")
builder.add_edge("agent", END)
graph = builder.compile()
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--serve", action="store_true", help="Run visualization server")
args = parser.parse_args()
if args.serve:
print("To visualize the graph, run:")
print("langgraph dev")
else:
asyncio.run(run_agent("List the first 2 catalogs available"))
Configure the LangGraph project so the CLI recognizes the workflow graph and environment settings. Create a configuration file that registers the graph for use in LangGraph Studio or during local visualization runs.
Create a new file named langgraph.json in your project directory.
Use the content below in the langgraph.json file:
{
"dependencies": ["."],
"graphs": {
"agent": "./test.py:graph"
},
"env": ".env"
}
Run the LangGraph development server to view and interact with your workflow in LangGraph Studio. This allows direct visualization of how the agent processes prompts, invokes tools, and retrieves Amazon Athena data through the MCP server.
Open a terminal in your project directory and run:
langgraph devπ Start the LangGraph development server
After the server starts, LangGraph launches a local API and provides a link to the Studio UI:
https://smith.langchain.com/studio/?baseUrl=http://127.0.0.1:2024
Ideally, the link opens automatically when the command is run. If not, open this link in your browser to load the LangGraph Studio dashboard.
In the Studio interface, enter a natural language prompt such as:
Show all Amazon Athena tables available in my catalog
LangGraph displays a real-time visualization of the agent's reasoning flow, showing how it interprets the prompt, calls the appropriate MCP tools, and retrieves live data from Amazon Athena.
π Query and visualize the Amazon Athena dataTo get live data access to hundreds of SaaS, Big Data, and NoSQL sources directly from your cloud applications, try CData Connect AI today!
Learn more about CData Connect AI or sign up for free trial access:
Free Trial