LangGraph has become the go-to framework for building production-grade AI agents in Python. With version 1.1.6 released on As of As of April 10, 2026, LangGraph has over 126,000 GitHub stars (claim dated April 8, 2026) and provides developers with a graph-based approach to orchestrating multi-step, stateful AI workflows that go far beyond simple prompt-response chains.[2] This LangGraph tutorial walks you through building a fully functional autonomous AI agent from scratch in 14 detailed steps, complete with tool integration, memory persistence, human-in-the-loop controls, and deployment-ready patterns.
Last updated: April 10, 2026
Unlike basic LLM wrappers that process a single prompt and return a response, LangGraph lets you define directed graphs where each node is a computation step and edges control the flow of execution. This architecture supports cycles, branching, conditional logic, and persistent state – the essentials for agents that can reason, plan, and act across multiple steps. Whether you are building a research assistant, a customer support bot, or a code generation pipeline, this tutorial gives you the foundation to ship real AI agents in 2026.
Prerequisites and Environment Setup
Before diving into this LangGraph tutorial, make sure your development environment meets the following requirements. LangGraph 1.1.x dropped support for Python 3.9 and added Python 3.14 compatibility, so you need a modern Python installation. Every dependency listed below has been tested together as of April 2026 to ensure compatibility.
| Requirement | Minimum Version | Recommended Version | Purpose |
|---|---|---|---|
| Python | 3.10 | 3.12+ | Runtime environment |
| langgraph | 1.0.0 | 1.1.6 | Graph orchestration framework |
| langchain-core | 0.3.0 | 0.3.29 | Base abstractions and message types |
| langchain-openai | 0.2.0 | 0.3.8 | OpenAI model integration |
| langchain-anthropic | 0.3.0 | 0.3.10 | Claude model integration (optional) |
| langgraph-checkpoint-sqlite | 2.0.0 | 2.0.6 | Persistent state checkpointing |
| python-dotenv | 1.0.0 | 1.0.1 | Environment variable management |
| pip | 23.0 | 24.3+ | Package installer |
You will also need an API key from at least one LLM provider. This tutorial uses OpenAI’s GPT-4o as the default model, but every example can be swapped to use Anthropic Claude or any other LangChain-compatible provider. Estimated cost for completing all tutorial steps: under $2.00 in API calls.
Hardware requirements are minimal. LangGraph runs the orchestration logic locally while offloading inference to cloud APIs, so any machine with 4 GB of RAM and a stable internet connection is sufficient. If you plan to use local models via Ollama instead, you will need at least 16 GB of RAM and a GPU with 8 GB VRAM.
Step 1: Install LangGraph and Create the Project Structure
Start by creating a clean project directory and virtual environment. Isolating your dependencies prevents conflicts with other Python projects and ensures reproducible builds. The following commands work on macOS, Linux, and Windows (WSL).
mkdir langgraph-agent && cd langgraph-agent
python -m venv .venv
source .venv/bin/activate # Windows: .venvScriptsactivate
pip install langgraph==1.1.6
langchain-core==0.3.29
langchain-openai==0.3.8
langgraph-checkpoint-sqlite==2.0.6
python-dotenv==1.0.1
# Verify installation
python -c "import langgraph; print(f'LangGraph {langgraph.__version__} installed')"
Expected output:
LangGraph 1.1.6 installed
Next, create the project file structure. Organizing your code into separate modules from the start makes it easier to add nodes, tools, and configuration as your agent grows. This is the structure we will build throughout the tutorial:
langgraph-agent/
├── .env # API keys
├── agent/
│ ├── __init__.py
│ ├── graph.py # Main graph definition
│ ├── nodes.py # Node functions
│ ├── tools.py # Tool definitions
│ ├── state.py # State schema
│ └── prompts.py # System prompts
├── tests/
│ └── test_agent.py # Integration tests
└── main.py # Entry point
Create the directories and the .env file:
mkdir -p agent tests
touch agent/__init__.py
# Create .env file with your API key
echo "OPENAI_API_KEY=sk-your-key-here" > .env
Common Pitfall #1: Do not install langgraph and langchain without pinning versions. LangGraph 1.1.x requires langchain-core>=0.3.0, and mixing major versions causes import errors like ImportError: cannot import name 'BaseMessage' from 'langchain_core.messages'. Always pin your versions as shown above.
Step 2: Define the Agent State Schema
Every LangGraph application starts with a state definition. The state is a typed dictionary or Pydantic model that flows through your graph – each node reads from it and writes updates back to it. Think of state as the shared memory that connects all your agent’s reasoning steps. LangGraph uses reducers to control how state updates merge, which is critical for list fields like message histories.
Create agent/state.py:
"""Agent state schema with typed fields and reducers."""
from typing import Annotated, TypedDict
from langchain_core.messages import BaseMessage
from langgraph.graph.message import add_messages
class AgentState(TypedDict):
"""State that flows through the agent graph.
Attributes:
messages: Conversation history with reducer that appends new messages.
tool_calls_count: Tracks how many tools the agent has called.
final_answer: The agent's synthesized response to the user.
iteration: Current reasoning loop iteration (for safety limits).
"""
messages: Annotated[list[BaseMessage], add_messages]
tool_calls_count: int
final_answer: str
iteration: int
The Annotated[list[BaseMessage], add_messages] pattern is the most important concept in this LangGraph tutorial. The add_messages reducer tells LangGraph to append new messages to the existing list rather than replacing it. Without the reducer, each node would overwrite the entire conversation history. Other fields like tool_calls_count and iteration use the default reducer, which means the latest value wins.
LangGraph 1.1 introduced Pydantic and dataclass coercion, so you can alternatively define state using a Pydantic BaseModel. However, TypedDict remains the recommended approach for most use cases because it has lower overhead and works smoothly with all LangGraph features including checkpointing and streaming.
Common Pitfall #2: Forgetting the add_messages reducer on your messages field is the single most common LangGraph bug. Without it, your agent loses its conversation context after every node execution, leading to responses like “I don’t have any context about what you asked.” Always annotate message list fields with the reducer.
Step 3: Create Tools for the Agent
Tools give your agent the ability to interact with the outside world – searching the web, querying databases, calling APIs, or performing calculations. In LangGraph, tools are standard Python functions decorated with @tool from langchain_core. The LLM decides when to call them based on the function name, docstring, and parameter types.
Create agent/tools.py with three practical tools:
"""Tools that the agent can call during execution."""
import json
import urllib.request
from datetime import datetime
from langchain_core.tools import tool
@tool
def get_current_time() -> str:
"""Get the current date and time in ISO format.
Useful when the user asks about the current date, time, or when
something is happening relative to now.
"""
return datetime.now().isoformat()
@tool
def calculate(expression: str) -> str:
"""Evaluate a mathematical expression and return the result.
Supports basic arithmetic: +, -, *, /, **, %, parentheses.
Use this when the user needs any calculation performed.
Args:
expression: A mathematical expression like '2 + 2' or '(10 * 5) / 3'
"""
allowed_chars = set("0123456789+-*/.() %")
if not all(c in allowed_chars for c in expression.replace(" ", "")):
return "Error: Expression contains invalid characters."
try:
result = eval(expression, {"__builtins__": {}}) # noqa: S307
return str(result)
except Exception as e:
return f"Error evaluating expression: {e}"
@tool
def search_web(query: str) -> str:
"""Search the web for current information about a topic.
Use this when you need up-to-date facts, news, or data that
might not be in your training data.
Args:
query: The search query string
"""
# In production, replace with a real search API (Tavily, SerpAPI, etc.)
# This is a mock for tutorial purposes
return json.dumps({
"query": query,
"results": [
{
"title": f"Latest information about {query}",
"snippet": f"Here are the most recent findings about {query} as of 2026.",
"url": f"https://example.com/search?q={query.replace(' ', '+')}"
}
],
"note": "Replace this mock with Tavily or SerpAPI for production use."
})
# Collect all tools into a list for the graph
agent_tools = [get_current_time, calculate, search_web]
Each tool’s docstring is critical – it serves as the instruction manual that the LLM reads to decide when and how to call the tool. Write clear, specific docstrings that describe when the tool should be used and what it returns. Vague descriptions lead to incorrect tool selection.
Common Pitfall #3: Using Python’s built-in eval() for calculation tools without input sanitization is a security vulnerability. The implementation above restricts allowed characters to digits and operators. In production, use a safe math parser like numexpr or sympy instead of eval().
Step 4: Build the Graph Nodes
Nodes are the core building blocks of a LangGraph application. Each node is a function that takes the current state, performs some computation, and returns a partial state update. LangGraph merges these updates into the graph state using the reducers you defined. The three fundamental node types in most agent graphs are: the reasoning node (calls the LLM), the tool execution node (runs tools), and the output node (formats the final response).
Create agent/nodes.py:
"""Node functions for the agent graph."""
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.messages import SystemMessage, ToolMessage
from langgraph.prebuilt import ToolNode
from agent.state import AgentState
from agent.tools import agent_tools
load_dotenv()
# Initialize the LLM with tool binding
llm = ChatOpenAI(
model="gpt-4o",
temperature=0,
api_key=os.getenv("OPENAI_API_KEY"),
).bind_tools(agent_tools)
SYSTEM_PROMPT = """You are a helpful AI assistant with access to tools.
Use tools when you need current information, calculations, or web data.
Always explain your reasoning before and after using tools.
If you cannot find an answer, say so honestly rather than making one up.
Limit yourself to 5 tool calls per conversation to stay efficient."""
def reasoning_node(state: AgentState) -> dict:
"""Call the LLM to reason about the next step.
Reads the current messages and decides whether to:
1. Call a tool for more information
2. Provide a final answer to the user
"""
messages = [SystemMessage(content=SYSTEM_PROMPT)] + state["messages"]
response = llm.invoke(messages)
return {
"messages": [response],
"iteration": state.get("iteration", 0) + 1,
}
# Use LangGraph's built-in ToolNode for automatic tool execution
tool_node = ToolNode(agent_tools)
def output_node(state: AgentState) -> dict:
"""Extract the final answer from the last AI message."""
last_message = state["messages"][-1]
return {
"final_answer": last_message.content,
}
The reasoning_node prepends the system prompt, sends the full message history to the LLM, and appends the response. The ToolNode from langgraph.prebuilt automatically handles tool execution – it reads tool call requests from the last AI message, executes the matching tools, and returns ToolMessage objects with the results. The output_node extracts the final text response for easy access.
Notice that bind_tools() is called on the LLM instance, not on the node. This tells the model about available tools and their schemas so it can generate structured tool call requests in its output. The model does not actually execute the tools – that happens in the separate tool_node.
Step 5: Wire the Graph Together with Conditional Edges
This is where the graph-based architecture of LangGraph shines. Unlike linear chains where Step A always leads to Step B, graphs support conditional edges that route execution based on the current state. For an agent, the key routing decision is: did the LLM request a tool call, or did it produce a final answer? This conditional edge creates the agent loop – reason, act, observe, repeat.
Create agent/graph.py:
"""Main graph definition – wires nodes and edges together."""
from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.sqlite import SqliteSaver
from agent.state import AgentState
from agent.nodes import reasoning_node, tool_node, output_node
MAX_ITERATIONS = 10
def should_continue(state: AgentState) -> str:
"""Route after the reasoning node.
Returns:
'tools' if the LLM wants to call a tool
'output' if the LLM produced a final answer
'output' if we hit the iteration safety limit
"""
last_message = state["messages"][-1]
# Safety limit: prevent infinite loops
if state.get("iteration", 0) >= MAX_ITERATIONS:
return "output"
# Check if the LLM requested tool calls
if hasattr(last_message, "tool_calls") and last_message.tool_calls:
return "tools"
return "output"
def build_agent_graph(checkpointer=None):
"""Construct and compile the agent graph.
Args:
checkpointer: Optional checkpointer for state persistence.
Returns:
A compiled LangGraph that can be invoked or streamed.
"""
graph = StateGraph(AgentState)
# Add nodes
graph.add_node("reasoning", reasoning_node)
graph.add_node("tools", tool_node)
graph.add_node("output", output_node)
# Define edges
graph.add_edge(START, "reasoning")
graph.add_conditional_edges(
"reasoning",
should_continue,
{"tools": "tools", "output": "output"},
)
graph.add_edge("tools", "reasoning") # Loop back after tool execution
graph.add_edge("output", END)
# Compile with optional checkpointer
return graph.compile(checkpointer=checkpointer)
# Default graph instance without persistence
agent = build_agent_graph()
The graph flow is: START → reasoning → (tools → reasoning)* → output → END. The asterisk indicates that the tool-reasoning loop can repeat multiple times. The should_continue function acts as the router, inspecting the LLM’s output to decide the next step. The MAX_ITERATIONS safety limit prevents runaway loops where the agent keeps calling tools without producing an answer – a common problem with unbounded agent architectures.
Common Pitfall #4: Not setting a maximum iteration limit causes your agent to loop indefinitely in edge cases, burning through API credits. The MAX_ITERATIONS = 10 safety valve in should_continue is essential. In production, also add a timeout at the HTTP layer.
Step 6: Run the Agent and Handle Streaming Output
With the graph built, you can now invoke the agent and see it in action. LangGraph 1.1 introduced type-safe streaming with version="v2", which produces unified StreamPart output with type, ns, and data keys. This makes it much easier to build real-time UIs that display agent reasoning as it happens.
Create main.py:
"""Entry point – run the agent with streaming output."""
from langchain_core.messages import HumanMessage
from agent.graph import build_agent_graph
def run_agent(user_input: str):
"""Run the agent and stream output to the console."""
graph = build_agent_graph()
initial_state = {
"messages": [HumanMessage(content=user_input)],
"tool_calls_count": 0,
"final_answer": "",
"iteration": 0,
}
print(f"n{'='*60}")
print(f"User: {user_input}")
print(f"{'='*60}n")
# Stream execution – see each node as it runs
for event in graph.stream(initial_state, stream_mode="updates"):
for node_name, node_output in event.items():
print(f"[Node: {node_name}]")
if node_name == "reasoning":
msg = node_output["messages"][-1]
if hasattr(msg, "tool_calls") and msg.tool_calls:
for tc in msg.tool_calls:
print(f" → Calling tool: {tc['name']}({tc['args']})")
else:
print(f" → Response: {msg.content[:200]}...")
elif node_name == "tools":
for msg in node_output.get("messages", []):
print(f" → Tool result: {msg.content[:150]}...")
elif node_name == "output":
print(f" → Final: {node_output.get('final_answer', '')[:300]}...")
print()
# Get final state with invoke for the complete answer
result = graph.invoke(initial_state)
print(f"n{'='*60}")
print(f"Final Answer:n{result['final_answer']}")
print(f"Iterations: {result['iteration']}")
print(f"{'='*60}")
if __name__ == "__main__":
run_agent("What is 47 * 89 and what time is it right now?")
Expected output when running python main.py:
============================================================
User: What is 47 * 89 and what time is it right now?
============================================================
[Node: reasoning]
→ Calling tool: calculate({'expression': '47 * 89'})
[Node: tools]
→ Tool result: 4183...
[Node: reasoning]
→ Calling tool: get_current_time({})
[Node: tools]
→ Tool result: 2026-04-08T14:32:07.123456...
[Node: reasoning]
→ Response: 47 × 89 equals **4,183**, and the current time is...
[Node: output]
→ Final: 47 × 89 equals **4,183**, and the current time is April 8, 2026...
============================================================
Final Answer:
47 × 89 equals **4,183**, and the current time is April 10, 2026 at 10:40 AM UTC.
Iterations: 3
============================================================
The streaming output shows the agent’s reasoning process in real time: it first decides to call the calculator, receives the result, then calls the time tool, and finally synthesizes both results into a coherent answer. Three iterations means three passes through the reasoning node.
Step 7: Add Persistent Memory with Checkpointing
One of LangGraph’s most powerful features is built-in checkpointing. A checkpointer saves the full graph state after every node execution, enabling conversation memory across sessions, time travel debugging, and fault recovery. LangGraph supports SQLite, PostgreSQL, and in-memory checkpointers. For this tutorial, we use SQLite because it requires zero infrastructure.
Update main.py to use persistent memory:
"""Entry point with persistent memory across conversations."""
from langchain_core.messages import HumanMessage
from langgraph.checkpoint.sqlite import SqliteSaver
from agent.graph import build_agent_graph
def run_with_memory():
"""Run the agent with SQLite-backed conversation memory."""
# SqliteSaver persists state to a local database file
with SqliteSaver.from_conn_string("checkpoints.db") as checkpointer:
graph = build_agent_graph(checkpointer=checkpointer)
# Thread ID groups messages into a conversation
config = {"configurable": {"thread_id": "user-session-001"}}
# First message
result1 = graph.invoke(
{"messages": [HumanMessage(content="My name is Alex and I work at Acme Corp.")]},
config=config,
)
print(f"Agent: {result1['final_answer']}n")
# Second message – agent remembers context from first
result2 = graph.invoke(
{"messages": [HumanMessage(content="What company do I work at?")]},
config=config,
)
print(f"Agent: {result2['final_answer']}n")
# Different thread – no memory of previous conversation
config2 = {"configurable": {"thread_id": "user-session-002"}}
result3 = graph.invoke(
{"messages": [HumanMessage(content="What company do I work at?")]},
config=config2,
)
print(f"Agent (new thread): {result3['final_answer']}")
if __name__ == "__main__":
run_with_memory()
Expected output:
Agent: Nice to meet you, Alex! I'll remember that you work at Acme Corp.
Agent: You work at Acme Corp, Alex!
Agent (new thread): I don't have any information about where you work.
Could you tell me?
The thread_id parameter is the key concept. All invocations with the same thread ID share state, creating a continuous conversation. Different thread IDs create isolated sessions. The SQLite checkpointer writes state to checkpoints.db, so conversations survive process restarts. In production, swap SqliteSaver for PostgresSaver for multi-server deployments.
Common Pitfall #5: Using SqliteSaver without a context manager (with statement) leaves database connections open, causing OperationalError: database is locked after a few hundred invocations. Always use with SqliteSaver.from_conn_string(...) as checkpointer: to ensure proper cleanup.
Step 8: Implement Human-in-the-Loop Approval
For agents that take consequential actions – sending emails, making purchases, modifying databases – you need a way for humans to review and approve actions before they execute. LangGraph provides a built-in interrupt mechanism that pauses graph execution at any point and resumes when a human provides input. This is one of the features that separates LangGraph from simpler agent frameworks.
Add a human approval gate to the graph. Update agent/graph.py:
"""Graph with human-in-the-loop approval for tool calls."""
from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.sqlite import SqliteSaver
from langgraph.types import interrupt, Command
from agent.state import AgentState
from agent.nodes import reasoning_node, tool_node, output_node
SENSITIVE_TOOLS = {"search_web"} # Tools that need human approval
MAX_ITERATIONS = 10
def approval_node(state: AgentState) -> dict | Command:
"""Pause execution and ask the human to approve tool calls."""
last_message = state["messages"][-1]
tool_calls = last_message.tool_calls if hasattr(last_message, "tool_calls") else []
sensitive_calls = [tc for tc in tool_calls if tc["name"] in SENSITIVE_TOOLS]
if sensitive_calls:
# interrupt() pauses the graph and returns this to the caller
decision = interrupt({
"question": "The agent wants to search the web. Approve?",
"tool_calls": [
{"name": tc["name"], "args": tc["args"]}
for tc in sensitive_calls
],
})
if decision.get("approved"):
return {} # Continue to tool execution
else:
# Skip tool execution, go back to reasoning with feedback
return Command(
goto="reasoning",
update={"messages": [{"role": "user", "content": "Tool call denied by user."}]},
)
return {} # No sensitive tools, proceed normally
def should_continue(state: AgentState) -> str:
last_message = state["messages"][-1]
if state.get("iteration", 0) >= MAX_ITERATIONS:
return "output"
if hasattr(last_message, "tool_calls") and last_message.tool_calls:
# Check if any tool calls need approval
sensitive = any(tc["name"] in SENSITIVE_TOOLS for tc in last_message.tool_calls)
return "approval" if sensitive else "tools"
return "output"
def build_agent_graph_with_hitl(checkpointer=None):
"""Build graph with human-in-the-loop approval."""
graph = StateGraph(AgentState)
graph.add_node("reasoning", reasoning_node)
graph.add_node("approval", approval_node)
graph.add_node("tools", tool_node)
graph.add_node("output", output_node)
graph.add_edge(START, "reasoning")
graph.add_conditional_edges(
"reasoning",
should_continue,
{"tools": "tools", "approval": "approval", "output": "output"},
)
graph.add_edge("approval", "tools")
graph.add_edge("tools", "reasoning")
graph.add_edge("output", END)
return graph.compile(checkpointer=checkpointer)
When the agent tries to call a sensitive tool like search_web, the graph pauses at the approval_node. The caller receives the interrupt payload, presents it to the user, and resumes the graph with the approval decision. This pattern is essential for building trustworthy AI agents in production environments where autonomous actions carry real consequences.
Step 9: Add Multi-Agent Collaboration
Real-world AI applications often require multiple specialized agents working together. LangGraph supports multi-agent architectures through subgraphs – you can embed one compiled graph as a node inside another. This creates a supervisor pattern where a coordinator agent delegates tasks to specialist agents and synthesizes their results.
Here is a practical example with a research agent and a writing agent:
"""Multi-agent system: researcher + writer coordinated by supervisor."""
from typing import Annotated, TypedDict
from langchain_core.messages import BaseMessage, HumanMessage, SystemMessage
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
class TeamState(TypedDict):
messages: Annotated[list[BaseMessage], add_messages]
research_notes: str
draft: str
task: str
llm = ChatOpenAI(model="gpt-4o", temperature=0)
def researcher_node(state: TeamState) -> dict:
"""Research agent gathers information about the topic."""
response = llm.invoke([
SystemMessage(content="You are a research specialist. Gather key facts, "
"statistics, and insights about the given topic. "
"Output structured research notes."),
HumanMessage(content=f"Research this topic: {state['task']}"),
])
return {"research_notes": response.content}
def writer_node(state: TeamState) -> dict:
"""Writer agent creates content based on research notes."""
response = llm.invoke([
SystemMessage(content="You are an expert technical writer. Use the provided "
"research notes to write a clear, engaging draft."),
HumanMessage(content=f"Write about: {state['task']}nn"
f"Research notes:n{state['research_notes']}"),
])
return {"draft": response.content, "messages": [response]}
def supervisor_node(state: TeamState) -> dict:
"""Supervisor reviews the draft and provides the final output."""
response = llm.invoke([
SystemMessage(content="You are an editor. Review this draft for accuracy "
"and clarity. Output the final polished version."),
HumanMessage(content=f"Review and finalize:nn{state['draft']}"),
])
return {"messages": [response]}
def build_team_graph():
graph = StateGraph(TeamState)
graph.add_node("researcher", researcher_node)
graph.add_node("writer", writer_node)
graph.add_node("supervisor", supervisor_node)
graph.add_edge(START, "researcher")
graph.add_edge("researcher", "writer")
graph.add_edge("writer", "supervisor")
graph.add_edge("supervisor", END)
return graph.compile()
# Usage
team = build_team_graph()
result = team.invoke({
"messages": [],
"research_notes": "",
"draft": "",
"task": "Explain how quantum error correction works in 2026",
})
print(result["messages"][-1].content)
This linear pipeline (researcher → writer → supervisor) is the simplest multi-agent pattern. For more complex workflows, you can add conditional edges to create review loops where the supervisor sends the draft back to the writer for revisions, or branching patterns where multiple researchers work in parallel on different subtopics.
Step 10: Test the Agent with Integration Tests
Testing AI agents is notoriously difficult because LLM outputs are non-deterministic. However, you can still write meaningful tests by focusing on structural correctness – does the graph compile, do tools execute, does state flow correctly – rather than testing exact text outputs. LangGraph’s deterministic graph execution makes this practical.
Create tests/test_agent.py:
"""Integration tests for the agent graph."""
import pytest
from langchain_core.messages import HumanMessage, AIMessage
from agent.graph import build_agent_graph
from agent.state import AgentState
from agent.tools import calculate, get_current_time
def test_graph_compiles():
"""The graph should compile without errors."""
graph = build_agent_graph()
assert graph is not None
def test_calculate_tool():
"""The calculate tool should return correct results."""
assert calculate.invoke({"expression": "2 + 2"}) == "4"
assert calculate.invoke({"expression": "10 * 5"}) == "50"
assert calculate.invoke({"expression": "(100 - 25) / 3"}) == "25.0"
def test_calculate_tool_rejects_invalid():
"""The calculate tool should reject dangerous input."""
result = calculate.invoke({"expression": "__import__('os').system('ls')"})
assert "Error" in result or "invalid" in result
def test_time_tool_returns_iso():
"""The time tool should return an ISO formatted string."""
result = get_current_time.invoke({})
assert "2026" in result
assert "T" in result # ISO format separator
def test_state_schema():
"""State should accept valid field types."""
state: AgentState = {
"messages": [HumanMessage(content="test")],
"tool_calls_count": 0,
"final_answer": "",
"iteration": 0,
}
assert len(state["messages"]) == 1
assert state["iteration"] == 0
def test_agent_responds(monkeypatch):
"""The agent should produce a final answer (requires API key)."""
graph = build_agent_graph()
result = graph.invoke({
"messages": [HumanMessage(content="What is 15 + 27?")],
"tool_calls_count": 0,
"final_answer": "",
"iteration": 0,
})
assert result["final_answer"] != ""
assert result["iteration"] >= 1
# Run with: pytest tests/ -v
Run the test suite with pytest tests/ -v. The tool-level tests run without API keys and execute in milliseconds. The integration test (test_agent_responds) requires a valid API key and makes actual LLM calls. In CI/CD pipelines, mark integration tests with @pytest.mark.integration and run them separately to avoid API costs on every commit.
Step 11: Deploy with LangGraph Platform
LangGraph Platform (formerly LangGraph Cloud) provides managed deployment for LangGraph applications. It handles scaling, persistence, and API serving so you can expose your agent as a production REST API without managing infrastructure. As of April 2026, LangGraph Platform supports both self-hosted and cloud-hosted deployment options through LangGraph’s official documentation.
To prepare your agent for deployment, create a langgraph.json configuration file in your project root:
{
"dependencies": ["."],
"graphs": {
"agent": "./agent/graph.py:agent"
},
"env": ".env"
}
And a requirements.txt for the deployment environment:
langgraph==1.1.6
langchain-core==0.3.29
langchain-openai==0.3.8
langgraph-checkpoint-sqlite==2.0.6
python-dotenv==1.0.1
For self-hosted deployment, you can also containerize the agent with Docker:
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["python", "-m", "langgraph", "serve", "--host", "0.0.0.0", "--port", "8000"]
Build and run: docker build -t langgraph-agent . && docker run -p 8000:8000 --env-file .env langgraph-agent. The agent is now accessible at http://localhost:8000 with a standard REST API for invocation, streaming, and thread management.
Step 12: Monitor with LangSmith Observability
Deploying an agent without observability is like driving without a dashboard. LangSmith provides tracing, evaluation, and debugging for LangGraph applications. Every node execution, tool call, LLM invocation, and state transition is captured as a trace that you can inspect in the LangSmith UI. Integration requires just two environment variables.
Add to your .env file:
LANGSMITH_API_KEY=lsv2-your-key-here
LANGSMITH_PROJECT=langgraph-agent-prod
LANGSMITH_TRACING=true
With these variables set, every graph execution automatically sends traces to LangSmith. No code changes required – the LangChain SDK detects the environment variables and instruments all calls. In the LangSmith dashboard, you can see the full execution tree for each invocation, including token counts, latency per node, tool inputs and outputs, and the complete state at each checkpoint.
Key metrics to monitor in production:
| Metric | Target | Alert Threshold | Why It Matters |
|---|---|---|---|
| Average iterations per query | 2-3 | >5 | High iterations mean the agent is struggling to find answers |
| Tool call success rate | >95% | <90% | Failing tools cause cascading reasoning errors |
| End-to-end latency (P95) | <10s | >20s | Users abandon after 15 seconds of waiting |
| Token cost per query | <$0.05 | >$0.15 | Runaway tool loops burn budget fast |
| Interrupt approval rate | >80% | <50% | Low approval means the agent suggests wrong actions |
| Checkpoint write latency | <50ms | >200ms | Slow checkpoints degrade streaming experience |
Steps 13–14: Advanced Patterns and Production Optimization
Once your basic agent is working, these advanced patterns help you scale to production quality. These techniques are used by teams running LangGraph agents serving thousands of concurrent users in 2026.
Step 13: Parallel Tool Execution
By default, LangGraph’s ToolNode executes tool calls sequentially. When the LLM requests multiple tools in a single response, parallel execution can cut latency significantly. Here is how to implement it:
"""Parallel tool execution for faster multi-tool responses."""
import asyncio
from langchain_core.messages import ToolMessage
from agent.tools import agent_tools
# Build a lookup dict for tools
tool_map = {tool.name: tool for tool in agent_tools}
async def parallel_tool_node(state: dict) -> dict:
"""Execute all pending tool calls concurrently."""
last_message = state["messages"][-1]
tool_calls = last_message.tool_calls
async def run_tool(tc):
tool_fn = tool_map[tc["name"]]
result = await asyncio.to_thread(tool_fn.invoke, tc["args"])
return ToolMessage(
content=str(result),
tool_call_id=tc["id"],
)
results = await asyncio.gather(*[run_tool(tc) for tc in tool_calls])
return {"messages": list(results)}
Step 14: Error Recovery and Retry Logic
Production agents need to handle failures gracefully. When an API call fails or a tool throws an exception, the agent should retry or route to a fallback rather than crashing the entire graph. LangGraph’s state-based architecture makes this straightforward:
"""Error-resilient tool node with retry logic."""
from langchain_core.messages import ToolMessage
from agent.tools import agent_tools
import time
tool_map = {tool.name: tool for tool in agent_tools}
MAX_RETRIES = 2
def resilient_tool_node(state: dict) -> dict:
"""Execute tools with automatic retry on failure."""
last_message = state["messages"][-1]
results = []
for tc in last_message.tool_calls:
tool_fn = tool_map.get(tc["name"])
if not tool_fn:
results.append(ToolMessage(
content=f"Error: Unknown tool '{tc['name']}'",
tool_call_id=tc["id"],
))
continue
for attempt in range(MAX_RETRIES + 1):
try:
result = tool_fn.invoke(tc["args"])
results.append(ToolMessage(
content=str(result),
tool_call_id=tc["id"],
))
break
except Exception as e:
if attempt == MAX_RETRIES:
results.append(ToolMessage(
content=f"Error after {MAX_RETRIES + 1} attempts: {e}",
tool_call_id=tc["id"],
))
time.sleep(1 * (attempt + 1)) # Exponential backoff
return {"messages": results}
By returning error messages as ToolMessage content instead of raising exceptions, the LLM can read the error and either retry with different parameters or inform the user about the failure. This self-healing pattern is far more strong than try-catch blocks that silently swallow errors.
LangGraph Architecture: How It Compares to Other Agent Frameworks
Understanding where LangGraph sits in the AI agent framework landscape helps you make informed architecture decisions. As of April 2026, three major Python frameworks dominate the agent-building space, each with different design philosophies.
| Feature | LangGraph 1.1 | CrewAI 0.80+ | AutoGen 0.4+ |
|---|---|---|---|
| Architecture | Directed graph (cyclic) | Role-based crews | Conversation-based |
| State management | Built-in typed state + reducers | Shared memory object | Message history |
| Checkpointing | Native (SQLite, PostgreSQL) | Manual implementation | Limited |
| Human-in-the-loop | Native interrupt/resume | Callback-based | Nested chat pattern |
| Streaming | Type-safe v2 streaming | Basic event callbacks | Token streaming only |
| Multi-agent | Subgraphs + supervisor | Sequential/hierarchical crews | Group chat + selectors |
| Persistence | First-class feature | Third-party required | Third-party required |
| Learning curve | Moderate (graph concepts) | Low (role assignment) | Low-moderate |
| Best for | Complex stateful workflows | Simple multi-agent tasks | Research and prototyping |
LangGraph’s graph-based approach requires more upfront design than CrewAI’s “assign roles and go” pattern, but it pays off in production environments where you need fine-grained control over execution flow, state management, and error handling. The native checkpointing and human-in-the-loop support eliminate entire categories of infrastructure code that other frameworks leave to the developer.
Troubleshooting: 10 Common LangGraph Issues and Solutions
Based on thousands of LangGraph deployments in 2026, these are the most frequently encountered issues and their solutions. Bookmark this section – you will likely hit at least three of these during development.
1. “ImportError: cannot import name ‘StateGraph’ from ‘langgraph'”
Cause: You have an old version of langgraph installed. StateGraph moved to langgraph.graph in 1.0.
Fix: Run pip install --upgrade langgraph>=1.1.0 and update your import to from langgraph.graph import StateGraph.
2. “InvalidUpdateError: Key ‘messages’ requires a reducer”
Cause: Your state defines messages: list[BaseMessage] without the Annotated[..., add_messages] wrapper.
Fix: Change to messages: Annotated[list[BaseMessage], add_messages] and add from langgraph.graph.message import add_messages.
3. Agent loops indefinitely, burning API credits
Cause: No iteration limit in the routing function. The LLM keeps requesting tools without providing a final answer.
Fix: Add a MAX_ITERATIONS check in your should_continue function as shown in Step 5. Also set recursion_limit when invoking: graph.invoke(state, {"recursion_limit": 25}).
4. “ToolException: Tool ‘X’ not found in tool map”
Cause: The tool list passed to bind_tools() doesn’t match the tools available in ToolNode.
Fix: Ensure the same tool list is used for both: llm.bind_tools(agent_tools) and ToolNode(agent_tools).
5. “OperationalError: database is locked” with SqliteSaver
Cause: Multiple processes or connections accessing the same SQLite file concurrently.
Fix: Use a context manager (with SqliteSaver.from_conn_string(...):) and switch to PostgresSaver for multi-process deployments.
6. Streaming returns empty events
Cause: Using stream_mode="values" with nodes that return empty dicts.
Fix: Use stream_mode="updates" to see per-node updates, or ensure every node returns at least one state field.
7. “GraphRecursionError: Recursion limit of 25 reached”
Cause: The agent exceeded the default recursion limit. Each node execution counts as one step.
Fix: Either increase the limit with graph.invoke(state, {"recursion_limit": 50}) or reduce the number of tool calls per query by improving your system prompt.
8. Tool results not appearing in the LLM’s context
Cause: Tool messages are being created with incorrect tool_call_id values that don’t match the LLM’s request.
Fix: Use LangGraph’s built-in ToolNode instead of a custom implementation. It automatically maps tool call IDs correctly.
9. Checkpointer not saving state between invocations
Cause: Missing thread_id in the config, or creating a new checkpointer instance for each invocation.
Fix: Always pass config={"configurable": {"thread_id": "your-id"}} and reuse the same checkpointer instance across invocations.
10. Type errors when using Pydantic state with LangGraph 1.1
Cause: LangGraph 1.1’s Pydantic coercion expects specific field types. Passing raw dicts where the schema expects Pydantic models fails silently.
Fix: Either use TypedDict state (recommended) or ensure all input values match the Pydantic model’s field types exactly. Use version="v2" in stream/invoke for automatic coercion.
Advanced Tips for Production LangGraph Agents
These tips come from teams running LangGraph agents at scale in 2026. Each one addresses a real production challenge that you will eventually face as your agent handles more traffic and more complex tasks.
Use LangGraph Studio for visual debugging. LangGraph Studio renders your graph visually and lets you step through execution node by node. It is invaluable for debugging complex conditional flows where print statements are not enough. Install it locally with pip install langgraph-cli and run langgraph dev in your project directory.
Implement graceful degradation with fallback models. If your primary model (GPT-4o) is down or rate-limited, fall back to a cheaper model (GPT-4o-mini) automatically. Use LangChain’s with_fallbacks method: llm = ChatOpenAI(model="gpt-4o").with_fallbacks([ChatOpenAI(model="gpt-4o-mini")]).
Cache tool results to reduce latency and cost. If the same search query runs multiple times in a conversation, caching the result avoids redundant API calls. Use Python’s functools.lru_cache for in-memory caching, or Redis for distributed caching across multiple agent instances.
Separate your graph definition from your LLM configuration. Hard-coding the model name inside node functions makes it impossible to A/B test different models or swap providers without code changes. Instead, pass the LLM as a parameter to build_agent_graph() and inject it into nodes via closure or dependency injection.
Use structured output for deterministic routing. Instead of parsing free-text LLM responses to decide the next step, use LangChain’s .with_structured_output() to force the LLM to return a typed decision object. This eliminates parsing errors and makes routing 100% reliable.
Monitor token usage per thread, not just per request. A single thread (conversation) can accumulate thousands of messages over its lifetime, causing context window overflow and exponentially increasing costs. Implement a message pruning strategy that keeps the system prompt, the last N messages, and a summary of older messages.
Complete Working Project: Full Source Code
Here is the complete, copy-paste-ready source code for the LangGraph agent built in this tutorial. All files are shown in their final form with all features integrated. Clone the structure, add your API key, and run python main.py to see the agent in action.
The complete main.py with all features integrated:
"""Complete LangGraph Agent – Full working example."""
from langchain_core.messages import HumanMessage
from langgraph.checkpoint.sqlite import SqliteSaver
from agent.graph import build_agent_graph
def main():
"""Run the complete agent with memory and streaming."""
with SqliteSaver.from_conn_string("checkpoints.db") as checkpointer:
graph = build_agent_graph(checkpointer=checkpointer)
config = {"configurable": {"thread_id": "demo-session"}}
queries = [
"What is the square root of 144 plus 25?",
"Search the web for the latest Python release.",
"What did I just ask you about?", # Tests memory
]
for query in queries:
print(f"nUser: {query}")
print("-" * 50)
for event in graph.stream(
{"messages": [HumanMessage(content=query)],
"tool_calls_count": 0, "final_answer": "", "iteration": 0},
config=config,
stream_mode="updates",
):
for node_name, output in event.items():
if node_name == "output":
print(f"Agent: {output.get('final_answer', '')}n")
if __name__ == "__main__":
main()
The full source code is available as a complete project on the LangGraph GitHub repository, which includes additional examples for chatbots, RAG agents, and multi-agent systems. To run the tutorial code, you need only three files (state.py, tools.py, graph.py) plus the main.py entry point shown above.
Related Coverage
For more on AI development and the tools shaping the industry in 2026, explore these related articles:
- How to Build a RAG Chatbot with Python and LangChain
- The Rise of AI Agents: How Autonomous Software Is Reshaping Enterprise
- Agentic AI in Enterprise 2026: Inside the $9 Billion Market
- How to Automate Tasks with Python
- How to Build a REST API with FastAPI
- Best AI Models 2026 – Pillar Guide
Frequently Asked Questions
What is LangGraph and how is it different from LangChain?
LangGraph is a library built on top of LangChain that adds graph-based orchestration for building stateful, multi-step AI agents. While LangChain provides the building blocks (LLM wrappers, tools, prompts), LangGraph provides the execution engine that controls how those blocks connect and interact. Think of LangChain as the parts catalog and LangGraph as the assembly blueprint. LangGraph’s key additions are cyclic graph execution, built-in state management with reducers, and native checkpointing – features that LangChain’s simple sequential chains do not support.
Do I need an OpenAI API key to use LangGraph?
No. LangGraph is model-agnostic. While this tutorial uses OpenAI’s GPT-4o as the default, you can swap it for any LangChain-compatible model by changing the LLM initialization. Supported providers include Anthropic Claude, Google Gemini, Mistral, Cohere, and local models via Ollama. Just install the corresponding LangChain integration package (e.g., langchain-anthropic) and update the import and initialization.
How much does it cost to run a LangGraph agent?
The cost depends entirely on the LLM provider and the number of tokens processed. A typical agent query with 2-3 tool calls using GPT-4o costs approximately $0.02-0.05. The LangGraph framework itself is free and open source. For the complete tutorial project, expect under $2 in total API costs. Monitor costs in production using LangSmith’s token tracking dashboard.
Can LangGraph agents run locally without cloud APIs?
Yes. You can use local models through Ollama or any OpenAI-compatible local server. Replace ChatOpenAI with ChatOllama from langchain-community and point it at your local Ollama instance. Models like Llama 3.1 70B and Mixtral work well for agent tasks. You will need at least 16 GB RAM and preferably a GPU for acceptable inference speed.
What is the difference between StateGraph and MessageGraph?
StateGraph is the general-purpose graph type that accepts any typed state schema with multiple fields, reducers, and custom types. MessageGraph is a simplified version where the entire state is just a list of messages. Use StateGraph for any non-trivial agent that needs to track more than just conversation messages. MessageGraph is convenient for simple chatbots but limits your ability to add custom state fields later.
How do I handle rate limits when my agent makes many tool calls?
Implement exponential backoff in your tool node (as shown in Step 14), set a maximum tool calls limit in your system prompt, and use LangChain’s built-in rate limiting with max_retries and request_timeout parameters on the LLM. For high-throughput production agents, use a token bucket rate limiter at the application layer and queue requests when approaching API limits.
Is LangGraph production-ready in 2026?
Yes. LangGraph reached version 1.0 in late 2025 and is now at version 1.1.6. It is used in production by companies across financial services, healthcare, customer support, and software development. The LangGraph Platform provides managed deployment with enterprise SLAs. The framework’s graph-based architecture, native persistence, and thorough observability through LangSmith make it suitable for mission-critical applications.
Can I use LangGraph with TypeScript or JavaScript?
Yes. LangGraph has an official TypeScript/JavaScript version (langgraphjs) with feature parity to the Python version. Install it with npm install @langchain/langgraph. The API design is nearly identical, so concepts from this Python tutorial translate directly to the JS version with syntax adjustments.
Sofia Lindström
Sofia Lindström is the Editor-in-Chief at Tech Insider, where she leads editorial strategy and oversees coverage across AI, cybersecurity, and enterprise technology. With over a decade in Swedish tech journalism, she previously served as technology editor at Dagens Industri and covered the Nordic startup ecosystem for Breakit. Sofia holds an MSc in Media Technology from KTH Royal Institute of Technology and is a frequent speaker at Web Summit and Slush. She is passionate about making complex technology accessible to business leaders.
View all articles