Voozh

April 17, 2026

23 min read

Dify has quietly become one of the fastest-growing open-source AI platforms on GitHub, surpassing 100,000 stars and climbing. It lets developers build production-grade AI applications–chatbots, RAG pipelines, autonomous agents, and workflow automations–without writing boilerplate LLM orchestration code from scratch. With its visual workflow builder, built-in RAG engine, and support for dozens of LLM providers, Dify bridges the gap between prototyping and production deployment.

This tutorial walks you through building a complete AI-powered customer support assistant using Dify, from local Docker installation to a deployed application with RAG-powered knowledge retrieval, custom tool integrations, and workflow automation. By the end, you will have a working project that answers questions from your own documents, routes complex queries to human agents, and exposes a production-ready API.

Prerequisites and Environment Setup

Before diving into the Dify tutorial, ensure your development machine meets the minimum requirements. Dify runs as a multi-container Docker application, so you need adequate resources for the platform services, the vector database, and the LLM API connections.

Here is what you need installed and configured before starting:

Prerequisite	Minimum Version	Recommended	Purpose
Docker Engine	20.10+	27.x	Container runtime for Dify services
Docker Compose	2.17+	2.29+	Multi-container orchestration
RAM	4 GB free	8 GB+	Running Dify + vector DB + Redis + PostgreSQL
Disk Space	10 GB	20 GB+	Docker images, model caches, document storage
CPU	2 cores	4+ cores	API processing and document indexing
Git	2.30+	Latest	Cloning the Dify repository
OpenAI or Anthropic API Key	–	–	LLM provider for inference
Python (optional)	3.10+	3.12	Custom tool development and API scripting

You also need a working internet connection for pulling Docker images and connecting to LLM APIs. If you plan to use local models via Ollama, add another 8 GB of RAM for running 7B-parameter models. For this tutorial, we will use cloud-hosted LLMs through API keys, which keeps resource requirements lower.

Verify your Docker installation before proceeding:

docker --version
# Expected: Docker version 27.x.x

docker compose version
# Expected: Docker Compose version v2.29.x

# Verify available resources
docker system info | grep -E "CPUs|Total Memory"
# CPUs: 4
# Total Memory: 15.62GiB

If Docker is not installed, follow the official Docker installation guide for your operating system. On macOS and Windows, Docker Desktop includes Docker Compose by default. On Linux, install the Docker Compose plugin separately if needed.

Step 1: Clone and Launch Dify with Docker Compose

Dify provides an official Docker Compose configuration that spins up all required services: the API server, the web frontend, a PostgreSQL database, Redis for caching, a Weaviate vector database for RAG, and a background worker for async tasks. This single-command deployment is one of the reasons Dify has attracted over 100,000 GitHub stars–it eliminates the complexity of stitching together individual components.

👁 Step 1: Clone and Launch Dify with Docker Compose

Start by cloning the Dify repository and navigating to the Docker directory:

# Clone the Dify repository
git clone https://github.com/langgenius/dify.git
cd dify/docker

# Copy the environment template
cp .env.example .env

# Launch all services
docker compose up -d

# Verify all containers are running
docker compose ps

The initial pull downloads several Docker images totaling around 3-4 GB. On a typical broadband connection, expect this to take 3-5 minutes. Once complete, you should see the following containers running:

# Expected output from docker compose ps
NAME STATUS
dify-api-1 Up (healthy)
dify-web-1 Up
dify-worker-1 Up
dify-db-1 Up (healthy)
dify-redis-1 Up (healthy)
dify-weaviate-1 Up (healthy)
dify-sandbox-1 Up
dify-nginx-1 Up 0.0.0.0:80->80/tcp

Open your browser and navigate to http://localhost (or http://localhost:80). You will see the Dify setup wizard. Create your admin account by entering an email and password. This account becomes the workspace owner with full permissions to manage models, applications, and team members.

If port 80 is already occupied on your machine, edit the .env file and change the EXPOSE_NGINX_PORT variable to an available port like 3000. Then restart with docker compose up -d.

Step 2: Configure LLM Providers

With Dify running, the next step is connecting your LLM providers. Dify supports a wide range of providers out of the box, including OpenAI, Anthropic, Google Gemini, Azure OpenAI, Cohere, Mistral, Zhipu AI, and local model servers like Ollama and LM Studio. This multi-provider flexibility is a core advantage–you can switch models per application without changing code.

Navigate to Settings → Model Provider in the Dify dashboard. Click on your preferred provider to add API credentials. For this tutorial, we will configure OpenAI as the primary provider:

1. Click the OpenAI card in the model provider list.

2. Enter your OpenAI API key in the provided field.

3. Click Save. Dify will validate the key and list all available models (GPT-4o, GPT-4o mini, etc.).

To add Anthropic as a secondary provider, repeat the process with your Anthropic API key. Dify will detect available Claude models automatically. You can then choose which model to use on a per-application basis, making it easy to compare performance and cost across providers.

For teams evaluating cost efficiency, Dify tracks token usage per model in the dashboard. This is especially useful when comparing providers: a GPT-4o call versus a Claude Sonnet call for the same prompt can differ significantly in both latency and cost. Configure at least two providers now so you can experiment with model switching later in this tutorial.

Step 3: Create Your First Chatbot Application

Dify organizes AI functionality into application types: Chatbot, Text Generator, Agent, and Workflow. Each type serves different use cases. For our customer support assistant, we will start with a Chatbot application and progressively enhance it with RAG and workflow capabilities.

From the Dify dashboard, click Create App → Chatbot. Give it a name like “Support Assistant” and an optional description. Select your preferred LLM model (e.g., GPT-4o or Claude Sonnet). You are now in the application editor, which has three main panels: the orchestration panel on the left, the prompt editor in the center, and the preview chat on the right.

Write a system prompt that defines the assistant’s behavior:

You are a helpful customer support assistant for TechCorp.
Your responsibilities:
- Answer product questions accurately using the provided knowledge base
- Escalate billing issues to the billing team
- Be concise and professional
- If you don't know the answer, say so honestly
- Always cite the source document when answering from the knowledge base

Rules:
- Never make up product features or pricing
- Do not discuss competitor products
- For urgent issues, provide the support hotline: 1-800-TECHCORP

Click Publish in the top-right corner to save your application. Test it in the preview panel by asking a question. At this stage, the chatbot responds based on the LLM’s general knowledge and your system prompt. The real power comes in the next steps when we add a knowledge base for RAG.

Notice the Variables section in the orchestration panel. Dify supports input variables that let you dynamically inject user context into prompts–for example, the user’s subscription tier or account ID. We will use this feature later for conditional routing.

Step 4: Build a Knowledge Base for RAG

Retrieval-Augmented Generation (RAG) is what transforms a generic chatbot into a domain-specific assistant. Dify’s built-in knowledge base handles the entire RAG pipeline: document upload, text chunking, embedding generation, vector storage, and retrieval. This eliminates the need for external tools like LangChain or LlamaIndex for basic RAG workflows.

👁 Step 4: Build a Knowledge Base for RAG

Go to Knowledge in the left sidebar and click Create Knowledge. Name it “Product Documentation” and click Create. Now upload your documents. Dify supports PDF, Word (.docx), Markdown (.md), plain text (.txt), CSV, and HTML files. For this tutorial, create a sample Markdown file with product information:

# sample-docs/product-guide.md

## TechCorp Pro Plan
- Price: $49/month or $470/year
- Features: Unlimited projects, 100 GB storage, priority support
- API access: 10,000 requests/day
- Team members: Up to 25

## TechCorp Enterprise Plan
- Price: Custom pricing, starting at $199/month
- Features: Everything in Pro, plus SSO, audit logs, 99.99% SLA
- API access: Unlimited
- Team members: Unlimited
- Dedicated account manager

## Return Policy
- 30-day money-back guarantee on all plans
- Pro-rated refunds for annual subscriptions
- Contact [email protected] for refund requests

## System Requirements
- Browser: Chrome 90+, Firefox 88+, Safari 15+, Edge 90+
- Internet: 5 Mbps minimum, 25 Mbps recommended
- No desktop installation required (cloud-based)

Upload this file to your knowledge base. Dify will process it through the indexing pipeline: splitting the document into chunks, generating vector embeddings, and storing them in Weaviate. You can configure the chunking strategy under Settings:

Chunking Parameter	Default Value	Recommended for FAQ	Recommended for Long Docs
Chunk size	500 tokens	300 tokens	800 tokens
Chunk overlap	50 tokens	50 tokens	100 tokens
Retrieval mode	Semantic search	Hybrid (semantic + keyword)	Semantic search
Top K results	3	5	3
Score threshold	0.5	0.3	0.6
Reranking	Off	On (recommended)	On

After indexing completes, return to your chatbot application. In the orchestration panel, click Add Context and select your “Product Documentation” knowledge base. Now when users ask questions, Dify retrieves relevant document chunks and injects them into the LLM prompt automatically. Test it by asking “What’s included in the Pro Plan?” in the preview–the response should cite your uploaded document.

Step 5: Build an Agentic Workflow

Dify’s visual workflow builder is where the platform truly differentiates itself from simpler chatbot tools. Workflows let you chain multiple LLM calls, conditional branches, tool invocations, and data transformations into a single automated pipeline. This is the foundation of agentic AI–where the system reasons about which steps to take rather than following a static script.

Create a new application and select Workflow as the type. Name it “Support Ticket Router.” The workflow canvas opens with a Start node. You will build a pipeline that classifies incoming support tickets, retrieves relevant documentation, and routes complex issues to the appropriate team.

Add the following nodes by dragging them from the node palette:

Node 1 – LLM (Ticket Classifier): Configure this node to classify the incoming message into categories: “billing”, “technical”, “general”, or “urgent”. Use a lightweight model like GPT-4o mini to keep costs low. Set the prompt to: “Classify the following customer message into exactly one category: billing, technical, general, or urgent. Respond with only the category name.”

Node 2 – Conditional Branch: Connect it to the classifier output. Create branches for each category. The “billing” branch routes to a billing-specific handler. The “technical” branch routes to the RAG knowledge retrieval. The “urgent” branch triggers an immediate escalation response.

Node 3 – Knowledge Retrieval: For the “technical” branch, add a Knowledge Retrieval node connected to your Product Documentation knowledge base. This node fetches the most relevant document chunks based on the user’s question.

Node 4 – LLM (Response Generator): This node takes the retrieved documents and the original question to generate a detailed, sourced answer. Use a more capable model like GPT-4o or Claude Sonnet here for higher-quality responses.

Node 5 – End: Connect all branches to an End node that returns the final response. Each branch can have its own End node with different output templates.

Click Run to test the workflow with sample inputs. The execution log shows each node’s input, output, token usage, and latency–invaluable for debugging and optimization. Publish the workflow when satisfied with the results.

Step 6: Add Custom Tools and API Integrations

Dify ships with over 50 built-in tools, but real-world applications often need custom integrations–querying an internal database, calling a ticketing API, or triggering a webhook. Dify supports custom tools through OpenAPI/Swagger specifications, making it straightforward to connect any REST API.

Navigate to Tools in the left sidebar and click Create Custom Tool. You will define the tool using an OpenAPI 3.0 schema. Here is an example that creates a ticket in a hypothetical ticketing system:

{
 "openapi": "3.0.0",
 "info": {
 "title": "Support Ticket API",
 "version": "1.0.0"
 },
 "servers": [
 {
 "url": "https://api.techcorp.example/v1"
 }
 ],
 "paths": {
 "/tickets": {
 "post": {
 "operationId": "createTicket",
 "summary": "Create a new support ticket",
 "requestBody": {
 "required": true,
 "content": {
 "application/json": {
 "schema": {
 "type": "object",
 "properties": {
 "subject": {
 "type": "string",
 "description": "Ticket subject line"
 },
 "category": {
 "type": "string",
 "enum": ["billing", "technical", "general", "urgent"]
 },
 "description": {
 "type": "string",
 "description": "Detailed issue description"
 },
 "priority": {
 "type": "integer",
 "minimum": 1,
 "maximum": 5
 }
 },
 "required": ["subject", "category", "description"]
 }
 }
 }
 },
 "responses": {
 "201": {
 "description": "Ticket created successfully",
 "content": {
 "application/json": {
 "schema": {
 "type": "object",
 "properties": {
 "ticket_id": { "type": "string" },
 "status": { "type": "string" },
 "created_at": { "type": "string" }
 }
 }
 }
 }
 }
 }
 }
 }
 }
}

Paste this schema into the tool creation form, configure authentication (API key, Bearer token, or Basic Auth), and save. The tool now appears in the workflow node palette and can be used in Agent applications. When the LLM decides a ticket needs to be created, it invokes this tool with the appropriate parameters automatically.

For Agent-type applications, Dify uses a ReAct (Reasoning + Acting) strategy by default. The agent decides which tools to call, interprets results, and determines next steps. You can also configure function calling mode for OpenAI-compatible models, which tends to be more reliable for structured tool invocations.

Step 7: Expose the Application via API

Every Dify application automatically gets a RESTful API endpoint. This is how you integrate Dify into your production stack–your website, mobile app, Slack bot, or any other client can send requests to the Dify API and receive AI-generated responses.

👁 Step 7: Expose the Application via API

In your application’s settings, click API Access. Dify generates a unique API key for each application. Copy the key and note the base URL (for self-hosted instances, this is typically http://your-server/v1).

Here is a Python example that sends a message to your chatbot and handles the streaming response:

import requests
import json

DIFY_API_KEY = "app-your-api-key-here"
DIFY_BASE_URL = "http://localhost/v1"

def chat_with_dify(query: str, conversation_id: str = "") -> dict:
 """Send a message to the Dify chatbot API."""
 headers = {
 "Authorization": f"Bearer {DIFY_API_KEY}",
 "Content-Type": "application/json"
 }
 
 payload = {
 "inputs": {},
 "query": query,
 "response_mode": "streaming",
 "conversation_id": conversation_id,
 "user": "user-123"
 }
 
 response = requests.post(
 f"{DIFY_BASE_URL}/chat-messages",
 headers=headers,
 json=payload,
 stream=True
 )
 
 full_response = ""
 current_conversation_id = conversation_id
 
 for line in response.iter_lines():
 if line:
 decoded = line.decode("utf-8")
 if decoded.startswith("data: "):
 data = json.loads(decoded[6:])
 event = data.get("event")
 
 if event == "message":
 full_response += data.get("answer", "")
 print(data.get("answer", ""), end="", flush=True)
 elif event == "message_end":
 current_conversation_id = data.get("conversation_id", "")
 metadata = data.get("metadata", {})
 print(f"nnTokens used: {metadata.get('usage', {})}")
 
 return {
 "response": full_response,
 "conversation_id": current_conversation_id
 }

# First message
result = chat_with_dify("What plans do you offer?")

# Follow-up (maintains context)
result = chat_with_dify(
 "How much is the annual Pro plan?",
 conversation_id=result["conversation_id"]
)

The streaming mode delivers tokens as they are generated, providing a responsive user experience. Dify also supports a blocking mode ("response_mode": "blocking") that waits for the complete response before returning–useful for backend processing where streaming is unnecessary.

The API supports conversation history through the conversation_id parameter. Pass it between requests to maintain multi-turn context. Dify stores conversation history in PostgreSQL, so it persists across API restarts.

Step 8: Implement Variables and Conditional Logic

Production applications need dynamic behavior based on user context. Dify’s variable system lets you pass structured data into applications and use it in prompts, conditional branches, and tool parameters. This is how you personalize responses without creating separate applications for each use case.

In your chatbot’s orchestration panel, click Add Variable. Create variables for common user context:

user_plan (Select type): Options are “free”, “pro”, “enterprise”. This determines which features the assistant can discuss.

user_name (Text type): For personalized greetings.

account_id (Text type): Passed to custom tools for account lookup.

Update your system prompt to reference these variables using Jinja2 syntax:

You are a support assistant for TechCorp.
The customer's name is {{user_name}} and they are on the {{user_plan}} plan.
Their account ID is {{account_id}}.

{% if user_plan == "free" %}
When discussing premium features, mention the upgrade path to Pro ($49/month).
Highlight the 30-day money-back guarantee to reduce upgrade hesitation.
{% elif user_plan == "enterprise" %}
This is a high-priority enterprise customer. Provide detailed technical answers.
Offer to schedule a call with their dedicated account manager if needed.
{% endif %}

Always greet the customer by name in your first response.

When calling the API, pass variable values in the inputs field:

payload = {
 "inputs": {
 "user_name": "Sarah",
 "user_plan": "pro",
 "account_id": "ACC-2026-4471"
 },
 "query": "How do I increase my API rate limit?",
 "response_mode": "blocking",
 "user": "user-456"
}

Variables also work in workflow nodes. In a conditional branch, you can route enterprise customers to a priority queue while free users get standard responses. This pattern eliminates the need for application-level routing logic in your backend–Dify handles it internally.

Step 9: Monitor Performance and Debug with Logs

Dify includes built-in observability tools that most DIY LLM stacks lack. The Logs section in each application records every conversation, including the full prompt sent to the LLM, the retrieved documents (for RAG), token counts, latency, and cost. This is critical for debugging hallucinations, identifying retrieval failures, and optimizing token usage.

Navigate to your application’s Logs tab. Each conversation entry shows:

Input/Output: The exact user query and model response.

Retrieved Context: Which document chunks were retrieved and their similarity scores. If the answer seems wrong, check whether the right chunks were retrieved–often the issue is chunking strategy, not the LLM.

Token Usage: Prompt tokens, completion tokens, and total cost per interaction. This helps you project monthly costs at scale.

Latency Breakdown: Time spent on retrieval, LLM inference, and total response time. If latency exceeds user expectations, consider switching to a faster model for the classification step while keeping a more capable model for the final response.

For workflow applications, the execution log is even more detailed. Each node shows its individual input, output, duration, and status. Failed nodes are highlighted in red, making it easy to identify exactly where a pipeline broke. You can re-run individual nodes with modified inputs for rapid debugging without restarting the entire workflow.

Dify also supports annotation-based improvement. When you find a response that should have been different, click Annotate and provide the ideal answer. These annotations are stored and can be used to fine-tune retrieval or override specific queries with canned responses–useful for frequently asked questions where you want guaranteed accuracy.

Step 10: Deploy to Production

Moving from a local Docker setup to a production deployment requires addressing security, scalability, and reliability. Here are the key changes you should make to the default Dify configuration for production use.

👁 Step 10: Deploy to Production

First, update the .env file with production-grade settings:

# .env - Production settings

# Change the secret key (CRITICAL for security)
SECRET_KEY=your-random-64-char-string-here

# Database configuration
DB_USERNAME=dify_prod
DB_PASSWORD=strong-random-password
DB_HOST=db
DB_PORT=5432
DB_DATABASE=dify_production

# Redis configuration
REDIS_PASSWORD=another-strong-password

# CORS settings (restrict to your domain)
WEB_API_CORS_ALLOW_ORIGINS=https://yourdomain.com
CONSOLE_CORS_ALLOW_ORIGINS=https://admin.yourdomain.com

# File storage (switch to S3 for production)
STORAGE_TYPE=s3
S3_BUCKET_NAME=dify-file-storage
S3_ACCESS_KEY=your-aws-access-key
S3_SECRET_KEY=your-aws-secret-key
S3_REGION=us-east-1

# Rate limiting
API_RATE_LIMIT=100/minute

For high-traffic deployments, scale the API and worker services horizontally:

# Scale API servers to handle more concurrent requests
docker compose up -d --scale api=3 --scale worker=2

# Verify all instances are healthy
docker compose ps | grep api
# dify-api-1 Up (healthy)
# dify-api-2 Up (healthy)
# dify-api-3 Up (healthy)

Place a reverse proxy (Nginx or Caddy) in front of Dify to handle TLS termination, load balancing, and rate limiting. The built-in Nginx container works for single-server deployments, but a dedicated proxy gives you more control over caching, header management, and connection limits.

For cloud deployments, consider using Dify’s hosted option at cloud.dify.ai for smaller teams that do not want to manage infrastructure. The cloud version includes automatic updates, managed backups, and a free tier for experimentation.

Step 11: Integrate Conversation Memory and User Feedback

Dify maintains conversation history automatically through its conversation management system. Each conversation is stored in PostgreSQL with full message history, metadata, and user identifiers. This enables multi-turn interactions where the assistant remembers previous context–essential for support conversations that span multiple topics.

To implement user feedback collection, use the message feedback API endpoint. This allows users to rate responses as helpful or unhelpful, which feeds into Dify’s annotation and improvement pipeline:

import requests

def submit_feedback(message_id: str, rating: str, user: str):
 """Submit feedback for a specific message.
 
 Args:
 message_id: The ID of the message to rate
 rating: 'like' or 'dislike'
 user: The user identifier
 """
 headers = {
 "Authorization": f"Bearer {DIFY_API_KEY}",
 "Content-Type": "application/json"
 }
 
 response = requests.post(
 f"{DIFY_BASE_URL}/messages/{message_id}/feedbacks",
 headers=headers,
 json={
 "rating": rating,
 "user": user
 }
 )
 
 return response.json()

# After receiving a response, let users rate it
submit_feedback(
 message_id="msg-abc123",
 rating="like",
 user="user-123"
)

Monitor feedback trends in the Dify dashboard under Logs → Annotations. A high dislike rate on specific topics indicates gaps in your knowledge base or issues with prompt engineering. Use this data to iteratively improve your application: add missing documents, refine chunking parameters, or adjust the system prompt to handle edge cases better.

Common Pitfalls and How to Avoid Them

After deploying Dify applications for production use, these are the most frequent mistakes developers encounter and how to resolve them before they become critical issues.

Pitfall 1: Oversized document chunks. Setting chunk size too large (over 1,000 tokens) dilutes retrieval relevance. The LLM receives too much irrelevant context, increasing token costs and reducing answer precision. Start with 500-token chunks and adjust based on retrieval quality observed in the logs.

Pitfall 2: Missing chunk overlap. Setting overlap to zero causes context loss at chunk boundaries. If a critical answer spans two chunks, neither chunk contains the complete information. Use at least 50-token overlap for short documents and 100+ tokens for long-form content.

Pitfall 3: Using the wrong retrieval mode. Pure semantic search fails on exact-match queries like product codes, error numbers, or SKUs. Switch to hybrid retrieval mode (semantic + keyword) when your knowledge base contains structured data with specific identifiers.

Pitfall 4: Not setting a score threshold. Without a minimum similarity threshold, Dify returns the top K results regardless of relevance. This means completely unrelated chunks can be injected into the prompt. Set a threshold of 0.5 or higher and adjust based on retrieval accuracy in logs.

Pitfall 5: Ignoring the system prompt token budget. Long system prompts with detailed instructions consume tokens on every API call. A 500-token system prompt adds up at scale–at 10,000 daily conversations, that is 5 million extra input tokens per day. Keep system prompts under 300 tokens and move detailed instructions to the knowledge base.

Pitfall 6: Hardcoding a single LLM provider. If your application depends entirely on one provider and that provider has an outage, your application goes down. Configure at least two providers in Dify and use the model fallback feature, or implement provider switching in your workflow’s conditional logic.

Pitfall 7: Skipping conversation ID management. Each new API call without a conversation_id creates a new conversation, losing multi-turn context. Store and pass the conversation ID between requests. If the ID expires or becomes invalid, Dify returns a 404–handle this by starting a fresh conversation.

Troubleshooting Guide

This section covers the most common issues you will encounter when running Dify and their proven solutions.

👁 Troubleshooting Guide

Issue 1: Containers fail to start with “port already in use” error. Another process is using port 80 or 5432. Run sudo lsof -i :80 to identify the conflicting process. Either stop that process or change Dify’s ports in the .env file (EXPOSE_NGINX_PORT=3000).

Issue 2: “Model provider not configured” error when creating an application. Navigate to Settings → Model Provider and add at least one LLM provider with a valid API key. Dify validates keys on save–if validation fails, check that your API key has not expired and has sufficient quota.

Issue 3: Knowledge base indexing stuck at 0%. Check the worker container logs: docker compose logs worker. Common causes include insufficient memory (the embedding model needs RAM), network issues connecting to the embedding API, or corrupted document files. Try re-uploading the document in a different format (e.g., convert PDF to Markdown).

Issue 4: RAG returns irrelevant or empty results. Open the application logs and check the retrieved chunks. If chunks are irrelevant, reduce the chunk size and increase the score threshold. If no chunks are returned, the threshold may be too high–lower it to 0.3 temporarily. Also verify that the knowledge base is actually connected to the application in the orchestration panel.

Issue 5: API returns 401 Unauthorized. Verify the API key format. Dify application API keys start with app-. Ensure you are sending it in the Authorization: Bearer header, not as a query parameter. Also check that the API key belongs to the correct application–keys are application-specific.

Issue 6: Streaming responses cut off mid-sentence. This typically indicates a timeout issue. Check your reverse proxy timeout settings–Nginx defaults to 60 seconds, which may be insufficient for complex workflows. Increase proxy_read_timeout to 300 seconds. Also verify that your client library correctly handles Server-Sent Events (SSE).

Issue 7: Docker Compose crashes with out-of-memory error. Dify’s full stack (API, worker, PostgreSQL, Redis, Weaviate) requires at least 4 GB of free RAM. Check available memory with free -h. If running on a constrained machine, reduce Weaviate memory limits in docker-compose.yaml or switch to an external vector database.

Issue 8: Workflow nodes execute but return empty outputs. Check the node configuration for variable mapping errors. A common mistake is referencing a variable from a previous node using the wrong node ID. In the workflow editor, click a node and verify that its input variables correctly reference the output variables of upstream nodes. Dify uses the format {{#nodeId.variableName#}} for inter-node references.

Issue 9: Weaviate vector database fails to start. If you see Weaviate errors referencing gRPC port 50051, ensure that no other service is occupying that port. Dify v1.13.x upgraded the Weaviate client to v4, which requires Weaviate server v1.24.0 or higher and gRPC connectivity. Check the Weaviate container version in your docker-compose.yaml.

Issue 10: Slow response times exceeding 10 seconds. Profile the latency breakdown in application logs. If retrieval is slow, your vector database may need more memory or indexing optimization. If LLM inference is slow, consider using a faster model for non-critical steps (e.g., GPT-4o mini for classification, GPT-4o for final responses). Scaling the API container count also helps with concurrent request handling.

Advanced Tips for Production Dify Deployments

Once your basic application works, these advanced techniques will help you optimize performance, reduce costs, and handle enterprise-grade requirements.

Multi-model routing for cost optimization. Use a lightweight model (GPT-4o mini or Claude Haiku) for the first classification step in workflows, then route to a premium model only when the query requires deep reasoning. This pattern can reduce LLM costs by 60-70% compared to using a premium model for all steps. In Dify’s workflow builder, simply assign different models to different LLM nodes.

Knowledge base segmentation. Instead of one massive knowledge base, create separate knowledge bases for different domains: product docs, API references, billing FAQs, and troubleshooting guides. Connect only relevant knowledge bases to each application. This improves retrieval precision and reduces noise in the retrieved context.

Scheduled document sync. For dynamic content that changes frequently (pricing pages, release notes), set up a cron job that exports the latest content to Markdown and uploads it to Dify’s knowledge base via the API. This keeps your RAG pipeline current without manual intervention:

#!/bin/bash
# sync-knowledge.sh – Run daily via cron

DIFY_API_KEY="dataset-your-key"
DIFY_URL="http://localhost/v1"
DATASET_ID="your-dataset-id"

# Export latest docs
python3 export_docs.py --output /tmp/latest-docs.md

# Upload to Dify knowledge base
curl -s -X POST "${DIFY_URL}/datasets/${DATASET_ID}/document/create_by_file" 
 -H "Authorization: Bearer ${DIFY_API_KEY}" 
 -F "file=@/tmp/latest-docs.md" 
 -F 'data={"indexing_technique":"high_quality","process_rule":{"mode":"automatic"}}'

echo "Knowledge base synced at $(date)"

Webhook integrations for real-time escalation. Configure Dify workflow nodes to call webhooks when specific conditions are met–for example, sending a Slack notification when the agent classifies a ticket as “urgent” or creating a Jira issue for bugs. The HTTP Request node in workflows supports any REST endpoint with custom headers and authentication.

Local model fallback with Ollama. For sensitive data that cannot leave your network, configure Ollama as a local model provider in Dify. Run a model like Llama 3 or Qwen locally and use it as a fallback when cloud APIs are unavailable or for processing confidential documents. Add Ollama in Settings → Model Provider with the endpoint http://host.docker.internal:11434 (or your Ollama server address).

Dify vs Alternatives: When to Use What

Dify occupies a specific niche in the AI development ecosystem. Understanding when it is the right choice–and when alternatives serve you better–saves significant development time.

Feature	Dify	LangChain	Flowise	OpenAI Assistants API
Setup complexity	Docker Compose (5 min)	pip install + custom code	npm install (5 min)	API calls only
Visual workflow builder	Yes (full-featured)	No (code only)	Yes (node-based)	No
RAG built-in	Yes (multiple vector DBs)	Yes (requires setup)	Yes (limited)	Yes (file search)
Multi-LLM support	10+ providers	20+ providers	10+ providers	OpenAI only
Self-hosted option	Yes (open-source)	Yes (library)	Yes (open-source)	No (cloud only)
Production observability	Built-in logs + metrics	Requires LangSmith	Basic logs	Dashboard only
Custom tools	OpenAPI spec	Python functions	JavaScript functions	Function calling
Best for	Teams wanting visual + API	Developers wanting full control	Quick prototypes	OpenAI-only apps

Choose Dify when you need a production-ready platform with visual workflow building, built-in RAG, and team collaboration features. Choose LangChain when you need maximum flexibility and are comfortable writing orchestration code. Choose Flowise for rapid prototyping when you do not need the full feature set. Choose OpenAI Assistants API when you are committed to the OpenAI ecosystem and want the simplest possible integration.

Dify’s $180 million valuation and $30 million in funding reflect investor confidence in the platform’s trajectory. The project’s open-source model, combined with a cloud-hosted option, positions it as a viable choice for both startups and enterprises building AI-powered applications in 2026.

Complete Working Project: Customer Support Assistant

Here is the complete project structure that ties together everything from this tutorial. You can clone this setup, add your own documents, and have a working customer support assistant in under 30 minutes.

# Project structure
dify-support-assistant/
├── docker/
│ └── .env # Production environment variables
├── knowledge/
│ ├── product-guide.md # Product documentation
│ ├── billing-faq.md # Billing FAQ
│ └── troubleshooting.md # Troubleshooting guide
├── tools/
│ └── ticket-api-schema.json # Custom tool OpenAPI spec
├── scripts/
│ ├── setup.sh # Automated setup script
│ ├── sync-knowledge.sh # Knowledge base sync cron job
│ └── test-api.py # API integration tests
└── README.md

# setup.sh – One-command deployment
#!/bin/bash
set -e

echo "Cloning Dify..."
git clone https://github.com/langgenius/dify.git
cd dify/docker

echo "Configuring environment..."
cp .env.example .env
# Generate a random secret key
SECRET=$(openssl rand -hex 32)
sed -i "s/SECRET_KEY=.*/SECRET_KEY=${SECRET}/" .env

echo "Starting Dify..."
docker compose up -d

echo "Waiting for services to be healthy..."
sleep 30
docker compose ps

echo "Dify is running at http://localhost"
echo "Create your admin account to get started."

After running the setup script, complete these steps in the Dify dashboard:

1. Create an admin account at http://localhost.

2. Add your LLM provider(s) in Settings → Model Provider.

3. Create a Knowledge Base and upload the documents from the knowledge/ directory.

4. Create a Chatbot application with the system prompt from Step 3.

5. Connect the Knowledge Base to the application.

6. Import the custom tool schema from tools/ticket-api-schema.json.

7. Build the workflow from Step 5 for ticket routing.

8. Copy the API key and run python scripts/test-api.py to verify the integration.

Related Coverage

Explore more tutorials and comparisons on building AI-powered applications:

Frequently Asked Questions

Is Dify free to use?

Dify is open-source under the Apache 2.0 license for self-hosted deployments. You can run the full platform on your own servers at no cost beyond infrastructure. Dify also offers a cloud-hosted version at cloud.dify.ai with free and paid tiers. Note that you still pay for LLM API usage (OpenAI, Anthropic, etc.) regardless of which Dify deployment option you choose.

How many LLM providers does Dify support?

Dify supports connections to a broad range of LLM providers, including OpenAI, Anthropic, Google Gemini, Azure OpenAI, Cohere, Mistral, Zhipu AI, Moonshot, and local model servers like Ollama and LM Studio. You can configure multiple providers simultaneously and choose different models for different applications or workflow nodes.

Can I run Dify with local models instead of cloud APIs?

Yes. Dify integrates with Ollama and LM Studio for local model inference. Add Ollama as a model provider in Settings using the endpoint http://host.docker.internal:11434. This keeps all data on your network–no API calls leave your infrastructure. You will need additional RAM (8 GB+ for 7B models, 16 GB+ for 13B models) to run local models alongside Dify’s services.

What vector databases does Dify support for RAG?

Dify supports Weaviate (included in the default Docker Compose), as well as pgvector, Milvus, Qdrant, Chroma, and other vector databases. Weaviate is the default and requires no additional configuration. To switch to a different vector database, update the VECTOR_STORE environment variable in your .env file and configure the corresponding connection parameters.

How does Dify compare to building with LangChain directly?

Dify provides a visual interface, built-in RAG, and production observability out of the box. LangChain offers more granular control through code but requires you to build the UI, observability, and deployment infrastructure yourself. Teams that want rapid iteration with a visual builder choose Dify. Teams that need custom orchestration logic or deep framework integration choose LangChain.

Can multiple team members collaborate on Dify applications?

Yes. Dify includes workspace and team management features. The admin can invite team members with different permission levels (admin, editor, viewer). Multiple developers can work on different applications within the same workspace. The cloud-hosted version includes additional collaboration features like shared API keys and centralized billing.

What file formats does Dify support for knowledge base documents?

Dify accepts PDF, Word (.docx), Markdown (.md), plain text (.txt), CSV, and HTML files for knowledge base ingestion. For best results, use Markdown or plain text–they parse most reliably. PDFs with complex layouts or scanned images may require preprocessing. CSV files are useful for structured data like product catalogs or FAQ lists.

How do I update Dify to the latest version?

For Docker Compose deployments, navigate to the repository directory and run git pull origin main followed by docker compose pull and docker compose up -d. This pulls the latest images and restarts the services. Your data persists in Docker volumes, so updates do not affect existing applications, knowledge bases, or conversation history. Always back up your PostgreSQL database before major version upgrades.

👁 Nadia Dubois

Nadia Dubois

AI & Innovation Editor

Nadia Dubois is the AI & Innovation Editor at Tech Insider, where she tracks the rapid evolution of artificial intelligence, from foundation models to real-world enterprise deployment. She previously covered AI and startups for La Tribune and contributed to MIT Technology Review's European coverage. Nadia specializes in generative AI, AI regulation, and the intersection of technology and European industrial policy. She holds a dual degree in Computational Linguistics and Journalism from Sciences Po Paris.

View all articles

URL: https://tech-insider.org/dify-tutorial-ai-agent-rag-workflow-2026/

⇱ How to Build an AI Support Agent with Dify in 11 Steps [2026]