Integrates with Zendesk Help Center to harvest articles, search data, and engagement metrics, enabling content intelligence through semantic search, gap detection, and label management.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@kb-intelligenceFind articles about billing that are stale and have negative votes."
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
KB Intelligence
A local documentation intelligence system for Zendesk Help Centers, built with Claude and Ollama.
It harvests your Help Center content and search data from Zendesk, builds a semantic knowledge graph over it, and gives you a full editorial intelligence layer: content gaps, stale articles, duplicate detection, LLM-powered label taxonomy, and a Claude Desktop integration via MCP.
Why This Exists
Most documentation teams — especially small ones — are flying blind. Your Help Center might have hundreds of articles. You know roughly what's in it. But you don't know:
What your users are actually searching for that you haven't written yet
Which articles are stale and generating support tickets because they're out of date
Which articles are near-duplicates of each other and should be consolidated
Whether your label taxonomy (if you have one) is consistent or riddled with model hallucinations
Which high-traffic articles have negative votes — meaning they exist, users find them, but they don't help
For a solo technical writer or a small team, answering any of these questions manually requires hours of spreadsheet work — if you do it at all. This system answers all of them automatically, continuously, and surfaces them in a dashboard built for editorial decision-making.
Especially useful for small teams
Large documentation teams have dedicated tooling, content strategists, and analytics staff. Small teams don't. A single technical writer supporting hundreds of articles at a growing SaaS company is the exact user this was built for.
The system runs entirely on your machine. There's no SaaS subscription, no data leaves your network, and the only ongoing cost is the Zendesk API calls you were already making. The local LLM stack (Ollama + llama3.1:8b) does the heavy lifting on label suggestions and gap analysis without sending your KB content to a cloud API.
The Claude integration layers on top: once the local database is built, Claude can reason over your entire Help Center with full semantic search, gap detection, and graph traversal — giving you a conversational interface to your own documentation.
Related MCP server: Claude RAG MCP Pipeline
What It Does
Pipeline (Python scripts, run once or on a schedule)
Import — Pulls article HTML, metadata, and engagement data from Zendesk via the REST API and Explore CSV exports
Embed — Generates semantic embeddings for every article and search query using Ollama (
nomic-embed-text)Graph — Extracts inbound/outbound links, transclusions, and section relationships to build a knowledge graph
Label — Uses a local LLM to suggest labels for every article based on your approved taxonomy. Corrections and few-shot examples are injected from
data/corrections.jsonto keep suggestions consistentGaps — Cross-references zero-click search queries against article embeddings to surface content your users searched for but didn't find
MCP Server (Claude Desktop integration)
mcp_server.py exposes the local database as an MCP server that Claude Desktop can connect to. Tools available to Claude:
Tool | What it does |
| Semantic search over the full article corpus |
| Retrieve a single article by ID with full metadata |
| Find articles semantically similar to a given one |
| Show which articles lack approved labels |
| Surface search queries with no good article matches |
| List articles by staleness signal (age, votes, ticket rate) |
| Traverse the knowledge graph for a given article |
| Write approved labels back to Zendesk |
| Resolve a title to a numeric article ID |
With this running, you can ask Claude things like:
"Which articles about SSO have the most support tickets after viewing?"
"Find me all articles that link to the getting started guide but don't have an onboarding label."
"What are the most common searches with no article result?"
"Which articles about billing haven't been updated in over a year and have negative votes?"
Dashboard (React + FastAPI)
A panel-based editorial dashboard for reviewing the pipeline output without leaving the browser:
Panel | Purpose |
Overview | Article counts, embedding coverage, label coverage at a glance |
Label Taxonomy | Browse and review your full label schema |
Content Gaps | Search queries with poor or no article coverage |
Article Health | Articles ranked by staleness, votes, and ticket rate |
Explorer | Interactive knowledge graph: click any article, see its connections |
Staleness | Articles sorted by last-updated date with engagement signals |
Label Workflow | Review LLM label suggestions article by article, approve or reject |
Search | Semantic search with similarity scores and metadata |
Consolidation | Clusters of near-duplicate articles to consider merging |
Maintenance | Database refresh controls and pipeline status |
Tech Stack
Component | Technology |
Database | SQLite + sqlite-vec for vector search |
Embeddings | Ollama ( |
LLM | Ollama ( |
Claude integration | |
API | FastAPI + uvicorn |
Dashboard | React 19 + Vite + TanStack Query + Tailwind 4 + D3 |
Tests | pytest (pipeline + API) + Vitest + Playwright (dashboard) |
Everything runs locally. No cloud AI APIs required for the pipeline.
Setup
Prerequisites
Python 3.11+
Node.js 20+
Ollama running locally
A Zendesk account with Help Center enabled and API access
Claude Desktop (for the MCP integration)
1. Clone and install
git clone https://github.com/your-github-username/zendesk-kb-intelligence.git
cd zendesk-kb-intelligence
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cd dashboard && npm install && cd ..2. Configure
Copy the example config and fill in your Zendesk credentials:
cp config.example.json config.jsonEdit config.json:
{
"zendesk": {
"subdomain": "your-subdomain",
"email": "admin@your-company.com",
"api_token": "your_api_token_here",
"locale": "en-us",
"help_center_domain": "your-subdomain.zendesk.com"
},
"database": { "path": "kb.db" },
"ollama": {
"host": "http://localhost:11434",
"embed_model": "nomic-embed-text",
"eval_model": "llama3.1:8b"
}
}Configure the dashboard:
cp dashboard/.env.example dashboard/.env
# Edit dashboard/.env with your Zendesk subdomain3. Pull Ollama models
ollama pull nomic-embed-text
ollama pull llama3.1:8b4. Run the pipeline
# Initialize the database
python init_db.py
python init_vec.py
# Pull content from Zendesk
python harvest.py
# Load engagement data (export CSVs from Zendesk Explore first — see below)
python load_csvs.py
# Build embeddings
python embed_articles.py
python embed_queries.py
# Derive label schema from your app structure
python derive_label_schema.py
# Review and approve labels in data/label_schema.json, then:
# Run label suggestions (loops until all articles are processed)
bash run_until_done.sh
# Detect content gaps
python gap_detection.py
# Build the knowledge graph
python extract_links.py
python resolve_transclusions.py5. Connect Claude Desktop
Add the MCP server to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):
{
"mcpServers": {
"kb-intelligence": {
"command": "/path/to/.venv/bin/python",
"args": ["/path/to/zendesk-kb-intelligence/mcp_server.py"],
"env": {
"DB_PATH": "/path/to/kb.db"
}
}
}
}6. Start the dashboard
# Terminal 1 — API
source .venv/bin/activate
uvicorn api.main:app --reload --port 8765
# Terminal 2 — Dashboard
cd dashboard && npm run devOpen http://localhost:5173.
Zendesk Explore CSV Exports
load_csvs.py expects these reports exported from Zendesk Explore:
Report | Expected filename pattern |
Article engagement (views, votes) |
|
Deflection drilldown |
|
Quick answers |
|
Search queries overview |
|
Search clicks |
|
Search no results |
|
Export these to data/exports/ (gitignored).
Label Schema
The label system has two layers:
Layer 1 — Architectural: Derived from your product's navigation structure. These reflect how your product is organized (tabs, sections, feature areas). Defined in LAYER_1_LABELS in derive_label_schema.py — edit this to match your product.
Layer 2 — Intent: Derived by clustering your actual search query embeddings. These reflect how your users talk about your product — often different from how your documentation is organized.
The two layers together form a vocabulary that satisfies both content organization and search discoverability.
To customize for your product:
Edit
LAYER_1_LABELSinderive_label_schema.pywith your top-level navigationRun
python derive_label_schema.pyReview
data/label_schema.json— set status, label, and description for Layer 2 candidatesSet
reviewed_byandreviewed_atinlabel_schema.jsonCommit
label_schema.jsonand run the embedding + suggestion pipeline
Running Tests
# Python pipeline tests
source .venv/bin/activate
pytest tests/ -v
# API tests (requires a populated kb.db)
pytest api/tests/ -v
# Dashboard unit tests
cd dashboard && npm run test:run
# Dashboard e2e tests (requires running dev server)
cd dashboard && npm run test:e2eProject Structure
.
├── schema.sql # SQLite schema
├── init_db.py # DB initialization
├── harvest.py # Zendesk API → DB
├── load_csvs.py # Zendesk Explore CSVs → DB
├── embed_articles.py # Article embeddings (Ollama)
├── embed_queries.py # Search query embeddings
├── derive_label_schema.py # Label taxonomy derivation
├── suggest_labels.py # LLM label suggestions
├── gap_detection.py # Content gap detection
├── extract_links.py # Knowledge graph: links
├── resolve_transclusions.py # Knowledge graph: transclusions
├── apply_labels.py # Write labels back to Zendesk
├── mcp_server.py # Claude Desktop MCP integration
├── refresh.py # Incremental sync (in progress)
├── config.py # Config loader
├── config.example.json # Config template
├── .env.example # Environment variable template
├── data/
│ ├── label_schema.json # Approved label taxonomy
│ └── corrections.json # LLM correction rules
├── api/ # FastAPI backend
│ ├── main.py
│ ├── routers/
│ └── tests/
├── dashboard/ # React dashboard
│ ├── src/
│ ├── .env.example
│ └── .env.test
└── tests/ # Pipeline test suiteAdapting for Your Help Center
This repo is a working template, not an abstract framework. It was built for a real Help Center and then generalized. To make it yours:
Label schema — The biggest customization. Edit
LAYER_1_LABELSinderive_label_schema.pyto reflect your product's navigation. Everything else flows from the schema you define.Few-shot examples —
suggest_labels.pycontainsFEW_SHOT_EXAMPLES: real articles with correct labels that teach the LLM how to apply your taxonomy. Replace or extend these with examples from your own KB for best results.Corrections — As you run the pipeline, you'll find systematic errors (the LLM applies a label too broadly, invents compound labels, etc.). Log them in
data/corrections.json. The pipeline injects these as rules on every run.Model — The pipeline uses
llama3.1:8bby default. If you have a larger machine, swap inllama3.1:70bor any other Ollama-supported model inconfig.json.
License
MIT
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
- Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)By Om-Shree-0709 on .Agentic AiPrompt InjectionWebAssembly
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/JoshWrites/zendesk-kb-intelligence'
If you have feedback or need assistance with the MCP directory API, please join our Discord server
