Voozh

TL;DR

I built my first Model Context Protocol (MCP) server to give Claude Code read access to a local project knowledge base — and the first version was bad in ways I didn't expect. Here's the minimal TypeScript skeleton that actually works, plus 4 lessons about tool design that I wish someone had told me on day one.

The Problem

I use Claude Code (v2.x) as my daily driver, and most of the time the built-in file and shell tools are enough. But I had one recurring annoyance: a pile of internal docs — design notes, decision logs, runbooks — scattered across a directory that the agent kept re-reading from scratch every session. It would grep, open three files, lose the thread, and grep again.

I wanted the agent to ask one focused question — "what did we decide about retries?" — and get back the relevant note, not a directory listing.

That's exactly what MCP is for. MCP is an open protocol that lets you expose tools (callable functions), resources, and prompts to an LLM client over a standard interface. Claude Code speaks it natively. So instead of teaching the model to navigate my files, I could hand it a search_notes tool and a get_note tool and let it do the obvious thing.

The interesting constraint: the model only uses a tool well if the schema tells it exactly when and how. A tool the model calls with garbage arguments is worse than no tool at all. That's where the real work was.

How I Solved It

The whole thing is a small Node.js server (Node 22.x) using the official @modelcontextprotocol/sdk. Here's the architecture — it's deliberately boring:

flowchart LR
 A[Claude Code] -- stdio / JSON-RPC --> B[MCP Server]
 B --> C[search_notes]
 B --> D[get_note]
 C --> E[(Local notes dir)]
 D --> E

The server talks to Claude Code over stdio (standard in/out), which is the simplest transport — no ports, no auth, no network. The client launches your server as a subprocess and pipes JSON-RPC over the pipe. For a local, single-user tool this is exactly what you want.

The minimal skeleton

import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import {
 CallToolRequestSchema,
 ListToolsRequestSchema,
} from "@modelcontextprotocol/sdk/types.js";

const server = new Server(
 { name: "notes-server", version: "0.3.0" },
 { capabilities: { tools: {} } },
);

server.setRequestHandler(ListToolsRequestSchema, async () => ({
 tools: [
 {
 name: "search_notes",
 description:
 "Search the project knowledge base by keyword. Returns up to 5 " +
 "matching note titles with a one-line snippet. Use this FIRST to " +
 "find a note before fetching its full text.",
 inputSchema: {
 type: "object",
 properties: {
 query: { type: "string", description: "Keyword or phrase to search for" },
 },
 required: ["query"],
 },
 },
 {
 name: "get_note",
 description:
 "Fetch the full Markdown of one note by its exact id, as returned " +
 "by search_notes. Do not guess ids.",
 inputSchema: {
 type: "object",
 properties: {
 id: { type: "string", description: "The note id from search_notes" },
 },
 required: ["id"],
 },
 },
 ],
}));

Notice the descriptions read like instructions to a teammate, not API docs. That's the single highest-leverage thing in this file. More on that below.

The call handler

The handler is where I learned to be careful about failure. The first version threw exceptions on a missing file. Bad idea — a thrown error reads to the agent like the tool is broken, so it gives up instead of correcting course.

server.setRequestHandler(CallToolRequestSchema, async (req) => {
 const { name, arguments: args } = req.params;

 try {
 if (name === "search_notes") {
 const hits = await searchNotes(String(args?.query ?? ""));
 if (hits.length === 0) {
 return {
 content: [{
 type: "text",
 text: "No notes matched. Try a broader single keyword.",
 }],
 };
 }
 return { content: [{ type: "text", text: formatHits(hits) }] };
 }

 if (name === "get_note") {
 const note = await loadNote(String(args?.id ?? ""));
 if (!note) {
 return {
 content: [{
 type: "text",
 text: `No note with id "${args?.id}". Call search_notes to get a valid id.`,
 }],
 isError: true,
 };
 }
 return { content: [{ type: "text", text: note.body }] };
 }

 return {
 content: [{ type: "text", text: `Unknown tool: ${name}` }],
 isError: true,
 };
 } catch (err) {
 return {
 content: [{
 type: "text",
 text: `Tool failed: ${err instanceof Error ? err.message : String(err)}`,
 }],
 isError: true,
 };
 }
});

const transport = new StdioServerTransport();
await server.connect(transport);

The key move: errors come back as content with isError: true, with a sentence telling the model what to do next ("call search_notes to get a valid id"). The agent reads that and self-corrects, usually on the very next turn.

Wiring it into Claude Code

One line in the MCP config registers it:

{"mcpServers":{"notes":{"command":"node","args":["/abs/path/to/dist/index.js"]}}}

Restart Claude Code, and search_notes / get_note show up as callable tools. That's the whole loop.

Lessons Learned

1. The tool description IS the prompt — write it for the model, not for humans

My first search_notes description was literally "Searches notes." The model called it with full sentences, paragraph queries, sometimes the entire user message as the query. Garbage in, garbage out.

When I rewrote the description to say what it returns, when to use it, and what to do next ("Use this FIRST… then get_note"), the call quality jumped immediately. Treat every description field as a mini system prompt. The model has no other signal about your intent.

If you find yourself debugging "why won't the agent use my tool right?", the answer is almost always in the description, not the code.

2. Return structured errors, never stack traces

A raw exception bubbling up makes the agent think the tool is dead. A short, actionable error message makes it a recoverable hiccup. isError: true + "here's the valid next step" turned my flakiest tool into the most reliable one. Think of error messages as instructions for recovery, because to the model, that's exactly what they are.

3. Scope tools narrowly — two small tools beat one clever one

I was tempted to build a single notes(action, ...) mega-tool with an action enum. Don't. The model reasons about distinct tools far better than about a polymorphic one with conditional arguments. Two tools with clear names and required fields gave me dramatically more predictable behavior than one tool with five optional params. Narrow tools are also easier to grant or withhold — a real win when you care about what the agent can touch.

4. Version-stamp the server and the schema

I bumped version: "0.3.0" and added a stable id contract to get_note ("ids come from search_notes, don't guess"). When I later changed the search output format, the explicit contract meant I knew exactly what the model depended on. Schemas rot the same way APIs do — the model has memorized your old shape from earlier in the session. A version field and an explicit contract make breaking changes visible instead of mysterious.

A few smaller things that bit me:

stdio means stdout is sacred. Anything you console.log to stdout corrupts the JSON-RPC stream. Log to stderr (console.error) or a file. This cost me an hour of "why is the connection dropping?"
Keep tool count low. Every tool you expose is tokens in the model's context on every turn. I capped myself at the two tools I actually needed, and the agent's reasoning stayed tight because of it.
Make required fields actually required. Marking query and id as required in the JSON Schema stopped the model from calling tools with empty arguments "just to see what happens." The schema is a contract the client enforces before your code ever runs — lean on it.
Test the server by hand before wiring it in. You can pipe a JSON-RPC tools/list request straight into the process over stdin and eyeball the response. Doing that once saved me from chasing a config problem that was actually a serialization bug in my own formatHits.

What's Next

Three directions I'm exploring:

A list_recent_notes resource so the agent can browse without searching first.
Swapping the keyword search for a small embedding index — same two-tool interface, smarter matching underneath.
An HTTP transport version so the same server can back a shared team setup instead of a local subprocess.

The two-tool interface stays stable through all of that, which is the whole point of designing the schema carefully up front.

Wrap-up / CTA

Building an MCP server turned out to be 20% protocol and 80% interface design for a reader who happens to be a language model. The SDK gets you a working server in 40 lines; the lessons are all about making tools the model uses correctly.

If you're already using Claude Code, try writing one tiny MCP server this week — expose one thing you keep re-explaining to the agent and watch how much friction disappears.

💡 If this was useful:

Follow me here on Dev.to for more build logs on agent tooling.
Try Claude Code if you haven't — pairing it with a custom MCP server is where it gets genuinely fun.
Drop a comment with the first tool you'd expose. I'm collecting ideas. 🚀

URL: https://dev.to/yureki_lab/how-i-built-my-first-mcp-server-for-claude-code-4-lessons-293i

⇱ How I Built My First MCP Server for Claude Code (4 Lessons) - DEV Community