Voozh

Search for:

A guide for reducing token count in AI requests

View all MCP Servers

Why this server?
Provides AI chat history compression tools through token-based trimming and AI-powered summarization to manage context within token limits.
SlimContext MCP Server
agentailor
A
license
A
quality
C
maintenance
Provides AI chat history compression tools through token-based trimming and AI-powered summarization strategies to manage conversation context within token limits.
Last updated 2025-09-17
2
36
5
MIT
Why this server?
Enables 70-90% LLM API cost reduction by compressing conversation history via local models or heuristics, with token counting and pinned facts.
PromptThrift MCP
woling-dev
A
license
-
quality
B
maintenance
Enables 70-90% LLM API cost reduction by compressing conversation history via local Gemma 4 models or heuristics, featuring token counting, model routing, and pinned facts for preserving critical context.
Last updated 2026-04-11
1
MIT
Why this server?
Offers context optimization tools including targeted file analysis and intelligent command execution to reduce token usage by extracting only relevant information.
Context Optimizer MCP Server
malaksedarous
A
license
A
quality
C
maintenance
Provides AI coding assistants with context optimization tools including targeted file analysis, intelligent terminal command execution with LLM-powered output extraction, and web research capabilities. Helps reduce token usage by extracting only relevant information instead of processing entire files and command outputs.
Last updated 2025-08-30
5
26
60
TypeScript
MIT
Why this server?
Semantic search with 98% token reduction for AI assistants.
th0th
S1LV4
A
license
-
quality
A
maintenance
Semantic search with 98% token reduction for AI assistants.
Last updated 2026-06-12
135
MIT
Why this server?
An adaptive tiny-model layer that compresses verbose tool outputs to reduce token usage by up to two orders of magnitude.
PlanckBot
opcastil11
A
license
-
quality
B
maintenance
An adaptive tiny-model layer that sits between an LLM and its MCP tools, compressing verbose tool outputs to reduce token usage by up to two orders of magnitude.
Last updated 2026-05-10
1
Apache 2.0
Why this server?
Compresses long text, local files, and MCP catalog descriptions into denser context to reduce token usage without turning input into a summary.
ContextCrumb
Yuchen20
A
license
-
quality
A
maintenance
ContextCrumb compresses long text, local files, and MCP catalog descriptions into denser context for LLM agents. It helps agents load more useful information into the context window and reduce token usage without turning the input into a summary.
Last updated 2026-06-03
1
MIT
Why this server?
Provides intelligent code context and analysis through semantic compression, offering 60-80% token reduction while enabling code understanding.
Ambiance MCP Server
sbarron
A
license
A
quality
D
maintenance
Provides intelligent code context and analysis through semantic compression, AST parsing, and multi-language support. Offers 60-80% token reduction while enabling AI assistants to understand codebases through local analysis, OpenAI-enhanced insights, and GitHub repository integration.
Last updated 2025-10-19
6
13
3
MIT
Why this server?
Enables efficient AI agent operations through sandboxed code execution with up to 98.7% token reduction by processing data outside context.
Code Execution MCP
marc-shade
F
license
-
quality
C
maintenance
Enables efficient AI agent operations through sandboxed Python code execution with progressive tool discovery, PII tokenization, and skills persistence, achieving up to 98.7% token reduction by processing data in a sandbox rather than in context.
Last updated 2026-02-22
Why this server?
Supercharges agents with semantic code intelligence to save tokens and reduce costs.
tokensave
aovestdipaperino
A
license
-
quality
A
maintenance
Supercharge your Agent with Semantic Code Intelligence and save 💰 in the process!
Last updated 2026-06-17
222
MIT

URL: https://glama.ai/mcp/servers/search/a-guide-for-reducing-token-count-in-ai-requests

⇱ A guide for reducing token count in AI requests | Glama

A guide for reducing token count in AI requests

SlimContext MCP Server

PromptThrift MCP

Context Optimizer MCP Server

th0th

PlanckBot

ContextCrumb

Ambiance MCP Server

Code Execution MCP

tokensave