VOOZH about

URL: https://glama.ai/mcp/servers/search/a-guide-for-reducing-token-count-in-ai-requests

⇱ A guide for reducing token count in AI requests | Glama


Search for:

A guide for reducing token count in AI requests

View all MCP Servers

  • Why this server?

    Provides AI chat history compression tools through token-based trimming and AI-powered summarization to manage context within token limits.

    A
    license
    A
    quality
    C
    maintenance
    Provides AI chat history compression tools through token-based trimming and AI-powered summarization strategies to manage conversation context within token limits.
    Last updated
    2
    36
    5
    MIT
  • Why this server?

    Enables 70-90% LLM API cost reduction by compressing conversation history via local models or heuristics, with token counting and pinned facts.

    A
    license
    -
    quality
    B
    maintenance
    Enables 70-90% LLM API cost reduction by compressing conversation history via local Gemma 4 models or heuristics, featuring token counting, model routing, and pinned facts for preserving critical context.
    Last updated
    1
    MIT
  • Why this server?

    Offers context optimization tools including targeted file analysis and intelligent command execution to reduce token usage by extracting only relevant information.

    A
    license
    A
    quality
    C
    maintenance
    Provides AI coding assistants with context optimization tools including targeted file analysis, intelligent terminal command execution with LLM-powered output extraction, and web research capabilities. Helps reduce token usage by extracting only relevant information instead of processing entire files and command outputs.
    Last updated
    5
    26
    60
    TypeScript
    MIT
  • Why this server?

    Semantic search with 98% token reduction for AI assistants.

    A
    license
    -
    quality
    A
    maintenance
    Semantic search with 98% token reduction for AI assistants.
    Last updated
    135
    MIT
  • Why this server?

    An adaptive tiny-model layer that compresses verbose tool outputs to reduce token usage by up to two orders of magnitude.

    A
    license
    -
    quality
    B
    maintenance
    An adaptive tiny-model layer that sits between an LLM and its MCP tools, compressing verbose tool outputs to reduce token usage by up to two orders of magnitude.
    Last updated
    1
    Apache 2.0
  • Why this server?

    Compresses long text, local files, and MCP catalog descriptions into denser context to reduce token usage without turning input into a summary.

    A
    license
    -
    quality
    A
    maintenance
    ContextCrumb compresses long text, local files, and MCP catalog descriptions into denser context for LLM agents. It helps agents load more useful information into the context window and reduce token usage without turning the input into a summary.
    Last updated
    1
    MIT
  • Why this server?

    Provides intelligent code context and analysis through semantic compression, offering 60-80% token reduction while enabling code understanding.

    A
    license
    A
    quality
    D
    maintenance
    Provides intelligent code context and analysis through semantic compression, AST parsing, and multi-language support. Offers 60-80% token reduction while enabling AI assistants to understand codebases through local analysis, OpenAI-enhanced insights, and GitHub repository integration.
    Last updated
    6
    13
    3
    MIT
  • Why this server?

    Enables efficient AI agent operations through sandboxed code execution with up to 98.7% token reduction by processing data outside context.

    F
    license
    -
    quality
    C
    maintenance
    Enables efficient AI agent operations through sandboxed Python code execution with progressive tool discovery, PII tokenization, and skills persistence, achieving up to 98.7% token reduction by processing data in a sandbox rather than in context.
    Last updated
  • Why this server?

    Supercharges agents with semantic code intelligence to save tokens and reduce costs.

    A
    license
    -
    quality
    A
    maintenance
    Supercharge your Agent with Semantic Code Intelligence and save 💰 in the process!
    Last updated
    222
    MIT