Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@MCP Whisper Transcriptiontranscribe the lecture recording.m4a"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
MCP Whisper Transcription Server
An MCP (Model Context Protocol) server for audio/video transcription using MLX-optimized Whisper models. Optimized for Apple Silicon devices with ultra-fast performance.
โจ Features
๐ MLX-Optimized: Leverages Apple Silicon for blazing-fast transcription (up to 10x faster)
๐ฏ Multiple Formats: Supports txt, md, srt, and json output formats
๐ฌ Video Support: Automatically extracts audio from video files (MP4, MOV, AVI, MKV)
๐ฆ Batch Processing: Process multiple files in parallel with configurable workers
๐ง MCP Integration: Full MCP protocol support with tools and resources
๐ Performance Tracking: Built-in performance monitoring and reporting
๐๏ธ Flexible Models: Choose from 6 different Whisper models (tiny to large-v3-turbo)
๐ ๏ธ Error Handling: Robust error handling and validation
๐ Concurrent Processing: Thread-safe concurrent transcription support
๐ Voice Activity Detection: Optional VAD to remove silence and speed up processing
๐งน Hallucination Prevention: Advanced filtering to remove common transcription artifacts
Related MCP server: Fast-Whisper-MCP-Server
๐ Performance
Speed: Up to 10x realtime transcription on Apple Silicon
Memory: Optimized memory usage (< 500MB for most files)
Concurrent: Handle multiple transcriptions simultaneously
Scalable: Batch process hundreds of files efficiently
๐ Quick Start
Prerequisites
Apple Silicon Mac (M1, M2, M3, or later)
Python 3.10+
FFmpeg (for video support)
Installation
Install FFmpeg (if not already installed):
brew install ffmpegClone the repository:
git clone https://github.com/galacoder/mcp-whisper-transcription.git cd mcp-whisper-transcriptionInstall Poetry (if not already installed):
curl -sSL https://install.python-poetry.org | python3 -Install dependencies:
poetry installTest the installation:
poetry run python src/whisper_mcp_server.py --help
๐ Configuration
Environment Variables
Create a .env file to customize settings:
# Model Configuration
DEFAULT_MODEL=mlx-community/whisper-large-v3-turbo
OUTPUT_FORMATS=txt,md,srt,json
# Performance Settings
MAX_WORKERS=4
TEMP_DIR=./temp
# Optional: API Keys for future cloud features
# OPENAI_API_KEY=your_key_hereAvailable Models
Model | Size | Speed | Memory | Best For |
| 39M | ~10x | ~150MB | Quick drafts |
| 74M | ~7x | ~250MB | Balanced performance |
| 244M | ~5x | ~600MB | High quality |
| 769M | ~3x | ~1.5GB | Professional use |
| 1550M | ~2x | ~3GB | Maximum accuracy |
| 809M | ~4x | ~1.6GB | Recommended |
๐ง Usage
Claude Desktop Integration
Add to your Claude Desktop configuration file:
{
"mcpServers": {
"whisper-transcription": {
"command": "poetry",
"args": ["run", "python", "src/whisper_mcp_server.py"],
"cwd": "/absolute/path/to/mcp-whisper-transcription"
}
}
}๐ Configuration File Locations:
macOS:
~/Library/Application Support/Claude/claude_desktop_config.jsonWindows:
%APPDATA%\Claude\claude_desktop_config.json
Standalone Usage
# Run the MCP server directly
poetry run python src/whisper_mcp_server.py
# Or use the development server
poetry run python -m src.whisper_mcp_server๐ ๏ธ Available Tools & Resources
MCP Tools
Tool | Description | Key Parameters |
| Transcribe a single audio/video file |
|
| Process multiple files in a directory |
|
| Show available Whisper models | None |
| Get details about a specific model |
|
| Clear model cache |
|
| Estimate transcription time |
|
| Check file compatibility |
|
| List supported input/output formats | None |
MCP Resources
Resource | Description | Data Provided |
| Recent transcriptions | List of all transcriptions |
| Specific transcription details | Full transcription metadata |
| Available models | Model specifications and status |
| Current configuration | Server settings and environment |
| Supported formats | Input/output format details |
| Performance statistics | Speed, memory, and uptime metrics |
Quick Examples
# Single file transcription
result = await client.call_tool("transcribe_file", {
"file_path": "interview.mp4",
"output_formats": "txt,srt",
"model": "mlx-community/whisper-large-v3-turbo"
})
# Transcription with Voice Activity Detection
result = await client.call_tool("transcribe_file", {
"file_path": "long_interview.mp4",
"output_formats": "txt,srt",
"use_vad": True # Remove silence for faster processing
})
# Batch processing
result = await client.call_tool("batch_transcribe", {
"directory": "./podcasts",
"pattern": "*.mp3",
"max_workers": 4
})
# Check supported formats
formats = await client.call_tool("get_supported_formats", {})๐งช Development
Running Tests
# Run all tests
poetry run pytest
# Run with coverage
poetry run pytest --cov=src --cov-report=html
# Run specific test file
poetry run pytest tests/test_mcp_tools.py -vCode Quality
# Format code
poetry run black .
poetry run isort .
# Type checking (optional)
poetry run mypy src/
# Lint code
poetry run flake8 src/Project Structure
mcp-whisper-transcription/
โโโ src/
โ โโโ whisper_mcp_server.py # Main MCP server
โโโ tests/ # Comprehensive test suite
โโโ examples/ # Usage examples and test files
โโโ transcribe_mlx.py # MLX Whisper integration
โโโ whisper_utils.py # Utility functions
โโโ pyproject.toml # Project configuration๐ Performance Benchmarks
Test Results (Apple M3 Max)
Model | Audio Duration | Processing Time | Speed | Memory |
tiny | 10 minutes | 1.2 minutes | 8.3x | 150MB |
base | 10 minutes | 1.8 minutes | 5.6x | 250MB |
small | 10 minutes | 2.5 minutes | 4.0x | 600MB |
medium | 10 minutes | 4.2 minutes | 2.4x | 1.5GB |
large-v3 | 10 minutes | 5.8 minutes | 1.7x | 3GB |
large-v3-turbo | 10 minutes | 3.1 minutes | 3.2x | 1.6GB |
๐ง Troubleshooting
Common Issues
FFmpeg not found
brew install ffmpegModel download slow
Models are cached in
~/.cache/huggingface/First download can be slow but subsequent runs are fast
Memory issues
Use smaller models (tiny/base) for large files
Reduce
MAX_WORKERSfor concurrent processing
Permission errors
Ensure proper file permissions
Check output directory write access
See TROUBLESHOOTING.md for detailed solutions.
๐ Requirements
Python 3.10+
Apple Silicon Mac (M1, M2, M3, or later)
FFmpeg (for video file support)
4GB+ RAM (8GB+ recommended for large models)
2GB+ free disk space (for model cache)
๐ License
MIT License - see LICENSE file for details.
๐ค Contributing
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
๐ Acknowledgments
Built with FastMCP - Modern MCP server framework
Powered by MLX Whisper - Apple Silicon optimization
Original Whisper by OpenAI - Revolutionary speech recognition
Thanks to the MLX team at Apple for the incredible performance optimizations
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
- Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)By Om-Shree-0709 on .Agentic AiPrompt InjectionWebAssembly
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/galacoder/mcp-whisper-transcription'
If you have feedback or need assistance with the MCP directory API, please join our Discord server
