VOOZH about

URL: https://glama.ai/mcp/servers/blueman82/recall

⇱ Recall by blueman82 | Glama


Recall

Long-term memory system for MCP-compatible AI assistants with semantic search and relationship tracking.

Features

  • Persistent Memory Storage: Store preferences, decisions, patterns, and session context

  • Semantic Search: Find relevant memories using natural language queries via ChromaDB vectors

  • MLX Hybrid Embeddings: Native Apple Silicon support via MLX for ~5-10x faster embeddings (automatic fallback to Ollama)

  • Memory Relationships: Create edges between memories (supersedes, relates_to, caused_by, contradicts)

  • Namespace Isolation: Global memories vs project-scoped memories

  • Context Generation: Auto-format memories for session context injection

  • Deduplication: Content-hash based duplicate detection

Installation

# Clone the repository
git clone https://github.com/yourorg/recall.git
cd recall

# Install with uv
uv sync

# On Apple Silicon: MLX embeddings work automatically (fastest option)
# On other platforms or as fallback: ensure Ollama is running
ollama pull mxbai-embed-large # Required if not using MLX
ollama pull llama3.2 # Optional: session summarization for auto-capture hook
ollama serve

Usage

Run as MCP Server

uv run python -m recall

CLI Options

uv run python -m recall --help

Options:
 --sqlite-path PATH SQLite database path (default: ~/.recall/recall.db)
 --chroma-path PATH ChromaDB storage path (default: ~/.recall/chroma_db)
 --collection NAME ChromaDB collection name (default: memories)
 --ollama-host HOST Ollama server URL (default: http://localhost:11434)
 --ollama-model MODEL Embedding model (default: mxbai-embed-large)
 --ollama-timeout SECS Request timeout (default: 30)
 --log-level LEVEL DEBUG, INFO, WARNING, ERROR, CRITICAL (default: INFO)

meta-mcp Configuration

Add Recall to your meta-mcp servers.json:

{
 "recall": {
 "command": "uv",
 "args": [
 "run",
 "--directory",
 "/path/to/recall",
 "python",
 "-m",
 "recall"
 ],
 "env": {
 "RECALL_LOG_LEVEL": "INFO",
 "RECALL_OLLAMA_HOST": "http://localhost:11434",
 "RECALL_OLLAMA_MODEL": "mxbai-embed-large"
 },
 "description": "Long-term memory system with semantic search",
 "tags": ["memory", "context", "semantic-search"]
 }
}

Or for Claude Code / other MCP clients (claude.json):

{
 "mcpServers": {
 "recall": {
 "command": "uv",
 "args": [
 "run",
 "--directory",
 "/path/to/recall",
 "python",
 "-m",
 "recall"
 ],
 "env": {
 "RECALL_LOG_LEVEL": "INFO"
 }
 }
 }
}

Environment Variables

Variable

Default

Description

RECALL_SQLITE_PATH

~/.recall/recall.db

SQLite database file path

RECALL_CHROMA_PATH

~/.recall/chroma_db

ChromaDB persistent storage directory

RECALL_COLLECTION_NAME

memories

ChromaDB collection name

RECALL_EMBEDDING_BACKEND

ollama

Embedding backend: mlx (Apple Silicon) or ollama

RECALL_MLX_MODEL

mlx-community/mxbai-embed-large-v1

MLX embedding model identifier

RECALL_OLLAMA_HOST

http://localhost:11434

Ollama server URL

RECALL_OLLAMA_MODEL

mxbai-embed-large

Ollama embedding model name

RECALL_OLLAMA_TIMEOUT

30

Ollama request timeout in seconds

RECALL_LOG_LEVEL

INFO

Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)

RECALL_DEFAULT_NAMESPACE

global

Default namespace for memories

RECALL_DEFAULT_IMPORTANCE

0.5

Default importance score (0.0-1.0)

RECALL_DEFAULT_TOKEN_BUDGET

4000

Default token budget for context

MCP Tool Examples

memory_store_tool

Store a new memory with semantic indexing. Uses fast daemon path when available (<10ms), falls back to sync embedding otherwise.

{
 "content": "User prefers dark mode in all applications",
 "memory_type": "preference",
 "namespace": "global",
 "importance": 0.8,
 "metadata": {"source": "explicit_request"}
}

Response (fast path via daemon):

{
 "success": true,
 "queued": true,
 "queue_id": 42,
 "namespace": "global"
}

Response (sync path fallback):

{
 "success": true,
 "queued": false,
 "id": "550e8400-e29b-41d4-a716-446655440000",
 "content_hash": "a1b2c3d4e5f67890"
}

daemon_status_tool

Check if the recall daemon is running:

{}

Response:

{
 "running": true,
 "status": {
 "pid": 12345,
 "store_queue": {"pending_count": 5},
 "embed_worker_running": true
 }
}

memory_recall_tool

Search memories by semantic similarity:

{
 "query": "user interface preferences",
 "n_results": 5,
 "namespace": "global",
 "memory_type": "preference",
 "min_importance": 0.5,
 "include_related": true
}

Response:

{
 "success": true,
 "memories": [
 {
 "id": "550e8400-e29b-41d4-a716-446655440000",
 "content": "User prefers dark mode in all applications",
 "type": "preference",
 "namespace": "global",
 "importance": 0.8,
 "created_at": "2024-01-15T10:30:00",
 "accessed_at": "2024-01-15T14:22:00",
 "access_count": 3
 }
 ],
 "total": 1,
 "score": 0.92
}

memory_relate_tool

Create a relationship between memories:

{
 "source_id": "mem_new_123",
 "target_id": "mem_old_456",
 "relation": "supersedes",
 "weight": 1.0
}

Response:

{
 "success": true,
 "edge_id": 42
}

memory_context_tool

Generate formatted context for session injection:

{
 "query": "coding style preferences",
 "project": "myproject",
 "token_budget": 4000
}

Response:

{
 "success": true,
 "context": "# Memory Context\n\n## Preferences\n\n- User prefers dark mode [global]\n- Use 2-space indentation [project:myproject]\n\n## Recent Decisions\n\n- Decided to use FastAPI for the backend [project:myproject]\n",
 "token_estimate": 125
}

memory_forget_tool

Delete memories by ID or semantic search:

{
 "memory_id": "550e8400-e29b-41d4-a716-446655440000",
 "confirm": true
}

Or delete by search:

{
 "query": "outdated preferences",
 "namespace": "project:oldproject",
 "n_results": 10,
 "confirm": true
}

Response:

{
 "success": true,
 "deleted_ids": ["550e8400-e29b-41d4-a716-446655440000"],
 "deleted_count": 1
}

Architecture

┌─────────────────────────────────────────────────────────────┐
│ MCP Server (FastMCP) │
│ memory_store │ memory_recall │ memory_relate │ memory_forget │
└───────────────────────────┬─────────────────────────────────┘
 │
 ┌─────────────┴─────────────┐
 │ │
 ┌─────────▼─────────┐ ┌─────────▼─────────┐
 │ FAST PATH │ │ SYNC PATH │
 │ <10ms │ │ MLX: <100ms │
 └─────────┬─────────┘ │ Ollama: 10-60s │
 │ └─────────┬─────────┘
 ┌─────────▼─────────┐ │
 │ recall-daemon │ ┌─────────▼─────────┐
 │ (Unix socket) │ │ HybridStore │
 │ │ └─────────┬─────────┘
 │ ┌─────────────┐ │ │
 │ │ StoreQueue │ │ ┌───────────┼───────────┐
 │ │ EmbedWorker │ │ │ │ │
 │ └─────────────┘ │ │ │ │
 └─────────┬─────────┘ ┌─▼─────┐ ┌───▼───┐ ┌─────▼─────┐
 │ │SQLite │ │Chroma │ │ Embedding │
 └─────────────►Store │ │ Store │ │ Factory │
 └───────┘ └───────┘ └─────┬─────┘
 │
 ┌───────────┴───────────┐
 │ │
 ┌─────▼─────┐ ┌─────▼─────┐
 │ MLX │ │ Ollama │
 │ (Apple) │ │ (Fallback)│
 └───────────┘ └───────────┘

The daemon provides fast (<10ms) memory storage by queueing operations and processing embeddings asynchronously. When the daemon is unavailable, the MCP server falls back to synchronous embedding via MLX (~100ms on Apple Silicon) or Ollama (10-60s on other platforms).

Daemon Setup (macOS)

The recall daemon provides fast (<10ms) memory storage by processing embeddings asynchronously. Without the daemon, each store operation blocks for 10-60 seconds waiting for Ollama embeddings.

Quick Install

# From the recall directory
./hooks/install-daemon.sh

This will:

  1. Copy hook scripts to ~/.claude/hooks/

  2. Install the launchd plist to ~/Library/LaunchAgents/

  3. Start the daemon automatically

Manual Install

# 1. Copy hook scripts
cp hooks/recall*.py ~/.claude/hooks/
chmod +x ~/.claude/hooks/recall*.py

# 2. Create logs directory
mkdir -p ~/.claude/hooks/logs

# 3. Install plist with path substitution
sed "s|{{HOME}}|$HOME|g; s|{{RECALL_DIR}}|$(pwd)|g" \
 hooks/com.recall.daemon.plist.template > ~/Library/LaunchAgents/com.recall.daemon.plist

# 4. Load the daemon
launchctl load ~/Library/LaunchAgents/com.recall.daemon.plist

Daemon Commands

# Check status
echo '{"cmd": "status"}' | nc -U /tmp/recall-daemon.sock | jq

# Stop daemon
launchctl unload ~/Library/LaunchAgents/com.recall.daemon.plist

# Start daemon
launchctl load ~/Library/LaunchAgents/com.recall.daemon.plist

# View logs
tail -f ~/.claude/hooks/logs/recall-daemon.log

Hooks Configuration

Add recall hooks to your Claude Code settings (~/.claude/settings.json). See hooks/settings.example.json for the full configuration.

Development

# Install dev dependencies
uv sync --dev

# Run tests
uv run pytest tests/

# Run tests with coverage
uv run pytest tests/ --cov=recall --cov-report=html

# Type checking
uv run mypy src/recall

# Run specific integration tests
uv run pytest tests/integration/test_mcp_server.py -v

Requirements

  • Python 3.13+

  • For Apple Silicon (recommended): MLX embeddings work automatically with mlx-embeddings package

  • For other platforms: Ollama with:

    • mxbai-embed-large model (required for semantic search)

    • llama3.2 model (optional, for session auto-capture hook)

  • ~500MB disk space for ChromaDB indices

License

MIT

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/blueman82/recall'

If you have feedback or need assistance with the MCP directory API, please join our Discord server