VOOZH about

URL: https://mcp.so/server/attestor/bolnet

⇱ Attestor MCP Server


Attestor

@bolnet

Audit-grade memory backbone for agent teams. Bi-temporal facts (event time + transaction time, with recall(as_of=...) replay), 6-step deterministic retrieval (no LLM in the critical path), conversation ingest with speaker-locked dual-pass extraction, per-tenant Postgres row-level security, and Ed25519-signed provenance. Postgres + pgvector + Neo4j defaults; ArangoDB / DynamoDB / Cosmos / AlloyDB also supported. Listed on the official MCP Registry as io.github.bolnet/attestor.
Overview

Attestor

The memory layer for agent teams. Self-hosted, deterministic retrieval, zero LLM in the critical path.

πŸ‘ PyPI
πŸ‘ PyPI Downloads
πŸ‘ GitHub Stars
πŸ‘ Build
πŸ‘ Evals
πŸ‘ License: MIT

pip install attestor
Version4.0.0a1 (alpha; greenfield rebuild β€” no v3 migration path)
PyPIattestor
Importattestor
Live sitehttps://attestor.dev/
Repohttps://github.com/bolnet/attestor
LicenseMIT

What it is

Attestor is a memory store for agent teams that need a shared, tenant-isolated memory with bi-temporal replay, deterministic retrieval, and an auditable supersession chain. It runs as a Python library, a Starlette REST service, or an MCP server β€” same API in all three.

It is built around three claims, each grounded in code:

  1. Bi-temporal β€” replay any past state. Every memory has both event time (valid_from / valid_until) and transaction time (t_created / t_expired). Nothing is deleted; everything is queryable forever (attestor/temporal/manager.py:43-73, core.py:888-890).
  2. Semantic-first retrieval, no LLM in the hot path. A six-step deterministic pipeline. Same query β†’ same ranking. Unit-testable (attestor/retrieval/orchestrator.py:1-14).
  3. Conversation ingest with auditable conflict resolution. Two-pass speaker-locked extraction, then a four-decision (ADD / UPDATE / INVALIDATE / NOOP) resolver per fact. Every supersession carries an evidence_episode_id (attestor/extraction/conflict_resolver.py:98).

Designed for

  • Multi-agent products where many LLMs write to the same memory store
  • Regulated chat systems that need point-in-time reconstruction (compliance, audit, FOIA-style queries)
  • Self-hosted deployments β€” your VPC, your Postgres, your Neo4j

Not designed for

  • A general-purpose vector database
  • A RAG framework with built-in chunking, reranking, and orchestration
  • An LLM agent runtime β€” Attestor is the memory backend; the agent loop is yours

Quick start

1. Install

pip install attestor # or: pipx install attestor

2. Bring up local Postgres + Neo4j

attestor setup local # writes attestor/infra/local/docker-compose.yml
docker compose -f attestor/infra/local/docker-compose.yml up -d

Postgres 16 ships with pgvector (document + vector roles). Neo4j 5 ships with GDS (graph role: PageRank, BFS, Leiden).

3. Pull the default embedder

ollama pull bge-m3 # 1024-D, 8K context, local-first default

The provider chain in attestor/store/embeddings.py checks http://localhost:11434 first; cloud providers are fallbacks. Override via ATTESTOR_EMBEDDING_PROVIDER / ATTESTOR_EMBEDDING_MODEL.

4. Verify (mandatory)

attestor doctor

All four checks must be green for the default install: Document Store, Vector Store, Graph Store, Retrieval Pipeline. Graph (Neo4j) is required β€” the 6-step retrieval pipeline narrows on graph neighborhoods and the conversation ingest path writes typed edges (uses, authored-by, supersedes). The only hard dependency that cannot be down is the document store (Postgres); transient vector-probe failures are surfaced in the response trace rather than swallowed (retrieval/orchestrator.py β€” vector_error field).

5. Use it

from attestor import AgentMemory, AgentContext, AgentRole
mem = AgentMemory() # picks up env / ~/.attestor.toml automatically
ctx = AgentContext(
 agent_id="researcher-1",
 role=AgentRole.RESEARCHER,
 namespace="acme-prod",
)
mem.add(
 content="Alice is the engineering manager",
 entity="alice",
 category="role",
 context=ctx,
)
results = mem.recall(query="who runs engineering?", context=ctx)
for r in results:
 print(r.score, r.memory.content)

SOLO mode (zero-config). In v4, AgentMemory().add('foo') auto-provisions a singleton local user, an Inbox project (metadata.is_inbox=true), and a daily session β€” so the snippet above works on a fresh database without configuring identity (core.py:179-209). For multi-tenant production use, pass an explicit AgentContext with a real namespace.

6. Run a smoke benchmark (optional)

Verify your install end-to-end against a tiny LongMemEval slice. Defaults match the canonical benchmark stack: openai/gpt-5.2 answerer, dual judges (openai/gpt-5.2 + anthropic/claude-sonnet-4.6), openai/gpt-5.2 distiller, OpenAI text-embedding-3-large truncated to 1024-D.

export OPENAI_API_KEY=...
.venv/bin/python scripts/lme_smoke_local.py --n 2

Every model and parameter is overridable via env var or CLI flag. See --help for the full table.


Architecture

Bi-temporal β€” replay any past state

Every memory carries two time axes:

AxisColumnsMeaning
Event timevalid_from, valid_untilWhen the fact is true in the world
Transaction timet_created, t_expiredWhen the row landed in the store

Plus a superseded_by chain. Old facts are never deleted β€” they remain queryable forever (attestor/temporal/manager.py:30-66).

# What did we believe on March 1?
mem.recall(query="who runs engineering?", as_of="2026-03-01T00:00:00Z", context=ctx)
# Show me everything we knew about Alice between Feb and Apr
mem.recall(query="alice", time_window=("2026-02-01", "2026-04-01"), context=ctx)

as_of and time_window propagate end-to-end through the orchestrator and document store. Auto-supersession on write is wired into core.py:add() (core.py:762, 784-785): on every add, the temporal manager finds active rows with the same (entity, category, namespace) and different content, marks them superseded, sets valid_until=now, and links superseded_by=<new_id>. Detection is rule-based string equality today.

Tenant isolation β€” Postgres Row-Level Security

Every tenant table (users, projects, sessions, episodes, memories, user_quotas, deletion_audit) carries a tenant_isolation_* policy keyed off the attestor.current_user_id session variable. An empty / unset value fails closed β€” no rows visible (attestor/store/schema.sql:311-327).

Honest disclosure. Enforcement lives in Postgres, not Python. The AgentRole enum in attestor/context.py:49-56 is metadata that flows onto memories for provenance; it does not gate operations in Python. RLS is what actually controls access. This is correct architecture for a memory backend, but worth knowing if you read the Python alone.

The retrieval pipeline β€” semantic-first, six steps

attestor/retrieval/orchestrator.py runs the same six steps for every query:

  1. Vector top-K β€” pgvector cosine, k=50
  2. Graph narrow β€” Neo4j BFS depth ≀ 2 from each candidate's entity to the question entities; affinity bonus per hop (0-hop=+0.30, 1-hop=+0.20, 2-hop=+0.10; unreachable=βˆ’0.05). Discrete, not "soft".
  3. Triples inject β€” typed-edge facts (uses, authored-by, supersedes) injected as synthetic memories
  4. MMR rerank β€” Ξ»=0.7
  5. Confidence decay + temporal boost β€” recency lifts; stale, low-confidence rows fall
  6. Budget fit β€” greedy monotonic-by-score pack into the caller's token budget

Every call writes a JSONL trace to logs/attestor_trace.jsonl (disable via ATTESTOR_TRACE=0).

Three storage roles

RolePurposeDefaultAlternatives
DocumentSource of truth (content, tags, entity, ts, provenance, confidence)Postgres 16AlloyDB, ArangoDB, DynamoDB, Cosmos DB
VectorDense embedding per memorypgvectorAlloyDB ScaNN, ArangoDB, OpenSearch Serverless, Cosmos DiskANN
GraphEntity nodes + typed edgesNeo4j 5 + GDSApache AGE on AlloyDB, ArangoDB, Neptune, NetworkX (Azure)

Postgres is the source of truth. Neo4j is derived state, rebuildable from Postgres β€” but it's required for the canonical install: graph expansion is step 2 of the retrieval pipeline and conversation ingest writes typed edges. The only role that cannot be down is the document store; the orchestrator records transient vector-probe failures in the response trace (vector_error) instead of swallowing them.

Optional BM25 / FTS lane

A trigger-maintained content_tsv tsvector + GIN index lifts queries that embeddings under-recall (acronyms, IDs, rare proper nouns). Enabled when v4 schema is detected; fuses with the vector lane via Reciprocal Rank Fusion (RRF, k=60). Graceful no-op on backends without the column (core.py:122-130).


Conversation ingest

The heavyweight write path that turns conversation turns into auditable memories. core.py:ingest_round(turn) orchestrates four passes:

turn β†’ extract_user_facts(user_turn) ┐
 extract_agent_facts(assistant_turn) β”˜ β†’ resolve_conflicts β†’ apply

Two-pass speaker-locked extraction

attestor/extraction/round_extractor.py:216, 258 β€” separate prompts for user vs assistant turns. The user-turn extractor only emits facts attributable to the user; the assistant-turn extractor only emits facts the assistant introduced. Stops cross-attribution. The "+53.6 over Mem0" delta in our LongMemEval scores comes from this split.

Four-decision conflict resolver

attestor/extraction/conflict_resolver.py:40, 98 β€” for each newly-extracted fact, an LLM call against existing similar memories returns one of:

DecisionEffect
ADDNew info, no existing match β€” write fresh memory
UPDATESame entity + predicate, refined value β€” keep existing id
INVALIDATEOld memory contradicted β€” mark superseded (timeline replays)
NOOPAlready represented β€” skip

Each Decision carries evidence_episode_id. Every supersession is auditable. Failsafe: parse failure on a single fact yields ADD-by-default β€” better a duplicate-ish row than a silent drop.

Two write paths, two contracts. mem.add(...) runs the lightweight rule-based supersession (Β§Bi-temporal). mem.ingest_round(turn) runs the full four-decision pipeline. Pick ingest_round for conversational data; pick add for structured writes where you've already done the conflict reasoning.

Sleep-time consolidation

mem.consolidate() (core.py:526) re-extracts and synthesizes facts from recent episodes with a stronger model. Currently a Python-API-only call β€” no CLI command. Schedule it from your application (cron, systemd timer, ECS scheduled task) when you want fresher facts than the streaming extractor produces.

Reflection engine

attestor/consolidation/reflection.py runs periodic synthesis across N episodes for one user. Outputs:

  • stable_preferences β€” patterns appearing in 3+ episodes
  • stable_constraints β€” rules the user repeatedly invokes
  • changed_beliefs β€” preferences that shifted (old β†’ new, with explicit invalidate)
  • contradictions_for_review β€” flagged for HUMAN REVIEW, not auto-resolved

The "do not auto-resolve" stance is the load-bearing piece for regulated chat systems. The prompt is explicit (reflection.py:35-66): "Do NOT auto-resolve contradictions. Flag them for human review."

Chain-of-Note reading

pack = mem.recall_as_pack(query="who runs engineering?", context=ctx)
# pack.memories : list of {id, content, validity_window, confidence, source_episode_id}
# pack.prompt : default Chain-of-Note prompt with NOTES β†’ SYNTHESIS β†’ CITE β†’ ABSTAIN β†’ CONFLICT structure

The default prompt has explicit ABSTAIN and CONFLICT clauses β€” every frontier model defaults to confabulation otherwise.


Multi-agent primitives

Six roles

AgentRole: ORCHESTRATOR, PLANNER, EXECUTOR, RESEARCHER, REVIEWER, MONITOR (attestor/context.py:49-56). The role flows onto every memory's metadata for provenance. Access enforcement happens at the Postgres RLS layer (see Β§Tenant isolation).

AgentContext β€” handoff, scratchpad, trail

orchestrator = AgentContext.from_env(agent_id="orchestrator", namespace="project:acme")
planner = orchestrator.as_agent("planner", role=AgentRole.PLANNER)
executor = planner.as_agent("executor", role=AgentRole.EXECUTOR)
# Each child carries parent_agent_id + accumulating agent_trail.
# All three share the same scratchpad: Dict[str, Any] for typed handoff data.

as_agent() creates a child context with parent_agent_id, full agent_trail, and a shared scratchpad. The trail accumulates β€” useful for proving "this answer came from agent X who got it from agent Y."

Per-agent token budgets

AgentContext.token_budget (default 20 000) is enforced β€” recall() packs results greedily until the budget is exhausted (scorer.py:fit_to_budget). token_budget_used accumulates across calls in a session.

Optional write quotas

mem.set_quota(user_id, daily_writes=...) β†’ enforced on add against the v4 user_quotas table (core.py:592-621). Optional; unset means unlimited.


Security & Compliance

Row-Level Security

Cross-link to Β§Tenant isolation. RLS policies are the access-control surface; the Python layer trusts them. Set attestor.current_user_id per connection.

Provenance on every memory

Every memory carries agent_id, session_id, source_episode_id. The supersession chain (superseded_by) is preserved forever. Conversation episodes are stored verbatim, separate from the memories extracted from them β€” meaning you can always reconstruct which conversation turn produced which fact.

Deletion audit log

Hard deletes (e.g., GDPR purges) write a row to deletion_audit before the cascade β€” what was deleted, when, why, by whom. This is the carve-out for the otherwise-immutable schema.

GDPR β€” export and purge

mem.export_user(external_id="user-42") # full data export (memories + episodes + sessions + projects)
mem.purge_user(external_id="user-42", # cascading hard delete with audit trail
 reason="GDPR right-to-erasure request 2026-04-27")
mem.deletion_audit_log(limit=100) # forensic readback

core.py:557-590. v4 only. Returns / writes everything Subject Access requires for Art. 15 / Art. 17.

Optional: Ed25519 provenance signing

Enable via config (signing.enabled = true). On every add, attestor signs the canonical payload id || agent_id || t_created || content_hash with an Ed25519 key. mem.verify_memory(memory_id) returns bool (core.py:623-640). Optional, off by default β€” turn on for adversarial-write contexts where you need cryptographic non-repudiation.


Runtime topologies

Same API across all three. Only configuration changes.

ModeShapeWhen to use
A β€” Embedded libraryAgentMemory(config) in-process; talks directly to Postgres + Neo4jSingle-process agents, scripts, notebooks
B β€” Sidecarattestor api on localhost:8080; language-agnostic HTTP client shares the same Postgres + Neo4jPolyglot agents on one box (Python + TS + Go)
C β€” Shared serviceOne Attestor service in front of an agent mesh (App Runner / Cloud Run / Container Apps) backed by managed Postgres + Neo4jProduction multi-agent platforms
attestor api --port 8080 # Mode B / C β€” Starlette ASGI REST (HTTP)
attestor mcp --path ~/.attestor # MCP stdio server (zero-config; for Claude Desktop / Cursor / Windsurf)
attestor serve ~/.attestor # MCP stdio server (positional-path variant; equivalent transport)

Backends

BackendDocumentVectorGraphStatus
Postgres + Neo4j (default)βœ“pgvectorNeo4j + GDSProduction-ready
ArangoDBβœ“βœ“βœ“Production-ready (one engine, all 3 roles)
AWSDynamoDBOpenSearch ServerlessNeptuneBackend code + Terraform shipped
AzureCosmos DBCosmos DiskANNNetworkX (in-process)Backend code shipped, Terraform forthcoming
GCPAlloyDBAlloyDB ScaNNAGE on AlloyDBBackend code shipped, Terraform forthcoming

Override the default via config:

# ~/.attestor.toml
backend = "postgres+neo4j" # or "arangodb" | "aws" | "azure" | "gcp"

Reference Terraform lives under attestor/infra/.


Embeddings

Provider auto-detect (attestor/store/embeddings.py:get_embedding_provider), in this order:

  1. Local Ollama bge-m3 β€” 1024-D, 8K context β€” used when http://localhost:11434 is reachable
  2. Cloud-native β€” Bedrock Titan / Vertex / Azure OpenAI when their SDK + creds are present
  3. OpenAI text-embedding-3-large (3072-D native; pin OPENAI_EMBEDDING_DIMENSIONS=1024 for schema compat)
  4. OpenRouter β€” for federated runs

Local-first by design. Override:

export ATTESTOR_DISABLE_LOCAL_EMBED=1 # skip the Ollama probe entirely
export ATTESTOR_EMBEDDING_PROVIDER=openai
export ATTESTOR_EMBEDDING_MODEL=text-embedding-3-large

CLI

attestor --help lists everything. The most useful commands:

CommandPurpose
attestor initCreate a starter config
attestor setup localGenerate Docker Compose for Postgres + Neo4j
attestor doctorHealth-check every store + the retrieval pipeline
attestor add / recall / search / listCRUD-ish memory ops
attestor timelineEntity timeline (uses bi-temporal manager)
attestor statsStore statistics
attestor export / importJSON dump / restore
attestor compactRemove archived memories
attestor update / forgetMutate / archive a memory
attestor inspectInspect raw database state
attestor apiStart the Starlette REST API
attestor serve <path>Start MCP stdio server (positional-path variant)
attestor mcp [--path …]Start MCP stdio server (zero-config; default for Claude Desktop / Cursor / Windsurf)
attestor uiRead-only browser UI for the store
attestor hook {session-start, post-tool-use, stop}Run a Claude Code lifecycle hook
attestor lme / locomo / mabBuilt-in benchmark runners (see Β§Evaluation)

MCP server

attestor mcp (or attestor serve <path>) exposes an MCP stdio server with eight tools:

ToolPurpose
memory_addWrite a memory with provenance
memory_getFetch one memory by id
memory_recallRun the full retrieval pipeline
memory_searchFiltered list (entity / category / time / namespace)
memory_forgetArchive a memory by id
memory_timelineChronology for an entity
memory_statsStore statistics
memory_healthPer-role health snapshot β€” call this first when integrating

Plus MCP resources (memory listings) and prompts (canned recall prompts for IDE assistants).


Hooks (Claude Code)

Three lifecycle hooks ship in attestor/hooks/:

  • session_start β€” injects relevant memories into the session context based on cwd / repo
  • post_tool_use β€” auto-captures useful artifacts from Write / Edit / Bash
  • stop β€” writes a session summary on exit

Wire them up via the installer (next section) or by hand in ~/.claude/settings.json.


Install for Claude Code

Single instruction users can give Claude Code:

install attestor

(Or run /install-attestor.) The installer interviews you on:

  1. Scope β€” global (~/.claude/.mcp.json) vs project (.mcp.json)
  2. Postgres connection β€” local Docker, Neon, RDS, etc.
  3. Neo4j connection β€” local Docker, AuraDB, etc.
  4. Backend override β€” default postgres+neo4j, or arangodb / aws / azure / gcp
  5. Embedding provider β€” local Ollama (default), OpenAI, or cloud-native
  6. Hooks β€” whether to wire session-start / post-tool-use / stop
  7. Namespace + default token budget

Then it installs attestor via pipx, writes the MCP config, optionally writes settings.json hooks, and runs attestor doctor to verify.


Evaluation

Boundary statement. The dual-LLM judge stack is a benchmarking mechanism, not the runtime contract. Recall in production is single-pipeline and deterministic. Multiple judges score answers in evaluation only β€” never in user-facing reads.

RunnerSourceMeasures
attestor lmeLongMemEval (Google's long-memory benchmark)answer accuracy under long history, distillation, dual-judge cross-family
attestor locomoLoCoMoconversational long-memory consistency
attestor mabMultiAgentBenchmulti-agent coordination
AbstentionBench (CI gate)internalwhen not to answer β€” known unknowns
scripts/lme_smoke_local.pydual-LLM smokequick install verification (see Quick Start Β§6)

The smoke driver mirrors the canonical published-benchmark stack exactly. See --help for the full env-var / CLI-flag override matrix.


Project layout

attestor/
 core.py -- AgentMemory (main public API)
 client.py -- MemoryClient (HTTP drop-in for remote Attestor)
 context.py -- AgentContext, AgentRole, Visibility
 models.py -- Memory, RetrievalResult, ContextPack
 cli.py -- attestor CLI entry point
 api.py -- Starlette ASGI REST API
 longmemeval.py -- LongMemEval benchmark runner (dual-judge)
 locomo.py -- LoCoMo runner
 doctor_v4.py -- v4 schema + invariant validator
 init_wizard.py -- interactive install flow
 store/
 base.py -- DocumentStore / VectorStore / GraphStore protocols
 registry.py -- backend selection
 connection.py -- config layering / env resolution
 embeddings.py -- provider auto-detect (Ollama / OpenAI / Bedrock / Vertex / Azure)
 postgres_backend.py -- pgvector (document + vector roles)
 neo4j_backend.py -- Neo4j + GDS (graph role)
 arango_backend.py -- all 3 roles in one
 aws_backend.py -- DynamoDB + OpenSearch Serverless + Neptune
 azure_backend.py -- Cosmos DB DiskANN + NetworkX
 gcp_backend.py -- AlloyDB pgvector + AGE + ScaNN
 schema.sql -- v4 Postgres schema (RLS, bi-temporal columns, content_tsv)
 conversation/
 ingest.py -- ingest_round() pipeline
 extraction/
 round_extractor.py -- 2-pass speaker-locked extraction
 conflict_resolver.py -- 4-decision contract (ADD/UPDATE/INVALIDATE/NOOP)
 rule_based.py -- deterministic fact extraction (no LLM)
 prompts.py -- shared prompt templates
 consolidation/
 consolidator.py -- sleep-time re-extraction
 reflection.py -- cross-thread synthesis (stable patterns + flagged contradictions)
 graph/
 extractor.py -- entity / relation extraction
 retrieval/
 orchestrator.py -- 6-step semantic-first pipeline
 tag_matcher.py
 scorer.py -- MMR, confidence decay, entity boost, fit-to-budget
 trace.py -- JSONL trace writer
 temporal/
 manager.py -- timelines, supersession, contradiction detection, as_of replay
 identity/
 signing.py -- Ed25519 provenance signing (optional)
 defaults.py -- SOLO mode auto-provisioning
 mcp/
 server.py -- MCP server (tools, resources, prompts)
 hooks/
 session_start.py
 post_tool_use.py
 stop.py
 ui/
 app.py -- Starlette read-only viewer
 static/, templates/ -- Evidence Board UI
 utils/
 config.py, tokens.py
 infra/
 local/ -- Docker Compose (Postgres + Neo4j)
 aws_arango/ -- Reference Terraform
tests/ -- Unit tests; live cloud tests env-gated
evals/ -- LongMemEval / LoCoMo / MultiAgentBench / AbstentionBench harnesses
docs/ -- Architecture notes, ADRs
commands/ -- /install-attestor, etc.
scripts/ -- lme_smoke_local.py, etc.

Development

poetry install
poetry run pytest tests/ -q # unit tests, no external services needed
ATTESTOR_LIVE_PG=1 poetry run pytest tests/live -q # live integration (env-gated)

Style: black formatting, isort imports, ruff lint, mypy types. PEP 8, type-annotated signatures, dataclasses for DTOs. Many small files (200–400 lines typical, 800 max).

Conventions worth knowing:

  • Postgres is the source of truth. Neo4j is derived; rebuild it from Postgres if it drifts.
  • Non-fatal errors in vector / graph paths are caught and logged. The document path never silently breaks.
  • Configuration layering: env vars β†’ ~/.attestor.toml β†’ in-code overrides.
  • Two write paths: add() for structured (lightweight rule-based supersession), ingest_round() for conversational (full 2-pass + 4-decision contract).

Health check

Always call this first when integrating:

attestor doctor # CLI
mem = AgentMemory()
print(mem.health()) # Python API
// MCP
{ "tool": "memory_health" }

It probes Document Store (Postgres), Vector Store (pgvector), Graph Store (Neo4j), and the retrieval pipeline. All four are required for the default topology β€” graph expansion is step 2 of the canonical pipeline, not an optional accelerator. Transient vector-probe failures surface in the recall() trace (vector_error) so callers can distinguish a degraded result from a clean one.


Status & versioning

  • Version: 4.0.0a4 (alpha) β€” published to PyPI and the MCP Registry as io.github.bolnet/attestor
  • v3 β†’ v4: greenfield rebuild on a v4-native Postgres schema with hard tenant isolation, bi-temporal facts, and a no-LLM retrieval critical path. There is no automated migration. v3 was alpha-only with no production users; drop your v3 DB and reinstall.
  • See CHANGELOG.md for the full track-by-track changelog.

License

MIT. See LICENSE.

Server Config

{
 "mcpServers": {
 "attestor": {
 "command": "attestor",
 "args": [
 "mcp"
 ],
 "env": {
 "ATTESTOR_DISABLE_LOCAL_EMBED": "1"
 }
 }
 }
}
Β© 2025 MCP.so. All rights reserved.

Build with ShipAny.