mnemon-mcp

Persistent layered memory for AI agents — 4-layer model, FTS5 search, fact versioning, EN+RU stemming. Local-first, zero-cloud, single SQLite file.

mnemon-mcp

CI npm version Node.js License: MIT

Persistent layered memory for AI agents. Local-first. Zero-cloud. Single SQLite file.

Landing Page · npm · GitHub

Your AI agent forgets everything after each session. Mnemon fixes that.

It gives any MCP-compatible client — OpenClaw, Claude Code, Cursor, Windsurf, or your own — a structured long-term memory backed by a single SQLite database on your machine. No API keys, no cloud, no telemetry. Just npm install and your agent remembers.

<p align="center"> <img src="demo/mnemon-demo.gif" alt="mnemon-mcp demo — memory_add, memory_search, memory_inspect, memory_update" width="800"> </p>

Why Layered Memory?

Flat key-value stores treat "what happened yesterday" the same as "never commit without tests." That's wrong — different kinds of knowledge have different lifetimes and access patterns.

Mnemon organizes memories into four layers:

LayerWhat it storesHow it's accessedLifetime
EpisodicEvents, sessions, journal entriesBy date or periodDecays (30-day half-life)
SemanticFacts, preferences, relationshipsBy topic or entityStable
ProceduralRules, workflows, conventionsLoaded at startupRarely changes
ResourceReference material, book notesOn demandDecays slowly (90 days)

A journal entry from last Tuesday and a coding rule that never changes live in different layers — because they should.

Quick Start

Install

npm install -g mnemon-mcp

Or from source:

git clone https://github.com/nikitacometa/mnemon-mcp.git
cd mnemon-mcp && npm install && npm run build

Configure Your MCP Client

<details open> <summary><strong>OpenClaw</strong></summary>
openclaw mcp register mnemon-mcp --command="mnemon-mcp"

Or add to ~/.openclaw/mcp_config.json:

{
  "mnemon-mcp": {
    "command": "mnemon-mcp"
  }
}
</details> <details> <summary><strong>Claude Code</strong></summary>

Add to ~/.claude/mcp.json:

{
  "mcpServers": {
    "mnemon-mcp": {
      "command": "mnemon-mcp"
    }
  }
}
</details> <details> <summary><strong>Cursor / Windsurf / Other MCP clients</strong></summary>

Add to your client's MCP config:

{
  "mcpServers": {
    "mnemon-mcp": {
      "command": "mnemon-mcp"
    }
  }
}
</details> <details> <summary><strong>Running from source?</strong></summary>

Use the full path to the compiled entry point:

{
  "mnemon-mcp": {
    "command": "node",
    "args": ["/absolute/path/to/mnemon-mcp/dist/index.js"]
  }
}
</details>

Verify

echo '{"jsonrpc":"2.0","method":"tools/list","id":1}' | mnemon-mcp

You should see 10 tools in the response. The database (~/.mnemon-mcp/memory.db) is created automatically on first run.

That's it. Your agent now has persistent memory.

What It Can Do

10 MCP Tools

ToolWhat it does
memory_addStore a memory with layer, entity, confidence, importance, and optional TTL
memory_searchFull-text or exact search with filters by layer, entity, date, scope, confidence
memory_updateUpdate in-place or create a versioned replacement (superseding chain)
memory_deleteDelete a memory; re-activates its predecessor if any
memory_inspectGet layer statistics or trace a single memory's version history
memory_exportExport to JSON, Markdown, or Claude-md format with filters
memory_healthRun diagnostics: expired entries, orphaned chains, stale memories; optionally GC
memory_session_startStart an agent session — returns session ID for grouping memories
memory_session_endEnd a session with optional summary; returns duration and memory count
memory_session_listList sessions with filters by client, project, or active status

MCP Resources & Prompts

Resources — live data your agent can read:

URIReturns
memory://statsAggregate stats per layer
memory://recentMemories created/updated in last 24h
memory://layer/{layer}All active memories in a layer
memory://entity/{name}All active memories about an entity

Prompts — pre-built workflows:

PromptPurpose
recall"Tell me everything you know about X"
context-loadLoad relevant context before starting a task
journalCreate a structured journal entry

Search

Two modes, both supporting layer / entity / scope / date / confidence filters:

FTS mode (default) — tokenized full-text search with BM25 ranking. Multi-word queries use AND; if too few results, OR supplements with a score penalty. Progressive AND relaxation tries top-3 most specific terms before falling back to full OR.

Scores: bm25 × (0.3 + 0.7 × importance) × decay(layer) × recency

Recency boost: 1 / (1 + daysSince / 365) — gently rewards recently created memories without penalizing old ones.

Exact modeLIKE substring match for precise phrase lookups.

Stemming

Snowball stemmer applied at both index time and query time for English and Russian. This means "running" matches "runs", and "книги" matches "книга". Stop words are filtered from queries to improve precision.

Fact Versioning

Knowledge evolves. Mnemon doesn't delete old facts — it chains them:

v1: "Team uses React 17"  →  superseded_by: v2
v2: "Team uses React 19"  →  supersedes: v1 (active)

Search returns only the latest version. memory_inspect with include_history: true reveals the full chain. memory_delete re-activates the predecessor — nothing is lost.

Vector Search (Optional, BYOK)

Enable semantic similarity search by providing your own embedding API:

# OpenAI
MNEMON_EMBEDDING_PROVIDER=openai MNEMON_EMBEDDING_API_KEY=sk-... mnemon-mcp

# Ollama (local, free)
MNEMON_EMBEDDING_PROVIDER=ollama mnemon-mcp

This unlocks two additional search modes:

  • mode: "vector" — pure cosine similarity search
  • mode: "hybrid" — FTS5 + vector combined via Reciprocal Rank Fusion

Requires sqlite-vec (installed as optional dependency). New memories are embedded on add; existing ones can be backfilled.

<details> <summary>Embedding configuration</summary>
VariableDefaultDescription
MNEMON_EMBEDDING_PROVIDERopenai or ollama (unset = disabled)
MNEMON_EMBEDDING_API_KEYAPI key (required for OpenAI)
MNEMON_EMBEDDING_MODELtext-embedding-3-small / nomic-embed-textModel name
MNEMON_EMBEDDING_DIMENSIONS1024 / 768Vector dimensions
MNEMON_OLLAMA_URLhttp://localhost:11434Ollama endpoint
</details>

Importing a Knowledge Base

Got a folder of Markdown files? Import them in bulk:

cp config.example.json ~/.mnemon-mcp/config.json   # edit this first
npm run import:kb -- --kb-path /path/to/your/kb     # incremental (skips unchanged files)

The config maps glob patterns to memory layers:

{
  "owner_name": "your-name",
  "extra_stop_words": [],
  "mappings": [
    {
      "glob": "journal/*.md",
      "layer": "episodic",
      "entity_type": "user",
      "entity_name": "$owner",
      "importance": 0.6,
      "split": "h2"
    },
    {
      "glob": "people/*.md",
      "layer": "semantic",
      "entity_type": "person",
      "entity_name": "from-heading",
      "importance": 0.8,
      "split": "h3"
    }
  ]
}

Config Fields

FieldTypeDescription
owner_namestringYour name — used for $owner substitution in entity_name
extra_stop_wordsstring[]Words to filter from FTS queries (e.g., your name forms)
globstringFile pattern to match
layerstringTarget memory layer
entity_typestringuser / person / project / concept / file / rule / tool
entity_namestringLiteral name, "$owner", or "from-heading" (extract from H2/H3)
splitstring"whole" (one memory per file), "h2", or "h3" (split on headings)
importancenumber0.0–1.0, affects search ranking
confidencenumber0.0–1.0, filterable in search
scopestringOptional namespace

HTTP Transport

For remote or multi-client setups:

MNEMON_AUTH_TOKEN=your-secret MNEMON_PORT=3000 npm run start:http
EndpointDescription
POST /mcpMCP JSON-RPC (Bearer auth if token set)
GET /health{"status":"ok","version":"..."}

Rate limiting (100 req/min/IP by default), CORS headers, 1MB body limit, timing-safe auth, graceful shutdown on SIGTERM.

Configuration Reference

VariableDefaultDescription
MNEMON_DB_PATH~/.mnemon-mcp/memory.dbDatabase path
MNEMON_KB_PATH.Knowledge base root for import
MNEMON_CONFIG_PATH~/.mnemon-mcp/config.jsonImport config path
MNEMON_AUTH_TOKENBearer token for HTTP transport
MNEMON_PORT3000HTTP transport port
MNEMON_CORS_ORIGIN*CORS Access-Control-Allow-Origin
MNEMON_RATE_LIMIT100Max requests per minute per IP (0 = off)

Tool Reference

<details> <summary><code>memory_add</code> — full parameter list</summary>
ParameterTypeRequiredDescription
contentstringYesMemory text (max 100K chars)
layerstringYesepisodic / semantic / procedural / resource
titlestringNoShort title (max 500 chars)
entity_typestringNouser / project / person / concept / file / rule / tool
entity_namestringNoEntity name for filtering
confidencenumberNo0.0–1.0 (default 0.8)
importancenumberNo0.0–1.0 (default 0.5)
scopestringNoNamespace (default global)
source_filestringNoSource file path — triggers auto-supersede of matching entries
ttl_daysnumberNoAuto-expire after N days
valid_from / valid_untilstringNoTemporal fact window (ISO 8601)
</details> <details> <summary><code>memory_search</code> — full parameter list</summary>
ParameterTypeRequiredDescription
querystringYesSearch text
modestringNofts (default), exact, vector, hybrid
layersstring[]NoFilter by layers
entity_namestringNoFilter by entity (supports aliases)
scopestringNoFilter by scope
date_from / date_tostringNoDate range (ISO 8601)
as_ofstringNoTemporal fact filter — facts valid at this date
min_confidencenumberNoMinimum confidence
min_importancenumberNoMinimum importance
limitnumberNoMax results (default 10, max 100)
offsetnumberNoPagination offset
</details> <details> <summary><code>memory_update</code> — full parameter list</summary>
ParameterTypeRequiredDescription
idstringYesMemory ID
contentstringNoNew content
titlestringNoNew title
confidencenumberNoNew confidence
importancenumberNoNew importance
supersedebooleanNotrue = versioned replacement; false (default) = in-place
new_contentstringNoContent for superseding entry
</details> <details> <summary><code>memory_delete</code></summary>
ParameterTypeRequiredDescription
idstringYesMemory ID. Re-activates predecessor if part of a superseding chain
</details> <details> <summary><code>memory_inspect</code></summary>
ParameterTypeRequiredDescription
idstringNoMemory ID (omit for aggregate stats)
layerstringNoFilter stats by layer
entity_namestringNoFilter stats by entity
include_historybooleanNoShow superseding chain
</details> <details> <summary><code>memory_export</code></summary>
ParameterTypeRequiredDescription
formatstringYesjson / markdown / claude-md
layersstring[]NoFilter by layers
scopestringNoFilter by scope
date_from / date_tostringNoDate range
limitnumberNoMax entries (default all, max 10K)
</details> <details> <summary><code>memory_health</code></summary>
ParameterTypeRequiredDescription
cleanupbooleanNotrue = garbage-collect expired entries (default: report only)

Returns: status (healthy / warning / degraded), per-layer stats, expired entries, orphaned chains, stale/low-confidence counts, cleaned count when cleanup=true.

</details> <details> <summary><code>memory_session_start</code></summary>
ParameterTypeRequiredDescription
clientstringYesClient identifier (e.g. claude-code, cursor, api)
projectstringNoProject scope for this session
metaobjectNoAdditional session metadata

Returns: id (session UUID), started_at (ISO 8601).

</details> <details> <summary><code>memory_session_end</code></summary>
ParameterTypeRequiredDescription
idstringYesSession ID to end
summarystringNoSummary of what was accomplished (max 10K chars)

Returns: id, ended_at, duration_minutes, memories_count.

</details> <details> <summary><code>memory_session_list</code></summary>
ParameterTypeRequiredDescription
limitnumberNoMax sessions (default 20, max 100)
clientstringNoFilter by client
projectstringNoFilter by project
active_onlybooleanNoOnly return sessions that haven't ended (default false)

Returns: array of sessions with id, client, project, started_at, ended_at, summary, memories_count.

</details>

How It Compares

mnemon-mcpmem0basic-memoryAnthropic KG
ArchitectureSQLite FTS5Cloud API + QdrantMarkdown + vectorJSON file
Memory structure4 typed layersFlatFlatGraph
Fact versioningSuperseding chainsPartialNoNo
StemmingEN + RU (Snowball)EN onlyEN onlyNone
OpenClaw supportNative MCPNoNoNo
Dependencies0 requiredQdrant, Neo4j, OllamaFastEmbed, Python 3.12None
Cloud requiredNoYesNo (SaaS optional)No
CostFree$19–249/moFree + SaaSFree
Setupnpm install -gDocker + API keyspip + depsBuilt-in
LicenseMITApache 2.0AGPLMIT

Development

npm run dev        # run via tsx (no build step)
npm run build      # TypeScript → dist/
npm test           # vitest (194 tests)
npm run bench      # performance benchmarks
npm run db:backup  # backup database

Stack: TypeScript 5.9 (strict mode), better-sqlite3, @modelcontextprotocol/sdk, Snowball stemmer, Zod, vitest.

See CONTRIBUTING.md for code guidelines.

Design Principles

  • Air-gapped — zero network calls, zero telemetry. Your memories stay on your machine.
  • Single file — one SQLite database, zero ops, instant backup via file copy.
  • Deterministic search — FTS5, not embeddings, is the default. Interpretable, reproducible, no GPU needed.
  • Structured over flat — layers encode access patterns; superseding chains encode time.
  • Minimal — 4 production dependencies. Works everywhere Node runs.

License

MIT

Related Servers