Mnemory MCP Server

Un sistema de memoria autogestionado, seguro y rico en funciones para agentes y asistentes de IA. Proporciona extracción inteligente de hechos y deduplicación, con un almacén de artefactos para contenido detallado.

Documentación

mnemory banner

mnemory

Give your AI agents persistent memory. mnemory is a self-hosted MCP server that adds personalization and long-term memory to any AI assistant — Claude Code, ChatGPT, Open WebUI, Cursor, or any MCP-compatible client.

Plug and play. Connect mnemory and your agent immediately starts remembering user preferences, facts, decisions, and context across conversations. No system prompt changes needed.

Self-hosted and secure. Your data stays on your infrastructure. No cloud dependencies, no third-party access to your memories.

Intelligent. Uses a unified LLM pipeline for fact extraction, deduplication, and contradiction resolution in a single call. Memories are semantically searchable, automatically categorized, and expire naturally when no longer relevant.

Features

  • Zero configuvx mnemory, connect your MCP client, done. Works out of the box with any OpenAI-compatible API.
  • Intelligent extraction — A single LLM call extracts facts, classifies metadata, and deduplicates against existing memories.
  • Contradiction resolution — "I drive a Skoda" + later "I bought a Tesla" = automatic update, not a duplicate.
  • Two-tier memory — Fast searchable summaries in a vector store + detailed artifact storage (reports, code, research) retrieved on demand.
  • AI-powered search — Multi-query semantic search with temporal awareness. Ask "What did I decide last week about the database?" and it finds the right memories.
  • Memory health checks — Built-in three-phase consistency checker (fsck) detects duplicates, contradictions, quality issues, and prompt injection. Run manually or on a schedule with auto-fix.
  • 10+ client support — Claude Code, ChatGPT, Open WebUI, OpenClaw, Cursor, Windsurf, Cline, OpenCode, and more. Native plugins available for automatic recall/remember.
  • Built-in management UI — Dashboard, semantic search, memory browser with full CRUD, relationship graph visualization, and health check interface. No extra tools needed.
  • Production ready — Qdrant for vectors, S3/MinIO for artifacts, API key or Cognis JWT authentication, per-user isolation, Kubernetes-friendly stateless HTTP.
  • Secure by default — API key or Cognis JWT authentication with session-level identity binding, per-user memory isolation, anti-injection safeguards in extraction prompts.
  • REST API + MCP — Dual interface with the same backend. 16 MCP tools + full REST API with OpenAPI spec. Build plugins, integrations, or use directly.
  • Prometheus monitoring — Built-in /metrics endpoint with operation counters and memory gauges. Pre-built Grafana dashboard included.

Quick Start

mnemory needs an OpenAI-compatible API key for LLM and embeddings. It picks up OPENAI_API_KEY from your environment automatically.

uvx mnemory

That's it. mnemory starts on http://localhost:8050/mcp, stores data in ~/.mnemory/.

Now connect your client — for Claude Code, add to your MCP config:

{
  "mcpServers": {
    "mnemory": {
      "type": "streamable-http",
      "url": "http://localhost:8050/mcp",
      "headers": {
        "X-Agent-Id": "claude-code"
      }
    }
  }
}

Start a new conversation. Memory works automatically.

Also available via Docker, pip, or production setup with Qdrant + S3. See the full quick start guide for more clients and options.

Screenshots

Dashboard — memory totals, breakdowns, and operation counts
Dashboard with memory breakdowns by type, category, and role

Search — semantic and AI-powered search with filters
Semantic search and AI-powered find with filters

Graph — D3.js force-directed memory relationship visualization
Memory relationship graph visualization

See all screenshots and UI features including memory browser, health checks, and artifact management.

Supported Clients

mnemory works with any MCP-compatible client. Some clients also have dedicated plugins for automatic recall/remember.

ClientMCPPluginSetup Guide
Claude CodeYesYes (hooks)Guide
ChatGPTYes (MCP connector)--Guide
Claude DesktopYes--Guide
Hermes AgentYesYes (plugin)Guide
Open WebUIYesYes (filter)Guide
OpenCodeYesYes (plugin)Guide
OpenClawYesYes (plugin)Guide
CursorYes--Guide
WindsurfYes--Guide
ClineYes--Guide
Continue.devYes--Guide
Codex CLIYes--Guide

MCP = works via Model Context Protocol (LLM-driven tool calls). Plugin = dedicated integration with automatic recall/remember (no LLM tool-calling needed).

How It Works

Storing: You share information naturally. mnemory extracts individual facts, classifies them (type, category, importance), checks for duplicates and contradictions against existing memories, and stores them as searchable vectors — all in a single LLM call.

Searching: Ask a question and mnemory generates multiple search queries covering different angles and associations, runs them in parallel, and reranks results by relevance. Temporal-aware — "what did I decide last week?" just works.

Recalling: At conversation start, your agent loads pinned memories (core facts, preferences, identity) plus recent context. During conversation, relevant memories are found automatically based on what you're discussing.

Maintaining: Memories have configurable TTL — context expires in 7 days, episodic memories in 90. Frequently accessed memories stay alive (reinforcement). The built-in health checker detects and fixes duplicates, contradictions, and quality issues.

Learn more in the architecture docs.

Benchmark

Evaluated on the LoCoMo benchmark — 10 multi-session dialogues with 1540 QA questions across 4 categories:

Systemsingle_hopmulti_hoptemporalopen_domainOverall
mnemory63.153.174.878.273.2
mnemory (gpt-oss-120b)66.359.468.573.870.5
Memobase70.952.185.077.275.8
Mem0-Graph65.747.258.175.768.4
Mem067.151.255.572.966.9
Zep61.741.449.376.666.0
LangMem62.247.923.471.158.1

Configuration: gpt-5-mini for extraction, text-embedding-3-small for vectors. gpt-oss-120b via Groq is a budget alternative at ~5x lower cost with comparable quality. See configuration docs for model options and benchmarks/ for reproduction.

Documentation

DocumentDescription
Quick StartGet running in 5 minutes with any client
ConfigurationAll environment variables — LLM, storage, server, memory behavior
Memory ModelTypes, categories, importance, TTL, roles, scoping, sub-agents
MCP Tools16 MCP tools — memory CRUD, search, artifacts
REST APIFull REST API, fsck pipeline, recall/remember endpoints
ArchitectureSystem diagram, detailed flows for storing/searching/recalling
Management UIScreenshots, features, access, UI development
MonitoringPrometheus metrics, Grafana dashboard
DeploymentProduction setup, Docker, authentication, Kubernetes
DevelopmentBuilding, testing, linting, contributing
Client GuidesPer-client setup instructions (10 clients)
System PromptsTemplates for personality agents and custom setups

License

Apache 2.0