Memclaw MCP Server

MemClaw — persistent memory for AI agent fleets (OSS) — Trending history, engagement metrics, and Reddit & Hacker News discussions on Trendshift

Documentation

MemClaw

Fleet memory for AI agents — governed, shared, self-improving.

License GitHub Stars CI Release Join us on Discord

Quick Start · Features · Performance · MCP · API Reference · Plugin Docs · Contributing · Discord


MemClaw — Fleet memory for AI agents

MemClaw is open-source memory for multi-tenant, multi-agent AI fleets. Your agents store what they learn, find what the fleet knows, and get smarter with every interaction — learning from each other instead of repeating mistakes.

Agents write plain text. MemClaw turns it into searchable, governed, self-improving memory.

One loop, three pillars: write, recall, compound — every interaction makes the next one smarter.

Built for fleets, not single agents. Public agent-memory benchmarks (LoCoMo, LongMemEval) measure one agent, one user, one long conversation — the single-chatbot shape. The deployment shape we see in production is the opposite: dozens or thousands of agents working on behalf of a company, sharing what they learn under governance. MemClaw is architected around that shape from day one — scoped memory, cross-agent outcome propagation, fleet-wide trust tiers — and competes on the axes that compound with agent count: latency, token efficiency, and governance. See Performance for the numbers, or read the benchmarks write-up.

In production at eToro (NASDAQ: ETOR): 300+ AI agents on one governed memory — 26,500+ memories, 1,372 shared skills, 23 ms p50 search. Architecture deep-dive →

MemClaw — Fleet Memory that Compounds

MemClaw demo — write, recall, and governed cross-fleet memory in action


Quick Start

Try it locally — no API key, no signup

The fastest way to see MemClaw work. Standalone mode runs single-tenant with auth bypassed — write and recall a memory in four commands. (It boots with dummy embeddings so there's nothing to configure; add an AI provider key for semantic search — see Self-Hosted below.)

git clone https://github.com/caura-ai/caura-memclaw.git
cd caura-memclaw
cp .env.example .env && echo "IS_STANDALONE=true" >> .env   # single-tenant, no API key
docker compose up -d                                        # Postgres + pgvector + Redis + API (~30s)

# Write a memory — no API key needed
curl -X POST http://localhost:8000/api/v1/memories \
  -H "X-API-Key: standalone" -H "Content-Type: application/json" \
  -d '{"tenant_id": "default", "content": "Our auth service uses JWT with 15-minute expiry."}'

# Search for it
curl -X POST http://localhost:8000/api/v1/search \
  -H "X-API-Key: standalone" -H "Content-Type: application/json" \
  -d '{"tenant_id": "default", "query": "authentication token lifetime"}'

The write response comes back enriched with an LLM-inferred memory_type, title, summary, tags, status, and weight — all from a single content field.

Ready for semantic recall, multi-tenant, a managed host, or an OpenClaw fleet? Pick a path below.


Three paths — pick the one that matches your setup:

PathWhenTime to first memory
Managed platformQuickest. We host the DB + scaling.~2 min
Self-hosted (Docker)Privacy / on-prem / air-gapped.~5 min
OpenClaw pluginYou already run an OpenClaw fleet — install MemClaw as a plugin against any of the above.~3 min

Managed Platform

Get up and running in minutes — no infrastructure, automatic updates, usage analytics, and enterprise-grade security included.

  1. Sign up free on memclaw.net
  2. Grab your API key from the dashboard
  3. Connect via MCP or REST:
{
  "mcpServers": {
    "memclaw": {
      "url": "https://memclaw.net/mcp",
      "headers": { "X-API-Key": "mc_your_api_key_here" }
    }
  }
}

Production / team use: the quickstart key above is a tenant-scoped credential — fine for personal use, but a fleet of agents should bind each one to its own agent-scoped credential for trust gating, fleet membership, and per-agent keystones. Provision agent-scoped credentials atomically via POST /api/v1/admin/agent-keys/provision, or through the dashboard at /settings/organization/api-credentials. Both kinds use the mc_ prefix on the wire — scope is bound at mint time on the credential itself. The MCP server accepts the credential on either X-API-Key: mc_… or Authorization: Bearer mc_…. (Pre-existing mca_… and mci_… keys continue to authenticate via back-compat.)

Using a tenant-scoped credential? Pass an explicit agent_id on every MCP tool call — the gateway refuses the reserved default (mcp-agent) on the tenant-scoped path.

Self-Hosted (Open Source)

The fastest path is Docker Compose — one command brings up Postgres + pgvector + Redis + the API.

Prefer not to use Docker? Skip to Manual deployment (Python + Postgres) below for the bare-Python path.

No cloud API key, no external calls? v2.0+ supports a self-hosted local embedder (BAAI/bge-m3 via HuggingFace TEI) — see docs/local-embedder.md. The setup below walks through the OpenAI default; the local-embedder doc walks through the alternative.

Prerequisites

  • Docker Engine 24+ (Linux) or Docker Desktop (macOS / Windows). Confirm with docker --version.
  • Docker Compose v2 (built into modern Docker). Confirm with docker compose version.
  • Git for cloning.
  • ~2 GB free disk for images + Postgres data volume.

1. Clone and configure

git clone https://github.com/caura-ai/caura-memclaw.git
cd caura-memclaw
cp .env.example .env

Set your AI provider in .env — minimal setup with OpenAI:

EMBEDDING_PROVIDER=openai
ENTITY_EXTRACTION_PROVIDER=openai
USE_LLM_FOR_MEMORY_CREATION=true
OPENAI_API_KEY=sk-...

Without any AI keys the stack still starts — dummy providers return non-semantic embeddings, useful for testing the API surface.

💡 Want zero cloud API calls? v2.0+ ships a self-hosted embedder profile (BAAI/bge-m3 on a HuggingFace TEI sidecar). Bring up the stack with docker compose --profile embed-local up -d and set the four OPENAI_EMBEDDING_* envs from .env.example — see docs/local-embedder.md for the full setup. Combined with IS_STANDALONE=true (below) this is a fully self-contained deployment with no external API calls.

Other providers (Gemini, Anthropic, OpenRouter, self-hosted)
Provider.env settingsRequired key
OpenAI (default)EMBEDDING_PROVIDER=openai
ENTITY_EXTRACTION_PROVIDER=openai
OPENAI_API_KEY
Google GeminiEMBEDDING_PROVIDER=openai
ENTITY_EXTRACTION_PROVIDER=gemini
GEMINI_API_KEY + OPENAI_API_KEY
AnthropicEMBEDDING_PROVIDER=openai
ENTITY_EXTRACTION_PROVIDER=anthropic
ANTHROPIC_API_KEY + OPENAI_API_KEY
OpenRouterEMBEDDING_PROVIDER=openai
ENTITY_EXTRACTION_PROVIDER=openrouter
OPENROUTER_API_KEY + OPENAI_API_KEY
Self-hosted (TEI / bge-m3)--profile embed-local + OPENAI_EMBEDDING_BASE_URL=http://tei:80/v1
+ OPENAI_EMBEDDING_MODEL=BAAI/bge-m3
+ OPENAI_EMBEDDING_SEND_DIMENSIONS=false
none — runs locally

Anthropic, Gemini, and OpenRouter don't offer embedding APIs here — pair them with OpenAI (or with TEI) for embeddings. You can mix providers freely. Gemini uses the Google AI Studio key-auth Developer API (no GCP project/ADC required). The self-hosted TEI row keeps EMBEDDING_PROVIDER=openai because TEI speaks the same OpenAI-compatible API; see docs/local-embedder.md for hardware sizing, GPU setup, and model swapping.

2. Start the stack

docker compose up -d

By default this pulls the multi-arch images from ghcr.io (linux/amd64 + linux/arm64) on first run — takes ~30 seconds. Subsequent up commands re-use the cached image (no registry round-trip, works offline). To pin a specific version, set MEMCLAW_VERSION=v1.2.3 in your .env. To build from local source instead (e.g. when iterating on a fork), run docker compose up --build --no-pull.

To upgrade to a newer image at the same tag (e.g. :latest after we cut a new release), run docker compose pull && docker compose up -d. Without an explicit pull, the local cache wins — there's no silent version drift.

Offline / air-gapped operation: depending on whether the image is already cached locally:

  • Image cached, no network: docker compose up -d works as-is — pull_policy: missing doesn't try to pull when the image is present. Use docker compose up --no-pull if you want to be explicit.
  • No local image, no network: docker compose up --build --no-pull (build from source, don't try to pull).
  • Strict no-network guarantee (e.g. an air-gapped pipeline that should never reach ghcr.io): drop a docker-compose.override.yml setting pull_policy: never for both services — Compose then fails fast if the image is absent rather than attempting a pull.
ServiceURL
Core API (REST + MCP)http://localhost:8000
Core Storage APIhttp://localhost:8002
PostgreSQL (pgvector)localhost:5432
Redislocalhost:6379

3. Verify

curl http://localhost:8000/api/v1/health
# {"status":"ok","storage":"connected","redis":"connected","event_bus":"ok"}

4. Write and search

# Write a memory (standalone mode — no API key needed)
curl -X POST http://localhost:8000/api/v1/memories \
  -H "X-API-Key: standalone" \
  -H "Content-Type: application/json" \
  -d '{"tenant_id": "default", "content": "Our auth service uses JWT with 15-minute expiry."}'

# Search for it
curl -X POST http://localhost:8000/api/v1/search \
  -H "X-API-Key: standalone" \
  -H "Content-Type: application/json" \
  -d '{"tenant_id": "default", "query": "authentication token lifetime"}'

The write response carries an LLM-inferred memory_type, title, summary, tags, status, and a weight (the importance score) — all derived from a single content field. On the default fast-write path, enrichment is applied asynchronously: the immediate response is marked enrichment_pending and the inferred fields populate within moments.

POST /search returns matches under an items array, each entry the full memory plus a similarity score:

{
  "items": [
    {
      "id": "…",
      "agent_id": "mcp-agent",
      "memory_type": "fact",
      "title": "Auth service uses JWT with 15-minute expiry",
      "similarity": 0.47,
      "visibility": "scope_team",
      "status": "active"
    }
  ]
}

Embedding is asynchronous too, so a just-written memory may not surface in semantic search for a moment after the write returns (watch metadata.embedding_pending); the non-semantic GET /memories list shows it immediately.


If MemClaw just worked for you, star the repo — it's how other fleet builders find us, and it shapes how much time we can invest in the OSS edition.


Auth modes

OSS supports three auth paths. Pick one and add it to your .env, then docker compose up -d to restart.

Standalone — single-tenant (tenant_id="default"), simplest for local / self-install:

IS_STANDALONE=true

No API key required for REST. MCP still expects a non-empty X-API-Key header — any value works.

Pair Standalone mode with --profile embed-local (see docs/local-embedder.md) for a fully self-contained deployment: no admin keys, no external API calls, all embeddings computed locally. Useful for offline / air-gapped environments and personal-laptop installs.

Admin key — multi-tenant with full access:

ADMIN_API_KEY=your-long-random-admin-key

Pass X-API-Key: your-long-random-admin-key and include tenant_id in request bodies / query params.

Shared gate — for network-exposed OSS deployments:

MEMCLAW_API_KEY=your-shared-key

Clients send X-API-Key: your-shared-key plus X-Tenant-ID: <tenant>.

See AGENT-INSTALL.md for the full agent self-install walkthrough.

Running tests
# Unit tests (no DB needed)
pytest tests/ -m "unit"

# All tests (requires PostgreSQL)
docker compose up -d db
pytest tests/ -m "not benchmark"

# Smoke test against live API (~30s, auto-cleanup)
python scripts/smoke_test.py --url http://localhost:8000 --api-key <admin-key>

OpenClaw Plugin

Already running an OpenClaw fleet? Install MemClaw as a plugin against either the managed platform or your self-hosted stack:

# Point at whichever URL hosts your MemClaw API
export MEMCLAW_URL=https://memclaw.net          # managed
# or:  export MEMCLAW_URL=http://localhost:8000  # self-hosted
export MEMCLAW_KEY=your-key                      # `standalone` works in self-hosted standalone mode
export MEMCLAW_FLEET=my-fleet

curl -sf -H "X-API-Key: $MEMCLAW_KEY" \
  "$MEMCLAW_URL/api/v1/install-plugin?fleet_id=$MEMCLAW_FLEET&api_url=$MEMCLAW_URL" | bash

# Restart the gateway to load the plugin
openclaw gateway restart

The plugin claims the OpenClaw memory slot (replacing memory-core) and exposes the same 12 MCP tools. Full setup, agent prompts, and trust levels: static/docs/integration-guide.md.

Python client

Talk to any MemClaw deployment (managed or self-hosted) from Python:

pip install memclaw-client
from memclaw_client import MemClaw

mc = MemClaw("mc_xxx", tenant_id="my-team", agent_id="my-agent")
mc.write("Q3 revenue target is $4M, set on 2026-04-15.")
print(mc.recall("Q3 revenue target").summary)

A thin wrapper over the REST API — see clients/python/ for the full client.

TypeScript client

Same, from TypeScript / JavaScript (Node 18+, zero dependencies):

npm install @caura/memclaw-client
import { MemClaw } from "@caura/memclaw-client";

const mc = new MemClaw("mc_xxx", { tenantId: "my-team", agentId: "my-agent" });
await mc.write("Q3 revenue target is $4M, set on 2026-04-15.");
console.log((await mc.recall("Q3 revenue target")).summary);

See clients/typescript/ for the full client.


Features

Governance

  • Tenant isolation — row-level database separation per tenant; PII auto-detected and flagged on every write (surfaced in memory metadata as contains_pii/pii_types)
  • Visibility scopes — every memory is stamped at write time: scope_agent (private), scope_team (fleet-wide, default), or scope_org (cross-fleet). Cross-fleet recall is permissioned, not open
  • Agent trust tiers — four levels control cross-fleet reads, writes, and deletes. Agents are either provisioned atomically via POST /admin/agent-keys/provision (recommended — mints key + row + trust + fleet in one call) or auto-registered on first write (legacy fallback)
  • Full audit log — every write, delete, and transition logged with tenant and scope context

Memory Pipeline

  • Single-pass LLM enrichment — every write auto-classifies into one of 14 memory types, generates title/summary/tags, scores importance, flags PII, and extracts entities — from a single content field
  • Hybrid search — pgvector semantic similarity + full-text keyword matching + knowledge graph expansion (up to 2 hops), ranked by composite score of similarity, importance, freshness, and graph boost
  • Live knowledge graph — people, orgs, locations, and concepts extracted into entities and relations on every write. Semantic entity resolution (>0.85 cosine) auto-merges duplicates
  • Contradiction detection — RDF triple comparison + LLM semantic analysis detects conflicting memories and automatically supersedes them, with full contradiction chain tracking

Self-Improving Memory

  • Outcome-based learning (Karpathy Loop) — agents report success/failure after acting on recalled memories; the system reinforces what works and auto-generates preventive rule-type memories on failure
  • Crystallization — LLM merges near-duplicate memories into canonical atomic facts with full provenance; 8-status lifecycle automation retires stale data
  • Per-agent retrieval tuning — each agent optimizes its own retrieval profile (top_k, min_similarity, graph_max_hops, blend weights) from feedback, so search quality compounds with every interaction

Integrations

  • MCP server — built-in Model Context Protocol at /mcp (Streamable HTTP). Connect Claude Desktop, Claude Code, Cursor, Windsurf, or any MCP client with a URL and API key
  • Multi-provider LLM — primary + fallback provider chain per tenant (OpenAI, Gemini, Anthropic, OpenRouter) with platform defaults for zero-config tenants
  • Document store — structured JSONB collections alongside semantic memories for exact-field lookups (customer records, config, task lists)

How MemClaw compares

Accuracy benchmarks cluster the leading tools in a narrow band (see Performance). Where the field actually diverges is fleet capability and governance:

CapabilityMemClawMem0ZepLetta
Multi-fleet support
Agent trust tiers + keystone policies
Cross-vendor memory sharing
Contradiction detection + supersession
Per-agent retrieval tuning
PII detection & flagging
Audit trail / provenance⚠️ partial
Knowledge graph (auto-extracted)⚠️
MCP-native⚠️
OSS licenseApache 2.0Apache 2.0Apache 2.0Apache 2.0

Mem0, Zep, and Letta are solid projects for single-agent memory. MemClaw's lane is governed memory across agent fleets — multiple agents, teams, and vendors on one auditable memory plane. Comparison reflects our reading of public docs as of June 2026 — corrections welcome via issue or PR.


Performance

Benchmarked against the two most-cited public agent-memory benchmarks. Full results, methodology, and how to reproduce them live in BENCHMARKS.md; operator-scale context is in docs/performance.md; the full write-up is on the blog.

LoCoMoLongMemEvalSearch latency
Accuracy (LLM-judge)77.6%72.5%
Token savings vs full context96.6%98.2%
Latency23 ms p50 · 27 ms p95

Accuracy sits inside the leading cluster across the field (Mem0, Zep, MemClaw — scores cluster in a narrow band). The axes we push hardest are latency and token efficiency, because those are the ones that compound as agent count grows — a few hundred ms of search latency disappears behind one LLM call, but bills millions of times a day across a fleet.

Single-agent benchmarks can't measure cross-agent recall, outcome propagation between agents, fleet-scoped visibility, or governance-aware retrieval. Those are the questions that decide whether a memory system is deployable inside a company. See docs/performance.md.

Source: Fast, Token-Efficient, and Built for Fleets (2026-04-19).


MCP (Model Context Protocol)

Add MemClaw to any MCP client with one config block.

Self-hosted (localhost):

{
  "mcpServers": {
    "memclaw": {
      "url": "http://localhost:8000/mcp",
      "headers": { "X-API-Key": "standalone" }
    }
  }
}

Managed platform (memclaw.net):

{
  "mcpServers": {
    "memclaw": {
      "url": "https://memclaw.net/mcp",
      "headers": { "X-API-Key": "mc_your_api_key_here" }
    }
  }
}

For team or production use, swap the tenant-scoped key for an agent-scoped credential — atomic provisioning via POST /api/v1/admin/agent-keys/provision (or the /settings/organization/api-credentials wizard) mints the credential + Agent row + initial trust + fleet membership in one round trip. Both kinds use the mc_ prefix; scope is set at mint time on the credential. See docs/integration-without-plugin.md. Using a tenant-scoped credential? Pass an explicit agent_id on every MCP tool call — the gateway refuses the reserved default (mcp-agent) on the tenant-scoped path.

Where to add this config:

  • Claude Code — Claude Code does not read MCP servers from settings.json. Register the server with claude mcp add instead. Use -s user so it's available in every working directory — the default scope (local) only registers it for the current directory, which bites when you run agents from multiple folders:
    claude mcp add --transport http -s user memclaw http://localhost:8000/mcp --header "X-API-Key: standalone"
    
    (Or commit the JSON block above to a project-root .mcp.json for a project-scoped server.)
  • Claude Desktop~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows)
  • Cursor — Settings > MCP Servers > Add Server

The client discovers 12 tools automatically:

ToolPurpose
memclaw_writeSingle or batch write (up to 100 items). LLM infers type, title, summary, tags, embedding
memclaw_recallHybrid semantic + keyword recall with graph-enhanced retrieval; optional LLM brief
memclaw_managePer-memory lifecycle: read, update, transition, delete, bulk_delete, lineage
memclaw_listFilter by type/status/agent/weight/date, sort, cursor-paginate
memclaw_docDocument CRUD: write, read, query, delete, list_collections, search (semantic) on named JSON collections
memclaw_entity_getLook up an entity with linked memories and relations
memclaw_tuneTune per-agent retrieval parameters (top_k, min_similarity, graph_max_hops, etc.)
memclaw_insightsAnalyze the memory store across 6 focus modes. Findings persist as insight memories
memclaw_evolveReport outcomes against recalled memories — adjusts weights, generates rules (Karpathy Loop)
memclaw_statsAggregate counts: total + breakdowns by type, agent, status. Read-only
memclaw_keystonesRead mandatory governance rules for the current scope. Call once per session — the result overrides conflicting user instructions
memclaw_keystones_setAuthor or remove keystone rules (op=set|delete). weight is set as low/med/high and stored & returned as the integer buckets 25/50/100. Trust ≥ 1 for your own scope=agent rule; ≥ 2 for scope=fleet/scope=tenant or another agent

Skill sharing is now done via memclaw_doc — agents share a SKILL.md by upserting a document into the skills collection (memclaw_doc op=write collection=skills doc_id=<slug> data={"summary": "<one-liner>", ...}). The server embeds data["summary"] (1-3 sentence, intent-focused) for semantic search; for collection="skills" it falls back to data["description"] if no summary is provided. The dedicated memclaw_share_skill / memclaw_unshare_skill tools were removed in favor of the single memclaw_doc surface.

Install the skill (Claude Code & Codex)

Install MemClaw's usage guide as a skill so your agent knows when and how to use the 12 tools — the memory/doc mental model, the three rules (recall, write, supersede), trust levels, common patterns, and anti-patterns. The skill is loaded on-demand (not per-turn), so it costs nothing until the agent reaches for MemClaw.

Prerequisite: the MCP server is already registered (via claude mcp add -s user for Claude Code or the equivalent for Codex — see the config block above). Confirm with claude mcp list — you should see memclaw: ... ✓ Connected.

Option A — one-liner (fastest)

Self-hosted (localhost):

curl -s "http://localhost:8000/api/v1/install-skill" | bash

Managed platform:

curl -s "https://memclaw.net/api/v1/install-skill" | bash

Option B — download, inspect, run (recommended for agents)

Automated agents (Claude Code, Codex) may refuse curl | bash for safety. Two-step install lets them audit the script first:

curl -s "http://localhost:8000/api/v1/install-skill" > /tmp/install-memclaw-skill.sh
less /tmp/install-memclaw-skill.sh      # review — it only does mkdir + curl + write
bash /tmp/install-memclaw-skill.sh

Options

Query paramEffect
(none)Install for both Claude Code and Codex (default)
?agent=claude-codeOnly Claude Code → ~/.claude/skills/memclaw/SKILL.md
?agent=codexOnly Codex → ~/.agents/skills/memclaw/SKILL.md

Verify

ls -la ~/.claude/skills/memclaw/SKILL.md       # Claude Code
ls -la ~/.agents/skills/memclaw/SKILL.md       # Codex

Restart your agent after installing — skills are loaded at startup. Re-run the installer any time to pull the latest version.

OpenClaw-plugin users get the skill automatically when the plugin installs; skip this step.


Deployment

The recommended way to run MemClaw is via Docker Compose (see Quick Start). This gives you a production-ready PostgreSQL + pgvector + Redis + API stack with a single command.

Published container images

Each release publishes multi-arch (linux/amd64, linux/arm64) images to GitHub Container Registry:

ghcr.io/caura-ai/caura-memclaw-core-api:v2.5.0
ghcr.io/caura-ai/caura-memclaw-core-storage-api:v2.5.0

Tags follow SemVer with floating aliases — :v1, :v1.0, :v1.0.0, plus :latest for the latest stable release. Pull them in your own compose file or Kubernetes manifests instead of building from source.

Manual deployment (without Docker)

The core-api/ service is a standard FastAPI app that runs under any ASGI server (uvicorn, hypercorn). Requirements:

  • Python 3.12+
  • PostgreSQL 16+ with the pgvector extension
  • Redis (optional — falls back to in-memory cache if unavailable)
uvicorn core_api.app:app --host 0.0.0.0 --port 8000 --workers 2

Deployment topologies

MemClaw ships with two operational modes for the storage layer. Single-node (default) is what you get from Docker Compose, pip install, or any fresh deploy — one core-storage-api instance serves both reads and writes. This is the right choice for any deployment that isn't seeing sustained 100+ writes/sec.

The reader/writer split is an opt-in topology for high-write-rate deploys that want to scale reads independently of writes — e.g. by pointing read traffic at a Postgres streaming replica. Enabling it means running two core-storage-api services with different roles and pointing core-api at both:

  • Set CORE_STORAGE_ROLE=writer on the write-serving instance; =reader on the read-serving instance(s).
  • Set CORE_STORAGE_READ_URL on core-api to the reader service URL. Leave CORE_STORAGE_API_URL pointing at the writer.
  • READ_DATABASE_URL on each core-storage-api can point at a read replica if you have one.

Defaults: CORE_STORAGE_ROLE=hybrid and CORE_STORAGE_READ_URL="" — both null-safe, so single-node deploys need zero configuration to get the legacy single-service behavior.


Upgrading from v1.x

⚠️ v2.0.0 ships a destructive schema migration. If your installation is on v1.x and has any memories already stored, follow this procedure carefully — the migration NULLs every existing embedding to widen the pgvector column from 768 → 1024 dim. The application is designed to refuse the migration automatically; you must opt in.

What changes

  • Default embedder model: BAAI/bge-m3 (was: OpenAI text-embedding-3-small). Self-hosted via the new tei profile in docker-compose; documented in docs/local-embedder.md.
  • pgvector schema dim: vector(1024) (was: vector(768)).
  • Existing embeddings on memories.embedding, entities.name_embedding, and documents.embedding are NULLed by alembic revision 012_vector_dim_1024. Re-embedding is required; until rows are re-embedded, semantic search returns no results for those rows.

Procedure (OSS, docker-compose)

  1. Stop the stack so no writes happen during migration:

    docker compose down
    
  2. Snapshot the database. A pg_dump is the safest fallback. Replace <container> with the running PostgreSQL container name (typically caura-memclaw-db-1):

    docker compose up -d db    # bring just the DB back
    docker exec <container> pg_dump -U memclaw memclaw > backup-pre-v2.sql
    docker compose down
    
  3. Pull the new image and start with the migration opt-in env set. The gate enforces an explicit opt-in because the migration is destructive on a populated DB:

    docker compose pull
    MEMCLAW_RUN_DESTRUCTIVE_MIGRATIONS=true docker compose up -d
    

    The core-storage-api container will run alembic upgrade head on startup. The migration runs in seconds-to-minutes for typical OSS workloads.

  4. Verify migration completed. Look for the line Database initialization complete in the core-storage-api logs:

    docker compose logs core-storage-api | grep -i "alembic\|migration"
    
  5. Re-embed your data. Two paths:

    • Lazy (zero action): the application re-embeds rows on next read or write that touches them. Search will return empty results for cold rows until they are touched. Acceptable for low-traffic personal deployments.
    • Eager (recommended): run the bundled backfill CLI. It walks every memory and entity with a NULL embedding and re-embeds via the configured provider. Idempotent — safe to re-run. First do a dry-run to estimate scope:
      docker compose run --rm core-storage-api \
        python -m core_storage_api.scripts.backfill_embeddings --dry-run
      
      Then the real run:
      docker compose run --rm core-storage-api \
        python -m core_storage_api.scripts.backfill_embeddings
      
      Optional knobs: --tenant-id <id> (per-tenant cutover), --batch-size N, --max-inflight N, --only-table memories|entities. Documents are NOT covered (their embed-source field is per-row JSON, not a fixed column); re-write any embedded documents to refresh them.
    • Eager (event-driven, recommended for multi-tenant production): if you run the core-worker service, drive the existing EMBED_REQUESTED consumer instead. The CLI scans WHERE embedding IS NULL and publishes one event per row, inheriting per-tenant concurrency + retry + DLQ:
      docker compose run --rm core-worker \
        python -m core_worker.cli backfill-embeddings --dry-run
      docker compose run --rm core-worker \
        python -m core_worker.cli backfill-embeddings
      
      Same knobs as the standalone script (--tenant-id, --batch-size, --max-inflight, --dry-run). Currently covers memories only.
  6. Once stable, unset MEMCLAW_RUN_DESTRUCTIVE_MIGRATIONS so subsequent up commands don't carry the opt-in:

    unset MEMCLAW_RUN_DESTRUCTIVE_MIGRATIONS  # if exported in the shell
    # or remove the line from your .env file
    

What if I skip the opt-in?

core-storage-api will refuse to start, with a clear error message reporting how many rows would be NULLed. The container exits non-zero; the rest of the stack stays healthy. No data is touched. Set the env var and retry.

Rolling back

The migration has a symmetric downgrade(). To revert, set the env var and explicitly downgrade:

docker compose run --rm \
  -e MEMCLAW_RUN_DESTRUCTIVE_MIGRATIONS=true \
  core-storage-api alembic downgrade 011

This NULLs any 1024-dim embeddings written since the upgrade and widens the columns back to vector(768). The same data-loss tradeoff applies in reverse. For untouched-since-upgrade installations, the simpler recovery is to restore the snapshot from step 2.

v1.x → v2.x compatibility for client code

No public API changes. Code that reads memory embeddings via the search/recall endpoints is unaffected. Client code that hardcodes 768 for vector lengths should be updated to read VECTOR_DIM from common.constants.


API Reference

All routes are versioned under /api/v1/. Interactive Swagger docs at /api/docs.

Memory endpoints
EndpointMethodDescription
/memoriesPOSTWrite a memory. LLM enrichment + embedding + entity extraction + contradiction detection. "persist": false for extract-only preview
/memories/bulkPOSTWrite up to 100 memories. Batches embeddings, parallelizes enrichment, single transaction. Requires X-Bulk-Attempt-Id header (per-attempt idempotency); a retry with the same id resolves committed rows as duplicate_attempt instead of duplicating. Returns 200 (clean / all-error) or 207 Multi-Status (mixed) — read per-item status
/memoriesGETList memories (filter by type, status, agent; paginate)
/memories/{id}GETFull memory detail (embedding stats, entity links, RDF triple, temporal bounds)
/memories/{id}PATCHUpdate content or metadata. Re-embeds if content changes
/memories/{id}DELETESoft delete (sets status to deleted)
/memories/{id}/statusPATCHUpdate lifecycle status
/memories/{id}/contradictionsGETView contradiction chain
/memoriesDELETEBulk soft-delete
/memories/statsGETCounts by type, agent, and status
/searchPOSTHybrid semantic + keyword search with graph-enhanced retrieval
/recallPOSTSearch + LLM summarization — returns context paragraph + source memories
/ingest/previewPOSTExtract 5-20 atomic facts from a URL or text (no writes)
/ingest/commitPOSTWrite previewed facts as memories
Knowledge graph endpoints
EndpointMethodDescription
/entitiesGETList entities (filter by type, search)
/entities/upsertPOSTCreate or update entity
/entities/{id}GETEntity detail with relations and linked memories
/relations/upsertPOSTCreate or update relation
/graphGETFull knowledge graph (entities + relations)
Evolve, Insights, Agents, Crystallizer, Documents, Fleet, Admin

Karpathy Loop / Evolve

EndpointMethodDescription
/evolve/reportPOSTReport an outcome (success/failure/partial) against recalled memories

Insights

EndpointMethodDescription
/insights/generatePOSTLLM-powered analysis. Focus: contradictions, failures, stale, divergence, patterns, discover

Agents

EndpointMethodDescription
/agentsGETList registered agents with trust levels
/agents/{id}GETSingle agent detail
/agents/{id}/trustPATCHSet trust level (0-3)

Memory Crystallizer

EndpointMethodDescription
/crystallizePOSTTrigger crystallization for a tenant
/crystallize/allPOSTTrigger for all tenants (admin key, nightly)
/crystallize/reportsGETList crystallization reports
/crystallize/latestGETMost recent completed report

Documents

EndpointMethodDescription
/documentsPOSTStore or update a structured JSON document
/documents/{id}GETRetrieve document by ID
/documents/queryPOSTQuery by field equality filters
/documents/{id}DELETEDelete a document

Fleet

EndpointMethodDescription
/fleet/heartbeatPOSTPlugin heartbeat — upserts node status, returns pending commands
/fleet/nodesGETList fleet nodes with status (online/stale/offline)
/fleet/commandsPOSTQueue a command for a node
/fleet/commandsGETList command history

Admin + System

EndpointMethodDescription
/healthGETLiveness check
/versionGETCurrent version
/tool-descriptionsGETCanonical MCP tool descriptions
/admin/tenantsGETList all tenants (admin key)
/admin/fleetsGETList fleets across all tenants (admin key)
/admin/memoriesGETList memories across all tenants with filters (admin key)
/admin/memories/statsGETMemory counts by tenant/type/status (admin key)
/settingsGET / PUTPer-tenant configuration
/audit-logGETAudit log entries
/mcpPOSTMCP Streamable HTTP endpoint (mounted at app root, NOT under /api/v1)

Auth: X-API-Key header for all endpoints. Admin endpoints require the admin key. Public (no auth): /api/v1/health, /api/v1/version, /api/v1/tool-descriptions.

Gateway-injected headers (trusted only behind the enterprise gateway):

HeaderEffect
X-Agent-IDScopes the request to this agent
X-Org-Read-Only: trueRead-only mode — creates/updates return 403
X-Tenant-IDTenant identity when using the shared MEMCLAW_API_KEY gate

These headers are honored unconditionally — core-api must not be network-exposed without a gateway that strips them from untrusted callers.

Rate limiting (managed platform)

These limits apply to the managed platform at memclaw.net. In the OSS edition, rate limiting is a no-op — see the Rate limiting section below.

ScopeLimit
Memory writes60 req/min per API key
Memory searches120 req/min per API key
General reads300 req/min per API key
Auth endpoints10 req/min per IP
Global DDoS floor1000 req/min per IP

Exceeded limits return HTTP 429 with a Retry-After header.


Configuration

All configuration is via environment variables or .env. See .env.example for the full list.

Migrating from a pre-1.0 deploy? The legacy ALLOYDB_* env var names are still accepted as aliases — POSTGRES_HOST falls back to ALLOYDB_HOST, etc. Aliases will be dropped in a future major release.

VariableDefaultDescription
POSTGRES_HOST127.0.0.1Database host
POSTGRES_PORT5432Database port
POSTGRES_USERmemclawDatabase user
POSTGRES_PASSWORDchangemeDatabase password
POSTGRES_DBmemclawDatabase name
POSTGRES_USE_IAM_AUTHfalseUse GCP IAM for DB auth (managed Postgres on GCP only)
ADMIN_API_KEY(empty)Admin API key — bypasses tenant enforcement
EMBEDDING_PROVIDERopenaiopenai, local, or fake
ENTITY_EXTRACTION_PROVIDERopenaiopenai, gemini, anthropic, openrouter, fake, or none
ENTITY_EXTRACTION_MODELgpt-5.4-nanoLLM model for enrichment and entity extraction
OPENAI_API_KEYRequired for OpenAI embeddings and enrichment
USE_LLM_FOR_MEMORY_CREATIONtrueLLM auto-classifies type, weight, title, summary, tags on write
ANTHROPIC_API_KEYRequired for Anthropic
OPENROUTER_API_KEYRequired for OpenRouter
GEMINI_API_KEYRequired for Gemini (Developer API, from AI Studio)
CORS_ORIGINShttp://localhost:3000Comma-separated allowed CORS origins
ENVIRONMENTdevelopmentdevelopment or production
SETTINGS_ENCRYPTION_KEYFernet key for encrypting tenant settings. Required in production
PLATFORM_LLM_PROVIDER(empty)Platform-default LLM: openai, vertex, or empty to disable
PLATFORM_LLM_MODEL(empty)Model override (e.g. gpt-5.4-nano, gemini-3.1-flash-lite-preview)
PLATFORM_LLM_API_KEYOpenAI API key for the platform LLM singleton
PLATFORM_LLM_GCP_PROJECT_IDGCP project for platform Vertex LLM
PLATFORM_LLM_GCP_LOCATIONus-central1GCP region for platform Vertex LLM
PLATFORM_EMBEDDING_PROVIDER(empty)Platform-default embeddings: openai or empty to disable
PLATFORM_EMBEDDING_MODEL(empty)Embedding model override (e.g. text-embedding-3-small)
PLATFORM_EMBEDDING_API_KEYOpenAI API key for platform embeddings
Project structure
memclaw/
├── core-api/                      # Main FastAPI service
│   └── src/core_api/
│       ├── app.py                 # FastAPI app, lifespan, middleware
│       ├── mcp_server.py          # MCP server (Streamable HTTP, 12 tools)
│       ├── constants.py           # Tool descriptions, limits, ranking params
│       ├── config.py              # Settings (env vars)
│       ├── auth.py                # API key + JWT auth, tenant enforcement
│       ├── routes/                # Route handlers
│       ├── services/              # Business logic
│       ├── providers/             # LLM/embedding abstraction + fallback
│       ├── pipeline/              # Composable write/search pipelines
│       └── tools/                 # MCP tool implementations
│
├── core-storage-api/              # PostgreSQL CRUD microservice
│   └── src/core_storage_api/
│       ├── routers/               # Memory, entity, document, fleet CRUD
│       ├── services/              # ORM operations
│       └── database/              # SQLAlchemy models, Alembic migrations
│
├── plugin/                        # OpenClaw plugin (TypeScript)
│   └── src/
│       ├── tools.ts               # Tool implementations
│       ├── agent-auth.ts          # Per-agent credentials (agent-scoped mc_ keys)
│       ├── context-engine.ts      # Auto-read/write lifecycle
│       ├── heartbeat.ts           # 60s heartbeat → MemClaw API
│       └── educate.ts             # Agent education delivery
│
├── common/                        # Shared SQLAlchemy ORM models and constants
├── tests/                         # Test suite
├── scripts/                       # Smoke tests, benchmarks, export tools
├── docker-compose.yml             # Production-like stack
├── docker-compose.dev.yml         # Dev stack
└── .env.example                   # Full configuration reference
Latency benchmarks

Typical results on a single-instance deployment (OpenAI embeddings + GPT-5.4 Nano):

OperationMeanP50P95
memclaw_write~2000ms~2000ms~2300ms
memclaw_recall~650ms~640ms~670ms
memclaw_recall (with include_brief=true)~1300ms~1200ms~2100ms

Write latency is dominated by LLM enrichment. Recall latency by the embedding API call.

python scripts/latency_test.py --url http://localhost:8000 --api-key <admin-key> --runs 20

Public API & Stability

MemClaw v1.x follows SemVer 2.0.0. The surfaces below are stable; everything else is internal and may change in any release.

Stable surfaces

MCP tools (12)

The MCP server is mounted at /mcp. Tool names, parameter names, and the documented op-dispatch values are stable.

ToolPurpose
memclaw_recallHybrid semantic + keyword search over memories, with optional LLM-summarised brief.
memclaw_writeSingle or batch (≤100) memory write; auto-enriched with type, title, summary, tags.
memclaw_managePer-memory lifecycle, op-dispatched: read | update | transition | delete | bulk_delete | lineage.
memclaw_listNon-semantic enumeration with filters, sort, cursor pagination.
memclaw_docStructured-document CRUD, op-dispatched: write | read | query | delete | list_collections | search.
memclaw_entity_getLook up a knowledge-graph entity by UUID.
memclaw_tuneRead/update an agent's per-search profile (top_k, fts_weight, freshness, blend, …).
memclaw_insightsKarpathy-Loop reflection: contradictions, failures, stale, divergence, patterns, discover.
memclaw_evolveKarpathy-Loop feedback: record an outcome (success | failure | partial) against memories.
memclaw_statsAggregate counts: total + breakdowns by type / agent / status. Read-only.
memclaw_keystonesRead mandatory governance rules for the current scope (tenant + fleet + agent merged). Call once per session.
memclaw_keystones_setAuthor/remove keystone rules, op-dispatched: set | delete. Trust ≥ 1 for self-authored scope=agent; ≥ 2 otherwise.

Skill sharing uses the generic memclaw_doc surface — write/read/query/search/delete on collection="skills". The server validates the slug and embeds data["summary"] for semantic discovery (with a back-compat fallback to data["description"] for skills).

REST endpoints

All paths are prefixed with /api/v1 unless noted. Request and response shapes documented in the OpenAPI schema at /openapi.json are part of the contract.

AreaEndpoints
MemoryGET/POST /memories, PATCH /memories/{id}, DELETE /memories/{id}, PATCH /memories/{id}/status, POST /memories/bulk, POST /memories/bulk-delete, GET /memories/stats, GET /memories/{id}, GET /memories/{id}/contradictions, POST /search, POST /recall, POST /ingest/preview, POST /ingest/commit
Knowledge graphGET /entities, GET /entities/{id}, POST /entities/upsert, GET /graph, POST /relations/upsert
DocumentsPOST /documents, GET /documents, GET /documents/{id}, POST /documents/query, DELETE /documents/{id}
KeystonesGET /memclaw/keystones, POST /memclaw/keystones, DELETE /memclaw/keystones/{doc_id}
FleetPOST /fleet/heartbeat, GET /fleet/nodes, POST /fleet/commands, GET /fleet/commands
AgentsGET /agents, GET /agents/{id}, PATCH /agents/{id}/trust, POST /admin/agent-keys/provision (atomic key + row + trust + fleet), GET /whoami (identity probe)
InsightsPOST /insights/generate
EvolvePOST /evolve/report
CrystallizerPOST /crystallize, POST /crystallize/all, GET /crystallize/reports, GET /crystallize/latest
SettingsGET/PUT /settings
SystemGET /health, GET /version, GET /tool-descriptions, GET /audit-log
MCPPOST /mcp (Streamable HTTP transport, mounted at app root)
Bootstrap (plugin)GET /plugin-source, GET /plugin-source-hash, GET/POST /install-plugin, GET /install-skill, GET /skill/memclaw. Aliased under /api (no /v1) for one-line installers.

Plugin environment variables

Read by the OpenClaw plugin. The plugin's published name (memclaw) and these variables are the public contract; the plugin's TypeScript module structure is internal.

VarPurpose
MEMCLAW_API_URLBase URL of the core-api server.
MEMCLAW_API_KEYTenant or admin API key sent in X-API-Key.
MEMCLAW_TENANT_IDOptional pre-resolved tenant id; bypasses lookup.
MEMCLAW_FLEET_IDDefault fleet id for writes/heartbeat.
MEMCLAW_NODE_NAMEFleet node identifier reported on heartbeat.
MEMCLAW_AUTO_WRITE_TURNSAuto-write turn summaries (default true).

Server environment variables

These mirror the Configuration table above. See it for defaults.

GroupVars
DatabasePOSTGRES_HOST, POSTGRES_PORT, POSTGRES_USER, POSTGRES_PASSWORD, POSTGRES_DB, POSTGRES_USE_IAM_AUTH, POSTGRES_REQUIRE_SSL
AuthADMIN_API_KEY, MEMCLAW_API_KEY, IS_STANDALONE
ProvidersEMBEDDING_PROVIDER, ENTITY_EXTRACTION_PROVIDER, OPENAI_API_KEY, ANTHROPIC_API_KEY, OPENROUTER_API_KEY, GEMINI_API_KEY, USE_LLM_FOR_MEMORY_CREATION
RuntimeCORS_ORIGINS, ENVIRONMENT, SETTINGS_ENCRYPTION_KEY, REDIS_URL

Auth modes

ModeActivated byUse case
StandaloneIS_STANDALONE=trueSingle-tenant self-host; auth bypassed.
Multi-tenant adminADMIN_API_KEY=…Operator key for multi-tenant deployments.
Shared gateMEMCLAW_API_KEY=…Optional shared secret required on every non-admin request.

See AGENT-INSTALL.md for installation flows that exercise each mode.

Internal (not covered by SemVer)

Anything not listed above is internal and may change in any release without a major version bump:

  • Python module layout (core_api.middleware.*, core_api.providers.*, core_api.pipeline.*, core_api.services.*, common/*)
  • Database schema, table names, migration paths
  • Gateway-injected HTTP headers (X-Memclaw-Gateway, X-Tenant-ID, X-Agent-ID, X-Org-Read-Only)
  • Most /api/v1/admin/* and all /api/v1/testing/* routes (the documented exception is POST /admin/agent-keys/provision, which is part of the stable identity-bootstrap surface — see the Agents row above)
  • The core-storage-api microservice (internal, not user-facing)
  • The plugin's TypeScript module structure
  • API-key prefix formats — currently unified on mc_… (with legacy mca_… / mci_… aliases still accepted via back-compat); formats may continue to evolve

Reporting breaking changes

Contributors who introduce a breaking change to a stable surface must:

  • Add a BREAKING CHANGE: trailer to the commit message describing the impact and any migration steps.
  • Apply the kind/breaking label to the pull request.

Reviewers will block merges to dev that touch a stable surface without these markers. If you are unsure whether a change is breaking, open the PR with the label and let review decide — better to over-label than ship a silent break.


Telemetry and error tracking

MemClaw supports optional Sentry integration for error tracking and performance monitoring:

  • Opt-in only — set the SENTRY_DSN environment variable to enable. No errors are reported unless you explicitly configure a DSN.
  • No usage analytics — MemClaw does not collect usage statistics, feature flags, or behavioral data.
  • No phone-home — the application makes zero outbound calls unless you configure a Sentry DSN or an LLM/embedding provider.

OpenClaw Plugin

See static/docs/integration-guide.md for full plugin setup, agent system prompts, and usage examples.


Rate limiting

Rate limiting in the OSS edition is a no-op — all rate-limit decorators are identity functions that accept every request without throttling. For production deployments exposed to the internet, add rate limiting at your reverse proxy (nginx, Caddy, Cloudflare) or implement application-level limiting in core-api/src/core_api/middleware/rate_limit.py.

Telemetry

MemClaw does not phone home by default. No usage data, analytics, or tracking of any kind.

If you set the SENTRY_DSN environment variable, Sentry error tracking is enabled — crash reports and performance traces are sent to your configured Sentry project. When SENTRY_DSN is empty (the default), Sentry is not initialized and no data leaves the server.

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines, development setup, and how to submit PRs.


FAQ

What is MemClaw? MemClaw is open-source governed shared memory for AI agent fleets: cross-agent, cross-fleet recall with visibility scopes, trust tiers, keystone policies, audit trails, and tenant isolation enforced on every operation — plus self-improving retrieval through outcome-based learning.

How is MemClaw different from a vector database? MemClaw uses pgvector under the hood but is not a vector DB wrapper. On top of hybrid search it adds fleet orchestration, per-agent retrieval tuning, contradiction detection, an 8-status lifecycle, an auto-extracted knowledge graph, LLM enrichment on every write, row-level tenant isolation, and audit trails on every operation.

How is MemClaw different from Mem0 or Zep? Mem0 and Zep focus on memory for individual agents; accuracy benchmarks cluster all three tools in a narrow band. MemClaw is built for fleets: multiple agents across teams and vendors sharing one governed memory plane, with trust tiers, keystone policies, and cross-fleet permissions those tools don't address. See How MemClaw compares.

Does MemClaw work with Claude Desktop, Claude Code, Cursor, or Windsurf? Yes — MemClaw is MCP-native. Paste a JSON config with a URL and API key into any MCP client and 12 tools appear immediately.

Can agents from different vendors share memory? Yes — that's the point. An Anthropic agent recalls what an OpenAI agent wrote, under the same governance rules — with trust tiers and visibility scopes deciding what crosses fleet boundaries.

Is MemClaw really free? The full engine — storage, 12 MCP tools, plugin, audit trail — is Apache 2.0. Run it yourself forever. The managed platform at memclaw.net adds hosting, scaling, and enterprise governance for teams that don't want to operate infrastructure.

Who runs MemClaw in production? eToro (NASDAQ: ETOR) runs 300+ agents on MemClaw — 26,500+ memories, 1,372 shared skills, 23 ms p50 search. Case study →


License

MemClaw is licensed under the Apache License, Version 2.0.

See NOTICE for copyright and third-party attributions.

Trademarks

"MemClaw" and "Caura" are trademarks of Caura. The Apache License 2.0 grants permission to use the source code but does not grant permission to use these names, logos, or branding in a way that suggests endorsement of, or affiliation with, any derivative work. See Apache License 2.0 §6 for the full legal terms.