context-mem

Context optimization for AI coding assistants — 99% token savings via 14 content-aware summarizers, 3-layer search, and progressive disclosure. No LLM dependency.

context-mem

Context optimization for AI coding assistants — 99% token savings, zero configuration, no LLM dependency.

npm version tests license node

AI coding assistants waste 60–80% of their context window on raw tool outputs — full npm logs, verbose test results, uncompressed JSON. This means shorter sessions, lost context, and repeated work.

context-mem captures tool outputs via hooks, compresses them using 14 content-aware summarizers, stores everything in local SQLite with full-text search, and serves compressed context back through the MCP protocol. No LLM calls, no cloud, no cost.

How It Compares

context-memclaude-memcontext-modeContext7
Approach14 specialized summarizersLLM-based compressionSandbox + intent filterExternal docs injection
Token Savings99% (benchmarked)~95% (claimed)98% (claimed)N/A
SearchBM25 + Trigram + Fuzzy + VectorBasic recallBM25 + Trigram + FuzzyDoc lookup
Semantic SearchLocal embeddings (free)LLM-based ($$$)NoNo
LLM CallsNone (free, deterministic)Every observation (~$57/mo)NoneNone
Activity JournalFile edits, commands, readsNoNoNo
Cross-Session MemoryJournal + snapshots + DBLLM summariesYesNo
Knowledge Base5 categories, auto-extraction, relevance decayNoNoNo
Budget ManagementConfigurable limits + overflowNoBasic throttlingNo
Event TrackingP1–P4, error-fix detectionNoSession events onlyNo
DashboardReal-time web UIBasic viewNoNo
Session ContinuitySnapshot save/restorePartialYesNo
Content Types14 specialized detectorsGeneric LLMGeneric sandboxDocs only
Model Lock-inNone (MCP protocol)Claude-onlyClaude-onlyAny
PrivacyFully local, tag strippingLocalLocalCloud
LicenseMITAGPL-3.0Elastic v2Open

Quick Start

Claude Code (recommended):

/plugin marketplace add JubaKitiashvili/context-mem
/plugin install context-mem@context-mem

npm (manual):

npm install -g context-mem
cd your-project
context-mem init
context-mem serve
More platforms — Cursor, Windsurf, Copilot, Cline, Roo Code, Gemini CLI, Goose, OpenClaw, CrewAI, LangChain

Cursor.cursor/mcp.json:

{ "mcpServers": { "context-mem": { "command": "npx", "args": ["-y", "context-mem", "serve"] } } }

Windsurf.windsurf/mcp.json:

{ "mcpServers": { "context-mem": { "command": "npx", "args": ["-y", "context-mem", "serve"] } } }

GitHub Copilot.vscode/mcp.json:

{ "servers": { "context-mem": { "type": "stdio", "command": "npx", "args": ["-y", "context-mem", "serve"] } } }

Cline — add to MCP settings:

{ "mcpServers": { "context-mem": { "command": "npx", "args": ["-y", "context-mem", "serve"], "disabled": false } } }

Roo Code — same as Cline format above.

Gemini CLI.gemini/settings.json:

{ "mcpServers": { "context-mem": { "command": "npx", "args": ["-y", "context-mem", "serve"] } } }

Goose — add to profile extensions:

extensions:
  context-mem:
    type: stdio
    cmd: npx
    args: ["-y", "context-mem", "serve"]

OpenClaw — add to MCP config:

{ "mcpServers": { "context-mem": { "command": "npx", "args": ["-y", "context-mem", "serve"] } } }

CrewAI / LangChain — see configs/ for Python integration examples.

Runtime Context Optimization (benchmark-verified)

MechanismHow it worksSavings
Content summarizerAuto-detects 14 content types, produces statistical summaries97–100% per output
Index + SearchFTS5 BM25 retrieval returns only relevant chunks, code preserved exactly80% per search
Smart truncation4-tier fallback: JSON schema → Pattern → Head/Tail → Binary hash83–100% per output
Session snapshotsCaptures full session state in <8 KB~50% vs log replay
Budget enforcementThrottling at 80% prevents runaway token consumptionPrevents overflow

Result: In a full coding session, 99% of tool output tokens are eliminated — leaving 99.6% of your context window free for actual problem solving. See BENCHMARK.md for complete results.

Headline Numbers

ScenarioRawCompressedSavings
Full coding session (50 tools)365.5 KB3.2 KB99%
14 content types (555.9 KB)555.9 KB5.6 KB99%
Index + Search (6 scenarios)38.9 KB8.0 KB80%
BM25 search latency0.3ms avg3,342 ops/s
Trigram search latency0.008ms avg120,122 ops/s

Verified on Apple M3 Pro, Node.js v22.22.0, 555.9 KB real-world test data across 21 scenarios.

What Gets Compressed

14 summarizers detect content type automatically and apply the optimal compression:

Content TypeExampleStrategy
Shell outputnpm install, build logsCommand + exit code + error extraction
JSONAPI responses, configsSchema extraction (keys + types, no values)
ErrorsStack traces, crashesError type + message + top frames
Test resultsJest, VitestPass/fail/skip counts + failure details
TypeScript errorserror TS2345:Error count by file + top error codes
Build outputWebpack, Vite, Next.jsRoutes + bundle sizes + warnings
Git logCommits, diffsCommit count + authors + date range
CSV/TSVData files, analyticsRow/column count + headers + aggregation
MarkdownDocs, READMEsHeading tree + code blocks + links
HTMLWeb pagesTitle + nav + headings + forms
NetworkHTTP logs, access logsMethod/status distribution
CodeSource filesFunction/class signatures
Log filesApp logs, access logsLevel distribution + error extraction
BinaryImages, compiled filesSHA256 hash + byte count

Features

Search — 4-layer hybrid: BM25 full-text → trigram fuzzy → Levenshtein typo-tolerant → optional vector/semantic search. Sub-millisecond latency with intent classification. Semantic search finds "auth problem" when stored as "login token expired" — local embeddings via all-MiniLM-L6-v2, no cloud, no cost.

Activity Journal — Every file edit, bash command, and file read is logged to .context-mem/journal.md in human-readable format. Cross-session memory injects journal entries on startup — Claude knows exactly what changed in previous sessions without LLM calls.

Plugin Commands/context-mem:status (stats + dashboard link), /context-mem:search <query> (search observations), /context-mem:journal (show activity log).

Knowledge Base — Save and search patterns, decisions, errors, APIs, components. Time-decay relevance scoring with automatic archival. Auto-extraction — decisions, errors, commits, and frequently-accessed files are automatically saved to the knowledge base without manual intervention.

Export/Import — Transfer knowledge between machines: context-mem export dumps knowledge, snapshots, and events as JSON; context-mem import restores them in another project. Merge or replace modes.

Budget Management — Session token limits with three overflow strategies: aggressive truncation, warn, hard stop.

Event Tracking — P1–P4 priority events with automatic error→fix detection.

Session Snapshots — Save/restore session state across restarts with progressive trimming.

Dashboard — Real-time web UI at http://localhost:51893 — auto-starts with serve, supports multi-project aggregation. Token economics, observations, search, knowledge base, events, system health. Switch between projects or see everything at once.

Dashboard — token economics and observation stats

Dashboard — event stream, session snapshots, activity

VS Code Extension — Sidebar dashboard, status bar with live savings, command palette (start/stop/search/stats). Install from marketplace: context-mem.

Auto-Detectioncontext-mem init detects your editor (Cursor, Windsurf, VS Code, Cline, Roo Code) and creates MCP config + AI rules automatically. First serve run also triggers lightweight auto-setup (.gitignore, rules) — zero manual config needed.

OpenClaw Native Plugin — Full ContextEngine integration with lifecycle hooks (bootstrap, ingest, assemble, compact, afterTurn, dispose). See openclaw-plugin/.

Privacy — Everything local. <private> tag stripping, custom regex redaction. No telemetry, no cloud.

Architecture

Tool Output → Hook Capture → HTTP Bridge (:51894) → Pipeline → Summarizer (14 types) → SQLite + FTS5
                                    ↓                    ↓                                      ↓
                              ObserveQueue         SHA256 Dedup                          3-Layer Search
                             (burst protection)          ↓                                      ↓
                                              4-Tier Truncation                    Progressive Disclosure
                                                      ↓                                        ↓
                                              Auto-Extract KB                   AI Assistant ← MCP Server

MCP Tools

17 tools available via MCP protocol
ToolDescription
observeStore an observation with auto-summarization
searchHybrid search across all observations
getRetrieve full observation by ID
timelineReverse-chronological observation list
statsToken economics for current session
summarizeSummarize content without storing
configureUpdate runtime configuration
executeRun code snippets (JS/Python)
index_contentIndex content with code-aware chunking
search_contentSearch indexed content chunks
save_knowledgeSave to knowledge base
search_knowledgeSearch knowledge base
budget_statusCurrent budget usage
budget_configureSet budget limits
restore_sessionRestore session from snapshot
emit_eventEmit a context event
query_eventsQuery events with filters

CLI Commands

context-mem init        # Initialize in current project
context-mem serve       # Start MCP server (stdio)
context-mem status      # Show database stats
context-mem doctor      # Run health checks
context-mem dashboard   # Open web dashboard
context-mem export      # Export knowledge, snapshots, events as JSON
context-mem import      # Import data from JSON export file

Configuration

.context-mem.json
{
  "storage": "auto",
  "plugins": {
    "summarizers": ["shell", "json", "error", "log", "code"],
    "search": ["bm25", "trigram", "vector"],
    "runtimes": ["javascript", "python"]
  },
  "privacy": {
    "strip_tags": true,
    "redact_patterns": []
  },
  "token_economics": true,
  "lifecycle": {
    "ttl_days": 30,
    "max_db_size_mb": 500,
    "max_observations": 50000,
    "cleanup_schedule": "on_startup",
    "preserve_types": ["decision", "commit"]
  },
  "port": 51893,
  "db_path": ".context-mem/store.db"
}

Documentation

DocDescription
Benchmark ResultsFull benchmark suite — 21 scenarios, 7 parts
Configuration GuideAll config options with defaults

Platform Support

PlatformMCP ConfigAI RulesAuto-Setup
Claude CodeCLAUDE.mdAppends to CLAUDE.mdinit + serve
Cursormcp.json.cursor/rules/context-mem.mdcinit + serve
Windsurfmcp_config.json.windsurf/rules/context-mem.mdinit + serve
GitHub Copilotmcp.json.github/copilot-instructions.mdinit + serve
Clinecline_mcp_settings.json.clinerules/context-mem.mdinit + serve
Roo Codemcp_settings.json.roo/rules/context-mem.mdinit + serve
Gemini CLIGEMINI.mdAppends to GEMINI.mdinit + serve
AntigravityGEMINI.mdAppends to GEMINI.mdserve
Gooserecipe.yamlManual
OpenClawmcp_config.jsonManual
CrewAIexample.pyManual
LangChainexample.pyManual

AI Rules teach the AI when and how to use context-mem tools automatically — calling observe after large outputs, restore_session on startup, search before re-reading files.

Available On

  • npmnpm install -g context-mem
  • VS Code MarketplaceContext Mem
  • Claude Code Plugin/plugin marketplace add JubaKitiashvili/context-mem

License

MIT — use it however you want.

Author

Juba Kitiashvili


context-mem — 99% less noise, 100% more context
Star this repo · Fork it · Report an issue

関連サーバー