context-mem

Context optimization for AI coding assistants — 99% token savings via 14 content-aware summarizers, 3-layer search, and progressive disclosure. No LLM dependency.

context-mem

Context optimization for AI coding assistants — 99% token savings, zero configuration, no LLM dependency.

npm version tests license node

AI coding assistants waste 60–80% of their context window on raw tool outputs — full npm logs, verbose test results, uncompressed JSON. This means shorter sessions, lost context, and repeated work.

context-mem captures tool outputs via hooks, compresses them using 14 content-aware summarizers, stores everything in local SQLite with full-text search, and serves compressed context back through the MCP protocol. No LLM calls, no cloud, no cost.

How It Compares

context-memclaude-memcontext-modeContext7
Approach14 specialized summarizersLLM-based compressionSandbox + intent filterExternal docs injection
Token Savings99% (benchmarked)~95% (claimed)98% (claimed)N/A
SearchBM25 + Trigram + FuzzyBasic recallBM25 + Trigram + FuzzyDoc lookup
LLM CallsNone (free, deterministic)Every observation ($$$)NoneNone
Knowledge Base5 categories, relevance decayNoNoNo
Budget ManagementConfigurable limits + overflowNoBasic throttlingNo
Event TrackingP1–P4, error-fix detectionNoSession events onlyNo
DashboardReal-time web UINoNoNo
Session ContinuitySnapshot save/restorePartialYesNo
Content Types14 specialized detectorsGeneric LLMGeneric sandboxDocs only
PrivacyFully local, tag strippingLocalLocalCloud
LicenseMITAGPL-3.0Elastic v2Open

Quick Start

Claude Code (recommended):

/plugin marketplace add JubaKitiashvili/context-mem
/plugin install context-mem@context-mem

npm (manual):

npm install -g context-mem
cd your-project
context-mem init
context-mem serve
<details> <summary>More platforms — Cursor, Windsurf, Copilot, Cline, Roo Code, Gemini CLI, Goose, OpenClaw, CrewAI, LangChain</summary>

Cursor.cursor/mcp.json:

{ "mcpServers": { "context-mem": { "command": "npx", "args": ["-y", "context-mem", "serve"] } } }

Windsurf.windsurf/mcp.json:

{ "mcpServers": { "context-mem": { "command": "npx", "args": ["-y", "context-mem", "serve"] } } }

GitHub Copilot.vscode/mcp.json:

{ "servers": { "context-mem": { "type": "stdio", "command": "npx", "args": ["-y", "context-mem", "serve"] } } }

Cline — add to MCP settings:

{ "mcpServers": { "context-mem": { "command": "npx", "args": ["-y", "context-mem", "serve"], "disabled": false } } }

Roo Code — same as Cline format above.

Gemini CLI.gemini/settings.json:

{ "mcpServers": { "context-mem": { "command": "npx", "args": ["-y", "context-mem", "serve"] } } }

Goose — add to profile extensions:

extensions:
  context-mem:
    type: stdio
    cmd: npx
    args: ["-y", "context-mem", "serve"]

OpenClaw — add to MCP config:

{ "mcpServers": { "context-mem": { "command": "npx", "args": ["-y", "context-mem", "serve"] } } }

CrewAI / LangChain — see configs/ for Python integration examples.

</details>

Runtime Context Optimization (benchmark-verified)

MechanismHow it worksSavings
Content summarizerAuto-detects 14 content types, produces statistical summaries97–100% per output
Index + SearchFTS5 BM25 retrieval returns only relevant chunks, code preserved exactly80% per search
Smart truncation4-tier fallback: JSON schema → Pattern → Head/Tail → Binary hash83–100% per output
Session snapshotsCaptures full session state in <2 KB~50% vs log replay
Budget enforcementThrottling at 80% prevents runaway token consumptionPrevents overflow

Result: In a full coding session, 99% of tool output tokens are eliminated — leaving 99.6% of your context window free for actual problem solving. See BENCHMARK.md for complete results.

Headline Numbers

ScenarioRawCompressedSavings
Full coding session (50 tools)365.5 KB3.2 KB99%
14 content types (555.9 KB)555.9 KB5.6 KB99%
Index + Search (6 scenarios)38.9 KB8.0 KB80%
BM25 search latency0.3ms avg3,342 ops/s
Trigram search latency0.008ms avg120,122 ops/s

<sup>Verified on Apple M3 Pro, Node.js v22.22.0, 555.9 KB real-world test data across 21 scenarios.</sup>

What Gets Compressed

14 summarizers detect content type automatically and apply the optimal compression:

Content TypeExampleStrategy
Shell outputnpm install, build logsCommand + exit code + error extraction
JSONAPI responses, configsSchema extraction (keys + types, no values)
ErrorsStack traces, crashesError type + message + top frames
Test resultsJest, VitestPass/fail/skip counts + failure details
TypeScript errorserror TS2345:Error count by file + top error codes
Build outputWebpack, Vite, Next.jsRoutes + bundle sizes + warnings
Git logCommits, diffsCommit count + authors + date range
CSV/TSVData files, analyticsRow/column count + headers + aggregation
MarkdownDocs, READMEsHeading tree + code blocks + links
HTMLWeb pagesTitle + nav + headings + forms
NetworkHTTP logs, access logsMethod/status distribution
CodeSource filesFunction/class signatures
Log filesApp logs, access logsLevel distribution + error extraction
BinaryImages, compiled filesSHA256 hash + byte count

Features

Search — 3-layer hybrid: BM25 full-text → trigram fuzzy → Levenshtein typo-tolerant. Sub-millisecond latency with intent classification.

Knowledge Base — Save and search patterns, decisions, errors, APIs, components. Time-decay relevance scoring with automatic archival.

Budget Management — Session token limits with three overflow strategies: aggressive truncation, warn, hard stop.

Event Tracking — P1–P4 priority events with automatic error→fix detection.

Session Snapshots — Save/restore session state across restarts with progressive trimming.

Dashboard — Real-time web UI at http://localhost:51893 — auto-starts with serve, supports multi-project aggregation. Token economics, observations, search, knowledge base, events, system health. Switch between projects or see everything at once.

<p align="center"> <img src="docs/screenshots/dashboard-overview.png" width="600" alt="Dashboard — token economics and observation stats" /> </p> <p align="center"> <img src="docs/screenshots/dashboard-middle.png" width="600" alt="Dashboard — event stream, session snapshots, activity" /> </p>

VS Code Extension — Sidebar dashboard, status bar with live savings, command palette (start/stop/search/stats). Install from marketplace: context-mem.

Auto-Detectioncontext-mem init detects your editor (Cursor, Windsurf, VS Code, Cline, Roo Code) and creates MCP config automatically.

OpenClaw Native Plugin — Full ContextEngine integration with lifecycle hooks (bootstrap, ingest, assemble, compact, afterTurn, dispose). See openclaw-plugin/.

Privacy — Everything local. <private> tag stripping, custom regex redaction. No telemetry, no cloud.

Architecture

Tool Output → Hook Capture → Pipeline → Summarizer (14 types) → SQLite + FTS5
                                ↓                                      ↓
                          SHA256 Dedup                          3-Layer Search
                                ↓                                      ↓
                        4-Tier Truncation              Progressive Disclosure
                                                               ↓
                                                AI Assistant ← MCP Server

MCP Tools

<details> <summary>17 tools available via MCP protocol</summary>
ToolDescription
observeStore an observation with auto-summarization
searchHybrid search across all observations
getRetrieve full observation by ID
timelineReverse-chronological observation list
statsToken economics for current session
summarizeSummarize content without storing
configureUpdate runtime configuration
executeRun code snippets (JS/Python)
index_contentIndex content with code-aware chunking
search_contentSearch indexed content chunks
save_knowledgeSave to knowledge base
search_knowledgeSearch knowledge base
budget_statusCurrent budget usage
budget_configureSet budget limits
restore_sessionRestore session from snapshot
emit_eventEmit a context event
query_eventsQuery events with filters
</details>

CLI Commands

context-mem init        # Initialize in current project
context-mem serve       # Start MCP server (stdio)
context-mem status      # Show database stats
context-mem doctor      # Run health checks
context-mem dashboard   # Open web dashboard

Configuration

<details> <summary>.context-mem.json</summary>
{
  "storage": "auto",
  "plugins": {
    "summarizers": ["shell", "json", "error", "log", "code"],
    "search": ["bm25", "trigram"],
    "runtimes": ["javascript", "python"]
  },
  "privacy": {
    "strip_tags": true,
    "redact_patterns": []
  },
  "token_economics": true,
  "lifecycle": {
    "ttl_days": 30,
    "max_db_size_mb": 500,
    "max_observations": 50000,
    "cleanup_schedule": "on_startup",
    "preserve_types": ["decision", "commit"]
  },
  "port": 51893,
  "db_path": ".context-mem/store.db"
}
</details>

Documentation

DocDescription
Benchmark ResultsFull benchmark suite — 21 scenarios, 7 parts
Configuration GuideAll config options with defaults

Platform Support

PlatformIntegrationConfig
Claude CodePlugin marketplaceconfigs/claude-code/
CursorMCP nativeconfigs/cursor/
WindsurfMCP nativeconfigs/windsurf/
GitHub CopilotAgent Mode MCPconfigs/copilot/
Cline / Roo CodeMCP nativeconfigs/cline/
Gemini CLIMCP + GEMINI.mdconfigs/gemini-cli/
GooseRecipe YAMLconfigs/goose/
OpenClawMCP configconfigs/openclaw/
AntigravityGEMINI.md routingconfigs/antigravity/
CrewAIPython MCP adapterconfigs/crewai/
LangChainlangchain-mcp-adaptersconfigs/langchain/

Available On

  • npmnpm install -g context-mem

License

MIT — use it however you want.

Author

Juba Kitiashvili


<p align="center"> <b>context-mem — 99% less noise, 100% more context</b><br/> <a href="https://github.com/JubaKitiworworashvili/context-mem">Star this repo</a> · <a href="https://github.com/JubaKitiworworashvili/context-mem/fork">Fork it</a> · <a href="https://github.com/JubaKitiworworashvili/context-mem/issues">Report an issue</a> </p>

Related Servers