Knowledge Graph

A knowledge graph-driven persistent memory layer for coding agents and LLM workflows.

Knowledge Graph for Claude Code and Codex

Persistent, git-native memory that makes your AI coding agent actually remember. Zero databases, zero services — just bash, jq, and your own commits.

GitHub stars CI License: MIT Last Commit

Claude Code and other AI coding agents forget everything between sessions — you end up re-explaining the same project context every time. Knowledge Graph fixes that by turning your file operations and git history into a lightweight, evidence-based memory layer that lives inside your repo.

First-class support for:

  • Claude Code — auto-tracks reads and writes via hooks, injects a work snapshot on every session start, rebuilds context after /clear and /compact
  • Codex / Cursor / Windsurf / any MCP client — 7 tools and 20+ resources exposed by the bundled MCP stdio server (kg_read_node, kg_query, kg_recent_work, kg_blind_spots, …)

No embeddings. No vector stores. No external services. Works on macOS, Linux, and Windows.


Who this is for

  • Vibecoders — you describe intent, the agent writes code. Knowledge Graph gives the agent the project context you never had to learn, so one-line requests turn into working changes instead of destructive rewrites. From the maintainer (a vibecoder himself): goal completion and "actually what I wanted" rate jumped at least 10× after installing it — "10× is the floor."
  • Senior developers — you want structured, auditable context that your AI agent respects. Every rule traces back to a commit hash or a recorded error event. No hallucinated conventions.
  • Teams — rules live in canonical CLAUDE.md nodes right next to the code they govern. Codex reads the same nodes through MCP, so teams avoid split-brain knowledge. Share via git push.

Quick Start

macOS / Linux / WSL

bash <(curl -fsSL https://raw.githubusercontent.com/hilyfux/knowledge-graph/main/standalone/install.sh) /path/to/your-project

Windows (PowerShell + Git Bash)

git clone https://github.com/hilyfux/knowledge-graph.git
cd knowledge-graph
.\standalone\install.ps1 C:\path\to\your-project

Then:

  1. Restart Claude Code so hooks activate, or connect your MCP-aware agent.
  2. For Codex, read the installed AGENTS.md notes and use the knowledge-graph MCP server from .mcp.json.
  3. Run /knowledge-graph init in Claude Code, or use MCP tools such as kg_status, kg_query, and kg_read_node from Codex.

From that point on: silent tracking in Claude Code, distributed knowledge nodes per module, and cross-session memory readable by Codex or any MCP-aware agent.


vs Alternatives

Knowledge Graphmcp-knowledge-graphMementoCaveman
StoragePlain files in your repoNeo4j databaseVector databaseN/A (stateless)
Dependenciesjq onlyNeo4j + Node.js + DockerPython + ChromaDBPython (optional)
Learns over time✅ Inference engine
Predicts context✅ Co-change analysis
Survives clear / compact✅ Snapshot + @includeN/AN/AN/A
LLM costNear zero (bash computes)Every queryEmbedding costsZero
Team sharinggit pushManual DB exportManual DB exportN/A
Multi-agent (Codex / MCP)✅ 7 tools + resourcesPartialPartial
Windows (PowerShell installer)

What You Get

  • Cross-agent memory — works natively in Claude Code (hooks); works in Codex / Cursor / Windsurf / any MCP client through the bundled server (7 tools + 22 resources auto-exposed)
  • Session-to-session continuity — snapshot survives clear and compact; includes git status uncommitted changes so the agent knows what's still in progress, not just what was committed
  • Predict errors before they happen — co-change prediction preloads related-module prohibitions on first access; Read size-guard warns before a 25K-token Read hits its ceiling, so the agent knows to Grep + partial-read instead of burning a round-trip
  • Auto-discovered dependencies from real co-change patterns — observe work, infer patterns, promote only evidence-backed rules
  • Zero-interrupt workflow — heavy analysis mostly runs at session boundaries; long sessions get a throttled background refresh so graph-analysis.json does not go stale
  • Named event channels + schema — parallel streams for domain-specific trackers ({channel}-events.jsonl) with formal event shape and corrupt-line tolerance. See events-schema.md.
  • Zero dependencies beyond jq — no Docker, no Neo4j, no Python, no services, no daemon. Inspectable. Versionable. No lock-in.

Token Budget

ComponentTokensWhen loaded
Knowledge index (pointer tags)~300-500Always (@include)
Work snapshot~200-400SessionStart / PostCompact
Predicted prohibitions~100/moduleFirst access to new module
Module CLAUDE.md~200/moduleOn file access (lazy)
Total baseline~500-900<0.5% of 200K context

How It Works (briefly)

Hooks fire silently during your normal Claude Code workflow:

  • Read / Write → events recorded in ~3ms; first access to a module triggers a co-change prediction that pre-loads related module prohibitions; long write-heavy sessions also trigger a throttled background refresh of graph-analysis.json
  • SessionStart / PostCompact → injects the last work snapshot so the agent picks up where it left off
  • Stop → saves the snapshot, rotates the event log, runs background analysis

Pure bash + jq mines patterns from the event log and git history; the LLM is only involved when a knowledge node actually needs to be (re)written. Everything else is zero-token.

Deep dive with full hook table, pipeline diagram, and context-survival matrix: docs/architecture-notes.md.

For non-Claude agents: the same canonical CLAUDE.md nodes, work snapshot, and co-change pairs are accessible via the MCP server.


Commands

CommandPurpose
/knowledge-graph initFull project scan. Generates canonical CLAUDE.md for every module.
/knowledge-graph updateIncremental refresh + inference engine.
/knowledge-graph statusCoverage, health, blind spots, activity heatmap.
/knowledge-graph query <question>Search the graph; get sourced answers.

What Gets Generated

Each module directory gets a compact canonical CLAUDE.md node (≤20 lines, maximum information density). Codex consumes the same node through MCP instead of maintaining a duplicate AGENTS.md.

# auth

## Prohibitions
- Raw token in localStorage → XSS (a3f21b)
- Skip refresh in test mock → flaky CI (8c4e01)

## When Changing
- Token flow → @middleware/CLAUDE.md
- User model → @api/users/CLAUDE.md

## Conventions
- Auth errors: 401 + {code, message}
- Refresh tokens: httpOnly cookies only

@ references form the dependency graph. The inference engine discovers and adds them from co-change patterns automatically.


MCP Server

7 tools and a resources channel exposed via MCP, usable from any MCP-aware agent (Codex, Cursor, Windsurf, Claude Desktop, custom clients):

ToolDescription
kg_statusCoverage, pending events, blind-spot count, hot zones, recent failures
kg_queryFull-text search across every canonical CLAUDE.md / SKILL.md body — returns path:line:excerpt
kg_read_nodeFetch the full knowledge node for a specific module
kg_recent_workCurrent work snapshot — active modules, uncommitted changes, recent commits
kg_predictPredict related modules for a file path (co-change history)
kg_cochangeTop co-change directory pairs — implicit dependencies
kg_blind_spotsModules with activity but no knowledge node

Plus Resources: every canonical CLAUDE.md / SKILL.md is exposed through kg://node/<path>, kg://claude/<path>, or kg://skill/<path>. The knowledge index is at kg://index; the work snapshot at kg://snapshot.

Auto-registered in .mcp.json during installation.


Design Principles

  1. Zero interrupts. Never blocks your coding. Analysis runs at session boundaries.
  2. Bash computes, LLM decides. Pattern mining is pure bash (~3ms/event); LLM only writes prose.
  3. Evidence-based only. Every rule traces back to a commit, error, or analysis. No evidence, no rule.
  4. Predict, don't react. Pre-load related knowledge before errors, based on co-change history.
  5. Survive everything. clear, compact, long sessions — working state persists through snapshots.
  6. Minimal token footprint. ≤20 line knowledge nodes, pointer-style index, lazy loading.
  7. Agent-agnostic outputs. Hooks are Claude Code-specific; canonical CLAUDE.md nodes, MCP tools, and resources are consumable by Codex and other agents.

Requirements

  • bash — macOS / Linux: native. Windows: Git Bash (winget install Git.Git) or WSL.
  • jqbrew install jq / apt install jq / winget install jqlang.jq
  • git (optional, recommended) — enhances dependency analysis and evidence tracing
  • An MCP-aware AI agent: Claude Code natively, or Codex / Cursor / Windsurf / Claude Desktop via the bundled MCP server

Learn More

  • Installation — platform-specific setup (macOS / Linux / Windows / WSL)
  • Configuration — env vars and tuning
  • Architecture — hook flow, prediction engine, pipeline diagram, installed layout
  • Events Schema — channel concept + event shape + tolerance guarantees
  • FAQ — common questions
  • Changelog — release history

Contributing

Contributions welcome. See CONTRIBUTING.md.

High-impact areas:

  • New pattern types in infer.sh
  • Large-monorepo performance (1000+ modules)
  • Prediction accuracy measurement and feedback loops
  • Integration tests for non-Claude MCP clients
  • Additional agent integrations beyond MCP

License

MIT

Serveurs connexes

NotebookLM Web Importer

Importez des pages web et des vidéos YouTube dans NotebookLM en un clic. Utilisé par plus de 200 000 utilisateurs.

Installer l'extension Chrome