better-code-review-graph MCP Server

กราฟความรู้สำหรับการตรวจสอบโค้ดที่ประหยัดโทเค็น ด้วยการแยกวิเคราะห์ Tree-sitter การฝังแบบสองโหมด (ONNX + LiteLLM) และการวิเคราะห์รัศมีการระเบิดผ่านเครื่องมือ MCP

เอกสาร

Better Code Review Graph

mcp-name: io.github.n24q02m/better-code-review-graph

Knowledge graph for token-efficient code reviews -- semantic search and call-graph resolution across your codebase.

CI codecov PyPI Docker License: MIT

Python MCP semantic-release Renovate

Sister projects from n24q02m (click to expand)
ProjectTaglineTag
better-code-review-graphKnowledge graph for token-efficient code reviews -- semantic search and call-...MCP
better-email-mcpIMAP/SMTP email for AI agents -- read, send, organize folders, and manage att...MCP
better-godot-mcpComposite MCP server for Godot Engine -- 17 composite tools for AI-assisted g...MCP
better-notion-mcpMarkdown-first Notion for AI agents -- pages, databases, blocks, and comments...MCP
better-telegram-mcpTelegram for AI agents -- messages, chats, media, and contacts across both bo...MCP
claude-pluginsClaude Code plugin marketplace for the n24q02m MCP servers -- install web sea...Marketplace
imagine-mcpImage and video understanding + generation for AI agents -- across Gemini, Op...MCP
jules-task-archiverChrome Extension for bulk operations on Jules tasks via batchexecute API -- a...Tooling
mcp-coreShared foundation for building MCP servers -- Streamable HTTP transport, OAut...MCP
mnemo-mcpPersistent AI memory with hybrid search and embedded sync. Open, free, unlimi...MCP
qwen3-embedLightweight Qwen3 text embedding and reranking via ONNX Runtime and GGUFLibrary
skretSecrets without the server.CLI
tacetTACET: a self-distilling neuro-symbolic cascade that amortises LLM cost in kn...Tooling
web-coreShared web infrastructure package for search, scraping, HTTP security, and st...Library
wet-mcpOpen-source MCP server for AI agents: web search, content extraction, and lib...MCP
better-code-review-graph MCP server

An MCP server that parses your codebase with Tree-sitter, builds a structural graph of functions/classes/imports, and gives Claude (or any MCP client) precise context so it reads only what matters instead of the whole tree. Semantic search runs on a local ONNX embedding model by default (zero config, no API key), with an optional cloud embedding chain. Fork of code-review-graph with fixed multi-word search, qualified call resolution, dual-mode embeddings, output pagination, and production CI/CD.

v2.0 migration (BREAKING)

v2.0 adds temporal columns (valid_from_sha / valid_to_sha on every node + edge) and an opt-in security scanner. The schema migration is auto-applied on first GraphStore open, and a backup of the pre-2.0 DB is saved to <graph_db>.pre-2.0.bak so you can roll back. See BREAKING_CHANGES.md for the full schema-change list, behavior changes, environment requirements, and the downgrade procedure (CRG_DOWNGRADE_TO_1_X=1 uv run better-code-review-graph).

Table of contents

Install

The server runs over stdio by default and works with any MCP client. The recommended launcher is uvx (no install step -- it fetches and runs the published package in an isolated environment):

{
  "mcpServers": {
    "better-code-review-graph": {
      "command": "uvx",
      "args": ["--python", "3.13", "better-code-review-graph"],
      "env": { "MCP_TRANSPORT": "stdio" }
    }
  }
}

Or install it as a Python package:

uvx better-code-review-graph        # run without installing
pip install better-code-review-graph

The optional Semgrep engine for deeper security scans is a separate extra:

pip install 'better-code-review-graph[security]'

Install with an AI agent -- paste this to your AI coding agent:

Install MCP server better-code-review-graph following the steps at https://raw.githubusercontent.com/n24q02m/claude-plugins/main/plugins/better-code-review-graph/setup-with-agent.md

Full per-client setup (Claude Code, Codex, Gemini CLI, Cursor, Windsurf, raw mcp.json) is at mcp.n24q02m.com/servers/better-code-review-graph/setup/.

Configuration

Everything works out of the box with zero configuration -- semantic search uses a local qwen3-embed ONNX model (Qwen3-Embedding-0.6B, ~570 MB downloaded on first graph embed). All environment variables below are optional and only needed for cloud embeddings or LLM summaries.

Model chains

Embeddings and summaries are each driven by an ordered model chain -- a CSV of provider/model entries where the order is the litellm fallback order (first entry is the active model). The provider is inferred from the model prefix, so the matching <PROVIDER>_API_KEY is all you need to add.

VariablePurposeEmpty (default)
EMBEDDING_MODELSCloud embedding chain, e.g. jina_ai/jina-embeddings-v5-text-small,gemini/gemini-embedding-001Local ONNX (qwen3-embed)
SUMMARY_MODELSSummarizer chain for graph(action="summarize"), e.g. gemini/gemini-2.5-flash,openai/gpt-4o-miniSummaries disabled

All vectors are stored at a fixed 768 dimensions (MRL truncation), so the embeddings table schema stays valid across providers. Switching embedding model changes the vector space; embeddings are tracked per provider and a provider switch triggers re-embedding rather than mixing incomparable vectors.

Provider API keys

Cloud models need the provider key for whatever prefixes appear in your chains. Without any cloud key the server stays on local ONNX. Summarizers must expose a chat-completion API (so Jina and Cohere are embedding-only).

Model prefixAPI key env varGet a key
jina_ai/JINA_AI_API_KEYhttps://jina.ai/api-key
gemini/GEMINI_API_KEY (or GOOGLE_API_KEY)https://aistudio.google.com/apikey
openai/ (or bare text-embedding-*)OPENAI_API_KEYhttps://platform.openai.com/api-keys
cohere/COHERE_API_KEYhttps://dashboard.cohere.com/api-keys

Any other litellm provider works via its standard <PROVIDER>_API_KEY.

Advanced

VariablePurpose
EMBEDDING_API_BASECustom OpenAI-compatible base URL for cloud embedding (SSRF-guarded)
LLM_API_BASECustom OpenAI-compatible base URL for the summarizer (SSRF-guarded)
DISABLE_LOCAL_EMBEDSkip the local ONNX download; embedding is unavailable unless a cloud chain is configured
CRG_DATA_DIROverride the per-user data directory (default ~/.crg) used for per-user graphs and credentials in HTTP multi-user mode
EMBEDDING_BACKEND / EMBEDDING_MODEL / SUMMARY_MODELDeprecated singular vars, honored one release with a warning -- migrate to the *_MODELS chains

Example -- cloud embeddings + summaries

{
  "mcpServers": {
    "better-code-review-graph": {
      "command": "uvx",
      "args": ["--python", "3.13", "better-code-review-graph"],
      "env": {
        "MCP_TRANSPORT": "stdio",
        "EMBEDDING_MODELS": "jina_ai/jina-embeddings-v5-text-small,gemini/gemini-embedding-001",
        "SUMMARY_MODELS": "gemini/gemini-2.5-flash",
        "JINA_AI_API_KEY": "jina_...",
        "GEMINI_API_KEY": "AIza..."
      }
    }
  }
}

You can also configure cloud keys interactively in HTTP mode via the relay setup form (config(action="setup_start") returns the browser URL). See the modes overview and multi-user setup.

Tools

Seven tools, each grouping related actions to keep the tool surface small.

graph -- Graph lifecycle

Actions: build | update | stats | embed | export | summarize

ActionDescription
buildFull or incremental graph build. Set full_rebuild=true to re-parse all files; pass roots to federate extra repo directories into one graph.
updateAlias for build with full_rebuild=false (incremental).
statsGraph size, languages, node/edge breakdown, embedding count.
embedCompute vector embeddings for semantic search. Dual-mode: local ONNX or cloud chain.
exportExport the graph as graphml / json-ld / dot / cypher. Inline or to output_path.
summarizeLLM-generated one-paragraph docstrings for Function nodes (via the SUMMARY_MODELS chain; no-op when no provider key is set). Cost-capped via max_nodes.

query -- Graph queries

Actions: query | search | impact | large_functions | spot_check | renamed_in_diff | diff

ActionDescription
queryPredefined patterns: callers_of, callees_of, imports_of, importers_of, children_of, tests_for, inheritors_of, file_summary.
searchSearch code entities by name/keyword or semantic similarity.
impactBlast radius of changed files. Auto-detects from git diff. Paginated with max_results.
large_functionsFind functions/classes exceeding a line-count threshold.
spot_checkRandom callsite snippets from the last callers_of/callees_of/inheritors_of/importers_of result.
renamed_in_diffSymbols whose callsite line shifted versus a base ref.
diffNodes added/removed/modified between two commit SHAs (from_sha, to_sha).

Most read actions accept as_of=<sha> for temporal (point-in-time) snapshots and repo=<repo_id> to scope a federated multi-repo graph.

review -- Code review context

Actions: context (default) | delta

Token-optimized review context with structural summary, impacted nodes, source snippets, and review guidance. context auto-detects changed files from the git diff; delta (with from_sha/to_sha, optional show_line_shifts) surfaces refactor moves between two commits.

config -- Server configuration and credential setup

Actions: status | set | cache_clear | setup_status | setup_start | setup_skip | setup_reset | setup_complete

ActionDescription
statusServer info: version, graph path, node/edge counts, embedding backend, embeddings count.
setUpdate a runtime setting (key=log_level).
cache_clearRemove all computed embeddings.
setup_statusShow current credential state, providers configured, and setup URL.
setup_startStart relay setup to configure API keys via browser (HTTP mode).
setup_skipSet local mode (skip relay permanently, use ONNX only).
setup_resetClear credentials and reset state.
setup_completeRe-resolve credentials from environment variables.

security -- Security scanning

Actions: scan | report | suppress | rule_list

ActionDescription
scanRun a security scan (engine='heuristic' default = 5 regex rules, or 'semgrep'). Findings persist on nodes.security_tags.
reportRe-emit cached findings as JSON (format='json') or SARIF v2.1.0 (format='sarif').
suppressSuppress a finding by rule_id (or remove=true to un-suppress).
rule_listList available rules for an engine.

The semgrep engine requires the [security] extra and runs Semgrep's p/auto registry pack plus a 3-rule curated overlay.

help -- Full documentation

Topics: graph | query | review | config | security | recipes

Returns complete documentation for each tool. Use when the compressed descriptions above are insufficient.

config__open_relay -- Re-trigger the relay setup form

Registered automatically from mcp-core. In HTTP mode it returns <PUBLIC_URL>/authorize so the agent can re-open the browser setup form (e.g. after credential expiry); in stdio mode it returns status: 'stdio_unsupported'.

Features

What this fork fixes versus the upstream code-review-graph:

Featurecode-review-graphbetter-code-review-graph
Multi-word searchBroken (literal substring)AND-logic word splitting
callers_of/callees_ofEmpty results (bare name targets)Qualified name resolution + bare fallback
Embeddingsentence-transformers + torch (1.1 GB)qwen3-embed ONNX + cloud (200 MB), dual-mode
Output sizeUnbounded (500K+ chars)Paginated (max_results, truncated flag)
Tool design9 individual tools7 grouped tools: graph + query + review + config + security + help + config__open_relay
Plugin hooksInvalid PostEdit/PostGitValid PostToolUse

Comparison

How better-code-review-graph stacks up against direct competitors in each pillar:

Capabilitybetter-code-review-graphGreptileSourcegraph (Cody / MCP)CodeGraph (colbymchenry)
Codebase knowledge graphYes (Tree-sitter, 14 langs, SQLite)Yes (functions/classes/deps)Yes (precise code indexing)Yes (Tree-sitter, 20+ langs, SQLite)
Persistent incremental updatesYes (git-diff + file-hash re-parse)?Yes (continuous indexing)Yes (OS file-watcher debounced)
Qualified call resolution (callers/callees)Yes (same-file bare-call resolution + fallback)?Yes (go-to-def / find-references)Yes (callers / callees / impact)
Semantic search / embeddingsYes (qwen3 ONNX local + cloud Jina/Gemini/OpenAI/Cohere)?Yes (semantic + keyword + regex)No (FTS5 full-text only)
Token-optimized review contextYes (review tool, git-diff scoped)Yes (PR review comments)No (code-context assistant)No (context layer, not review)
Security scanningYes (Semgrep p/auto + 3-rule overlay, SARIF)??No
Self-hostableYes (stdio default, machine-bound)Yes (Docker / K8s / air-gapped)Yes (self-hosted instance)Yes (100% local, no API keys)
Free / open sourceYes (MIT)No (proprietary SaaS; free OSS tier)No (Enterprise license, source private)Yes (MIT)

Sources: Greptile · Greptile pricing · Sourcegraph MCP · CodeGraph. Cells marked ? are capabilities the competitor does not publicly document, not confirmed absences.

Security

  • Graceful fallbacks -- Cloud embedding failure falls back to local ONNX.
  • Error handling -- Tools return error strings with fix suggestions, never crash.
  • Read-only mount -- Docker mode mounts the repo as :ro (read-only).
  • SSRF-guarded endpoints -- Custom EMBEDDING_API_BASE / LLM_API_BASE URLs are validated before any outbound call.

To report a vulnerability, see SECURITY.md.

Build from source

git clone https://github.com/n24q02m/better-code-review-graph
cd better-code-review-graph
uv sync --group dev
uv run pytest
uv run better-code-review-graph

Requirements: Python 3.13, uv.

Trust model

This plugin implements TC-Local (machine-bound, single trust principal). See the mcp-core trust model for full classification.

ModeGraph DBCloud credentialsWho can read your data?
stdio (default)<repo>/.code-review-graph/graph.db (git-ignored)~/.better-code-review-graph-mcp/config.json (AES-GCM, machine-bound key)Only your OS user
HTTP self-host (multi-user)Per-user ~/.crg/subs/<sub>/graph.dbPer-user ~/.crg/subs/<sub>/config.jsonOnly the authenticated user

Migration & changelog

The v2.0 release added temporal columns (valid_from_sha / valid_to_sha on every node and edge) plus an opt-in security scanner. The schema migration is auto-applied on first GraphStore open, and a backup of the pre-2.0 DB is written to <graph_db>.pre-2.0.bak. To downgrade and restore it:

CRG_DOWNGRADE_TO_1_X=1 uvx better-code-review-graph

Full schema-change list, behavior changes, and rollback procedure: BREAKING_CHANGES.md. Release-by-release history: CHANGELOG.md.

Documentation

Full docs at mcp.n24q02m.com/servers/better-code-review-graph/setup/:

  • Setup -- install methods for Claude Code, Codex, Gemini CLI, Cursor, Windsurf, mcp.json
  • Modes overview -- stdio / local-relay / remote-relay / remote-oauth
  • Multi-user setup -- per-JWT-sub credential model

Use the help tool from any MCP client for inline per-tool reference.

License

MIT -- See LICENSE.