jCodeMunch-MCP
Token-efficient MCP server for GitHub source code exploration via tree-sitter AST parsing
Quickstart - https://github.com/jgravelle/jcodemunch-mcp/blob/main/QUICKSTART.md
FREE FOR PERSONAL USE
Use it to make money, and Uncle J. gets a taste. Fair enough? details
Cut code-reading token costs by up to 99%
Most AI agents explore repositories the expensive way: open entire files → skim thousands of irrelevant lines → repeat.
jCodeMunch indexes a codebase once and lets agents retrieve only the exact symbols they need — functions, classes, methods, constants — with byte-level precision.
| Task | Traditional approach | With jCodeMunch |
|---|---|---|
| Find a function | ~40,000 tokens | ~200 tokens |
| Understand module API | ~15,000 tokens | ~800 tokens |
| Explore repo structure | ~200,000 tokens | ~2k tokens |
Index once. Query cheaply forever.
Precision context beats brute-force context.
jCodeMunch MCP
Structured retrieval for serious AI agents
Commercial licenses
jCodeMunch-MCP is free for non-commercial use.
Commercial use requires a paid license.
jCodeMunch-only licenses
- Builder — $79 — 1 developer
- Studio — $349 — up to 5 developers
- Platform — $1,999 — org-wide internal deployment
Want both code and docs retrieval?
Stop dumping files into context windows. Start retrieving exactly what the agent needs.
jCodeMunch indexes a codebase once using tree-sitter AST parsing, then allows MCP-compatible agents (Claude Desktop, VS Code, Google Antigravity, and others) to discover and retrieve code by symbol instead of brute-reading files.
Every symbol stores:
- Signature
- Kind
- Qualified name
- One-line summary
- Byte offsets into the original file
Full source is retrieved on demand using O(1) byte-offset seeking.
Proof: Token savings in the wild
Repo: geekcomputers/Python
Size: 338 files, 1,422 symbols indexed
Task: Locate calculator / math implementations
| Approach | Tokens | What the agent had to do |
|---|---|---|
| Raw file approach | ~7,500 | Open multiple files and scan manually |
| jCodeMunch MCP | ~1,449 | search_symbols() → get_symbol() |
Result: ~80% fewer tokens (~5× more efficient)
Cost scales with tokens.
Latency scales with irrelevant context.
jCodeMunch turns search into navigation.
Why agents need this
Agents waste money when they:
- Open entire files to find one function
- Re-read the same code repeatedly
- Consume imports, boilerplate, and unrelated helpers
jCodeMunch provides precision context access:
- Search symbols by name, kind, or language
- Outline files without loading full contents
- Retrieve exact symbol implementations only
- Fall back to full-text search when necessary
Agents do not need larger context windows.
They need structured retrieval.
How it works
jCodeMunch implements jMRI-Full — the open specification for structured retrieval MCP servers. jMRI-Full covers the full stack: discover, search, retrieve, and metadata operations with batch retrieval, hash-based drift detection, byte-offset addressing, and a complete _meta envelope on every call.
- Discovery — GitHub API or local directory walk
- Security filtering — traversal protection, secret exclusion, binary detection
- Parsing — tree-sitter AST extraction
- Context enrichment — auto-detected ecosystem providers (dbt, etc.) inject business metadata
- Storage — JSON index + raw files stored locally (
~/.code-index/) - Retrieval — O(1) byte-offset seeking via stable symbol IDs
Stable Symbol IDs
{file_path}::{qualified_name}#{kind}
Examples:
src/main.py::UserService.login#methodsrc/utils.py::authenticate#function
IDs remain stable across re-indexing when path, qualified name, and kind are unchanged.
Installation
New here? See QUICKSTART.md for a focused 3-step setup guide.
Prerequisites
- Python 3.10+
- pip
Install
pip install jcodemunch-mcp
Verify:
jcodemunch-mcp --help
Configure MCP Client
PATH note: MCP clients often run with a limited environment where
jcodemunch-mcpmay not be found even if it works in your terminal. Usinguvxis the recommended approach — it resolves the package on demand without requiring anything to be on your system PATH. If you preferpip install, use the absolute path to the executable instead:
- Linux:
/home/<username>/.local/bin/jcodemunch-mcp- macOS:
/Users/<username>/.local/bin/jcodemunch-mcp- Windows:
C:\\Users\\<username>\\AppData\\Roaming\\Python\\Python3xx\\Scripts\\jcodemunch-mcp.exe
Claude Code
The fastest way to add jCodeMunch to Claude Code is a single command:
claude mcp add jcodemunch uvx jcodemunch-mcp
This registers the server at user scope (~/.claude.json) so it is available in every project. To add it to a specific project only, pass --scope project:
claude mcp add --scope project jcodemunch uvx jcodemunch-mcp
To include optional environment variables (e.g. GITHUB_TOKEN or ANTHROPIC_API_KEY):
claude mcp add jcodemunch uvx jcodemunch-mcp \
-e GITHUB_TOKEN=ghp_... \
-e ANTHROPIC_API_KEY=sk-ant-...
Restart Claude Code after adding the server.
Manual config — if you prefer to edit the config file directly, the relevant files are:
| Scope | Path |
|---|---|
| User (global) | ~/.claude.json |
| Project | .claude/settings.json (in the project root) |
{
"mcpServers": {
"jcodemunch": {
"command": "uvx",
"args": ["jcodemunch-mcp"]
}
}
}
Claude Desktop
Config file location:
| OS | Path |
|---|---|
| macOS | ~/Library/Application Support/Claude/claude_desktop_config.json |
| Linux | ~/.config/claude/claude_desktop_config.json |
| Windows | %APPDATA%\Claude\claude_desktop_config.json |
Minimal config (no API keys needed):
{
"mcpServers": {
"jcodemunch": {
"command": "uvx",
"args": ["jcodemunch-mcp"]
}
}
}
With optional AI summaries and GitHub auth:
{
"mcpServers": {
"jcodemunch": {
"command": "uvx",
"args": ["jcodemunch-mcp"],
"env": {
"GITHUB_TOKEN": "ghp_...",
"ANTHROPIC_API_KEY": "sk-ant-..."
}
}
}
}
With debug logging (useful when diagnosing why files are not indexed):
{
"mcpServers": {
"jcodemunch": {
"command": "uvx",
"args": [
"jcodemunch-mcp",
"--log-level", "DEBUG",
"--log-file", "/tmp/jcodemunch.log"
]
}
}
}
Logging flags can also be set via env vars
JCODEMUNCH_LOG_LEVELandJCODEMUNCH_LOG_FILE. Always use--log-file(or the env var) when debugging — writing logs to stderr can corrupt the MCP stdio stream in some clients.
After saving the config, restart Claude Desktop for the server to appear.
Google Antigravity
- Open the Agent pane → click the
⋯menu → MCP Servers → Manage MCP Servers - Click View raw config to open
mcp_config.json - Add the entry below, save, then restart the MCP server from the Manage MCPs pane
{
"mcpServers": {
"jcodemunch": {
"command": "uvx",
"args": ["jcodemunch-mcp"]
}
}
}
Environment variables are optional:
| Variable | Purpose |
|---|---|
GITHUB_TOKEN | Higher GitHub API limits / private access |
ANTHROPIC_API_KEY | AI-generated summaries via Claude Haiku (takes priority) |
ANTHROPIC_BASE_URL | Third-party Anthropic-compatible endpoints (e.g. z.ai) |
GOOGLE_API_KEY | AI-generated summaries via Gemini Flash |
Step 3: Tell Claude to actually use it
This step is not optional.
Installing the MCP server makes the tools available — but Claude will not use them automatically. Without instructions, Claude defaults to its built-in file tools (read, grep, etc.) and never touches jCodeMunch. This is the single most common reason users install the server and see no difference.
Create a CLAUDE.md file that instructs Claude to use jCodeMunch for all code lookups.
Global (applies to every project)
Create or edit ~/.claude/CLAUDE.md:
Use jcodemunch-mcp for all code lookups. Never read full files when MCP is available.
1. Call `list_repos` first — if the project is not indexed, call `index_folder` with the current directory.
2. Use `search_symbols` / `get_symbol` to find and retrieve code by symbol name.
3. Use `get_repo_outline` or `get_file_outline` to explore structure.
4. Fall back to direct file reads only when editing or when MCP is unavailable.
Project-level only
Create CLAUDE.md in your project root with the same content. Claude Code merges project-level and global instructions automatically.
Verify it's working
Ask Claude: "What repos do you have indexed?" — it should call list_repos. If it responds without calling any tool, re-check that CLAUDE.md exists and that the MCP server appears in /mcp (Claude Code) or the server list in Claude Desktop.
Usage Examples
index_folder: { "path": "/path/to/project" }
index_repo: { "url": "owner/repo" }
get_repo_outline: { "repo": "owner/repo" }
get_file_outline: { "repo": "owner/repo", "file_path": "src/main.py" }
get_file_content: { "repo": "owner/repo", "file_path": "src/main.py", "start_line": 10, "end_line": 25 }
search_symbols: { "repo": "owner/repo", "query": "authenticate" }
get_symbol: { "repo": "owner/repo", "symbol_id": "src/main.py::MyClass.login#method" }
get_context_bundle: { "repo": "owner/repo", "symbol_id": "src/main.py::MyClass.login#method" }
search_text: { "repo": "owner/repo", "query": "TODO", "context_lines": 1 }
search_columns: { "repo": "owner/repo", "query": "customer_id", "model_pattern": "fact_*" }
Local folder indexes are stored with stable hashed repo ids. Use list_repos to inspect the exact id, or the bare display name when it is unique.
Tools (14)
| Tool | Purpose |
|---|---|
index_repo | Index a GitHub repository |
index_folder | Index a local folder |
list_repos | List indexed repositories |
get_file_tree | Repository file structure |
get_file_outline | Symbol hierarchy for a file |
get_file_content | Retrieve cached file content |
get_symbol | Retrieve full symbol source |
get_symbols | Batch retrieve symbols |
get_context_bundle | Symbol source + file imports in one call |
search_symbols | Search symbols with filters |
search_text | Full-text search with context |
search_columns | Search column metadata across models |
get_repo_outline | High-level repo overview |
invalidate_cache | Remove cached index |
Every tool response includes a _meta envelope with timing, token savings, and cost avoided:
"_meta": {
"timing_ms": 4.3,
"tokens_saved": 48153,
"total_tokens_saved": 1280837,
"cost_avoided": { "claude_opus": 1.2038, "gpt5_latest": 0.4815 },
"total_cost_avoided": { "claude_opus": 32.02, "gpt5_latest": 12.81 }
}
total_tokens_saved and total_cost_avoided accumulate across all tool calls and persist to ~/.code-index/_savings.json.
Recent Updates
v1.4.1 — CLI interface (cli/cli.py) for terminal/pipeline use; "Tell Claude to use it" setup section in README
v0.2.10 — Pin mcp<1.10.0 to prevent Windows win32api DLL crash on startup
v0.2.9 — Community savings meter: anonymous token savings shared to a live global counter at j.gravelle.us (opt-out via JCODEMUNCH_SHARE_SAVINGS=0); updated model pricing (Opus $25/1M, GPT-5 $10/1M)
v0.2.8 — Estimated cost avoided added to every _meta response (cost_avoided, total_cost_avoided)
v0.2.7 — Security fix: .claude/ excluded from sdist; structural CI guardrails prevent credential bundling
v0.2.5 — Path traversal hardening in IndexStore; jcodemunch-mcp --help now works
v0.2.4 — Live token savings counter (tokens_saved, total_tokens_saved in every _meta)
v0.2.3 — Google Gemini Flash support (GOOGLE_API_KEY); auto-selects between Anthropic and Gemini
v0.2.2 — PHP language support
Supported Languages
| Language | Extensions | Symbol Types |
|---|---|---|
| Python | .py | function, class, method, constant, type |
| JavaScript | .js, .jsx | function, class, method, constant |
| TypeScript | .ts, .tsx | function, class, method, constant, type |
| Go | .go | function, method, type, constant |
| Rust | .rs | function, type, impl, constant |
| Java | .java | method, class, type, constant |
| PHP | .php | function, class, method, type, constant |
| Dart | .dart | function, class, method, type |
| C# | .cs | class, method, type, record |
| C | .c | function, type, constant |
| C++ | .cpp, .cc, .cxx, .hpp, .hh, .hxx, .h* | function, class, method, type, constant |
| Elixir | .ex, .exs | class (module/impl), type (protocol/@type/@callback), method, function |
| Ruby | .rb, .rake | class, type (module), method, function |
| SQL | .sql | function (CREATE FUNCTION, CTE, dbt macro/test/materialization), type (CREATE TABLE/VIEW/SCHEMA/INDEX, dbt snapshot) |
| XML/XUL | .xml, .xul | type (root element), constant (id attributes), function (script refs) |
* .h is parsed as C++ first, then falls back to C when no C++ symbols are extracted.
See LANGUAGE_SUPPORT.md for full semantics.
Context Providers
When indexing local folders, jCodeMunch automatically detects ecosystem tools and enriches the index with business context — descriptions, tags, and metadata from project configuration files.
| Provider | Detects | Enriches With |
|---|---|---|
| dbt | dbt_project.yml | Model descriptions, tags, column names/descriptions |
Context enrichment is automatic — no configuration needed. When a provider detects its tool, it injects metadata into AI summarization prompts, file summaries, and search keywords.
Example: a dbt model with a schema.yml description produces file summaries like:
This table summarizes account ledger. Tags: nightly, agg, intraday. 70 properties
Instead of the default:
Contains 2 functions: source, renamed
The provider system is extensible — adding support for Terraform, OpenAPI, Django, or any other tool requires implementing a single ContextProvider class.
See CONTEXT_PROVIDERS.md for the full architecture, dbt details, and guide to writing new providers.
Contributing
PRs welcome! All contributors must sign the Contributor License Agreement before their PR can be merged — CLA Assistant will prompt you automatically. See CONTRIBUTING.md for details.
Security
Built-in protections:
- Path traversal prevention (owner/name sanitization +
_safe_content_pathenforcement) - Symlink escape protection
- Secret file exclusion (
.env,*.pem, etc.) - Binary detection
- Configurable file size limits
See SECURITY.md for details.
Best Use Cases
- Large multi-module repositories
- Agent-driven refactors
- Architecture exploration
- Faster onboarding
- Token-efficient multi-agent workflows
Not Intended For
- LSP diagnostics or completions
- Editing workflows
- Real-time file watching
- Cross-repository global indexing
- Semantic program analysis
Local LLMs (Ollama / LM Studio)
You can use local, privacy-preserving AI models to generate summaries by providing an OpenAI-compatible endpoint.
For Ollama, run a model locally, then configure the MCP server:
"env": {
"OPENAI_API_BASE": "http://localhost:11434/v1",
"OPENAI_MODEL": "qwen3-coder"
}
For LM Studio, ensure the Local Server is running (usually on port 1234):
"env": {
"OPENAI_API_BASE": "http://127.0.0.1:1234/v1",
"OPENAI_MODEL": "openai/gpt-oss-20b"
}
[!TIP] Performance Note: Local models can be slow to load into memory on their first request, potentially causing the MCP server to time out and fall back to generic signature summaries. It is highly recommended to pre-load the model in Ollama or LM Studio before starting the server, or increase the
OPENAI_TIMEOUTenvironment variable (e.g., to"120.0") to allow more time for generation.
Environment Variables
| Variable | Purpose | Required |
|---|---|---|
GITHUB_TOKEN | GitHub API auth | No |
ANTHROPIC_API_KEY | Symbol summaries via Claude Haiku (takes priority) | No |
ANTHROPIC_BASE_URL | Third-party Anthropic-compatible endpoints (e.g. z.ai) | No |
ANTHROPIC_MODEL | Model name for Claude summaries (default: claude-haiku-4-5-20251001) | No |
GOOGLE_API_KEY | Symbol summaries via Gemini Flash | No |
GOOGLE_MODEL | Model name for Gemini summaries (default: gemini-2.5-flash-lite) | No |
OPENAI_API_BASE | Base URL for local LLMs (e.g. http://localhost:11434/v1) | No |
OPENAI_API_KEY | API key for local LLMs (default: local-llm) | No |
OPENAI_MODEL | Model name for local LLMs (default: qwen3-coder) | No |
OPENAI_TIMEOUT | Timeout in seconds for local requests (default: 60.0) | No |
OPENAI_BATCH_SIZE | Symbols per summarization request (default: 10) | No |
OPENAI_CONCURRENCY | Max parallel batch requests (default: 1) | No |
OPENAI_MAX_TOKENS | Max output tokens per batch response (default: 500) | No |
CODE_INDEX_PATH | Custom cache path | No |
JCODEMUNCH_MAX_INDEX_FILES | Maximum files to index per repo/folder (default: 10000) | No |
JCODEMUNCH_CONTEXT_PROVIDERS | Set to 0 to disable context providers (dbt, etc.) during indexing | No |
JCODEMUNCH_SHARE_SAVINGS | Set to 0 to disable anonymous community token savings reporting | No |
JCODEMUNCH_LOG_LEVEL | Log level: DEBUG, INFO, WARNING, ERROR (default: WARNING) | No |
JCODEMUNCH_LOG_FILE | Path to log file. If unset, logs go to stderr. Use a file to avoid polluting MCP stdio. | No |
Community Savings Meter
Each tool call contributes an anonymous delta to a live global counter at j.gravelle.us. Only two values are ever sent: the tokens saved (a number) and a random anonymous install ID — never code, paths, repo names, or anything identifying. The anon ID is generated once and stored in ~/.code-index/_savings.json.
To disable, set JCODEMUNCH_SHARE_SAVINGS=0 in your MCP server env.
Documentation
- USER_GUIDE.md
- ARCHITECTURE.md
- SPEC.md
- SECURITY.md
- LANGUAGE_SUPPORT.md
- CONTEXT_PROVIDERS.md
Star History
<a href="https://www.star-history.com/#jgravelle/jcodemunch-mcp&type=date&legend=top-left"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=jgravelle/jcodemunch-mcp&type=date&theme=dark&legend=top-left" /> <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=jgravelle/jcodemunch-mcp&type=date&legend=top-left" /> <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=jgravelle/jcodemunch-mcp&type=date&legend=top-left" /> </picture> </a>License (Dual Use)
This repository is free for non-commercial use under the terms below.
Commercial use requires a paid commercial license.
Copyright and License Text
Copyright (c) 2026 J. Gravelle
1. Non-Commercial License Grant (Free)
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to use, copy, modify, merge, publish, and distribute the Software for personal, educational, research, hobby, or other non-commercial purposes, subject to the following conditions:
-
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
-
Any modifications made to the Software must clearly indicate that they are derived from the original work, and the name of the original author (J. Gravelle) must remain intact. He's kinda full of himself.
-
Redistributions of the Software in source code form must include a prominent notice describing any modifications from the original version.
2. Commercial Use
Commercial use of the Software requires a separate paid commercial license from the author.
“Commercial use” includes, but is not limited to:
- Use of the Software in a business environment
- Internal use within a for-profit organization
- Incorporation into a product or service offered for sale
- Use in connection with revenue generation, consulting, SaaS, hosting, or fee-based services
For commercial licensing inquiries, contact:
[email protected] | https://j.gravelle.us
Until a commercial license is obtained, commercial use is not permitted.
3. Disclaimer of Warranty
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHOR OR COPYRIGHT HOLDER BE LIABLE FOR ANY CLAIM, DAMAGES, OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT, OR OTHERWISE, ARISING FROM, OUT OF, OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Related Servers
Chhart MCP
Chhart MCP is a tool that enables AI assistants to generate instant, shareable flowcharts and Sankey diagrams directly in chat,
clickup-mcp
Lightweight ClickUp MCP server with 35 tools. Token-optimized responses reduce API verbosity by 95%+ (3500 chars → 160). Tasks, comments, checklists, tags, dependencies.
Synter Ads
Cross-platform ad campaign management for AI agents across Google, Meta, LinkedIn, Reddit, TikTok, and more. 140+ tools with read/write access.
Hyperspell
A spellchecker and grammar checker for developers, requiring a Hyperspell token for authentication.
StashDog MCP Server
A server providing natural language tools to manage your StashDog inventory.
Clockify
Manage your Clockify time entries using natural language prompts.
Work Memory MCP Server
Manages work memories and shares context between AI tools using a local SQLite database.
OneNote
Access your entire OneNote knowledge base through AI using the Microsoft Graph API.
FluentLab Funding Assistant
An assistant API to help find and apply for funding opportunities.
vidmagik-mcp
An un-official moviepy mcp server giving your Agents the abillity to edit,master, & re-master Video, Slideshows, and Gif's