ChunkHound
A local-first semantic code search tool with vector and regex capabilities, designed for AI assistants.
Your AI assistant searches code but doesn't understand it. ChunkHound researches your codebase—extracting architecture, patterns, and institutional knowledge at any scale. Integrates via MCP.
Features
- cAST Algorithm - Research-backed semantic code chunking
- Multi-Hop Semantic Search - Discovers interconnected code relationships beyond direct matches
- Semantic search - Natural language queries like "find authentication code"
- Regex search - Pattern matching without API keys
- Local-first - Your code stays on your machine
- 30 languages with structured parsing
- Programming (via Tree-sitter): Python, JavaScript, TypeScript, JSX, TSX, Java, Kotlin, Groovy, C, C++, C#, Go, Rust, Haskell, Swift, Bash, MATLAB, Makefile, Objective-C, PHP, Vue, Svelte, Zig
- Configuration: JSON, YAML, TOML, HCL, Markdown
- Text-based (custom parsers): Text files, PDF
- MCP integration - Works with Claude, VS Code, Cursor, Windsurf, Zed, etc
- Real-time indexing - Automatic file watching, smart diffs, seamless branch switching
Documentation
Visit chunkhound.github.io for complete guides:
Requirements
- Python 3.10+
- uv package manager
- API keys (optional - regex search works without any keys)
- Embeddings: VoyageAI (recommended) | OpenAI | Local with Ollama
- LLM (for Code Research): Claude Code CLI or Codex CLI (no API key needed) | Anthropic | OpenAI
Installation
# Install uv if needed
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install ChunkHound
uv tool install chunkhound
Quick Start
- Create
.chunkhound.jsonin project root
{
"embedding": {
"provider": "voyageai",
"api_key": "your-voyageai-key"
},
"llm": {
"provider": "claude-code-cli"
}
}
Note: Use
"codex-cli"instead if you prefer Codex. Both work equally well and require no API key.
- Index your codebase
chunkhound index
For configuration, IDE setup, and advanced usage, see the documentation.
Why ChunkHound?
| Approach | Capability | Scale | Maintenance |
|---|---|---|---|
| Keyword Search | Exact matching | Fast | None |
| Traditional RAG | Semantic search | Scales | Re-index files |
| Knowledge Graphs | Relationship queries | Expensive | Continuous sync |
| ChunkHound | Semantic + Regex + Code Research | Automatic | Incremental + realtime |
Ideal for:
- Large monorepos with cross-team dependencies
- Security-sensitive codebases (local-only, no cloud)
- Multi-language projects needing consistent search
- Offline/air-gapped development environments
Stop recreating code. Start with deep understanding.
License
MIT
Related Servers
Web Search
A server that provides web search capabilities using OpenAI models.
Esports Events
Get the latest information about esports matches. 50+ supported games: Counter-Strike, Valorant, League of Legends, Rocket League, ...
HyperKitty MCP Server
MCP server that provides read-only access to HyperKitty, the web-based email archive component of Mailman 3.
12306-mcp
Search for train tickets on 12306, the official China Railway website.
Local Flow
A minimal, local, GPU-accelerated RAG server for document ingestion and querying.
Cryptocurrency Price
A service to query real-time cryptocurrency prices.
Singapore Location Intelligence MCP
Provides real-time Singapore transport data and routing information.
Splunk
An MCP server for Splunk to search, analyze, and visualize machine-generated data from your Splunk instance.
Contextual MCP Server
A server for Retrieval-Augmented Generation (RAG) using the Contextual AI platform.
Amazon Shopping with Claude
An MCP server for searching and buying products on Amazon.