ChunkHound
A local-first semantic code search tool with vector and regex capabilities, designed for AI assistants.
Local-first codebase intelligence
Your AI assistant searches code but doesn't understand it. ChunkHound researches your codebase—extracting architecture, patterns, and institutional knowledge at any scale. Integrates via MCP.
Features
- cAST Algorithm - Research-backed semantic code chunking
- Multi-Hop Semantic Search - Discovers interconnected code relationships beyond direct matches
- Semantic search - Natural language queries like "find authentication code"
- Regex search - Pattern matching without API keys
- Local-first - Your code stays on your machine
- 32 languages with structured parsing
- Programming (via Tree-sitter): Python, JavaScript, TypeScript, JSX, TSX, Java, Kotlin, Groovy, C, C++, C#, Go, Rust, Haskell, Swift, Bash, MATLAB, Makefile, Objective-C, PHP, Dart, Lua, Vue, Svelte, Zig
- Configuration: JSON, YAML, TOML, HCL, Markdown
- Text-based (custom parsers): Text files, PDF
- MCP integration - Works with Claude, VS Code, Cursor, Windsurf, Zed, etc
- Real-time indexing - Automatic file watching, smart diffs, seamless branch switching, and explicit backend selection (
watchdog,watchman,polling)
Documentation
Visit chunkhound.github.io for complete guides:
Requirements
- Python 3.10+
- uv package manager
- API keys (optional - regex search works without any keys)
- Embeddings: VoyageAI (recommended) | OpenAI | Local with Ollama
- LLM (for Code Research): Claude Code CLI or Codex CLI (no API key needed) | Anthropic | OpenAI | Grok (xAI)
Installation
# Install uv if needed
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install ChunkHound
uv tool install chunkhound
Quick Start
- Create
.chunkhound.jsonin project root
{
"embedding": {
"provider": "voyageai",
"api_key": "your-voyageai-key"
},
"llm": {
"provider": "claude-code-cli"
}
}
Note: Use
"codex-cli"instead if you prefer Codex. Both work equally well and require no API key.
- Index your codebase
chunkhound index
For configuration, IDE setup, and advanced usage, see the documentation.
Why ChunkHound?
| Approach | Capability | Scale | Maintenance |
|---|---|---|---|
| Keyword Search | Exact matching | Fast | None |
| Traditional RAG | Semantic search | Scales | Re-index files |
| Knowledge Graphs | Relationship queries | Expensive | Continuous sync |
| ChunkHound | Semantic + Regex + Code Research | Automatic | Incremental + realtime |
Ideal for:
- Large monorepos with cross-team dependencies
- Security-sensitive codebases (local-only, no cloud)
- Multi-language projects needing consistent search
- Offline/air-gapped development environments
License
MIT
Servidores relacionados
Semantic Scholar
Search for academic papers, authors, and citations using the Semantic Scholar API.
中指房产估值MCP
MCP服务器,提供房产小区评级和评估功能
Flight Search
Search for flights using the SerpAPI Google Flights engine.
Scientific Paper Harvester
Harvests scientific papers from arXiv and OpenAlex, providing real-time access to metadata and full text.
Coles and Woolworths MCP Server
Search for products and compare prices at Coles and Woolworths supermarkets in Australia.
US Business Data MCP Server
Search US business entities across 17 states, building permits in 400+ cities, SEC filings, and SAM.gov contracts.
PubMed MCP Server
A server for searching, retrieving, and analyzing articles from the PubMed database.
MCP NIF.PT
Query and analyze Portuguese companies using the NIF.PT public API. Supports search by NIF, company name, and city.
MCP RAG
A managed Retrieval-Augmented Generation (RAG) server using MCP, integrated with knowledge bases and OpenSearch.
Grok Search
Comprehensive web, news, and social media search and analysis using xAI's Grok API.