ChunkHound
A local-first semantic code search tool with vector and regex capabilities, designed for AI assistants.
Local-first codebase intelligence
Your AI assistant searches code but doesn't understand it. ChunkHound researches your codebase—extracting architecture, patterns, and institutional knowledge at any scale. Integrates via MCP.
Features
- cAST Algorithm - Research-backed semantic code chunking
- Multi-Hop Semantic Search - Discovers interconnected code relationships beyond direct matches
- Semantic search - Natural language queries like "find authentication code"
- Regex search - Pattern matching without API keys
- Local-first - Your code stays on your machine
- 32 languages with structured parsing
- Programming (via Tree-sitter): Python, JavaScript, TypeScript, JSX, TSX, Java, Kotlin, Groovy, C, C++, C#, Go, Rust, Haskell, Swift, Bash, MATLAB, Makefile, Objective-C, PHP, Dart, Lua, Vue, Svelte, Zig
- Configuration: JSON, YAML, TOML, HCL, Markdown
- Text-based (custom parsers): Text files, PDF
- MCP integration - Works with Claude, VS Code, Cursor, Windsurf, Zed, etc
- Real-time indexing - Automatic file watching, smart diffs, seamless branch switching, and explicit backend selection (
watchdog,watchman,polling)
Documentation
Visit chunkhound.github.io for complete guides:
Requirements
- Python 3.10+
- uv package manager
- API keys (optional - regex search works without any keys)
- Embeddings: VoyageAI (recommended) | OpenAI | Local with Ollama
- LLM (for Code Research): Claude Code CLI or Codex CLI (no API key needed) | Anthropic | OpenAI | Grok (xAI)
Installation
# Install uv if needed
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install ChunkHound
uv tool install chunkhound
Quick Start
- Create
.chunkhound.jsonin project root
{
"embedding": {
"provider": "voyageai",
"api_key": "your-voyageai-key"
},
"llm": {
"provider": "claude-code-cli"
}
}
Note: Use
"codex-cli"instead if you prefer Codex. Both work equally well and require no API key.
- Index your codebase
chunkhound index
For configuration, IDE setup, and advanced usage, see the documentation.
Why ChunkHound?
| Approach | Capability | Scale | Maintenance |
|---|---|---|---|
| Keyword Search | Exact matching | Fast | None |
| Traditional RAG | Semantic search | Scales | Re-index files |
| Knowledge Graphs | Relationship queries | Expensive | Continuous sync |
| ChunkHound | Semantic + Regex + Code Research | Automatic | Incremental + realtime |
Ideal for:
- Large monorepos with cross-team dependencies
- Security-sensitive codebases (local-only, no cloud)
- Multi-language projects needing consistent search
- Offline/air-gapped development environments
License
MIT
関連サーバー
Semble
Fast, accurate, local code search for agents. Indexes any local path or GitHub repo on demand in ~250ms and answers queries in ~1.5ms. Works on CPU, no API keys or external services.
VideoSeek
Find anything in any video. Semantic video search, video Q&A, persistent memory, and social media import (TikTok/YouTube/Instagram) for AI agents. 18 MCP tools.
Code Research MCP Server
Search and access programming resources from Stack Overflow, MDN, GitHub, npm, and PyPI.
SmartHomeExplorer Product Intelligence
Smart home product intelligence for AI assistants. 1,080+ products with consensus scores from 12 expert sources, cross-ecosystem compatibility engine, and 340+ buying guides.
Marketaux
Search for market news and financial data by entity, country, industry, or symbol using the Marketaux API.
ArXiv-MCP
Search and retrieve academic papers from arXiv based on keywords.
Dartpoint
Access public disclosure information for Korean companies (DART) using the dartpoint.ai API.
Gemini AI MCP Server
Provides AI-powered web search and summarization using the Gemini API's grounding feature.
Deep Research
A server for conducting deep research and generating reports.
Yandex Search MCP Server
Perform real-time web searches using the Yandex Search API.