Autodev Codebase
A platform-agnostic code analysis library with semantic search capabilities and MCP server support.
@autodev/codebase
╭─ ~/workspace/autodev-codebase
╰─❯ codebase --demo --search="user manage"
Found 3 results in 2 files for: "user manage"
==================================================
File: "hello.js"
==================================================
< class UserManager > (L7-20)
class UserManager {
constructor() {
this.users = [];
}
addUser(user) {
this.users.push(user);
console.log('User added:', user.name);
}
getUsers() {
return this.users;
}
}
==================================================
File: "README.md" | 2 snippets
==================================================
< md_h1 Demo Project > md_h2 Usage > md_h3 JavaScript Functions > (L16-20)
### JavaScript Functions
- greetUser(name) - Greets a user by name
- UserManager - Class for managing user data
─────
< md_h1 Demo Project > md_h2 Search Examples > (L27-38)
## Search Examples
Try searching for:
- "greet user"
- "process data"
- "user management"
- "batch processing"
- "YOLO model"
- "computer vision"
- "object detection"
- "model training"
A vector embedding-based code semantic search tool with MCP server and multi-model integration. Can be used as a pure CLI tool. Supports Ollama for fully local embedding and reranking, enabling complete offline operation and privacy protection for your code repository.
🚀 Features
- 🔍 Semantic Code Search: Vector-based search using advanced embedding models
- 🌐 MCP Server: HTTP-based MCP server with SSE and stdio adapters
- 💻 Pure CLI Tool: Standalone command-line interface without GUI dependencies
- ⚙️ Layered Configuration: CLI, project, and global config management
- 🎯 Advanced Path Filtering: Glob patterns with brace expansion and exclusions
- 🌲 Tree-sitter Parsing: Support for 40+ programming languages
- 💾 Qdrant Integration: High-performance vector database
- 🔄 Multiple Providers: OpenAI, Ollama, Jina, Gemini, Mistral, OpenRouter, Vercel
- 📊 Real-time Watching: Automatic index updates
- ⚡ Batch Processing: Efficient parallel processing
- 📝 Code Outline Extraction: Generate structured code outlines with AI summaries
📦 Installation
1. Dependencies
brew install ollama ripgrep
ollama serve
ollama pull nomic-embed-text
2. Qdrant
docker run -d -p 6333:6333 -p 6334:6334 --name qdrant qdrant/qdrant
3. Install
npm install -g @autodev/codebase
codebase --set-config embedderProvider=ollama,embedderModelId=nomic-embed-text
🛠️ Quick Start
# Demo mode (recommended for first-time)
# Creates a demo directory in current working directory for testing
# Index & search
codebase --demo --index
codebase --demo --search="user greet"
# MCP server
codebase --demo --serve
📋 Commands
📝 AI-Powered Code Outlines
Generate intelligent code summaries with one command:
codebase --outline "src/**/*.ts" --summarize
Output Example:
# src/cli.ts (1902 lines)
└─ Implements a simplified CLI for @autodev/codebase using Node.js native parseArgs.
Manages codebase indexing, searching, and MCP server operations.
27--35 | function initGlobalLogger
└─ Initializes a global logger instance with specified log level and timestamps.
45--54 | interface SearchResult
└─ Defines the structure for search result payloads, including file path, code chunk,
and relevance score.
... (full outline with AI summaries)
Benefits:
- 🧠 Understand code fast - Get function-level summaries without reading every line
- 💾 Smart caching - Only summarizes changed code blocks
- 🌐 Multi-language - English/Chinese summaries supported
- ⚡ Batch processing - Efficiently handles large codebases
Quick Setup:
# Configure Ollama (recommended for free, local AI)
codebase --set-config summarizerProvider=ollama,summarizerOllamaModelId=qwen3-vl:4b-instruct
# Or use DeepSeek (cost-effective API)
codebase --set-config summarizerProvider=openai-compatible,summarizerOpenAiCompatibleBaseUrl=https://api.deepseek.com/v1,summarizerOpenAiCompatibleModelId=deepseek-chat,summarizerOpenAiCompatibleApiKey=sk-your-key
Indexing & Search
# Index the codebase
codebase --index --path=/my/project --force
# Search with filters
codebase --search="error handling" --path-filters="src/**/*.ts"
# Search with custom limit and minimum score
codebase --search="authentication" --limit=20 --min-score=0.7
codebase --search="API" -l 30 -S 0.5
# Search in JSON format
codebase --search="authentication" --json
# Clear index data
codebase --clear --path=/my/project
MCP Server
# HTTP mode (recommended)
codebase --serve --port=3001 --path=/my/project
# Stdio adapter
codebase --stdio-adapter --server-url=http://localhost:3001/mcp
Configuration
# View config
codebase --get-config
codebase --get-config embedderProvider --json
# Set config
codebase --set-config embedderProvider=ollama,embedderModelId=nomic-embed-text
codebase --set-config --global qdrantUrl=http://localhost:6333
Advanced Features
🔍 LLM-Powered Search Reranking
Enable LLM reranking to dramatically improve search relevance:
# Enable reranking with Ollama (recommended)
codebase --set-config rerankerEnabled=true,rerankerProvider=ollama,rerankerOllamaModelId=qwen3-vl:4b-instruct
# Or use OpenAI-compatible providers
codebase --set-config rerankerEnabled=true,rerankerProvider=openai-compatible,rerankerOpenAiCompatibleModelId=deepseek-chat
# Search with automatic reranking
codebase --search="user authentication" # Results are automatically reranked by LLM
Benefits:
- 🎯 Higher precision: LLM understands semantic relevance beyond vector similarity
- 📊 Smart scoring: Results are reranked on a 0-10 scale based on query relevance
- ⚡ Batch processing: Efficiently handles large result sets with configurable batch sizes
- 🎛️ Threshold control: Filter results with
rerankerMinScoreto keep only high-quality matches
Path Filtering & Export
# Path filtering with brace expansion and exclusions
codebase --search="API" --path-filters="src/**/*.ts,lib/**/*.js"
codebase --search="utils" --path-filters="{src,test}/**/*.ts"
# Export results in JSON format for scripts
codebase --search="auth" --json
📝 AI-Powered Code Outlines
Generate intelligent code summaries and outlines:
# Extract code structure
codebase --outline src/index.ts
# With AI summaries (recommended)
codebase --outline "src/**/*.ts" --summarize
# Preview before processing
codebase --outline "src/**/*.ts" --dry-run
# Clear cache and regenerate
codebase --outline src/index.ts --summarize --clear-summarize-cache
Key Benefits:
- 🎯 Function-level summaries: Understand code purpose at a glance
- 💾 Smart caching: Avoid redundant LLM calls
- 🌐 Multi-language: English / Chinese support
- ⚡ Batch processing: Efficiently handle large codebases
⚙️ Configuration
Config Layers (Priority Order)
- CLI Arguments - Runtime parameters (
--path,--config,--log-level,--force, etc.) - Project Config -
./autodev-config.json(or custom path via--config) - Global Config -
~/.autodev-cache/autodev-config.json - Built-in Defaults - Fallback values
Note: CLI arguments provide runtime override for paths, logging, and operational behavior. For persistent configuration (embedderProvider, API keys, search parameters), use --set-config to save to config files.
Common Config Examples
Ollama:
{
"embedderProvider": "ollama",
"embedderModelId": "nomic-embed-text",
"qdrantUrl": "http://localhost:6333"
}
OpenAI:
{
"embedderProvider": "openai",
"embedderModelId": "text-embedding-3-small",
"embedderOpenAiApiKey": "sk-your-key",
"qdrantUrl": "http://localhost:6333"
}
OpenAI-Compatible:
{
"embedderProvider": "openai-compatible",
"embedderModelId": "text-embedding-3-small",
"embedderOpenAiCompatibleApiKey": "sk-your-key",
"embedderOpenAiCompatibleBaseUrl": "https://api.openai.com/v1"
}
Key Configuration Options
| Category | Options | Description |
|---|---|---|
| Embedding | embedderProvider, embedderModelId, embedderModelDimension | Provider and model settings |
| API Keys | embedderOpenAiApiKey, embedderOpenAiCompatibleApiKey | Authentication |
| Vector Store | qdrantUrl, qdrantApiKey | Qdrant connection |
| Search | vectorSearchMinScore, vectorSearchMaxResults | Search behavior |
| Reranker | rerankerEnabled, rerankerProvider | Result reranking |
| Summarizer | summarizerProvider, summarizerLanguage, summarizerBatchSize | AI summary generation |
Key CLI Arguments:
--serve/--index/--search- Core operations--outline <pattern>- Extract code outlines (supports glob patterns)--summarize- Generate AI summaries for code outlines--dry-run- Preview operations before execution--title- Show only file-level summaries--clear-summarize-cache- Clear all summary caches--get-config/--set-config- Configuration management--path,--demo,--force- Common options--limit/-l <number>- Maximum number of search results (default: from config, max 50)--min-score/-S <number>- Minimum similarity score for search results (0-1, default: from config)--help- Show all available options
For complete CLI reference, see CONFIG.md.
Configuration Commands:
# View config
codebase --get-config
codebase --get-config --json
# Set config (saves to file)
codebase --set-config embedderProvider=ollama,embedderModelId=nomic-embed-text
codebase --set-config --global embedderProvider=openai,embedderOpenAiApiKey=sk-xxx
# Use custom config file
codebase --config=/path/to/config.json --get-config
codebase --config=/path/to/config.json --set-config embedderProvider=ollama
# Runtime override (paths, logging, etc.)
codebase --index --path=/my/project --log-level=info --force
For complete configuration reference, see CONFIG.md.
🔌 MCP Integration
HTTP Streamable Mode (Recommended)
codebase --serve --port=3001
IDE Config:
{
"mcpServers": {
"codebase": {
"url": "http://localhost:3001/mcp"
}
}
}
Stdio Adapter
# First start the MCP server in one terminal
codebase --serve --port=3001
# Then connect via stdio adapter in another terminal (for IDEs that require stdio)
codebase --stdio-adapter --server-url=http://localhost:3001/mcp
IDE Config:
{
"mcpServers": {
"codebase": {
"command": "codebase",
"args": ["stdio-adapter", "--server-url=http://localhost:3001/mcp"]
}
}
}
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request or open an Issue on GitHub.
📄 License
This project is licensed under the MIT License.
🙏 Acknowledgments
This project is a fork and derivative work based on Roo Code. We've built upon their excellent foundation to create this specialized codebase analysis tool with enhanced features and MCP server capabilities.
🌟 If you find this tool helpful, please give us a star on GitHub!
Made with ❤️ for the developer community
Related Servers
Scout Monitoring MCP
sponsorPut performance and error data directly in the hands of your AI assistant.
Alpha Vantage MCP Server
sponsorAccess financial market data: realtime & historical stock, ETF, options, forex, crypto, commodities, fundamentals, technical indicators, & more
UML-MCP
A diagram generation server supporting multiple UML and other diagram types, with various output formats. It integrates with rendering services like Kroki and PlantUML.
Openapi MCP
An MCP server that lets LLMs inspect and interact with OpenAPI specifications.
Profile MCP Server
An example MCP server deployable on Cloudflare Workers without authentication.
Shell MCP
Securely execute shell commands with whitelisting, resource limits, and timeout controls for LLMs.
Figma
Integrate Figma design data with AI coding tools using a local MCP server.
tachibot-mcp
Stop AI Hallucinations Before They Start Run models from OpenAI, Google, Anthropic, xAI, Perplexity, and OpenRouter in parallel. They check each other's work, debate solutions, and catch errors before you see them.
MCP Create Server
A service for dynamically creating, running, and managing Model Context Protocol (MCP) servers.
ComfyUI
An MCP server for ComfyUI integration.
Cygnus MCP Server
An MCP server demonstrating Cygnus tools for reading text files and invoking local APIs.
Knowledge Graph Memory Server
Enables persistent memory for Claude using a local knowledge graph of entities, relations, and observations.