CGM MCP Server
A server for CodeFuse-CGM, a graph-integrated large language model designed for repository-level software engineering tasks.
CGM MCP Server
A Model Context Protocol (MCP) server implementation of CodeFuse-CGM (Code Graph Model), providing graph-integrated large language model capabilities for repository-level software engineering tasks.
๐ Features
๐ฏ Two Deployment Modes
1. Full CGM Pipeline (with LLM integration)
- Repository-level Code Analysis: Analyze entire codebases using graph-based representations
- Issue Resolution: Automatically generate code patches to fix bugs and implement features
- Four-Stage Pipeline: Rewriter โ Retriever โ Reranker โ Reader architecture
- Multi-LLM Support: Works with OpenAI, Anthropic, Ollama, LM Studio
2. Model-agnostic Tools (pure analysis, no LLM required) โญ
- Pure Code Analysis: Extract code structure without LLM dependencies
- Universal Integration: Works with ANY AI model or IDE
- No API Keys Required: Zero external dependencies
- High Performance: Cached analysis results for speed
โก Performance & GPU Acceleration
Multi-Platform GPU Support ๐ฏ
- Apple Silicon (M1/M2/M3): Native MPS acceleration (42x cache speedup!)
- NVIDIA GPU: Full CUDA support with cuPy integration
- AMD GPU: ROCm (Linux) and DirectML (Windows) support
- CPU Fallback: Automatic fallback ensures universal compatibility
Advanced Caching System ๐๏ธ
- Multi-level Caching: TTL cache (1hr) + LRU cache (500 entries) + AST cache
- Smart Cache Keys: MD5-based cache keys for efficient lookups
- Memory Management: Real-time monitoring with automatic cleanup
- Performance Stats: Detailed hit/miss ratios and timing metrics
Concurrent Processing ๐
- Async File I/O: Non-blocking file operations with aiofiles
- Batch Processing: Concurrent analysis of multiple files
- GPU-Accelerated: Entity matching and text processing on GPU
- Intelligent Scheduling: Semaphore-controlled concurrency limits
๐ง Common Features
- MCP Integration: Compatible with Claude Desktop, VS Code, Cursor, and other MCP clients
- Graph-based Context: Leverages code structure and relationships for better understanding
- Multiple Output Formats: Structured JSON, Markdown, and Prompt formats
- Real-time Monitoring: GPU usage, memory consumption, and performance metrics
๐ Table of Contents
- Installation
- Quick Start
- Performance & GPU Setup
- Configuration
- Usage
- Architecture
- API Reference
- Examples
- Performance Monitoring
- Contributing
- License
๐ Installation
Prerequisites
- Python 3.8+
- pip or conda
Install from Source
# Clone the repository
git clone https://github.com/your-org/cgm-mcp.git
cd cgm-mcp
# Install dependencies
pip install -r requirements.txt
# Or install in development mode
pip install -e .
GPU Acceleration Setup (Optional)
๐ Apple Silicon (M1/M2/M3) - Automatic
# No additional setup needed!
# MPS (Metal Performance Shaders) is automatically detected and enabled
pip install torch torchvision torchaudio # Usually already installed
๐ข NVIDIA GPU
# Install CUDA-enabled PyTorch
pip install torch --index-url https://download.pytorch.org/whl/cu118
# Optional: Enhanced GPU features
pip install cupy-cuda11x # For CUDA 11.x
# or
pip install cupy-cuda12x # For CUDA 12.x
๐ด AMD GPU
# Linux (ROCm)
pip install torch --index-url https://download.pytorch.org/whl/rocm5.6
# Windows (DirectML)
pip install torch-directml
Install from PyPI (Coming Soon)
pip install cgm-mcp
โก Quick Start
1. Setup Environment
# Run setup script
./scripts/setup.sh
# Copy example environment file
cp .env.example .env
2. Choose Your Model Provider
Option A: Use Cloud Models (OpenAI/Anthropic)
# Edit .env with your API keys
export CGM_LLM_PROVIDER=openai
export CGM_LLM_API_KEY=your-openai-api-key
export CGM_LLM_MODEL=gpt-4
Option B: Use Local Models (Recommended)
# Install and start Ollama
curl -fsSL https://ollama.ai/install.sh | sh
ollama serve
# Download recommended model
ollama pull deepseek-coder:6.7b
# Start with local model
./scripts/start_local.sh --provider ollama --model deepseek-coder:6.7b
Option C: Use LM Studio
# Download and start LM Studio
# Load deepseek-coder-6.7b-instruct model
# Start local server
# Start CGM with LM Studio
./scripts/start_local.sh --provider lmstudio
3. Start the Server
# Start MCP server (cloud models)
python main.py
# Start with local models
./scripts/start_local.sh
# Or with custom config
python main.py --config config.local.json --log-level DEBUG
4. Test with Example
# Run example usage
python examples/example_usage.py
# Check GPU acceleration status
python check_gpu_dependencies.py
โก Performance & GPU Setup
๐ Check Your GPU Status
# Run the GPU dependency checker
python check_gpu_dependencies.py
Expected output for Apple Silicon:
๐ OPTIMAL: Apple Silicon GPU acceleration is active!
โข MPS backend enabled
โข No additional dependencies needed
โข CuPy warnings can be ignored
๐ Performance Benchmarks
Platform | Entity Matching | Text Processing | Cache Hit Rate |
---|---|---|---|
Apple Silicon (MPS) | 42x speedup (cached) | ~0.001s (200 files) | 95%+ |
NVIDIA CUDA | 5-10x speedup | 3-5x speedup | 90%+ |
AMD ROCm | 3-8x speedup | 2-4x speedup | 90%+ |
CPU Fallback | Baseline | Baseline | 85%+ |
๐ ๏ธ Performance Features
Smart Caching System
- TTL Cache: 1-hour expiration for analysis results
- LRU Cache: 500 most recent files kept in memory
- AST Cache: 200 parsed syntax trees cached
- Embedding Cache: GPU-accelerated similarity vectors
Memory Management
- Real-time Monitoring: Track GPU and system memory usage
- Automatic Cleanup: Clear caches when memory usage > 80%
- Unified Memory: Apple Silicon's shared CPU/GPU memory
- Memory Pools: Efficient GPU memory allocation
Concurrent Processing
- Async File I/O: Non-blocking file operations
- Batch Processing: Process multiple files simultaneously
- Semaphore Control: Limit concurrent operations (default: 10)
- GPU Queuing: Intelligent GPU task scheduling
๐ง Performance Tuning
Environment Variables
# GPU Configuration
export CGM_USE_GPU=true # Enable GPU acceleration
export CGM_GPU_BATCH_SIZE=1024 # Batch size for GPU operations
export CGM_SIMILARITY_THRESHOLD=0.1 # Entity similarity threshold
export CGM_CACHE_EMBEDDINGS=true # Cache embedding vectors
# Memory Management
export CGM_MAX_CACHE_SIZE=500 # Maximum cached files
export CGM_MEMORY_CLEANUP_THRESHOLD=80 # Memory cleanup trigger (%)
export CGM_GPU_MEMORY_FRACTION=0.8 # GPU memory usage limit
Configuration File
{
"gpu": {
"use_gpu": true,
"batch_size": 1024,
"max_sequence_length": 512,
"similarity_threshold": 0.1,
"cache_embeddings": true,
"gpu_memory_fraction": 0.8
},
"performance": {
"max_concurrent_files": 10,
"cache_ttl_seconds": 3600,
"max_file_cache_size": 500,
"memory_cleanup_threshold": 80
}
}
โ๏ธ Configuration
Environment Variables
Variable | Description | Default |
---|---|---|
CGM_LLM_PROVIDER | LLM provider (openai, anthropic, ollama, lmstudio, mock) | openai |
CGM_LLM_API_KEY | API key for LLM provider (not needed for local models) | Required for cloud |
CGM_LLM_MODEL | Model name | gpt-4 |
CGM_LLM_API_BASE | Custom API base URL (for local models) | Provider default |
CGM_LLM_TEMPERATURE | Generation temperature | 0.1 |
CGM_LOG_LEVEL | Logging level | INFO |
Configuration File
Create a config.json
file:
Cloud Models Configuration
{
"llm": {
"provider": "openai",
"model": "gpt-4",
"temperature": 0.1,
"max_tokens": 4000
}
}
Local Models Configuration
{
"llm": {
"provider": "ollama",
"model": "deepseek-coder:6.7b",
"api_base": "http://localhost:11434",
"temperature": 0.1,
"max_tokens": 4000
},
"graph": {
"max_nodes": 5000,
"max_edges": 25000,
"cache_enabled": true
},
"server": {
"log_level": "INFO",
"max_concurrent_tasks": 3
}
}
๐ Usage
MCP Tools
The server provides the following MCP tools:
Analysis Tools
cgm_analyze_repository
Analyze repository structure and extract code entities with GPU acceleration.
Parameters:
repository_path
: Path to the repositoryquery
: Search query for relevant codeanalysis_scope
: Scope of analysis (full
,focused
,minimal
)max_files
: Maximum number of files to analyze
cgm_get_file_content
Get detailed file content and analysis with concurrent processing.
Parameters:
repository_path
: Path to the repositoryfile_paths
: List of file paths to analyze
cgm_find_related_code
Find code entities related to a specific entity using GPU-accelerated similarity matching.
Parameters:
repository_path
: Path to the repositoryentity_name
: Name of the entity to find relations forrelation_types
: Types of relations to include (optional)
cgm_extract_context
Extract structured context for external model consumption.
Parameters:
repository_path
: Path to the repositoryquery
: Query for context extractionformat
: Output format (structured
,markdown
,prompt
)
Performance Tools
clear_gpu_cache
Clear GPU caches to free memory.
Parameters: None
Legacy Tools
cgm_process_issue
Process a repository issue using the CGM framework.
Parameters:
task_type
: Type of task (issue_resolution
,code_analysis
,bug_fixing
,feature_implementation
)repository_name
: Name of the repositoryissue_description
: Description of the issuerepository_context
: Optional repository context
Example:
{
"task_type": "issue_resolution",
"repository_name": "my-project",
"issue_description": "Authentication fails with special characters in password",
"repository_context": {
"path": "/path/to/repository",
"language": "Python",
"framework": "Django"
}
}
cgm_get_task_status
Get the status of a running task.
Parameters:
task_id
: Task ID to check
cgm_health_check
Check server health status.
MCP Resources
System Resources
cgm://health
: Server health informationcgm://tasks
: List of active tasks
Performance Resources
cgm://cache
: Cache statistics and hit/miss ratioscgm://performance
: Server performance and memory usage metricscgm://gpu
: GPU acceleration status and memory usage
Example Resource Access
# Check GPU status
curl "cgm://gpu"
# Monitor cache performance
curl "cgm://cache"
# View performance metrics
curl "cgm://performance"
๐ Architecture
CGM follows a four-stage pipeline:
graph LR
A[Issue] --> B[Rewriter]
B --> C[Retriever]
C --> D[Reranker]
D --> E[Reader]
E --> F[Code Patches]
G[Code Graph] --> C
G --> D
G --> E
Components
- Rewriter: Analyzes issues and extracts relevant entities and keywords
- Retriever: Locates relevant code subgraphs based on extracted information
- Reranker: Ranks files by relevance to focus analysis
- Reader: Generates specific code patches to resolve issues
Graph Builder
Constructs repository-level code graphs by analyzing:
- File structure and dependencies
- Class and function definitions
- Import relationships
- Code semantics and documentation
๐ API Reference
Core Models
CGMRequest
class CGMRequest(BaseModel):
task_type: TaskType
repository_name: str
issue_description: str
repository_context: Optional[Dict[str, Any]] = None
CGMResponse
class CGMResponse(BaseModel):
task_id: str
status: str
rewriter_result: Optional[RewriterResponse]
retriever_result: Optional[RetrieverResponse]
reranker_result: Optional[RerankerResponse]
reader_result: Optional[ReaderResponse]
processing_time: float
CodePatch
class CodePatch(BaseModel):
file_path: str
original_code: str
modified_code: str
line_start: int
line_end: int
explanation: str
๐ก Examples
Basic Issue Resolution
import asyncio
from cgm_mcp.server import CGMServer
from cgm_mcp.models import CGMRequest, TaskType
async def resolve_issue():
server = CGMServer(config)
request = CGMRequest(
task_type=TaskType.ISSUE_RESOLUTION,
repository_name="my-app",
issue_description="Login fails with special characters",
repository_context={"path": "./my-app"}
)
response = await server._process_issue(request.dict())
for patch in response.reader_result.patches:
print(f"File: {patch.file_path}")
print(f"Changes: {patch.explanation}")
Integration with Claude Desktop
Add to your Claude Desktop MCP configuration:
{
"mcpServers": {
"cgm": {
"command": "python",
"args": ["/path/to/cgm-mcp/main.py"],
"env": {
"CGM_LLM_API_KEY": "your-api-key"
}
}
}
}
๐ Performance Monitoring
Real-time Metrics
GPU Statistics
{
"memory": {
"gpu_available": true,
"platform": "Apple Silicon",
"backend": "Metal Performance Shaders",
"gpu_memory_allocated": 0.6
},
"performance": {
"gpu_entity_matches": 15,
"cache_hit_rate": 94.2
}
}
Cache Performance
{
"analysis_cache": {"size": 45, "maxsize": 100},
"file_cache": {"size": 234, "maxsize": 500},
"stats": {"hits": 156, "misses": 23, "hit_rate": 87.2}
}
Performance Testing
# Check GPU acceleration status
python check_gpu_dependencies.py
# Run performance tests
python gpu_verification.py
python test_multiplatform_gpu.py
๐งช Testing
# Run tests
pytest tests/
# Run with coverage
pytest tests/ --cov=cgm_mcp
# Run specific test
pytest tests/test_components.py::TestRewriterComponent
# Performance tests
python test_gpu_acceleration.py
python gpu_verification.py
๐ Performance Features Summary
โก GPU Acceleration
- Apple Silicon: Native MPS support with 42x cache speedup
- NVIDIA GPU: Full CUDA support with cuPy integration
- AMD GPU: ROCm (Linux) and DirectML (Windows) support
- Auto-detection: Intelligent platform detection and fallback
๐๏ธ Smart Caching
- Multi-level: TTL (1hr) + LRU (500 files) + AST (200 trees)
- Intelligent: MD5-based cache keys with hit rate monitoring
- Memory-aware: Automatic cleanup at 80% memory usage
- Performance: 85-95% cache hit rates in production
๐ Concurrent Processing
- Async I/O: Non-blocking file operations with aiofiles
- Batch Processing: Concurrent analysis of multiple files
- Semaphore Control: Configurable concurrency limits (default: 10)
- GPU Queuing: Intelligent GPU task scheduling
๐ Real-time Monitoring
- GPU Stats: Memory usage, platform detection, performance metrics
- Cache Analytics: Hit/miss ratios, size monitoring, cleanup events
- System Metrics: Memory usage, CPU utilization, processing times
- MCP Resources:
cgm://gpu
,cgm://cache
,cgm://performance
๐ค Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgments
- CodeFuse-CGM - Original CGM implementation
- PocketFlow - Framework inspiration
- Model Context Protocol - MCP specification
๐ Support
- ๐ง Email: cgm-mcp@example.com
- ๐ Issues: GitHub Issues
- ๐ฌ Discussions: GitHub Discussions
CGM MCP Server - Bringing graph-integrated code intelligence to your development workflow! ๐
Related Servers
Interactive Feedback MCP
An MCP server for AI-assisted development tools like Cursor and Claude, supporting interactive feedback workflows with AI.
Rails MCP Server
An MCP server for Rails projects, allowing LLMs to interact with your application.
Financial Dashboard with AI Agent Integration
A financial dashboard for monitoring and analyzing investment portfolios with AI-powered insights.
Unity MCP Template
A template project demonstrating interaction between a TypeScript-based MCP server and a Unity client.
LogAI MCP Server
An MCP server for log analysis using the LogAI framework, with optional Grafana and GitHub integrations.
CodeToPrompt MCP Server
An MCP server for the codetoprompt library, enabling integration with LLM agents.
Pharo NeoConsole
Evaluate Pharo Smalltalk expressions and get system information via a local NeoConsole server.
Kubernetes Automated Installation
An agent for automatically installing Kubernetes in a Rocky Linux environment using MCP.
AiCore Project
A unified framework for integrating various language models and embedding providers to generate text completions and embeddings.
Nextflow Developer Tools
An MCP server for Nextflow development and testing, which requires a local clone of the Nextflow Git repository.