CGM MCP Server
A server for CodeFuse-CGM, a graph-integrated large language model designed for repository-level software engineering tasks.
CGM MCP Server
A Model Context Protocol (MCP) server implementation of CodeFuse-CGM (Code Graph Model), providing graph-integrated large language model capabilities for repository-level software engineering tasks.
๐ Features
๐ฏ Two Deployment Modes
1. Full CGM Pipeline (with LLM integration)
- Repository-level Code Analysis: Analyze entire codebases using graph-based representations
- Issue Resolution: Automatically generate code patches to fix bugs and implement features
- Four-Stage Pipeline: Rewriter โ Retriever โ Reranker โ Reader architecture
- Multi-LLM Support: Works with OpenAI, Anthropic, Ollama, Ollama Cloud, LM Studio
2. Model-agnostic Tools (pure analysis, no LLM required) โญ
- Pure Code Analysis: Extract code structure without LLM dependencies
- Universal Integration: Works with ANY AI model or IDE
- No API Keys Required: Zero external dependencies
- High Performance: Cached analysis results for speed
โก Performance & GPU Acceleration
Multi-Platform GPU Support ๐ฏ
- Apple Silicon (M1/M2/M3): Native MPS acceleration (42x cache speedup!)
- NVIDIA GPU: Full CUDA support with cuPy integration
- AMD GPU: ROCm (Linux) and DirectML (Windows) support
- CPU Fallback: Automatic fallback ensures universal compatibility
โ๏ธ Ollama Cloud Support
Ollama Cloud Provider ๐
- Cloud-based Ollama Models: Run Ollama-compatible models in the cloud
- API Compatibility: Compatible with Ollama API format
- Easy Configuration: Works with standard Ollama model names
- Secure Access: Supports API key authentication
Advanced Caching System ๐๏ธ
- Multi-level Caching: TTL cache (1hr) + LRU cache (500 entries) + AST cache
- Smart Cache Keys: MD5-based cache keys for efficient lookups
- Memory Management: Real-time monitoring with automatic cleanup
- Performance Stats: Detailed hit/miss ratios and timing metrics
Concurrent Processing ๐
- Async File I/O: Non-blocking file operations with aiofiles
- Batch Processing: Concurrent analysis of multiple files
- GPU-Accelerated: Entity matching and text processing on GPU
- Intelligent Scheduling: Semaphore-controlled concurrency limits
๐ง Common Features
- MCP Integration: Compatible with Claude Desktop, VS Code, Cursor, and other MCP clients
- Graph-based Context: Leverages code structure and relationships for better understanding
- Multiple Output Formats: Structured JSON, Markdown, and Prompt formats
- Real-time Monitoring: GPU usage, memory consumption, and performance metrics
๐ Table of Contents
- Installation
- Quick Start
- Performance & GPU Setup
- Configuration
- Usage
- Architecture
- API Reference
- Examples
- Performance Monitoring
- Contributing
- License
๐ Installation
Prerequisites
- Python 3.8+
- pip or conda
Install from Source
# Clone the repository
git clone https://github.com/your-org/cgm-mcp.git
cd cgm-mcp
# Install dependencies
pip install -r requirements.txt
# Or install in development mode
pip install -e .
GPU Acceleration Setup (Optional)
๐ Apple Silicon (M1/M2/M3) - Automatic
# No additional setup needed!
# MPS (Metal Performance Shaders) is automatically detected and enabled
pip install torch torchvision torchaudio # Usually already installed
๐ข NVIDIA GPU
# Install CUDA-enabled PyTorch
pip install torch --index-url https://download.pytorch.org/whl/cu118
# Optional: Enhanced GPU features
pip install cupy-cuda11x # For CUDA 11.x
# or
pip install cupy-cuda12x # For CUDA 12.x
๐ด AMD GPU
# Linux (ROCm)
pip install torch --index-url https://download.pytorch.org/whl/rocm5.6
# Windows (DirectML)
pip install torch-directml
Install from PyPI (Coming Soon)
pip install cgm-mcp
โก Quick Start
1. Setup Environment
# Run setup script
./scripts/setup.sh
# Copy example environment file
cp .env.example .env
2. Choose Your Model Provider
Option A: Use Cloud Models (OpenAI/Anthropic)
# Edit .env with your API keys
export CGM_LLM_PROVIDER=openai
export CGM_LLM_API_KEY=your-openai-api-key
export CGM_LLM_MODEL=gpt-4
Option B: Use Local Models (Recommended)
# Install and start Ollama
curl -fsSL https://ollama.ai/install.sh | sh
ollama serve
# Download recommended model
ollama pull deepseek-coder:6.7b
# Start with local model
./scripts/start_local.sh --provider ollama --model deepseek-coder:6.7b
Option C: Use LM Studio
# Download and start LM Studio
# Load deepseek-coder-6.7b-instruct model
# Start local server
# Start CGM with LM Studio
./scripts/start_local.sh --provider lmstudio
3. Start the Server
# Start MCP server (cloud models)
python main.py
# Start with local models
./scripts/start_local.sh
# Or with custom config
python main.py --config config.local.json --log-level DEBUG
4. Test with Example
# Run example usage
python examples/example_usage.py
# Check GPU acceleration status
python check_gpu_dependencies.py
โก Performance & GPU Setup
๐ Check Your GPU Status
# Run the GPU dependency checker
python check_gpu_dependencies.py
Expected output for Apple Silicon:
๐ OPTIMAL: Apple Silicon GPU acceleration is active!
โข MPS backend enabled
โข No additional dependencies needed
โข CuPy warnings can be ignored
๐ Performance Benchmarks
| Platform | Entity Matching | Text Processing | Cache Hit Rate |
|---|---|---|---|
| Apple Silicon (MPS) | 42x speedup (cached) | ~0.001s (200 files) | 95%+ |
| NVIDIA CUDA | 5-10x speedup | 3-5x speedup | 90%+ |
| AMD ROCm | 3-8x speedup | 2-4x speedup | 90%+ |
| CPU Fallback | Baseline | Baseline | 85%+ |
๐ ๏ธ Performance Features
Smart Caching System
- TTL Cache: 1-hour expiration for analysis results
- LRU Cache: 500 most recent files kept in memory
- AST Cache: 200 parsed syntax trees cached
- Embedding Cache: GPU-accelerated similarity vectors
Memory Management
- Real-time Monitoring: Track GPU and system memory usage
- Automatic Cleanup: Clear caches when memory usage > 80%
- Unified Memory: Apple Silicon's shared CPU/GPU memory
- Memory Pools: Efficient GPU memory allocation
Concurrent Processing
- Async File I/O: Non-blocking file operations
- Batch Processing: Process multiple files simultaneously
- Semaphore Control: Limit concurrent operations (default: 10)
- GPU Queuing: Intelligent GPU task scheduling
๐ง Performance Tuning
Environment Variables
# GPU Configuration
export CGM_USE_GPU=true # Enable GPU acceleration
export CGM_GPU_BATCH_SIZE=1024 # Batch size for GPU operations
export CGM_SIMILARITY_THRESHOLD=0.1 # Entity similarity threshold
export CGM_CACHE_EMBEDDINGS=true # Cache embedding vectors
# Memory Management
export CGM_MAX_CACHE_SIZE=500 # Maximum cached files
export CGM_MEMORY_CLEANUP_THRESHOLD=80 # Memory cleanup trigger (%)
export CGM_GPU_MEMORY_FRACTION=0.8 # GPU memory usage limit
Configuration File
{
"gpu": {
"use_gpu": true,
"batch_size": 1024,
"max_sequence_length": 512,
"similarity_threshold": 0.1,
"cache_embeddings": true,
"gpu_memory_fraction": 0.8
},
"performance": {
"max_concurrent_files": 10,
"cache_ttl_seconds": 3600,
"max_file_cache_size": 500,
"memory_cleanup_threshold": 80
}
}
โ๏ธ Configuration
Environment Variables
| Variable | Description | Default |
|---|---|---|
CGM_LLM_PROVIDER | LLM provider (openai, anthropic, ollama, ollama_cloud, lmstudio, mock) | openai |
CGM_LLM_API_KEY | API key for LLM provider (not needed for local models) | Required for cloud |
CGM_LLM_MODEL | Model name | gpt-4 |
CGM_LLM_API_BASE | Custom API base URL (for local models) | Provider default |
CGM_LLM_TEMPERATURE | Generation temperature | 0.1 |
CGM_LOG_LEVEL | Logging level | INFO |
Configuration File
Create a config.json file:
Cloud Models Configuration
{
"llm": {
"provider": "openai",
"model": "gpt-4",
"temperature": 0.1,
"max_tokens": 4000
}
}
Local Models Configuration
{
"llm": {
"provider": "ollama",
"model": "deepseek-coder:6.7b",
"api_base": "http://localhost:11434",
"temperature": 0.1,
"max_tokens": 4000
},
"graph": {
"max_nodes": 5000,
"max_edges": 25000,
"cache_enabled": true
},
"server": {
"log_level": "INFO",
"max_concurrent_tasks": 3
}
}
๐ Usage
MCP Tools
The server provides the following MCP tools:
Analysis Tools
cgm_analyze_repository
Analyze repository structure and extract code entities with GPU acceleration.
Parameters:
repository_path: Path to the repositoryquery: Search query for relevant codeanalysis_scope: Scope of analysis (full,focused,minimal)max_files: Maximum number of files to analyze
cgm_get_file_content
Get detailed file content and analysis with concurrent processing.
Parameters:
repository_path: Path to the repositoryfile_paths: List of file paths to analyze
cgm_find_related_code
Find code entities related to a specific entity using GPU-accelerated similarity matching.
Parameters:
repository_path: Path to the repositoryentity_name: Name of the entity to find relations forrelation_types: Types of relations to include (optional)
cgm_extract_context
Extract structured context for external model consumption.
Parameters:
repository_path: Path to the repositoryquery: Query for context extractionformat: Output format (structured,markdown,prompt)
Performance Tools
clear_gpu_cache
Clear GPU caches to free memory.
Parameters: None
Legacy Tools
cgm_process_issue
Process a repository issue using the CGM framework.
Parameters:
task_type: Type of task (issue_resolution,code_analysis,bug_fixing,feature_implementation)repository_name: Name of the repositoryissue_description: Description of the issuerepository_context: Optional repository context
Example:
{
"task_type": "issue_resolution",
"repository_name": "my-project",
"issue_description": "Authentication fails with special characters in password",
"repository_context": {
"path": "/path/to/repository",
"language": "Python",
"framework": "Django"
}
}
cgm_get_task_status
Get the status of a running task.
Parameters:
task_id: Task ID to check
cgm_health_check
Check server health status.
MCP Resources
System Resources
cgm://health: Server health informationcgm://tasks: List of active tasks
Performance Resources
cgm://cache: Cache statistics and hit/miss ratioscgm://performance: Server performance and memory usage metricscgm://gpu: GPU acceleration status and memory usage
Example Resource Access
# Check GPU status
curl "cgm://gpu"
# Monitor cache performance
curl "cgm://cache"
# View performance metrics
curl "cgm://performance"
๐ Architecture
CGM follows a four-stage pipeline:
graph LR
A[Issue] --> B[Rewriter]
B --> C[Retriever]
C --> D[Reranker]
D --> E[Reader]
E --> F[Code Patches]
G[Code Graph] --> C
G --> D
G --> E
Components
- Rewriter: Analyzes issues and extracts relevant entities and keywords
- Retriever: Locates relevant code subgraphs based on extracted information
- Reranker: Ranks files by relevance to focus analysis
- Reader: Generates specific code patches to resolve issues
Graph Builder
Constructs repository-level code graphs by analyzing:
- File structure and dependencies
- Class and function definitions
- Import relationships
- Code semantics and documentation
๐ API Reference
Core Models
CGMRequest
class CGMRequest(BaseModel):
task_type: TaskType
repository_name: str
issue_description: str
repository_context: Optional[Dict[str, Any]] = None
CGMResponse
class CGMResponse(BaseModel):
task_id: str
status: str
rewriter_result: Optional[RewriterResponse]
retriever_result: Optional[RetrieverResponse]
reranker_result: Optional[RerankerResponse]
reader_result: Optional[ReaderResponse]
processing_time: float
CodePatch
class CodePatch(BaseModel):
file_path: str
original_code: str
modified_code: str
line_start: int
line_end: int
explanation: str
๐ก Examples
Basic Issue Resolution
import asyncio
from cgm_mcp.server import CGMServer
from cgm_mcp.models import CGMRequest, TaskType
async def resolve_issue():
server = CGMServer(config)
request = CGMRequest(
task_type=TaskType.ISSUE_RESOLUTION,
repository_name="my-app",
issue_description="Login fails with special characters",
repository_context={"path": "./my-app"}
)
response = await server._process_issue(request.dict())
for patch in response.reader_result.patches:
print(f"File: {patch.file_path}")
print(f"Changes: {patch.explanation}")
Integration with Claude Desktop
Add to your Claude Desktop MCP configuration:
{
"mcpServers": {
"cgm": {
"command": "python",
"args": ["/path/to/cgm-mcp/main.py"],
"env": {
"CGM_LLM_API_KEY": "your-api-key"
}
}
}
}
๐ Performance Monitoring
Real-time Metrics
GPU Statistics
{
"memory": {
"gpu_available": true,
"platform": "Apple Silicon",
"backend": "Metal Performance Shaders",
"gpu_memory_allocated": 0.6
},
"performance": {
"gpu_entity_matches": 15,
"cache_hit_rate": 94.2
}
}
Cache Performance
{
"analysis_cache": {"size": 45, "maxsize": 100},
"file_cache": {"size": 234, "maxsize": 500},
"stats": {"hits": 156, "misses": 23, "hit_rate": 87.2}
}
Performance Testing
# Check GPU acceleration status
python check_gpu_dependencies.py
# Run performance tests
python gpu_verification.py
python test_multiplatform_gpu.py
๐งช Testing
# Run tests
pytest tests/
# Run with coverage
pytest tests/ --cov=cgm_mcp
# Run specific test
pytest tests/test_components.py::TestRewriterComponent
# Performance tests
python test_gpu_acceleration.py
python gpu_verification.py
๐ Performance Features Summary
โก GPU Acceleration
- Apple Silicon: Native MPS support with 42x cache speedup
- NVIDIA GPU: Full CUDA support with cuPy integration
- AMD GPU: ROCm (Linux) and DirectML (Windows) support
- Auto-detection: Intelligent platform detection and fallback
๐๏ธ Smart Caching
- Multi-level: TTL (1hr) + LRU (500 files) + AST (200 trees)
- Intelligent: MD5-based cache keys with hit rate monitoring
- Memory-aware: Automatic cleanup at 80% memory usage
- Performance: 85-95% cache hit rates in production
๐ Concurrent Processing
- Async I/O: Non-blocking file operations with aiofiles
- Batch Processing: Concurrent analysis of multiple files
- Semaphore Control: Configurable concurrency limits (default: 10)
- GPU Queuing: Intelligent GPU task scheduling
๐ Real-time Monitoring
- GPU Stats: Memory usage, platform detection, performance metrics
- Cache Analytics: Hit/miss ratios, size monitoring, cleanup events
- System Metrics: Memory usage, CPU utilization, processing times
- MCP Resources:
cgm://gpu,cgm://cache,cgm://performance
๐ค Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgments
- CodeFuse-CGM - Original CGM implementation
- PocketFlow - Framework inspiration
- Model Context Protocol - MCP specification
๐ Support
- ๐ง Email: [email protected]
- ๐ Issues: GitHub Issues
- ๐ฌ Discussions: GitHub Discussions
CGM MCP Server - Bringing graph-integrated code intelligence to your development workflow! ๐
Server Terkait
Alpha Vantage MCP Server
sponsorAccess financial market data: realtime & historical stock, ETF, options, forex, crypto, commodities, fundamentals, technical indicators, & more
pyATS
Interact with network devices using Cisco's pyATS and Genie libraries for model-driven automation.
MCP Gemini CLI
Integrate with Google Gemini through its command-line interface (CLI).
Dev Manager
A development management tool for project planning, task management, and development workflows.
ADT MCP Server
An MCP server for ABAP Development Tools (ADT) that exposes various ABAP repository-read tools over a standardized interface.
Sensei MCP
Expert guidance for Dojo and Cairo development on Starknet, specializing in the Dojo ECS framework for building onchain worlds.
Hackle
Query A/B test data using the Hackle API.
CDK API MCP Server
Provides an offline AWS CDK API reference.
Lokalise MCP Tool
Add translation keys to Lokalise projects. Requires a Lokalise API key.
MCP Gemini CLI
A command-line interface wrapper for the Google Gemini API, enabling interaction with Gemini's Search and Chat tools.
OpenAPI to MCP Server
A server that converts OpenAPI specifications into the Model Context Protocol (MCP).