MCP LLM Integration Server Server

Một máy chủ MCP để tích hợp các Mô hình Ngôn ngữ Lớn cục bộ với các máy khách tương thích MCP.

Tài liệu

MCP LLM Integration Server

This is a Model Context Protocol (MCP) server that allows you to integrate local LLM capabilities with MCP-compatible clients.

Features

  • llm_predict: Process text prompts through a local LLM
  • echo: Echo back text for testing purposes

Setup

  1. Install dependencies:

    source .venv/bin/activate
    uv pip install mcp
    
  2. Test the server:

    python -c "
    import asyncio
    from main import server, list_tools, call_tool
    
    async def test():
        tools = await list_tools()
        print(f'Available tools: {[t.name for t in tools]}')
        result = await call_tool('echo', {'text': 'Hello!'})
        print(f'Result: {result[0].text}')
    
    asyncio.run(test())
    "
    

Integration with LLM Clients

For Claude Desktop

Add this to your Claude Desktop configuration (~/.config/claude-desktop/claude_desktop_config.json):

{
  "mcpServers": {
    "llm-integration": {
      "command": "/home/tandoori/Desktop/dev/mcp-server/.venv/bin/python",
      "args": ["/home/tandoori/Desktop/dev/mcp-server/main.py"]
    }
  }
}

For Continue.dev

Add this to your Continue configuration (~/.continue/config.json):

{
  "mcpServers": [
    {
      "name": "llm-integration",
      "command": "/home/tandoori/Desktop/dev/mcp-server/.venv/bin/python",
      "args": ["/home/tandoori/Desktop/dev/mcp-server/main.py"]
    }
  ]
}

For Cline

Add this to your Cline MCP settings:

{
  "llm-integration": {
    "command": "/home/tandoori/Desktop/dev/mcp-server/.venv/bin/python",
    "args": ["/home/tandoori/Desktop/dev/mcp-server/main.py"]
  }
}

Customizing the LLM Integration

To integrate your own local LLM, modify the perform_llm_inference function in main.py:

async def perform_llm_inference(prompt: str, max_tokens: int = 100) -> str:
    Example: Using transformers
    from transformers import pipeline
    generator = pipeline('text-generation', model='your-model')
    result = generator(prompt, max_length=max_tokens)
    return result[0]['generated_text']
    
    Example: Using llama.cpp python bindings
    from llama_cpp import Llama
    llm = Llama(model_path="path/to/your/model.gguf")
    output = llm(prompt, max_tokens=max_tokens)
    return output['choices'][0]['text']
    
    Current placeholder implementation
    return f"Processed prompt: '{prompt}' (max_tokens: {max_tokens})"

Testing

Run the server directly to test JSON-RPC communication:

source .venv/bin/activate
python main.py

Then send JSON-RPC requests via stdin:

{"jsonrpc": "2.0", "id": 1, "method": "initialize", "params": {"protocolVersion": "2024-11-05", "capabilities": {}, "clientInfo": {"name": "test-client", "version": "1.0.0"}}}