MCP LLM Integration Server

An MCP server for integrating local Large Language Models with MCP-compatible clients.

MCP LLM Integration Server

This is a Model Context Protocol (MCP) server that allows you to integrate local LLM capabilities with MCP-compatible clients.

Features

llm_predict: Process text prompts through a local LLM
echo: Echo back text for testing purposes

Setup

Install dependencies:

source .venv/bin/activate
uv pip install mcp

Test the server:

python -c "
import asyncio
from main import server, list_tools, call_tool

async def test():
    tools = await list_tools()
    print(f'Available tools: {[t.name for t in tools]}')
    result = await call_tool('echo', {'text': 'Hello!'})
    print(f'Result: {result[0].text}')

asyncio.run(test())
"

Integration with LLM Clients

For Claude Desktop

Add this to your Claude Desktop configuration (~/.config/claude-desktop/claude_desktop_config.json):

{
  "mcpServers": {
    "llm-integration": {
      "command": "/home/tandoori/Desktop/dev/mcp-server/.venv/bin/python",
      "args": ["/home/tandoori/Desktop/dev/mcp-server/main.py"]
    }
  }
}

For Continue.dev

Add this to your Continue configuration (~/.continue/config.json):

{
  "mcpServers": [
    {
      "name": "llm-integration",
      "command": "/home/tandoori/Desktop/dev/mcp-server/.venv/bin/python",
      "args": ["/home/tandoori/Desktop/dev/mcp-server/main.py"]
    }
  ]
}

For Cline

Add this to your Cline MCP settings:

{
  "llm-integration": {
    "command": "/home/tandoori/Desktop/dev/mcp-server/.venv/bin/python",
    "args": ["/home/tandoori/Desktop/dev/mcp-server/main.py"]
  }
}

Customizing the LLM Integration

To integrate your own local LLM, modify the perform_llm_inference function in main.py:

async def perform_llm_inference(prompt: str, max_tokens: int = 100) -> str:
    Example: Using transformers
    from transformers import pipeline
    generator = pipeline('text-generation', model='your-model')
    result = generator(prompt, max_length=max_tokens)
    return result[0]['generated_text']
    
    Example: Using llama.cpp python bindings
    from llama_cpp import Llama
    llm = Llama(model_path="path/to/your/model.gguf")
    output = llm(prompt, max_tokens=max_tokens)
    return output['choices'][0]['text']
    
    Current placeholder implementation
    return f"Processed prompt: '{prompt}' (max_tokens: {max_tokens})"

Testing

Run the server directly to test JSON-RPC communication:

source .venv/bin/activate
python main.py

Then send JSON-RPC requests via stdin:

{"jsonrpc": "2.0", "id": 1, "method": "initialize", "params": {"protocolVersion": "2024-11-05", "capabilities": {}, "clientInfo": {"name": "test-client", "version": "1.0.0"}}}

MCP LLM Integration Server

MCP LLM Integration Server

Features

Setup

Integration with LLM Clients

For Claude Desktop

For Continue.dev

For Cline

Customizing the LLM Integration

Testing

Verwandte Server

Alpha Vantage MCP Server

DomScan MCP

CCXT MCP Server

Second Opinion

Pistachio MobileDev MCP

@blockrun/mcp

MCP Aggregator

MCP Streamable HTTP Python Server

MCP Server Starter Template

MCP TypeScript Implementation

https://github.com/LastEld/AMS

NotebookLM Web Importer