MCP Platform
Local MCP runtime with multi-agent orchestration, distributed tool servers, and ML-powered media recommendations.
MCP Platform
Local MCP runtime with multi-agent orchestration, distributed tool servers, and ML-powered media recommendations.
⚠️ Experimental — intended for personal and experimental use only, not for production deployment.
Table of Contents
- Prerequisites
- 1. Quick Start
- 2. Using MCP Servers with Other Clients
- 3. Client Configuration
- 4. Adding Custom Tools
- 5. Distributed Mode (A2A Protocol)
- 6. Testing
- 7. Architecture
- 8. Intent Patterns & Troubleshooting
- License
Prerequisites
- Python 3.12+
- 16GB+ RAM recommended
- Ollama installed
1. Quick Start
Get the client running in 3 steps:
Install Dependencies
Clone repo and do the following
cd mcp-platform
# Create virtual environment
python -m venv .venv
# Activate (Linux/macOS)
source .venv/bin/activate
# Activate (Windows PowerShell)
.venv\Scripts\activate
# Install requirements - this will take a while
pip install -r requirements.txt
LLM Backend
Ollama
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Start Ollama server
ollama serve
# Download a model (use 14B+ for best results)
ollama pull qwen2.5:14b-instruct-q4_K_M
Optional: GGUF (local model files)
# Download a GGUF model (example)
wget https://huggingface.co/TheRains/Qwen2.5-14B-Instruct-Q4_K_M-GGUF/blob/main/qwen2.5-14b-instruct-q4_k_m.gguf
# Register the model
# (After starting client, use `:gguf add` command to the downloaded file)
Start the Client
python client.py
Access web UI at: http://localhost:9000/client/ui/index.html
That's it! The client auto-discovers all MCP servers and tools.
2. Using MCP Servers with Other Clients
Use these MCP servers with Claude Desktop, Cline, or any MCP-compatible client.
Example Configuration
Add to your MCP client config (e.g., claude_desktop_config.json):
{
"mcpServers": {
"code_assistant": {
"command": "/path/to/mcp-platform/.venv/bin/python",
"args": ["/path/to/mcp-platform/servers/code_assistant/server.py"]
},
"code_review": {
"command": "/path/to/mcp-platform/.venv/bin/python",
"args": ["/path/to/mcp-platform/servers/code_review/server.py"]
},
"code_runner": {
"command": "/path/to/mcp-platform/.venv/bin/python",
"args": ["/path/to/mcp-platform/servers/code_runner/server.py"]
},
"github": {
"command": "/path/to/mcp-platform/.venv/bin/python",
"args": ["/path/to/mcp-platform/servers/github/server.py"]
},
"google": {
"command": "/path/to/mcp-platform/.venv/bin/python",
"args": ["/path/to/mcp-platform/servers/google/server.py"]
},
"image": {
"command": "/path/to/mcp-platform/.venv/bin/python",
"args": ["/path/to/mcp-platform/servers/image/server.py"]
},
"location": {
"command": "/path/to/mcp-platform/.venv/bin/python",
"args": ["/path/to/mcp-platform/servers/location/server.py"]
},
"plex": {
"command": "/path/to/mcp-platform/.venv/bin/python",
"args": ["/path/to/mcp-platform/servers/plex/server.py"]
},
"rag": {
"command": "/path/to/mcp-platform/.venv/bin/python",
"args": ["/path/to/mcp-platform/servers/rag/server.py"]
},
"system": {
"command": "/path/to/mcp-platform/.venv/bin/python",
"args": ["/path/to/mcp-platform/servers/system_tools/server.py"]
},
"text": {
"command": "/path/to/mcp-platform/.venv/bin/python",
"args": ["/path/to/mcp-platform/servers/text/server.py"]
},
"trilium": {
"command": "/path/to/mcp-platform/.venv/bin/python",
"args": ["/path/to/mcp-platform/servers/trilium/server.py"]
}
}
}
Windows paths:
"command": "C:\\path\\to\\mcp-platform\\.venv\\Scripts\\python.exe"
Available servers:
code_assistant- AI-powered code analysis, generation, and refactoring (12 tools)code_review- Code review, search, and bug fixing (3 tools)code_runner- Python/bash execution sandbox (4 tools)github- GitHub repo clone, browse, and cleanup ⚠️ RequiresGITHUB_TOKENfor private repos; public repos work without itgoogle- Gmail + Google Calendar (9 tools) ⚠️ Requires one-time OAuth setup — see Google Setupimage- Image search, analysis, and AI generation (6 tools) ⚠️ RequiresSERPER_API_KEYfor web image search; AI generation is free with no API keylocation- Weather, time, location (3 tools)plex- Media library + ML recommendations (18 tools) ⚠️ RequiresPLEX_URL,PLEX_TOKENrag- Vector search and management (7 tools) ⚠️ Requires Ollama +bge-largesystem- System info and processes (3 tools)text- Text processing and web search (8 tools)trilium- Trilium notes integration (11 tools) ⚠️ RequiresTRILIUM_URL,TRILIUM_TOKEN
3. Client Configuration
Environment Variables
Create .env in project root:
# === LLM Backend ===
OLLAMA_BASE_URL=http://127.0.0.1:11434 # Ollama server URL (default: http://127.0.0.1:11434)
OLLAMA_VISION_MODEL=qwen3-vl:8b-instruct
MAX_MESSAGE_HISTORY=30 # Chat history limit (default: 20)
LLM_TEMPERATURE=0.3 # Model temperature 0 to 1 (default: 0.3)
# === GGUF Configuration (if using GGUF backend) ===
GGUF_GPU_LAYERS=-1 # -1 = all GPU, 0 = CPU only, N = N layers on GPU
GGUF_CONTEXT_SIZE=4096 # Context window size
GGUF_BATCH_SIZE=512 # Batch size for processing
# === API Keys (optional services) ===
PLEX_URL=http://localhost:32400 # Plex server URL
PLEX_TOKEN=your_token_here # Get from Plex account settings
TRILIUM_URL=http://localhost:8888
TRILIUM_TOKEN=your_token_here
SHASHIN_BASE_URL=http://localhost:6624/
SHASHIN_API_KEY=your_key_here
SERPER_API_KEY=your_key_here # Serper image search (https://serper.dev/api-keys)
OLLAMA_TOKEN=your_token_here # Ollama API key (https://ollama.com/settings/keys)
# === A2A Protocol (optional distributed mode) ===
A2A_ENDPOINTS=http://localhost:8010 # Comma-separated endpoints
A2A_EXPOSED_TOOLS= # Tool categories to expose (empty = all)
# === Performance Tuning (optional) ===
CONCURRENT_LIMIT=3 # Parallel ingestion jobs (default: 1)
EMBEDDING_BATCH_SIZE=50 # Embeddings per batch (default: 20)
DB_FLUSH_BATCH_SIZE=50 # DB inserts per batch (default: 30)
# === Tool Control (optional) ===
DISABLED_TOOLS=plex:* # Disable specific tools/categories
# === Location Default - uses your IP location otherwise (optional) ===
DEFAULT_CITY=Vancouver
DEFAULT_STATE=BC
DEFAULT_COUNTRY=Canada
Recommended Setup
Use Ollama for easy setup. Download and install Ollama at https://ollama.com/download and run:
ollama serve
Recommended LLM
ollama pull qwen2.5:14b-instruct-q4_K_M
RAG requires Ollama + bge-large: If bge-large has not been pulled from Ollama, RAG ingestion and semantic search will not work.
ollama pull bge-large
Image tools requires Ollama + vision models: If a vision model has not been pulled from Ollama, image tools will not work.
ollama pull qwen3-vl:8b-instruct
A minimal .env to get started with the core features:
# === Vision ===
OLLAMA_VISION_MODEL=qwen3-vl:8b-instruct
# === Disable unused servers ===
DISABLED_TOOLS=plex:*,image_tools:shashin_analyze,shashin_random,shashin_search
# === API Keys ===
OLLAMA_TOKEN=<token> # Free at https://ollama.com — required for web_search_tool
SERPER_API_KEY=<key> # Required for web_image_search_tool (https://serper.dev)
Configuration Details
LLM Backend:
ollama: Uses Ollama server (requiresollama serverunning)gguf: Uses local GGUF model files (GPU recommended)
GGUF GPU Layers:
-1: Use all GPU (fastest, requires model fits in VRAM)0: CPU only (slow but works with any model size)20: Use 20 layers on GPU (balance for large models on limited VRAM)
Performance Tuning:
EMBEDDING_BATCH_SIZE=50+DB_FLUSH_BATCH_SIZE=50= ~6x faster RAG ingestion- For 12GB VRAM, can increase to 100 for even faster processing
CONCURRENT_LIMIT=2enables parallel media ingestion
Disabled Tools:
- Format:
category:tool_nameorcategory:* - Example:
DISABLED_TOOLS=todo:delete_all_todo_items,system:* - Hidden from
:toolslist, return error if called
Feature Requirements
Some features require additional setup before they will function. The table below summarizes what's needed:
| Feature | Required env vars | Additional setup |
|---|---|---|
| Gmail + Google Calendar | — | One-time Google OAuth setup — see Google Setup |
| RAG ingestion & search | — | Ollama running + bge-large pulled |
| RAG reranking (optional) | — | cross-encoder/ms-marco-MiniLM-L-6-v2 pulled — improves result ranking, falls back to cosine if absent |
| Plex media library | PLEX_URL, PLEX_TOKEN | Plex Media Server running |
| Plex ingestion & recommendations | PLEX_URL, PLEX_TOKEN | Ollama running + bge-large pulled |
| Ollama web search | OLLAMA_TOKEN | Ollama account + API key |
| Image search | SERPER_API_KEY | Serper account + API key (https://serper.dev) |
| AI image generation | — | Free via Pollinations.ai — no account or API key required |
| Trilium notes | TRILIUM_URL, TRILIUM_TOKEN | Trilium server running |
| Shashin photo gallery | SHASHIN_BASE_URL, SHASHIN_API_KEY | Shashin server running |
| A2A distributed mode | A2A_ENDPOINTS | Remote A2A server running |
Available Commands
These work in both CLI and web UI:
:commands - List all available commands
:clear sessions - Clear all chat history
:clear session <id> - Clear session
:sessions - List all sessions
:stop - Stop current operation
:stats - Show performance metrics
:tools - List available tools (hides disabled)
:tools --all - Show all tools including disabled
:tool <n> - Get tool description
:model - List all available models
:model <n> - Switch to a model (auto-detects backend)
:models - List models (legacy)
:gguf add <path> - Register a GGUF model
:gguf remove <alias> - Remove a GGUF model
:gguf list - List registered GGUF models
:a2a on - Enable agent-to-agent mode
:a2a off - Disable agent-to-agent mode
:a2a status - Check A2A system status
:health - Health overview of all servers and tools
:env - Show environment configuration
API Setup
AI Image Generation:
Uses Pollinations.ai — completely free, no account or API key required. Works out of the box.
use generate_image_tool: prompt="A cat holding a sign that says hello" [model=""] [width=""] [height=""] [seed=""]
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
prompt | string | required | Text description of the image to generate |
model | string | flux | Model to use — see table below |
width | integer | 1024 | Image width in pixels |
height | integer | 1024 | Image height in pixels |
seed | integer | random | Fixed seed for reproducible results — same seed + same prompt = same image every time. Useful when you like a result and want to reproduce it exactly, or tweak the prompt while keeping the same overall composition. Any whole number works e.g. 42, 12345 |
Available models (free — no API key needed):
| Model | Description |
|---|---|
flux | Flux Schnell — fast, high-quality (default) |
zimage | Z-Image Turbo — fast Flux with 2x upscaling |
gptimage | GPT Image 1 Mini — OpenAI image model |
gptimage-large | GPT Image 1.5 — OpenAI advanced model, highest fidelity |
kontext | FLUX.1 Kontext — in-context editing, supports image input |
klein | FLUX.2 Klein 4B — fast generation and editing |
qwen-image | Qwen Image Plus — Alibaba text-to-image |
wan-image | Wan 2.7 Image — Alibaba text-to-image up to 2K |
⚠️ All other models (
nanobanana,grok-imagine,seedream,nova-canvas, etc.) are paid-only and require Pollen credits at enter.pollinations.ai.
Ollama Search API (web search):
- Sign up at https://ollama.com/
- Get API key from https://ollama.com/settings/keys
- Add to
.env:OLLAMA_TOKEN=your_key
Plex Media Server:
- Open Plex web interface
- Settings → Network → Show Advanced
- Copy server URL (e.g.,
http://192.168.1.100:32400) - Get token: Settings → Account → Show XML → Copy
authToken - Add to
.env:
PLEX_URL=http://your_server_ip:32400
PLEX_TOKEN=your_token
⚠️ Without
PLEX_URLandPLEX_TOKEN, all Plex tools (library browsing, ingestion, ML recommendations) will be unavailable. The server will load but calls will return a configuration error.
Google (Gmail + Calendar): See Google Setup below.
Google Setup
One-time setup. After completing these steps the server runs headlessly — the refresh token does not expire.
Step 1 — Google Cloud Console
- Go to https://console.cloud.google.com/
- Create a new project — name it anything (e.g.
mcp-platform) - APIs & Services → Library — search and enable both:
- Gmail API
- Google Calendar API
- Google Auth Platform → Get started (left menu)
- Fill in app name and support email → Save
- User type: External
- Google Auth Platform → Audience
- Click Publish App to set publishing status to In Production
- This prevents refresh tokens from expiring every 7 days
- Google Auth Platform → Clients → Create Client
- Application type: Desktop app → Create
- Click Download JSON on the new client
- Save the file as
credentials.json
Step 2 — Place credentials.json
servers/google/credentials.json
Step 3 — Install dependencies
pip install google-auth google-auth-oauthlib google-auth-httplib2 google-api-python-client
Step 4 — Authenticate
Make sure mcp-platform is stopped, then run from the project root:
.venv/bin/python auth_google.py
auth_google.py validates credentials.json, opens a browser for OAuth consent, and writes servers/google/token.json. If the server is already running and the token goes invalid, it will automatically launch auth_google.py at startup without manual intervention.
token.json is written to servers/google/token.json. Because the app is published to production, this token will not expire unless unused for 6 months or your Google password changes.
Step 5 — Restart the client
python client.py
The google server is auto-discovered. Gmail and Calendar tools are now available.
⚠️ If token becomes invalid — delete
servers/google/token.jsonand re-run the Step 4 script.
⚠️ Refresh tokens can still expire if you don't use Google tools for 6 months, or if you change your Google password.
4. Adding Custom Tools
Step 1: Create the server file
mkdir servers/my_tool
touch servers/my_tool/server.py
Step 2: Implement the tool with @tool_meta
This is the only step that requires your input. Everything else — routing, the tools panel, the capability registry — is automatic.
# servers/my_tool/server.py
import sys
import json
import logging
from pathlib import Path
PROJECT_ROOT = Path(__file__).resolve().parent.parent.parent
sys.path.insert(0, str(PROJECT_ROOT))
from dotenv import load_dotenv
load_dotenv(PROJECT_ROOT / ".env", override=True)
from mcp.server.fastmcp import FastMCP
from tools.tool_control import check_tool_enabled
try:
from client.tool_meta import tool_meta
except Exception:
def tool_meta(**kwargs):
def decorator(fn): return fn
return decorator
LOG_DIR = PROJECT_ROOT / "logs"
LOG_DIR.mkdir(exist_ok=True)
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("mcp_my_tool_server")
mcp = FastMCP("my-tool-server")
@mcp.tool()
@check_tool_enabled(category="my_tool")
@tool_meta(
tags=["read", "search"], # what the tool does — see tag vocabulary below
triggers=["my keyword", "my phrase"], # natural language that routes to this tool
example='use my_function: arg1=""', # pre-fill text shown in the tools panel
text_fields=["content"], # which response field contains the main text
)
def my_function(arg1: str, arg2: int = 0) -> str:
"""
Short description of what this tool does.
Args:
arg1 (str): Description of arg1
arg2 (int, optional): Description of arg2
Returns:
JSON string with results.
"""
logger.info(f"🛠 my_function called: {arg1}")
result = {"content": f"Processed {arg1} with {arg2}"}
return json.dumps(result, indent=2)
if __name__ == "__main__":
mcp.run(transport="stdio")
That's it. Restart the client and the tool is live — routed, badged, and registered automatically.
@tool_meta field reference
| Field | Required | Description |
|---|---|---|
tags | ✅ | Capability tags — see vocabulary below |
triggers | ✅ | Natural language words/phrases that route to this tool |
example | recommended | Pre-fill text shown in the tools panel UI |
text_fields | if needed | Response fields containing real text content (e.g. ["content", "preview"]). Only needed if your tool returns a list where the text is in an unusually named field |
rate_limit | no | "100/hour", "10/day", "ollama", or None |
idempotent | no | True if calling twice with the same args has no side effects (default: True) |
intent_category | no | Override routing group name — useful when multiple tools with the same tags need separate routing (e.g. "shashin_search" vs "shashin_analyze" both tagged ["media"]) |
Tag vocabulary
Tags serve two purposes:
- Routing — when a query matches an intent, the router calls
capability_registry.filter_by_tags([...])to find all tools in that group. Any new tool with the right tags is automatically included — no hardcoded tool name lists. - Tools panel — tags render as coloured badges on each tool card in the UI.
| Tag | Meaning |
|---|---|
read | Tool only reads data, never writes |
write | Tool creates or modifies data |
destructive | Tool deletes or irreversibly changes data |
search | Primary purpose is search or query |
external | Calls an external API or service |
vision | Processes image or visual input |
media | Operates on audio/video/image files |
calendar | Interacts with calendar data |
email | Interacts with email |
notes | Interacts with note-taking systems |
code | Operates on source code |
system | Interacts with the OS or hardware |
rag | Interacts with the RAG vector store |
ai | Calls an LLM or ML model |
Step 3: Create skill documentation (optional)
mkdir -p servers/my_tool/skills
touch servers/my_tool/skills/my_feature.md
Step 4: Add external MCP servers (optional)
To connect external or third-party MCP servers, create mcp-platform/external_servers.json.
The client auto-discovers this file on startup — no code changes needed.
SSE transport (remote HTTP event stream):
{
"external_servers": {
"deepwiki": {
"transport": "sse",
"url": "https://mcp.deepwiki.com/mcp",
"enabled": true
}
}
}
HTTP transport (streamable HTTP, e.g. authenticated APIs):
{
"external_servers": {
"neon": {
"transport": "http",
"url": "https://mcp.neon.tech/mcp",
"enabled": true,
"headers": { "Authorization": "Bearer <$TOKEN>" }
}
}
}
Header authentication uses the ES_{SERVER_NAME}_{PLACEHOLDER} convention in .env:
# Server "mcpserver" with <$TOKEN> → ES_MCPSERVER_TOKEN
# Server "mcpserver" with <$API_KEY> → ES_MCPSERVER_API_KEY
ES_MCPSERVER_TOKEN=your_token_here
ES_MCPSERVER_API_KEY=your_api_key_here
Stdio transport (local process servers):
{
"external_servers": {
"pycharm": {
"transport": "stdio",
"command": "/usr/lib/jvm/jdk-17/bin/java",
"args": ["-classpath", "/path/to/mcpserver.jar", "com.intellij.mcpserver.stdio.McpStdioRunnerKt"],
"env": { "IJ_MCP_SERVER_PORT": "64342" },
"enabled": true
}
}
}
Field reference:
| Field | Required | Description |
|---|---|---|
transport | ✅ | "sse", "http", or "stdio" |
url | SSE/HTTP only | Full URL to the endpoint |
headers | No | Request headers — use <$PLACEHOLDER> for secrets |
command | stdio only | Path to the executable |
args | stdio only | Command-line arguments |
env | No | Environment variables passed to the process |
cwd | No | Working directory (defaults to project root) |
enabled | No | false skips without removing (default: true) |
notes | No | Human-readable description, ignored by client |
WSL2 note: For stdio servers bridging to Windows, set
IJ_MCP_SERVER_HOSTinenvto the Windows host IP (cat /etc/resolv.conf | grep nameserver).
Step 5: Test & Deploy
python client.py # restart to auto-discover new server
5. Distributed Mode (A2A Protocol)
Run tools on remote servers and expose them via HTTP.
Setup A2A Server
# Terminal 1
python a2a_server.py # starts on http://localhost:8010
# Terminal 2
python client.py # auto-connects to A2A endpoints in .env
Control Exposed Tools
# Expose specific categories (comma-separated)
A2A_EXPOSED_TOOLS=plex,location,text
# Expose everything (default)
A2A_EXPOSED_TOOLS=
Security: Exclude plex to protect personal data.
Multi-Endpoint Support
A2A_ENDPOINTS=http://localhost:8010,http://gpu-server:8020
6. Testing
Running Tests
pytest # all tests
pytest -m unit # fast unit tests only
pytest -m integration # integration tests
pytest -m e2e # end-to-end tests
pytest -c tests/pytest.coverage.ini # with coverage
Test Structure
tests/
├── conftest.py
├── pytest.ini
├── unit/
│ ├── test_session_manager.py
│ ├── test_models.py
│ ├── test_context_tracker.py
│ ├── test_intent_patterns.py
│ └── test_code_review_tools.py
├── integration/
│ ├── test_websocket_flow.py
│ └── test_langgraph_agent.py
├── e2e/
│ └── test_full_conversation.py
└── results/
├── junit.xml
├── coverage.xml
├── test-report.html
└── coverage-report.html
CI/CD Integration
GitHub Actions:
- name: Run tests
run: pytest
- name: Upload coverage
uses: codecov/codecov-action@v3
with:
files: tests/results/coverage.xml
7. Architecture
Multi-Server Design
servers/
├── code_assistant/ 12 tools - AI-powered code analysis, generation, and refactoring
├── code_review/ 3 tools - Code review, search, and bug fixing
├── code_runner/ 4 tools - Python/bash execution sandbox
├── github/ 4 tools - GitHub repo clone, browse, and cleanup
├── google/ 9 tools - Gmail + Google Calendar [requires one-time OAuth setup]
├── image/ 6 tools - Image search, analysis, AI generation [requires SERPER_API_KEY for search; generation is free]
├── location/ 3 tools - Weather, time, location
├── plex/ 18 tools - Media + ML recommendations [requires PLEX_URL + PLEX_TOKEN]
├── rag/ 7 tools - Vector search and management [requires Ollama + bge-large]
├── system/ 3 tools - System info and processes
├── text/ 8 tools - Text processing and web search
└── trilium/ 11 tools - Trilium notes integration [requires TRILIUM_URL + TRILIUM_TOKEN]
Total: 88 tools across 12 servers
Directory Structure
mcp-platform/
├── servers/
├── a2a_server.py
├── client.py
├── client/
│ ├── ui/
│ │ ├── index.html
│ │ └── dashboard.html
│ ├── capability_registry.py ← Tool capability index (auto-populated from @tool_meta)
│ ├── langgraph.py
│ ├── query_patterns.py ← Intent routing (auto-populated from @tool_meta triggers)
│ ├── session_state.py ← Per-session context store
│ ├── tool_meta.py ← @tool_meta decorator — single source of truth for tool metadata
│ ├── search_client.py ← Ollama web search & fetch
│ ├── websocket.py
│ └── ...
└── tools/
8. Intent Patterns & Troubleshooting
Intent Patterns
The client routes queries to the right tools without sending all 88 tools to the LLM on every message. Routing is driven by the triggers you define in each tool's @tool_meta decorator — no manual pattern editing required.
Each intent has a priority — lower number wins when multiple patterns match. The static entries below cover the built-in servers. Any tool you add with @tool_meta(triggers=[...]) is automatically included on startup.
Overriding intent routing
Prefix your message with Using <tool_name>, to bypass pattern matching entirely and force a specific tool:
Using shashin_search_tool, find photos of Noah
Using web_image_search_tool, show me a picture of a red panda
Conversational bypass
Queries that start with personal statements ("I like…", "My favourite…"), filler words ("yes", "thanks"), creative tasks ("write me a poem"), or pronoun follow-ups ("what did he do?", "tell me more about them") bypass routing entirely — no tools are bound and the LLM answers from context.
Query not routing to the right tool?
- Use explicit phrasing:
"Using shashin_search_tool, find photos of Noah"bypasses pattern matching entirely - Check what triggers are registered by looking at
@tool_meta(triggers=[...])on each tool - Add more specific trigger phrases to the
@tool_metadecorator on your tool
Troubleshooting
AI image generation not working:
- No API key required — uses Pollinations.ai which is free
- If you get a rate limit error: wait a moment and try again
- If generation times out: try again or use
model="turbo"for faster results
Google tools not working:
- Confirm
servers/google/token.jsonexists — if not, re-run the auth script in Google Setup - If token is expired or invalid: delete
token.jsonand re-run the auth script - Confirm Gmail API and Google Calendar API are enabled in Google Cloud Console
- Confirm the OAuth app is published to In Production in Google Auth Platform → Audience
Ollama not reachable / connection refused:
- By default Ollama binds to
127.0.0.1only — setOLLAMA_BASE_URL=http://127.0.0.1:11434in.envand use127.0.0.1rather than a LAN IP - If running Ollama in WSL and need LAN access, start Ollama with
OLLAMA_HOST=0.0.0.0 ollama serveand add a Windows port proxy:netsh interface portproxy add v4tov4 listenport=11434 listenaddress=0.0.0.0 connectport=11434 connectaddress=<WSL_IP> - Get WSL IP with:
ip addr show eth0 | grep 'inet '
Ollama models not appearing:
ollama serve
ollama list
python client.py
RAG not working / embedding errors:
- Ensure Ollama is running:
ollama serve - Confirm
bge-largeis available:ollama list - If missing, pull it:
ollama pull bge-large - RAG requires Ollama for embeddings regardless of which LLM backend (Ollama or GGUF) you use for chat
Plex tools returning errors:
- Confirm
PLEX_URLandPLEX_TOKENare set in.env - Verify the Plex server is reachable:
curl $PLEX_URL/identity?X-Plex-Token=$PLEX_TOKEN - See API Setup for how to locate your token
GGUF model won't load:
- Check model size vs VRAM (use models <7GB for 12GB VRAM)
- Reduce GPU layers:
export GGUF_GPU_LAYERS=20 - CPU only:
export GGUF_GPU_LAYERS=0
Web UI won't load:
netstat -an | grep LISTEN # check ports 8765, 8766, 9000
A2A server not connecting:
curl http://localhost:8010/.well-known/agent-card.json
Ollama Search not working:
- Verify
OLLAMA_TOKENin.env - Get API key at https://ollama.com/settings/keys
- System falls back to LLM knowledge if unavailable
RAG search returns wrong results:
- RAG uses semantic similarity — returns closest matches even if not exact
- Check what's in the database:
> show rag stats - Content is only stored after researching URLs or manually adding via
rag_add_tool
RAG ingestion is slow:
- Normal: ~2.5s for 16 chunks (10,000 characters)
- If slower, check Ollama is running:
ollama list
Conversation history not working:
- Smaller models (≤7B) often refuse to answer questions about conversation history
- Switch to a larger model:
:model qwen2.5:14b-instruct-q4_K_M - Models with good instruction following:
qwen2.5:14b(80-95%),llama3.1:8b(~70%),mistral-nemo(~70%) - Avoid for this use case:
qwen2.5:3b,qwen2.5:7b(~10-30%)
Query not routing to the right tool:
- Use explicit phrasing:
"Using shashin_search_tool, find photos of Noah"bypasses pattern matching entirely - Check the
triggers=[...]in@tool_metaon the relevant tool and add more specific phrases - Restart the client after changing triggers — they are registered at startup
Tools not appearing:
:tools --all # check if disabled
# check DISABLED_TOOLS in .env
python client.py # restart
License
MIT License
Serveurs connexes
Kone.vc
sponsorMonetize your AI agent with contextual product recommendations
Alai
Create high quality presentations using AI
Google Calendar
Integrates with Google Calendar to manage events, reminders, and schedules using OAuth 2.0.
Penfield
Stores conversations, connects ideas, remembers context: a knowledge graph for your AI assistants
WordPress Reader for Claude Desktop
Access WordPress.com feeds, notifications, tags, and manage blogs within Claude Desktop.
SS&C Next Generation
Connects AI agents to the SS&C Next Generation REST API to automate business processes.
STUSYM MCP
MCP-enabled school timetable system with conflict detection, optimization support, and scheduling workflows.
WSLSnapit-MCP
Capture screenshots and read the clipboard on Windows from within a WSL environment.
Easy Notion MCP
Markdown-first Notion MCP server — 26 tools, 92% fewer tokens, full round-trip fidelity
recon
35× fewer tokens for AI coding agents
Actual Budget
Integrate Actual Budget with LLM assistants to manage your personal finances.