openrouter-mcp-multimodal
MCP server for OpenRouter: 300+ LLMs with vision, image gen, audio in/out, and video analysis + generation (Veo 3.1 / Sora 2 Pro / Seedance / Wan). Structured errors, IPv6 SSRF guards, path sandbox.
OpenRouter MCP Multimodal Server
The only MCP server that does text + image + audio + video analysis AND generation in one package.
Connect Claude Desktop, Cursor, Kiro, VS Code, Windsurf, or Cline to 300+ LLMs via OpenRouter.
Install · Tools · Examples · Config · Changelog
Install
npx -y @stabgan/openrouter-mcp-multimodal # that's it — needs OPENROUTER_API_KEY env var
Get a free API key → openrouter.ai/keys
One-Click Install
| Kiro | |
| Cursor | |
| VS Code | |
| VS Code Insiders | |
| Claude Desktop | Manual config — Add to claude_desktop_config.json |
| Windsurf | Manual config — Add to ~/.codeium/windsurf/mcp_config.json |
| Cline | Manual config — Add via Cline MCP settings |
| Smithery | npx -y @smithery/cli install @stabgan/openrouter-mcp-multimodal --client claude |
After clicking, the target client opens a confirmation prompt. Paste your
OPENROUTER_API_KEY— the deeplink ships a placeholder so no secrets end up in shared links.
Manual Config
npx (recommended)
{
"mcpServers": {
"openrouter": {
"command": "npx",
"args": ["-y", "@stabgan/openrouter-mcp-multimodal"],
"env": {
"OPENROUTER_API_KEY": "sk-or-v1-..."
}
}
}
}
Docker
{
"mcpServers": {
"openrouter": {
"command": "docker",
"args": [
"run", "--rm", "-i",
"-e", "OPENROUTER_API_KEY=sk-or-v1-...",
"stabgan/openrouter-mcp-multimodal:latest"
]
}
}
}
Global install
npm install -g @stabgan/openrouter-mcp-multimodal
{
"mcpServers": {
"openrouter": {
"command": "openrouter-multimodal",
"env": { "OPENROUTER_API_KEY": "sk-or-v1-..." }
}
}
}
Why This One?
| Capability | This server | Others |
|---|---|---|
| Text chat with 300+ models | ✅ | ✅ |
| Image analysis (vision) | ✅ sharp-optimized | some |
| Audio analysis + generation | ✅ | ❌ |
| Video understanding (mp4/mov/webm) | ✅ | ❌ |
| Video generation (Veo 3.1, Sora 2 Pro) | ✅ | ❌ |
| Response caching (zero tokens on hit) | ✅ | ❌ |
| Web search, rerank, health check | ✅ | ❌ |
| MCP 2025-06-18 spec (structured outputs, progress) | ✅ | ❌ |
Tools
| Tool | What it does |
|---|---|
chat_completion | Send messages to any model. Supports provider routing, model suffixes (:nitro, :floor, :exacto), response caching, reasoning passthrough, and web search. |
analyze_image | Analyze images from local files, URLs, or data URIs. Auto-optimized with sharp. |
analyze_audio | Transcribe/analyze audio (WAV, MP3, FLAC, OGG) from files, URLs, or data URIs. |
analyze_video | Analyze video (mp4, mpeg, mov, webm) from files, URLs, or data URIs. |
generate_image | Generate images with aspect ratio control and optional path-sandboxed disk save. |
generate_audio | Generate speech or music. Auto-detects format, wraps raw PCM in WAV. |
generate_video | Generate video via async API (Veo 3.1 / Sora 2 Pro / Seedance / Wan) with MCP progress notifications. |
generate_video_from_image | Image-to-video. Narrower schema than generate_video for higher tool-call accuracy. |
get_video_status | Resume polling a video generation job by ID. |
rerank_documents | Rerank documents against a query (Cohere, Fireworks). |
search_models | Search/filter models by name, provider, or modality. Paginated. |
get_model_info | Get pricing, context length, and capabilities for any model. |
validate_model | Check if a model ID exists on OpenRouter. |
health_check | Verify API key, OpenRouter reachability, server + protocol versions. |
All errors carry
_meta.codefrom a closed taxonomy:INVALID_INPUT·UNSAFE_PATH·UPSTREAM_HTTP·UPSTREAM_TIMEOUT·UPSTREAM_REFUSED·UNSUPPORTED_FORMAT·RESOURCE_TOO_LARGE·ZDR_INCOMPATIBLE·MODEL_NOT_FOUND·JOB_FAILED·JOB_STILL_RUNNING·INTERNAL
Usage Examples
Chat with provider routing:
{
"tool": "chat_completion",
"arguments": {
"model": "anthropic/claude-sonnet-4",
"messages": [{ "role": "user", "content": "Summarize this document" }],
"provider": { "sort": "price", "ignore": ["openai"], "data_collection": "deny" }
}
}
Generate video from Claude Desktop:
{
"tool": "generate_video",
"arguments": {
"model": "google/veo-3.1",
"prompt": "a calm river at sunrise, cinematic drone shot",
"duration": 4,
"save_path": "./river.mp4"
}
}
Analyze an image:
{
"tool": "analyze_image",
"arguments": {
"image": "/path/to/photo.jpg",
"prompt": "Describe what you see in detail"
}
}
Chat with caching + reasoning (v4.5):
{
"tool": "chat_completion",
"arguments": {
"model": "deepseek/deepseek-r1",
"messages": [{ "role": "user", "content": "Prove sqrt(2) is irrational" }],
"cache": true,
"include_reasoning": true
}
}
Web search:
{
"tool": "chat_completion",
"arguments": {
"model": "openai/gpt-4o",
"messages": [{ "role": "user", "content": "What shipped in OpenRouter last week?" }],
"online": true
}
}
Rerank documents:
{
"tool": "rerank_documents",
"arguments": {
"query": "best practices for MCP server auth",
"documents": ["doc A text...", "doc B text...", "doc C text..."],
"top_n": 3
}
}
Configuration
Environment variables (click to expand)
| Variable | Required | Default | Description |
|---|---|---|---|
OPENROUTER_API_KEY | Yes | — | Your OpenRouter API key |
OPENROUTER_DEFAULT_MODEL | No | nvidia/nemotron-nano-12b-v2-vl:free | Default model for chat + analyze tools |
DEFAULT_MODEL | No | — | Alias for above |
OPENROUTER_MAX_TOKENS | No | — | Default max_tokens when not set per-request |
OPENROUTER_PROVIDER_QUANTIZATIONS | No | — | CSV. Filter by quantization (e.g. fp16,int8) |
OPENROUTER_PROVIDER_IGNORE | No | — | CSV. Exclude provider slugs |
OPENROUTER_PROVIDER_SORT | No | — | price / throughput / latency |
OPENROUTER_PROVIDER_ORDER | No | — | JSON array or CSV of provider IDs |
OPENROUTER_PROVIDER_REQUIRE_PARAMETERS | No | — | true / false |
OPENROUTER_PROVIDER_DATA_COLLECTION | No | — | allow / deny |
OPENROUTER_PROVIDER_ALLOW_FALLBACKS | No | — | true / false |
OPENROUTER_CACHE_RESPONSES | No | — | 1 / true. Enable response caching server-wide |
OPENROUTER_INCLUDE_REASONING | No | — | 1 / true. Enable reasoning passthrough server-wide |
OPENROUTER_MODEL_CACHE_TTL_MS | No | 3600000 | Model cache TTL (ms) |
OPENROUTER_IMAGE_MAX_DIMENSION | No | 800 | Longest edge for resize (px) |
OPENROUTER_IMAGE_JPEG_QUALITY | No | 80 | JPEG quality (1–100) |
OPENROUTER_IMAGE_FETCH_TIMEOUT_MS | No | 30000 | Image URL timeout |
OPENROUTER_IMAGE_MAX_DOWNLOAD_BYTES | No | 26214400 | Image URL size cap (~25 MB) |
OPENROUTER_IMAGE_MAX_REDIRECTS | No | 8 | Image URL redirect cap |
OPENROUTER_IMAGE_MAX_DATA_URL_BYTES | No | 20971520 | Image data URL size cap (~20 MB) |
OPENROUTER_AUDIO_FETCH_TIMEOUT_MS | No | 30000 | Audio URL timeout |
OPENROUTER_AUDIO_MAX_DOWNLOAD_BYTES | No | 26214400 | Audio URL size cap (~25 MB) |
OPENROUTER_AUDIO_MAX_REDIRECTS | No | 8 | Audio URL redirect cap |
OPENROUTER_AUDIO_MAX_DATA_URL_BYTES | No | 20971520 | Audio data URL size cap |
OPENROUTER_DEFAULT_VIDEO_MODEL | No | google/gemini-2.5-flash | Default for analyze_video |
OPENROUTER_DEFAULT_VIDEO_GEN_MODEL | No | google/veo-3.1 | Default for generate_video |
OPENROUTER_VIDEO_FETCH_TIMEOUT_MS | No | 60000 | Video URL timeout |
OPENROUTER_VIDEO_MAX_DOWNLOAD_BYTES | No | 104857600 | Video URL size cap (~100 MB) |
OPENROUTER_VIDEO_MAX_REDIRECTS | No | 8 | Video URL redirect cap |
OPENROUTER_VIDEO_MAX_DATA_URL_BYTES | No | 104857600 | Video data URL size cap |
OPENROUTER_VIDEO_POLL_INTERVAL_MS | No | 15000 | Async video poll cadence |
OPENROUTER_VIDEO_MAX_WAIT_MS | No | 600000 | Max wait before returning a resumable handle |
OPENROUTER_VIDEO_GEN_MAX_BYTES | No | 268435456 | Generated video download cap (~256 MB) |
OPENROUTER_VIDEO_INLINE_MAX_BYTES | No | 10485760 | Inline video ceiling (~10 MB) |
OPENROUTER_OUTPUT_DIR | No | process.cwd() | Sandbox root for save_path |
OPENROUTER_ALLOW_UNSAFE_PATHS | No | — | 1 disables the sandbox |
OPENROUTER_LOG_LEVEL | No | info | error / warn / info / debug |
Security
- SSRF protection — URL fetches block private/link-local/reserved IPv4 and IPv6 targets (loopback, mapped, compat, multicast, 6to4, Teredo, ORCHID).
- Path sandbox —
save_pathis resolved againstOPENROUTER_OUTPUT_DIR; traversal attempts are rejected. Override:OPENROUTER_ALLOW_UNSAFE_PATHS=1. - No credential leakage — API key is never echoed in logs, responses, or errors. Audit logging captures every paid-op invocation.
Architecture
src/
├── index.ts # Entry, env validation, graceful shutdown
├── tool-handlers.ts # 14 tools (annotated) + dispatch
├── model-cache.ts # TTL + in-flight coalescing
├── openrouter-api.ts # REST client (chat + /videos)
├── errors.ts # Closed ErrorCode enum
├── logger.ts # JSON-line structured logger
└── tool-handlers/
├── fetch-utils.ts # SSRF, bounded fetch, data-URL parser
├── openrouter-errors.ts # SDK/HTTP → ErrorCode classifier
├── completion-utils.ts # Reasoning-model cutoff detection
├── path-safety.ts # save_path sandbox
├── chat-completion.ts # Text + multimodal chat
├── analyze-image.ts # Vision analysis
├── analyze-audio.ts # Audio transcription
├── analyze-video.ts # Video understanding
├── generate-image.ts # Image generation
├── generate-audio.ts # Audio generation + streaming
├── generate-video.ts # Video generation (async)
├── image-utils.ts # Sharp optimization, MIME sniffing
├── audio-utils.ts # Audio format detection
├── video-utils.ts # Video format detection
├── search-models.ts # Model search
├── get-model-info.ts # Model detail lookup
└── validate-model.ts # Model existence check
Design Principles & Research
v4.5's design draws from MCP best practices and academic research:
- Outcomes, not operations — Tools encapsulate whole workflows (fetch → validate → invoke → save) rather than exposing raw API primitives. Follows Phil Schmid's MCP production guide.
- Flattened arguments — Top-level primitives with enums reduce tool-call failure rates. Backed by Fu et al. (2025) showing success drops with schema complexity.
- Failure-mode documentation — Every tool description includes "Fails when:" and "Works with:" sections, improving selection accuracy per Schlapbach (2026).
- Untrusted content tagging — Analyze tools mark output
_meta.content_is_untrusted: trueto mitigate indirect prompt injection (Zhao et al., ClawGuard). - Structured errors with retry hints — Closed
_meta.codetaxonomy +retry_after_secondsbeats raw error strings. Per Apigene's 12 Rules. - MCP 2025-06-18 compliance — Structured outputs (
outputSchema), progress notifications, tool annotations (readOnlyHint,destructiveHint,idempotentHint,openWorldHint).
OpenRouter platform features surfaced: Response caching · Web search · Reasoning tokens · Auto Exacto · Rerank · Prompt caching
Upgrading from v2
v3+ is additive — no tool schemas or env vars were removed.
- New tools:
analyze_video,generate_video,generate_video_from_image,get_video_status,rerank_documents,health_check - Structured
_meta.codeon every error response save_pathsandboxed by default — setOPENROUTER_OUTPUT_DIRorOPENROUTER_ALLOW_UNSAFE_PATHS=1
Development
git clone https://github.com/stabgan/openrouter-mcp-multimodal.git
cd openrouter-mcp-multimodal
npm install && cp .env.example .env # Add your API key
npm run build && npm start
npm test # 288 unit tests, <1s
npm run test:integration # Live API tests (16 scenarios)
npm run lint
node scripts/live-e2e.mjs # 16 live E2E scenarios
Compatibility
Works with any MCP client: Kiro · Claude Desktop · Cursor · Windsurf · Cline · any MCP-compatible client.
License
Apache 2.0 — see LICENSE.
Contributing
Issues and PRs welcome. Please open an issue first for major changes.
Похожие серверы
Alpha Vantage MCP Server
спонсорAccess financial market data: realtime & historical stock, ETF, options, forex, crypto, commodities, fundamentals, technical indicators, & more
markmap-http-mcp
An MCP server for converting Markdown to interactive mind maps with export support (PNG/JPG/SVG). Server runs as HTTP service.
MCP Server with GitHub OAuth
A remote MCP server with built-in GitHub OAuth support, designed for deployment on Cloudflare Workers.
SAME (Stateless Agent Memory Engine
Your AI's memory shouldn't live on someone else's server — 12 MCP tools that give it persistent context from your local markdown, no cloud, no API keys, single binary.
GXtract
GXtract is a MCP server designed to integrate with VS Code and other compatible editors. It provides a suite of tools for interacting with the GroundX platform, enabling you to leverage its powerful document understanding capabilities directly within your development environment.
cmux-mcp
MCP server for controlling cmux (Ghostty-based terminal) via native CLI. Send commands, read output, send control characters — all in background via Unix socket.
GhidraMCP
Enables LLMs to autonomously reverse engineer applications by exposing core Ghidra functionality.
claude-token-analyzer
Diagnoses token waste in Claude Code sessions with 6 anomaly types and severity scoring. Fully local.
amCharts 5 MCP Server
MCP server that gives AI assistants on-demand access to 1,500+ amCharts docs, ~300 code examples, and 1000+ class API references.
LastSaaS
SaaS boilerplate / starter kit: comprehensive, Stripe billing, product management, multi-tenant; agentic controls via MCP
Last9
Seamlessly bring real-time production context—logs, metrics, and traces—into your local environment to auto-fix code faster.