openrouter-mcp-multimodal

MCP server for OpenRouter: 300+ LLMs with vision, image gen, audio in/out, and video analysis + generation (Veo 3.1 / Sora 2 Pro / Seedance / Wan). Structured errors, IPv6 SSRF guards, path sandbox.

OpenRouter MCP Multimodal

OpenRouter MCP Multimodal Server

The only MCP server that does text + image + audio + video analysis AND generation in one package.
Connect Claude Desktop, Cursor, Kiro, VS Code, Windsurf, or Cline to 300+ LLMs via OpenRouter.

npm version Docker version CI Apache 2.0 Node.js

npm downloads npm monthly Docker pulls Smithery GitHub stars GitHub forks

Install · Tools · Examples · Config · Changelog


Verified on MseeP

Install

npx -y @stabgan/openrouter-mcp-multimodal  # that's it — needs OPENROUTER_API_KEY env var

Get a free API key → openrouter.ai/keys

One-Click Install

KiroAdd to Kiro
CursorAdd to Cursor
VS CodeAdd to VS Code
VS Code InsidersAdd to VS Code Insiders
Claude DesktopManual config — Add to claude_desktop_config.json
WindsurfManual config — Add to ~/.codeium/windsurf/mcp_config.json
ClineManual config — Add via Cline MCP settings
Smitherynpx -y @smithery/cli install @stabgan/openrouter-mcp-multimodal --client claude

After clicking, the target client opens a confirmation prompt. Paste your OPENROUTER_API_KEY — the deeplink ships a placeholder so no secrets end up in shared links.

Manual Config

npx (recommended)
{
  "mcpServers": {
    "openrouter": {
      "command": "npx",
      "args": ["-y", "@stabgan/openrouter-mcp-multimodal"],
      "env": {
        "OPENROUTER_API_KEY": "sk-or-v1-..."
      }
    }
  }
}
Docker
{
  "mcpServers": {
    "openrouter": {
      "command": "docker",
      "args": [
        "run", "--rm", "-i",
        "-e", "OPENROUTER_API_KEY=sk-or-v1-...",
        "stabgan/openrouter-mcp-multimodal:latest"
      ]
    }
  }
}
Global install
npm install -g @stabgan/openrouter-mcp-multimodal
{
  "mcpServers": {
    "openrouter": {
      "command": "openrouter-multimodal",
      "env": { "OPENROUTER_API_KEY": "sk-or-v1-..." }
    }
  }
}

Why This One?

CapabilityThis serverOthers
Text chat with 300+ models
Image analysis (vision)✅ sharp-optimizedsome
Audio analysis + generation
Video understanding (mp4/mov/webm)
Video generation (Veo 3.1, Sora 2 Pro)
Response caching (zero tokens on hit)
Web search, rerank, health check
MCP 2025-06-18 spec (structured outputs, progress)

Tools

ToolWhat it does
chat_completionSend messages to any model. Supports provider routing, model suffixes (:nitro, :floor, :exacto), response caching, reasoning passthrough, and web search.
analyze_imageAnalyze images from local files, URLs, or data URIs. Auto-optimized with sharp.
analyze_audioTranscribe/analyze audio (WAV, MP3, FLAC, OGG) from files, URLs, or data URIs.
analyze_videoAnalyze video (mp4, mpeg, mov, webm) from files, URLs, or data URIs.
generate_imageGenerate images with aspect ratio control and optional path-sandboxed disk save.
generate_audioGenerate speech or music. Auto-detects format, wraps raw PCM in WAV.
generate_videoGenerate video via async API (Veo 3.1 / Sora 2 Pro / Seedance / Wan) with MCP progress notifications.
generate_video_from_imageImage-to-video. Narrower schema than generate_video for higher tool-call accuracy.
get_video_statusResume polling a video generation job by ID.
rerank_documentsRerank documents against a query (Cohere, Fireworks).
search_modelsSearch/filter models by name, provider, or modality. Paginated.
get_model_infoGet pricing, context length, and capabilities for any model.
validate_modelCheck if a model ID exists on OpenRouter.
health_checkVerify API key, OpenRouter reachability, server + protocol versions.

All errors carry _meta.code from a closed taxonomy: INVALID_INPUT · UNSAFE_PATH · UPSTREAM_HTTP · UPSTREAM_TIMEOUT · UPSTREAM_REFUSED · UNSUPPORTED_FORMAT · RESOURCE_TOO_LARGE · ZDR_INCOMPATIBLE · MODEL_NOT_FOUND · JOB_FAILED · JOB_STILL_RUNNING · INTERNAL

Usage Examples

Chat with provider routing:

{
  "tool": "chat_completion",
  "arguments": {
    "model": "anthropic/claude-sonnet-4",
    "messages": [{ "role": "user", "content": "Summarize this document" }],
    "provider": { "sort": "price", "ignore": ["openai"], "data_collection": "deny" }
  }
}

Generate video from Claude Desktop:

{
  "tool": "generate_video",
  "arguments": {
    "model": "google/veo-3.1",
    "prompt": "a calm river at sunrise, cinematic drone shot",
    "duration": 4,
    "save_path": "./river.mp4"
  }
}

Analyze an image:

{
  "tool": "analyze_image",
  "arguments": {
    "image": "/path/to/photo.jpg",
    "prompt": "Describe what you see in detail"
  }
}

Chat with caching + reasoning (v4.5):

{
  "tool": "chat_completion",
  "arguments": {
    "model": "deepseek/deepseek-r1",
    "messages": [{ "role": "user", "content": "Prove sqrt(2) is irrational" }],
    "cache": true,
    "include_reasoning": true
  }
}

Web search:

{
  "tool": "chat_completion",
  "arguments": {
    "model": "openai/gpt-4o",
    "messages": [{ "role": "user", "content": "What shipped in OpenRouter last week?" }],
    "online": true
  }
}

Rerank documents:

{
  "tool": "rerank_documents",
  "arguments": {
    "query": "best practices for MCP server auth",
    "documents": ["doc A text...", "doc B text...", "doc C text..."],
    "top_n": 3
  }
}

Configuration

Environment variables (click to expand)
VariableRequiredDefaultDescription
OPENROUTER_API_KEYYesYour OpenRouter API key
OPENROUTER_DEFAULT_MODELNonvidia/nemotron-nano-12b-v2-vl:freeDefault model for chat + analyze tools
DEFAULT_MODELNoAlias for above
OPENROUTER_MAX_TOKENSNoDefault max_tokens when not set per-request
OPENROUTER_PROVIDER_QUANTIZATIONSNoCSV. Filter by quantization (e.g. fp16,int8)
OPENROUTER_PROVIDER_IGNORENoCSV. Exclude provider slugs
OPENROUTER_PROVIDER_SORTNoprice / throughput / latency
OPENROUTER_PROVIDER_ORDERNoJSON array or CSV of provider IDs
OPENROUTER_PROVIDER_REQUIRE_PARAMETERSNotrue / false
OPENROUTER_PROVIDER_DATA_COLLECTIONNoallow / deny
OPENROUTER_PROVIDER_ALLOW_FALLBACKSNotrue / false
OPENROUTER_CACHE_RESPONSESNo1 / true. Enable response caching server-wide
OPENROUTER_INCLUDE_REASONINGNo1 / true. Enable reasoning passthrough server-wide
OPENROUTER_MODEL_CACHE_TTL_MSNo3600000Model cache TTL (ms)
OPENROUTER_IMAGE_MAX_DIMENSIONNo800Longest edge for resize (px)
OPENROUTER_IMAGE_JPEG_QUALITYNo80JPEG quality (1–100)
OPENROUTER_IMAGE_FETCH_TIMEOUT_MSNo30000Image URL timeout
OPENROUTER_IMAGE_MAX_DOWNLOAD_BYTESNo26214400Image URL size cap (~25 MB)
OPENROUTER_IMAGE_MAX_REDIRECTSNo8Image URL redirect cap
OPENROUTER_IMAGE_MAX_DATA_URL_BYTESNo20971520Image data URL size cap (~20 MB)
OPENROUTER_AUDIO_FETCH_TIMEOUT_MSNo30000Audio URL timeout
OPENROUTER_AUDIO_MAX_DOWNLOAD_BYTESNo26214400Audio URL size cap (~25 MB)
OPENROUTER_AUDIO_MAX_REDIRECTSNo8Audio URL redirect cap
OPENROUTER_AUDIO_MAX_DATA_URL_BYTESNo20971520Audio data URL size cap
OPENROUTER_DEFAULT_VIDEO_MODELNogoogle/gemini-2.5-flashDefault for analyze_video
OPENROUTER_DEFAULT_VIDEO_GEN_MODELNogoogle/veo-3.1Default for generate_video
OPENROUTER_VIDEO_FETCH_TIMEOUT_MSNo60000Video URL timeout
OPENROUTER_VIDEO_MAX_DOWNLOAD_BYTESNo104857600Video URL size cap (~100 MB)
OPENROUTER_VIDEO_MAX_REDIRECTSNo8Video URL redirect cap
OPENROUTER_VIDEO_MAX_DATA_URL_BYTESNo104857600Video data URL size cap
OPENROUTER_VIDEO_POLL_INTERVAL_MSNo15000Async video poll cadence
OPENROUTER_VIDEO_MAX_WAIT_MSNo600000Max wait before returning a resumable handle
OPENROUTER_VIDEO_GEN_MAX_BYTESNo268435456Generated video download cap (~256 MB)
OPENROUTER_VIDEO_INLINE_MAX_BYTESNo10485760Inline video ceiling (~10 MB)
OPENROUTER_OUTPUT_DIRNoprocess.cwd()Sandbox root for save_path
OPENROUTER_ALLOW_UNSAFE_PATHSNo1 disables the sandbox
OPENROUTER_LOG_LEVELNoinfoerror / warn / info / debug

Security

  • SSRF protection — URL fetches block private/link-local/reserved IPv4 and IPv6 targets (loopback, mapped, compat, multicast, 6to4, Teredo, ORCHID).
  • Path sandboxsave_path is resolved against OPENROUTER_OUTPUT_DIR; traversal attempts are rejected. Override: OPENROUTER_ALLOW_UNSAFE_PATHS=1.
  • No credential leakage — API key is never echoed in logs, responses, or errors. Audit logging captures every paid-op invocation.
Architecture
src/
├── index.ts                    # Entry, env validation, graceful shutdown
├── tool-handlers.ts            # 14 tools (annotated) + dispatch
├── model-cache.ts              # TTL + in-flight coalescing
├── openrouter-api.ts           # REST client (chat + /videos)
├── errors.ts                   # Closed ErrorCode enum
├── logger.ts                   # JSON-line structured logger
└── tool-handlers/
    ├── fetch-utils.ts          # SSRF, bounded fetch, data-URL parser
    ├── openrouter-errors.ts    # SDK/HTTP → ErrorCode classifier
    ├── completion-utils.ts     # Reasoning-model cutoff detection
    ├── path-safety.ts          # save_path sandbox
    ├── chat-completion.ts      # Text + multimodal chat
    ├── analyze-image.ts        # Vision analysis
    ├── analyze-audio.ts        # Audio transcription
    ├── analyze-video.ts        # Video understanding
    ├── generate-image.ts       # Image generation
    ├── generate-audio.ts       # Audio generation + streaming
    ├── generate-video.ts       # Video generation (async)
    ├── image-utils.ts          # Sharp optimization, MIME sniffing
    ├── audio-utils.ts          # Audio format detection
    ├── video-utils.ts          # Video format detection
    ├── search-models.ts        # Model search
    ├── get-model-info.ts       # Model detail lookup
    └── validate-model.ts       # Model existence check
Design Principles & Research

v4.5's design draws from MCP best practices and academic research:

  • Outcomes, not operations — Tools encapsulate whole workflows (fetch → validate → invoke → save) rather than exposing raw API primitives. Follows Phil Schmid's MCP production guide.
  • Flattened arguments — Top-level primitives with enums reduce tool-call failure rates. Backed by Fu et al. (2025) showing success drops with schema complexity.
  • Failure-mode documentation — Every tool description includes "Fails when:" and "Works with:" sections, improving selection accuracy per Schlapbach (2026).
  • Untrusted content tagging — Analyze tools mark output _meta.content_is_untrusted: true to mitigate indirect prompt injection (Zhao et al., ClawGuard).
  • Structured errors with retry hints — Closed _meta.code taxonomy + retry_after_seconds beats raw error strings. Per Apigene's 12 Rules.
  • MCP 2025-06-18 compliance — Structured outputs (outputSchema), progress notifications, tool annotations (readOnlyHint, destructiveHint, idempotentHint, openWorldHint).

OpenRouter platform features surfaced: Response caching · Web search · Reasoning tokens · Auto Exacto · Rerank · Prompt caching

Upgrading from v2

v3+ is additive — no tool schemas or env vars were removed.

  • New tools: analyze_video, generate_video, generate_video_from_image, get_video_status, rerank_documents, health_check
  • Structured _meta.code on every error response
  • save_path sandboxed by default — set OPENROUTER_OUTPUT_DIR or OPENROUTER_ALLOW_UNSAFE_PATHS=1

Development

git clone https://github.com/stabgan/openrouter-mcp-multimodal.git
cd openrouter-mcp-multimodal
npm install && cp .env.example .env  # Add your API key
npm run build && npm start
npm test                    # 288 unit tests, <1s
npm run test:integration    # Live API tests (16 scenarios)
npm run lint
node scripts/live-e2e.mjs  # 16 live E2E scenarios

Compatibility

Works with any MCP client: Kiro · Claude Desktop · Cursor · Windsurf · Cline · any MCP-compatible client.

License

Apache 2.0 — see LICENSE.

Contributing

Issues and PRs welcome. Please open an issue first for major changes.

เซิร์ฟเวอร์ที่เกี่ยวข้อง

NotebookLM Web Importer

นำเข้าหน้าเว็บและวิดีโอ YouTube ไปยัง NotebookLM ด้วยคลิกเดียว ผู้ใช้กว่า 200,000 คนไว้วางใจ

ติดตั้งส่วนขยาย Chrome