openrouter-mcp-multimodal

MCP server for OpenRouter: 300+ LLMs with vision, image gen, audio in/out, and video analysis + generation (Veo 3.1 / Sora 2 Pro / Seedance / Wan). Structured errors, IPv6 SSRF guards, path sandbox.

GitHub

OpenRouter MCP Multimodal Server

The only MCP server that does text + image + audio + video analysis AND generation in one package.
Connect Claude Desktop, Cursor, Kiro, VS Code, Windsurf, or Cline to 300+ LLMs via OpenRouter.

Install · Tools · Examples · Config · Changelog

Install

npx -y @stabgan/openrouter-mcp-multimodal  # that's it — needs OPENROUTER_API_KEY env var

Get a free API key → openrouter.ai/keys

One-Click Install

Kiro
Cursor
VS Code
VS Code Insiders
Claude Desktop	Manual config — Add to `claude_desktop_config.json`
Windsurf	Manual config — Add to `~/.codeium/windsurf/mcp_config.json`
Cline	Manual config — Add via Cline MCP settings
Smithery	`npx -y @smithery/cli install @stabgan/openrouter-mcp-multimodal --client claude`

After clicking, the target client opens a confirmation prompt. Paste your OPENROUTER_API_KEY — the deeplink ships a placeholder so no secrets end up in shared links.

Manual Config

npx (recommended)

{
  "mcpServers": {
    "openrouter": {
      "command": "npx",
      "args": ["-y", "@stabgan/openrouter-mcp-multimodal"],
      "env": {
        "OPENROUTER_API_KEY": "sk-or-v1-..."
      }
    }
  }
}

Docker

{
  "mcpServers": {
    "openrouter": {
      "command": "docker",
      "args": [
        "run", "--rm", "-i",
        "-e", "OPENROUTER_API_KEY=sk-or-v1-...",
        "stabgan/openrouter-mcp-multimodal:latest"
      ]
    }
  }
}

Global install

npm install -g @stabgan/openrouter-mcp-multimodal

{
  "mcpServers": {
    "openrouter": {
      "command": "openrouter-multimodal",
      "env": { "OPENROUTER_API_KEY": "sk-or-v1-..." }
    }
  }
}

Why This One?

Capability	This server	Others
Text chat with 300+ models	✅	✅
Image analysis (vision)	✅ sharp-optimized	some
Audio analysis + generation	✅	❌
Video understanding (mp4/mov/webm)	✅	❌
Video generation (Veo 3.1, Sora 2 Pro)	✅	❌
Response caching (zero tokens on hit)	✅	❌
Web search, rerank, health check	✅	❌
MCP 2025-06-18 spec (structured outputs, progress)	✅	❌

Tools

Tool	What it does
`chat_completion`	Send messages to any model. Supports provider routing, model suffixes (`:nitro`, `:floor`, `:exacto`), response caching, reasoning passthrough, and web search.
`analyze_image`	Analyze images from local files, URLs, or data URIs. Auto-optimized with sharp.
`analyze_audio`	Transcribe/analyze audio (WAV, MP3, FLAC, OGG) from files, URLs, or data URIs.
`analyze_video`	Analyze video (mp4, mpeg, mov, webm) from files, URLs, or data URIs.
`generate_image`	Generate images with aspect ratio control and optional path-sandboxed disk save.
`generate_audio`	Generate speech or music. Auto-detects format, wraps raw PCM in WAV.
`generate_video`	Generate video via async API (Veo 3.1 / Sora 2 Pro / Seedance / Wan) with MCP progress notifications.
`generate_video_from_image`	Image-to-video. Narrower schema than `generate_video` for higher tool-call accuracy.
`get_video_status`	Resume polling a video generation job by ID.
`rerank_documents`	Rerank documents against a query (Cohere, Fireworks).
`search_models`	Search/filter models by name, provider, or modality. Paginated.
`get_model_info`	Get pricing, context length, and capabilities for any model.
`validate_model`	Check if a model ID exists on OpenRouter.
`health_check`	Verify API key, OpenRouter reachability, server + protocol versions.

All errors carry _meta.code from a closed taxonomy: INVALID_INPUT · UNSAFE_PATH · UPSTREAM_HTTP · UPSTREAM_TIMEOUT · UPSTREAM_REFUSED · UNSUPPORTED_FORMAT · RESOURCE_TOO_LARGE · ZDR_INCOMPATIBLE · MODEL_NOT_FOUND · JOB_FAILED · JOB_STILL_RUNNING · INTERNAL

Usage Examples

Chat with provider routing:

{
  "tool": "chat_completion",
  "arguments": {
    "model": "anthropic/claude-sonnet-4",
    "messages": [{ "role": "user", "content": "Summarize this document" }],
    "provider": { "sort": "price", "ignore": ["openai"], "data_collection": "deny" }
  }
}

Generate video from Claude Desktop:

{
  "tool": "generate_video",
  "arguments": {
    "model": "google/veo-3.1",
    "prompt": "a calm river at sunrise, cinematic drone shot",
    "duration": 4,
    "save_path": "./river.mp4"
  }
}

Analyze an image:

{
  "tool": "analyze_image",
  "arguments": {
    "image": "/path/to/photo.jpg",
    "prompt": "Describe what you see in detail"
  }
}

Chat with caching + reasoning (v4.5):

{
  "tool": "chat_completion",
  "arguments": {
    "model": "deepseek/deepseek-r1",
    "messages": [{ "role": "user", "content": "Prove sqrt(2) is irrational" }],
    "cache": true,
    "include_reasoning": true
  }
}

Web search:

{
  "tool": "chat_completion",
  "arguments": {
    "model": "openai/gpt-4o",
    "messages": [{ "role": "user", "content": "What shipped in OpenRouter last week?" }],
    "online": true
  }
}

Rerank documents:

{
  "tool": "rerank_documents",
  "arguments": {
    "query": "best practices for MCP server auth",
    "documents": ["doc A text...", "doc B text...", "doc C text..."],
    "top_n": 3
  }
}

Configuration

Environment variables (click to expand)

Variable	Required	Default	Description
`OPENROUTER_API_KEY`	Yes	—	Your OpenRouter API key
`OPENROUTER_DEFAULT_MODEL`	No	`nvidia/nemotron-nano-12b-v2-vl:free`	Default model for chat + analyze tools
`DEFAULT_MODEL`	No	—	Alias for above
`OPENROUTER_MAX_TOKENS`	No	—	Default `max_tokens` when not set per-request
`OPENROUTER_PROVIDER_QUANTIZATIONS`	No	—	CSV. Filter by quantization (e.g. `fp16,int8`)
`OPENROUTER_PROVIDER_IGNORE`	No	—	CSV. Exclude provider slugs
`OPENROUTER_PROVIDER_SORT`	No	—	`price` / `throughput` / `latency`
`OPENROUTER_PROVIDER_ORDER`	No	—	JSON array or CSV of provider IDs
`OPENROUTER_PROVIDER_REQUIRE_PARAMETERS`	No	—	`true` / `false`
`OPENROUTER_PROVIDER_DATA_COLLECTION`	No	—	`allow` / `deny`
`OPENROUTER_PROVIDER_ALLOW_FALLBACKS`	No	—	`true` / `false`
`OPENROUTER_CACHE_RESPONSES`	No	—	`1` / `true`. Enable response caching server-wide
`OPENROUTER_INCLUDE_REASONING`	No	—	`1` / `true`. Enable reasoning passthrough server-wide
`OPENROUTER_MODEL_CACHE_TTL_MS`	No	`3600000`	Model cache TTL (ms)
`OPENROUTER_IMAGE_MAX_DIMENSION`	No	`800`	Longest edge for resize (px)
`OPENROUTER_IMAGE_JPEG_QUALITY`	No	`80`	JPEG quality (1–100)
`OPENROUTER_IMAGE_FETCH_TIMEOUT_MS`	No	`30000`	Image URL timeout
`OPENROUTER_IMAGE_MAX_DOWNLOAD_BYTES`	No	`26214400`	Image URL size cap (~25 MB)
`OPENROUTER_IMAGE_MAX_REDIRECTS`	No	`8`	Image URL redirect cap
`OPENROUTER_IMAGE_MAX_DATA_URL_BYTES`	No	`20971520`	Image data URL size cap (~20 MB)
`OPENROUTER_AUDIO_FETCH_TIMEOUT_MS`	No	`30000`	Audio URL timeout
`OPENROUTER_AUDIO_MAX_DOWNLOAD_BYTES`	No	`26214400`	Audio URL size cap (~25 MB)
`OPENROUTER_AUDIO_MAX_REDIRECTS`	No	`8`	Audio URL redirect cap
`OPENROUTER_AUDIO_MAX_DATA_URL_BYTES`	No	`20971520`	Audio data URL size cap
`OPENROUTER_DEFAULT_VIDEO_MODEL`	No	`google/gemini-2.5-flash`	Default for `analyze_video`
`OPENROUTER_DEFAULT_VIDEO_GEN_MODEL`	No	`google/veo-3.1`	Default for `generate_video`
`OPENROUTER_VIDEO_FETCH_TIMEOUT_MS`	No	`60000`	Video URL timeout
`OPENROUTER_VIDEO_MAX_DOWNLOAD_BYTES`	No	`104857600`	Video URL size cap (~100 MB)
`OPENROUTER_VIDEO_MAX_REDIRECTS`	No	`8`	Video URL redirect cap
`OPENROUTER_VIDEO_MAX_DATA_URL_BYTES`	No	`104857600`	Video data URL size cap
`OPENROUTER_VIDEO_POLL_INTERVAL_MS`	No	`15000`	Async video poll cadence
`OPENROUTER_VIDEO_MAX_WAIT_MS`	No	`600000`	Max wait before returning a resumable handle
`OPENROUTER_VIDEO_GEN_MAX_BYTES`	No	`268435456`	Generated video download cap (~256 MB)
`OPENROUTER_VIDEO_INLINE_MAX_BYTES`	No	`10485760`	Inline video ceiling (~10 MB)
`OPENROUTER_OUTPUT_DIR`	No	`process.cwd()`	Sandbox root for `save_path`
`OPENROUTER_ALLOW_UNSAFE_PATHS`	No	—	`1` disables the sandbox
`OPENROUTER_LOG_LEVEL`	No	`info`	`error` / `warn` / `info` / `debug`

Security

SSRF protection — URL fetches block private/link-local/reserved IPv4 and IPv6 targets (loopback, mapped, compat, multicast, 6to4, Teredo, ORCHID).
Path sandbox — save_path is resolved against OPENROUTER_OUTPUT_DIR; traversal attempts are rejected. Override: OPENROUTER_ALLOW_UNSAFE_PATHS=1.
No credential leakage — API key is never echoed in logs, responses, or errors. Audit logging captures every paid-op invocation.

Architecture

src/
├── index.ts                    # Entry, env validation, graceful shutdown
├── tool-handlers.ts            # 14 tools (annotated) + dispatch
├── model-cache.ts              # TTL + in-flight coalescing
├── openrouter-api.ts           # REST client (chat + /videos)
├── errors.ts                   # Closed ErrorCode enum
├── logger.ts                   # JSON-line structured logger
└── tool-handlers/
    ├── fetch-utils.ts          # SSRF, bounded fetch, data-URL parser
    ├── openrouter-errors.ts    # SDK/HTTP → ErrorCode classifier
    ├── completion-utils.ts     # Reasoning-model cutoff detection
    ├── path-safety.ts          # save_path sandbox
    ├── chat-completion.ts      # Text + multimodal chat
    ├── analyze-image.ts        # Vision analysis
    ├── analyze-audio.ts        # Audio transcription
    ├── analyze-video.ts        # Video understanding
    ├── generate-image.ts       # Image generation
    ├── generate-audio.ts       # Audio generation + streaming
    ├── generate-video.ts       # Video generation (async)
    ├── image-utils.ts          # Sharp optimization, MIME sniffing
    ├── audio-utils.ts          # Audio format detection
    ├── video-utils.ts          # Video format detection
    ├── search-models.ts        # Model search
    ├── get-model-info.ts       # Model detail lookup
    └── validate-model.ts       # Model existence check

Design Principles & Research

v4.5's design draws from MCP best practices and academic research:

Outcomes, not operations — Tools encapsulate whole workflows (fetch → validate → invoke → save) rather than exposing raw API primitives. Follows Phil Schmid's MCP production guide.
Flattened arguments — Top-level primitives with enums reduce tool-call failure rates. Backed by Fu et al. (2025) showing success drops with schema complexity.
Failure-mode documentation — Every tool description includes "Fails when:" and "Works with:" sections, improving selection accuracy per Schlapbach (2026).
Untrusted content tagging — Analyze tools mark output _meta.content_is_untrusted: true to mitigate indirect prompt injection (Zhao et al., ClawGuard).
Structured errors with retry hints — Closed _meta.code taxonomy + retry_after_seconds beats raw error strings. Per Apigene's 12 Rules.
MCP 2025-06-18 compliance — Structured outputs (outputSchema), progress notifications, tool annotations (readOnlyHint, destructiveHint, idempotentHint, openWorldHint).

OpenRouter platform features surfaced: Response caching · Web search · Reasoning tokens · Auto Exacto · Rerank · Prompt caching

Upgrading from v2

v3+ is additive — no tool schemas or env vars were removed.

New tools: analyze_video, generate_video, generate_video_from_image, get_video_status, rerank_documents, health_check
Structured _meta.code on every error response
save_path sandboxed by default — set OPENROUTER_OUTPUT_DIR or OPENROUTER_ALLOW_UNSAFE_PATHS=1

Development

git clone https://github.com/stabgan/openrouter-mcp-multimodal.git
cd openrouter-mcp-multimodal
npm install && cp .env.example .env  # Add your API key
npm run build && npm start

npm test                    # 288 unit tests, <1s
npm run test:integration    # Live API tests (16 scenarios)
npm run lint
node scripts/live-e2e.mjs  # 16 live E2E scenarios

Compatibility

Works with any MCP client: Kiro · Claude Desktop · Cursor · Windsurf · Cline · any MCP-compatible client.

License

Apache 2.0 — see LICENSE.

Contributing

Issues and PRs welcome. Please open an issue first for major changes.

Servidores relacionados

Alpha Vantage MCP Server

patrocinador

Access financial market data: realtime & historical stock, ETF, options, forex, crypto, commodities, fundamentals, technical indicators, & more

Reloaderoo

A local MCP server for developers that mirrors your in-development MCP server, allowing seamless restarts and tool updates so you can build, test, and iterate on your MCP server within the same AI session without interruption.

CSS Tutor

Provides personalized updates and tutoring on CSS features using the OpenRouter API.

PCM

A server for reverse engineering tasks using the pcm toolkit. Requires a local clone of the pcm repository.

AI Agent with MCP

An AI agent using the Model Context Protocol (MCP) with a Node.js server providing REST resources for users and messages.

Language Server

MCP Language Server gives MCP enabled clients access to semantic tools like get definition, references, rename, and diagnostics.

EndOfLife.date

Get end-of-life dates and support cycle information for various software products.

Smriti MCP

Smriti is a Model Context Protocol (MCP) server that provides persistent, graph-based memory for LLM applications. Built on LadybugDB (embedded property graph database), it uses EcphoryRAG-inspired multi-stage retrieval - combining cue extraction, graph traversal, vector similarity, and multi-hop association - to deliver human-like memory recall.

Cucumber Studio

Provides LLM access to the Cucumber Studio testing platform for managing and executing tests.

Agent Receipts

Cryptographic accountability for AI agents. Ed25519-signed receipts for every MCP tool call — constraints, chains, AI judgment, invoicing, local dashboard.

PHP MCP Server

Provides semantic PHP code analysis and refactoring tools, enabling AI assistants to perform safe, intelligent code transformations at the AST level.