An open API service indexing awesome lists of open source software.

https://github.com/stabgan/openrouter-mcp-multimodal

MCP server for OpenRouter: 300+ LLMs with vision, image gen, audio in/out, and video analysis + generation (Veo 3.1 / Sora 2 Pro / Seedance / Wan). Structured errors, IPv6 SSRF guards, path sandbox.
https://github.com/stabgan/openrouter-mcp-multimodal

ai audio-generation audio-transcription claude docker image-analysis image-generation llm mcp mcp-server model-context-protocol multimodal nodejs openrouter seedance sora typescript veo video-generation video-understanding

Last synced: 2 months ago
JSON representation

MCP server for OpenRouter: 300+ LLMs with vision, image gen, audio in/out, and video analysis + generation (Veo 3.1 / Sora 2 Pro / Seedance / Wan). Structured errors, IPv6 SSRF guards, path sandbox.

Awesome Lists containing this project

README

          


OpenRouter MCP Multimodal

OpenRouter MCP Multimodal Server


The all-in-one MCP server for 300+ LLMs โ€” text, vision, audio, and video in a single package.


npm version
Docker version
CI
MIT
Node.js


npm downloads
npm monthly
Docker pulls
GitHub stars
GitHub forks



3,800+ installs across npm + Docker Hub ยท ~950 npm installs/month and accelerating


Install ยท
Tools ยท
Quick Start ยท
Config ยท
Examples ยท
Architecture ยท
Changelog

---
[![Verified on MseeP](https://mseep.ai/badge.svg)](https://mseep.ai/app/8f27d6d4-0877-4b86-b377-8a33f451e755)

Access 300+ LLMs through [OpenRouter](https://openrouter.ai) via the [Model Context Protocol](https://modelcontextprotocol.io). Analyze images, audio, and video. Generate images, audio, and video. Chat with any model. Every tool returns structured `_meta.code` errors so MCP clients can switch on failure modes without parsing strings.

## One-Click Install

KiroAdd to Kiro
CursorAdd to Cursor
VS CodeAdd to VS Code
VS Code InsidersAdd to VS Code Insiders
Claude DesktopInstall Guide โ€” Add to claude_desktop_config.json
WindsurfInstall Guide โ€” Add to ~/.codeium/windsurf/mcp_config.json
ClineInstall Guide โ€” Add via Cline MCP settings
Smitherynpx -y @smithery/cli install @stabgan/openrouter-mcp-multimodal --client claude

> After clicking, the target client opens a confirmation prompt. You'll need to paste your `OPENROUTER_API_KEY` โ€” the deeplink ships a placeholder so no secrets end up in shared links.

## Why This One?

| Feature | Status |
| :--- | :--- |
| Text chat with 300+ models | โœ… |
| Image analysis (vision) | โœ… Native with sharp optimization |
| Audio analysis | โœ… Transcription + analysis, base64 auto-encoded |
| Audio generation | โœ… Conversational, speech, and music with format auto-detection |
| Image generation | โœ… Path-sandboxed disk output |
| **Video understanding** | โœ… **v3** โ€” mp4, mpeg, mov, webm from files, URLs, or data URLs |
| **Video generation** | โœ… **v3** โ€” Veo 3.1 / Sora 2 Pro / Seedance / Wan via async API with progress notifications |
| Auto image resize + compress | โœ… Configurable (defaults 800px max, JPEG 80%) |
| Model search + validation | โœ… Filter by vision / audio / video modality |
| Free model support | โœ… Default: free Nemotron VL |
| Docker support | โœ… Multi-arch (amd64 + arm64), ~345 MB Alpine |
| Retry-After + jitter | โœ… Honors `Retry-After` header, avoids thundering herd |
| IPv4 + IPv6 SSRF blocklist | โœ… Covers mapped, compat, multicast, 6to4, Teredo, ORCHID |
| Structured error taxonomy | โœ… Closed `_meta.code` so clients can switch on failure modes |
| Reasoning-model awareness | โœ… Detects `max_tokens` cutoff during CoT, guides the caller |
| MCP 2025 tool annotations | โœ… `readOnlyHint` / `destructiveHint` / `idempotentHint` on every tool |

## Tools

| Tool | Description |
| :--- | :--- |
| `chat_completion` | Send messages to any OpenRouter model. Detects reasoning-model cutoffs. |
| `analyze_image` | Analyze images from local files, URLs, or data URIs. Auto-optimized with sharp. |
| `analyze_audio` | Analyze/transcribe audio (WAV, MP3, FLAC, OGG, etc.) from files, URLs, or data URIs. |
| `analyze_video` | Analyze/transcribe video (mp4, mpeg, mov, webm) from files, URLs, or data URIs. |
| `generate_image` | Generate images from text prompts. Optional path-sandboxed disk save. |
| `generate_audio` | Generate audio from text. Auto-detects format, wraps raw PCM in WAV. |
| `generate_video` | Generate video via OpenRouter's async API (Veo 3.1 / Sora 2 Pro / Seedance / Wan). Submits, polls, downloads, saves. |
| `get_video_status` | Resume polling a `generate_video` job by id. Download + save when complete. |
| `search_models` | Search/filter models by name, provider, or capabilities (vision / audio / video). |
| `get_model_info` | Get pricing, context length, and capabilities for any model. |
| `validate_model` | Check if a model ID exists on OpenRouter. |

> All error responses carry `_meta.code` from a closed taxonomy: `INVALID_INPUT` ยท `UNSAFE_PATH` ยท `UPSTREAM_HTTP` ยท `UPSTREAM_TIMEOUT` ยท `UPSTREAM_REFUSED` ยท `UNSUPPORTED_FORMAT` ยท `RESOURCE_TOO_LARGE` ยท `ZDR_INCOMPATIBLE` ยท `MODEL_NOT_FOUND` ยท `JOB_FAILED` ยท `JOB_STILL_RUNNING` ยท `INTERNAL`

## Quick Start

### Prerequisites

Get a free API key from [openrouter.ai/keys](https://openrouter.ai/keys).

### Option 1: npx (no install)

```json
{
"mcpServers": {
"openrouter": {
"command": "npx",
"args": ["-y", "@stabgan/openrouter-mcp-multimodal"],
"env": {
"OPENROUTER_API_KEY": "sk-or-v1-..."
}
}
}
}
```

### Option 2: Docker

```json
{
"mcpServers": {
"openrouter": {
"command": "docker",
"args": [
"run", "--rm", "-i",
"-e", "OPENROUTER_API_KEY=sk-or-v1-...",
"stabgan/openrouter-mcp-multimodal:latest"
]
}
}
}
```

### Option 3: Global install

```bash
npm install -g @stabgan/openrouter-mcp-multimodal
```

```json
{
"mcpServers": {
"openrouter": {
"command": "openrouter-multimodal",
"env": { "OPENROUTER_API_KEY": "sk-or-v1-..." }
}
}
}
```

### Option 4: Smithery

```bash
npx -y @smithery/cli install @stabgan/openrouter-mcp-multimodal --client claude
```

## Configuration

Environment variables (click to expand)

| Variable | Required | Default | Description |
| :--- | :---: | :--- | :--- |
| `OPENROUTER_API_KEY` | Yes | โ€” | Your OpenRouter API key |
| `OPENROUTER_DEFAULT_MODEL` | No | `nvidia/nemotron-nano-12b-v2-vl:free` | Default model for chat + analyze tools |
| `DEFAULT_MODEL` | No | โ€” | Alias for above |
| `OPENROUTER_MODEL_CACHE_TTL_MS` | No | `3600000` | Model cache TTL (ms) |
| `OPENROUTER_IMAGE_MAX_DIMENSION` | No | `800` | Longest edge for resize (px) |
| `OPENROUTER_IMAGE_JPEG_QUALITY` | No | `80` | JPEG quality (1โ€“100) |
| `OPENROUTER_IMAGE_FETCH_TIMEOUT_MS` | No | `30000` | Image URL timeout |
| `OPENROUTER_IMAGE_MAX_DOWNLOAD_BYTES` | No | `26214400` | Image URL size cap (~25 MB) |
| `OPENROUTER_IMAGE_MAX_REDIRECTS` | No | `8` | Image URL redirect cap |
| `OPENROUTER_IMAGE_MAX_DATA_URL_BYTES` | No | `20971520` | Image data URL size cap (~20 MB) |
| `OPENROUTER_AUDIO_FETCH_TIMEOUT_MS` | No | `30000` | Audio URL timeout |
| `OPENROUTER_AUDIO_MAX_DOWNLOAD_BYTES` | No | `26214400` | Audio URL size cap (~25 MB) |
| `OPENROUTER_AUDIO_MAX_REDIRECTS` | No | `8` | Audio URL redirect cap |
| `OPENROUTER_AUDIO_MAX_DATA_URL_BYTES` | No | `20971520` | Audio data URL size cap |
| `OPENROUTER_DEFAULT_VIDEO_MODEL` | No | `google/gemini-2.5-flash` | Default for `analyze_video` |
| `OPENROUTER_DEFAULT_VIDEO_GEN_MODEL` | No | `google/veo-3.1` | Default for `generate_video` |
| `OPENROUTER_VIDEO_FETCH_TIMEOUT_MS` | No | `60000` | Video URL timeout |
| `OPENROUTER_VIDEO_MAX_DOWNLOAD_BYTES` | No | `104857600` | Video URL size cap (~100 MB) |
| `OPENROUTER_VIDEO_MAX_REDIRECTS` | No | `8` | Video URL redirect cap |
| `OPENROUTER_VIDEO_MAX_DATA_URL_BYTES` | No | `104857600` | Video data URL size cap |
| `OPENROUTER_VIDEO_POLL_INTERVAL_MS` | No | `15000` | Async video poll cadence |
| `OPENROUTER_VIDEO_MAX_WAIT_MS` | No | `600000` | Max wait before returning a resumable handle |
| `OPENROUTER_VIDEO_GEN_MAX_BYTES` | No | `268435456` | Generated video download cap (~256 MB) |
| `OPENROUTER_VIDEO_INLINE_MAX_BYTES` | No | `10485760` | Inline video ceiling (~10 MB) |
| `OPENROUTER_OUTPUT_DIR` | No | `process.cwd()` | Sandbox root for `save_path` |
| `OPENROUTER_ALLOW_UNSAFE_PATHS` | No | โ€” | `1` disables the sandbox |
| `OPENROUTER_LOG_LEVEL` | No | `info` | `error` / `warn` / `info` / `debug` |

### Security notes

- **Analyze tools** can read local files and fetch HTTP(S) URLs. URL fetches block private/link-local/reserved IPv4 and IPv6 targets (SSRF mitigation) and cap response size.
- **Generate tools** write to disk through a path sandbox: `save_path` is resolved against `OPENROUTER_OUTPUT_DIR` and any traversal attempt is rejected. Override with `OPENROUTER_ALLOW_UNSAFE_PATHS=1`.
- **IPv6 SSRF blocklist** covers loopback, unspecified, IPv4-mapped, IPv4-compatible, link-local, site-local, ULA, multicast, documentation, Teredo, ORCHID, and 6to4 of private IPv4.

## Usage Examples

```
# Chat
Use chat_completion to explain quantum computing in simple terms.

# Vision
Use analyze_image on /path/to/photo.jpg and tell me what you see.

# Audio transcription
Use analyze_audio on /path/to/recording.mp3 to transcribe it.

# Video understanding
Use analyze_video on /path/to/clip.mp4 โ€” what happens at 00:15?

# Generate audio
Use generate_audio with prompt "Explain neural networks" and voice "alloy", save to ./response.wav

# Generate music
Use generate_audio with model "google/lyria-3-clip-preview" and prompt "upbeat jazz piano trio"

# Generate image
Use generate_image with prompt "a cat astronaut on mars" and save to ./cat.png

# Generate video
Use generate_video with model "google/veo-3.1", prompt "a calm river at sunrise",
resolution 720p, duration 4, save to ./river.mp4

# Resume a video job
Use get_video_status with video_id "vid_abc123" and save_path "./river.mp4"
```

## Architecture

```
src/
โ”œโ”€โ”€ index.ts # Entry, env validation, graceful shutdown
โ”œโ”€โ”€ tool-handlers.ts # 11 tools (annotated) + dispatch
โ”œโ”€โ”€ model-cache.ts # TTL + in-flight coalescing
โ”œโ”€โ”€ openrouter-api.ts # REST client (chat + /videos)
โ”œโ”€โ”€ errors.ts # Closed ErrorCode enum
โ”œโ”€โ”€ logger.ts # JSON-line structured logger
โ””โ”€โ”€ tool-handlers/
โ”œโ”€โ”€ fetch-utils.ts # SSRF, bounded fetch, data-URL parser
โ”œโ”€โ”€ openrouter-errors.ts # SDK/HTTP โ†’ ErrorCode classifier
โ”œโ”€โ”€ completion-utils.ts # Reasoning-model cutoff detection
โ”œโ”€โ”€ path-safety.ts # save_path sandbox
โ”œโ”€โ”€ chat-completion.ts # Text + multimodal chat
โ”œโ”€โ”€ analyze-image.ts # Vision analysis
โ”œโ”€โ”€ analyze-audio.ts # Audio transcription
โ”œโ”€โ”€ analyze-video.ts # Video understanding
โ”œโ”€โ”€ generate-image.ts # Image generation
โ”œโ”€โ”€ generate-audio.ts # Audio generation + streaming
โ”œโ”€โ”€ generate-video.ts # Video generation (async)
โ”œโ”€โ”€ image-utils.ts # Sharp optimization, MIME sniffing
โ”œโ”€โ”€ audio-utils.ts # Audio format detection
โ”œโ”€โ”€ video-utils.ts # Video format detection
โ”œโ”€โ”€ search-models.ts # Model search
โ”œโ”€โ”€ get-model-info.ts # Model detail lookup
โ””โ”€โ”€ validate-model.ts # Model existence check
```

## Development

```bash
git clone https://github.com/stabgan/openrouter-mcp-multimodal.git
cd openrouter-mcp-multimodal
npm install
cp .env.example .env # Add your API key
npm run build
npm start
```

```bash
npm test # 163 unit tests, <1s
npm run test:integration # Live API tests
npm run lint
node scripts/live-e2e.mjs # 16 live E2E scenarios
```

## Upgrading from v2

v3 is **additive** โ€” no tool schemas or env vars were removed.

- Three new tools: `analyze_video`, `generate_video`, `get_video_status`
- Structured `_meta.code` on every error response (text messages preserved)
- `save_path` sandboxed by default โ€” set `OPENROUTER_OUTPUT_DIR` or `OPENROUTER_ALLOW_UNSAFE_PATHS=1`
- Reasoning-model awareness: `content: null` + `finish_reason: length` now returns `INVALID_INPUT` with a preview instead of empty string
- IPv6 SSRF coverage extended to mapped, compat, multicast, 6to4, Teredo, ORCHID

## Compatibility

Works with any MCP client: [Kiro](https://kiro.dev) ยท [Claude Desktop](https://claude.ai/download) ยท [Cursor](https://cursor.sh) ยท [Windsurf](https://codeium.com/windsurf) ยท [Cline](https://github.com/cline/cline) ยท any MCP-compatible client.

## License

MIT

## Contributing

Issues and PRs welcome. Please open an issue first for major changes.