AVCLabs Media MCP
AVCLabs MCP integrates AI-powered video upscaling, quality enhancement, and SAM3 image segmentation into MCP workflows. It enhances low-resolution videos, cleans noisy footage, and extracts target objects through text prompts.
media-mcp (Node.js)
English | 中文
A video enhancement and image segmentation service based on the MCP protocol, acting as an MCP Client-Server to interact with backend HTTP Servers.
Features
Provides the following MCP Tools:
Video Enhancement
create_task- Create a video enhancement task (supports URL or local file upload)get_task_status- Query task statusenhance_video_sync- Synchronously enhance video (blocking wait, truncated at ~50s by default)
Image Segmentation (SAM3)
sam3_predict- SAM3 image segmentation (supports local path, URL, or Base64 image)
Prerequisites
- Node.js >= 18 (check:
node --version) - API Key (required for authentication)
Lazy Install (Recommended)
If your AI Agent has a known MCP config path, just copy the line below and send it to your AI:
Install the npm package @avclabs.ai/media-mcp as an MCP server. My API Key is: sk-xxxxxxxx.
The AI will automatically:
- Detect your MCP client
- Find the config file path
- Write the correct configuration
- Prompt you to restart the client
Manual Install
No installation needed. Use npx directly in your MCP client config.
1. Claude Code (CLI)
Run in Claude Code:
/mcp
Check the output for the "User MCPs" section to find the config file path, then edit that file.
Common paths (if /mcp is unavailable):
- Windows:
%USERPROFILE%\.claude.json - macOS:
~/.claude.json - Linux:
~/.claude.json - Legacy/Alternative:
~/.claude/mcp.json
Paste this (replace your-api-key):
{
"mcpServers": {
"video-enhancement": {
"command": "npx",
"args": ["-y", "@avclabs.ai/media-mcp@latest"],
"env": {
"API_KEY": "your-api-key"
}
}
}
}
Save and run /mcp to verify it's loaded.
2. Cursor
Go to Settings > Tools & MCPs > Add New MCP Server:
- Name:
video-enhancement - Type:
command - Command:
env API_KEY=your-api-key npx -y @avclabs.ai/media-mcp@latest
Or edit ~/.cursor/mcp.json:
{
"mcpServers": {
"video-enhancement": {
"command": "npx",
"args": ["-y", "@avclabs.ai/media-mcp@latest"],
"env": {
"API_KEY": "your-api-key"
}
}
}
}
Verify Installation
After restarting your client, check if the tools are available:
- Or ask: "What tools do you have available?"
- You should see:
create_task,get_task_status,enhance_video_sync,sam3_predict
Configuration Options
| Variable | Required | Default | Description |
|---|---|---|---|
API_KEY | Yes | - | API authentication key (shared by video enhancement and SAM3) |
HTTP_API_BASE_URL | No | https://mcp.avc.ai/enhance | Video enhancement service endpoint |
SAM3_API_BASE_URL | No | https://mcp.avc.ai/sam | SAM3 service endpoint |
SAM3_POLL_INTERVAL | No | 2000 | SAM3 polling interval (milliseconds) |
SAM3_POLL_MAX_ATTEMPTS | No | 25 | SAM3 maximum polling attempts |
Custom Endpoint
{
"env": {
"HTTP_API_BASE_URL": "https://your-endpoint.com",
"API_KEY": "your-api-key",
"SAM3_API_BASE_URL": "https://your-sam3-endpoint.com"
}
}
Or via CLI args:
npx -y @avclabs.ai/media-mcp@latest --base-url https://your-endpoint.com --api-key your-api-key --sam3-base-url https://your-sam3-endpoint.com
Recommended Workflow
This project provides both synchronous and asynchronous modes.
Because MCP Agents typically enforce a ~60-second timeout per tool call, tasks with longer processing times (video enhancement) are strongly recommended to use asynchronous mode:
Asynchronous Mode (Recommended)
Video Enhancement:
- Call
create_taskto create a task → immediately gettask_id - Wait a few seconds, then call
get_task_statusto query the status - If
statusisprocessing, continue waiting and repeat step 2 - If
statusiscompleted, the task is done and the result containsvideo_url - If
statusisfailed, the task failed and the result containserror_message
Synchronous Mode (Simple Scenarios)
Video Enhancement:
- Call
enhance_video_sync→ the server polls internally - Defaults to a maximum wait of 50 seconds
- If completed within 50 seconds, returns the result directly
- If not completed within 50 seconds, returns
task_idand instructions for the Agent to switch toget_task_status
Image Segmentation (SAM3):
- Call
sam3_predict→ the server polls internally - Defaults to a maximum wait of 50 seconds (25 attempts × 2-second polling interval)
- If completed within 50 seconds, returns the segmentation result directly
- If not completed within 50 seconds, returns a truncation notice indicating the task is still processing
Usage Examples
Once configured, ask your AI agent naturally:
"Enhance this video to 1080p: https://example.com/video.mp4"
"Improve the quality of /Users/me/Desktop/video.mp4 to 2k"
"Analyze this image and find all objects: C:\Users\xxx\photo.png"
"Use SAM3 to segment this image, prompt: 'find all cars'"
The agent will automatically choose sync or async tools based on task complexity.
Provided Tools
Video Enhancement
create_task
Create an asynchronous video enhancement task.
Recommended for most use cases. Ideal for longer videos (over 10 seconds) to avoid timeouts and blocking the connection.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
video_source | string | Yes | - | Video URL or local file path (URL must be publicly accessible, links requiring login or signatures are not supported) |
type | string | No | url | url or local |
resolution | string | No | 720p | 480p, 540p, 720p, 1080p, 2k |
Returns:
{
"success": true,
"task_id": "xxx",
"status": "processing"
}
get_task_status
Query video enhancement task status.
The returned
statusfield can be:processing,completed, orfailed. Ifstatusisprocessing, you need to wait a few seconds and call this tool again.
| Parameter | Type | Required |
|---|---|---|
task_id | string | Yes |
Returns:
{
"success": true,
"task_id": "xxx",
"status": "completed",
"progress": 100,
"video_url": "https://...",
"message": "Task is still processing, please check again later"
}
The message field only appears when status is processing, prompting the Agent to continue waiting.
enhance_video_sync
Synchronously enhance video (blocks until completion).
Best for short videos (estimated processing time < 1 minute). If the task is not completed within 50 seconds, the tool returns early with a
task_id, and you need to useget_task_statusto continue querying.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
video_source | string | Yes | - | Video URL or local file path |
type | string | No | url | url or local |
resolution | string | No | 720p | Target resolution |
poll_interval | number | No | 5 | Poll interval (seconds) |
timeout | number | No | 50 | Sync wait timeout (seconds), returns early when exceeded |
Truncated return example (not completed within 50s):
{
"success": true,
"status": "processing",
"task_id": "xxx",
"message": "Task is still processing (waited 50 seconds). Please use get_task_status to continue polling.",
"note": "The synchronous wait for this long-running task has been truncated. Switch to get_task_status polling."
}
Image Segmentation (SAM3)
sam3_predict
Analyze an image using the SAM3 segmentation API to generate inference results (masks, boxes, scores).
Parameters:
Image input (choose one, must provide exactly one):
-
imagePath(string): Absolute path of a local image file. Supports common formats (PNG, JPG, JPEG).- Example:
"C:\\Users\\xxx\\photo.png","/home/user/images/cat.jpg" - Use when: The user explicitly provides a local file path
- Example:
-
imageUrl(string): Publicly accessible URL of the image.- Example:
"https://example.com/photo.jpg" - Use when: The image is already online and the user provides a link
- Note: The URL must be publicly accessible. Links requiring login or signatures are not supported
- Example:
-
imageBase64(string): Base64-encoded image data.- Example:
"iVBORw0KGgoAAAANSUhEUgAA..." - Use when: The user drags or uploads an image attachment, and the Agent encodes it as base64
- Note: Large images will produce very large base64 strings, which may slow transmission
- Example:
Other parameters:
prompt(string, required): English text prompt specifying the target object to segment. Since the SAM3 model only accepts English prompts, provide an English description. If the user provides Chinese or other non-English text, the Agent will automatically translate it before calling the tool.
Normal completion return:
After inference completes, returns a JSON string containing three fields:
-
masks: 2D array. Each element is a binary mask (values 0 or 1) with the same dimensions as the input image, marking the pixel-level location of detected objects. The i-th mask corresponds to the i-th detected object instance. -
boxes: 2D array. Each element is a bounding box in[x1, y1, x2, y2]format, representing the rectangular region of the detected object.x1,y1are the top-left coordinates;x2,y2are the bottom-right coordinates.Coordinate system: The top-left corner of the image is the origin
(0, 0). The x-axis increases to the right, and the y-axis increases downward, in pixels. For example,[120, 80, 300, 450]means the region starts 120px from the left edge and 80px from the top edge, extending to 300px from the left and 450px from the top. Width =x2 - x1 = 180px, Height =y2 - y1 = 370px. -
scores: 1D array. Each element is a confidence score for the corresponding detection result, ranging from 0 to 1. Higher scores indicate greater model confidence.
Example result JSON:
{
"masks": [
[[0, 0, 1, ...], [0, 1, 1, ...], ...],
[[0, 0, 0, ...], [0, 0, 1, ...], ...]
],
"boxes": [
[120, 80, 300, 450],
[400, 200, 600, 500]
],
"scores": [0.95, 0.87]
}
Truncated return example (not completed within 50s):
{
"success": true,
"status": "processing",
"task_id": "xxx",
"message": "Task is still processing (waited about 50 seconds). Please retry later or record this task_id for manual follow-up.",
"note": "The synchronous wait for this long-running task has been truncated."
}
FAQ
Agent reports timeout when calling tools?
This is the primary issue this project addresses. MCP Agents (such as Claude, Cursor) typically enforce a ~60-second timeout per tool call. If task processing exceeds this limit, the Agent will error and disconnect.
Solutions:
-
Prefer asynchronous tools: For video enhancement and other time-consuming tasks, always use
create_task+get_task_status. These tools return instantly on each call and will not trigger timeouts. -
Sync tool truncation mechanism:
enhance_video_synchas an internal 50-second truncation limit. If the task is not completed within 50 seconds, the tool proactively returns atask_idand instructs the Agent to useget_task_statusto follow up. -
SAM3 truncation mechanism:
sam3_predictdefaults to 25 polling attempts (~50 seconds). If the task is not completed, it returns a truncation notice indicating the task is still processing. -
Adjust SAM3 polling parameters (advanced): If you are confident that SAM3 tasks are usually fast (e.g., under 10 seconds), you can increase polling attempts via environment variable:
SAM3_POLL_MAX_ATTEMPTS=60But ensure the total wait time does not exceed your Agent's timeout limit.
Drag-and-drop attachment says file not found?
This is a known limitation of stdio MCP. When dragging or uploading an attachment through the Agent interface, the file path is usually not automatically passed to the MCP Server.
Solutions:
-
Provide the path simultaneously (recommended): After dragging the image, add the local absolute path in your message:
"Please analyze this image
D:\\photos\\cat.jpgand find the cat" -
Wait for auto-encoding: Claude may automatically encode the image as base64. If successful, no extra action is needed.
-
Reply to path inquiry: If Claude asks for the image path, simply reply with the local absolute path.
Is there a priority among the three input methods?
There is no strict priority. Claude will automatically choose the most appropriate method based on conversation context:
- You provided a local path → uses
imagePath - You provided a web link → uses
imageUrl - You dragged an attachment without a path → tries
imageBase64
What image formats are supported?
Common formats: PNG, JPG, JPEG, BMP, WebP, etc. PNG or JPG is recommended.
What if URL image download fails?
Ensure the URL is publicly accessible, requiring no login, cookies, or signatures. If the image is on a service requiring authentication (e.g., private S3 Bucket, login-required image host), download it locally first and use imagePath.
What if the base64 image is too large?
If the image is very large (e.g., 4K resolution), the base64-encoded data will be very large and may slow transmission. Suggestions:
- Use
imagePathinstead - Or compress the image before encoding
File Upload Notes
When type is "local":
- File is read locally by the MCP Server
- Uploaded directly to TOS object storage via pre-signed URL
- Max file size: 100MB
Troubleshooting
"command not found: npx"
Install Node.js >= 18: https://nodejs.org/
"Error: --api-key argument or API_KEY environment variable is required"
Your API Key is missing. Double-check the env.API_KEY in your config.
MCP Server shows red/error in client
Check logs:
- Claude Desktop macOS:
~/Library/Logs/Claude/mcp*.log - Claude Desktop Windows:
%APPDATA%\Claude\logs\mcp*.log - Cursor: Output panel > MCP
"TOS upload failed"
Usually a signature mismatch. Ensure your HTTP_API_BASE_URL and API_KEY are correct and active.
Global Install (Alternative)
If you prefer not using npx every time:
npm install -g @avclabs.ai/media-mcp
Then use "command": "media-mcp" with "args": ["--api-key", "your-api-key"] in your config.
License
MIT License - See LICENSE file for details
관련 서버
Kone.vc
스폰서Monetize your AI agent with contextual product recommendations
Business Central MCP
An MCP server for interacting with Microsoft Business Central, built with FastMCP and FastAPI.
Wishfinity
Save any product to a universal wishlist — converts any product URL into a one-click wishlist save link.
Notion API MCP
Interact with Notion's API to manage todo lists, databases, and content organization.
Inkdrop
Interact with the local Inkdrop note-taking app database via its HTTP API.
Obsidian MCP Server
Manage notes and files in an Obsidian vault. Requires the Obsidian Local REST API plugin.
Misar.Blog MCP
Publish blog posts, manage drafts, generate AI cover images, and pull analytics from Misar.Blog via Claude Code, Cursor, or Windsurf.
MCP-MD-PDF: Markdown to Word/PDF Converter
A simple, reliable Model Context Protocol (MCP) server that converts Markdown files into professional Word (.docx) and PDF documents — with full support for .dotx templates.
Taskade
Connect to the Taskade platform via MCP. Access tasks, projects, workflows, and AI agents in real-time through a unified workspace and API.
HomeVisto
HomeVisto offers a revolutionary solution by connecting remote property seekers with local "Scouts" who provide live, GPS-verified video tours of properties.
mycrab-mcp
instant public HTTPS URLs via Cloudflare Tunnels and custom domains for AI agent