AVCLabs Media MCP Server
AVCLabs MCP integrates AI-powered video upscaling, quality enhancement, and SAM3 image segmentation into MCP workflows. It enhances low-resolution videos, cleans noisy footage, and extracts target objects through text prompts.
Documentation
media-mcp (Node.js)
English | 中文
A video enhancement and image segmentation service based on the MCP protocol, acting as an MCP Client-Server to interact with backend HTTP Servers.
Features
Provides the following MCP Tools:
Video Enhancement
create_task- Create a video enhancement task (supports URL or local file upload)get_task_status- Query task statusenhance_video_sync- Synchronously enhance video (blocking wait, truncated at ~50s by default)
Image Segmentation (SAM3)
sam3_predict- SAM3 image segmentation (supports local path, URL, or Base64 image)
Prerequisites
- Node.js >= 18 (check:
node --version) - API Key (required for authentication)
Lazy Install (Recommended)
If your AI Agent has a known MCP config path, just copy the line below and send it to your AI:
Install the npm package @avclabs.ai/media-mcp as an MCP server. My API Key is: sk-xxxxxxxx.
The AI will automatically:
- Detect your MCP client
- Find the config file path
- Write the correct configuration
- Prompt you to restart the client
Manual Install
No installation needed. Use npx directly in your MCP client config.
1. Claude Code (CLI)
Run in Claude Code:
/mcp
Check the output for the "User MCPs" section to find the config file path, then edit that file.
Common paths (if /mcp is unavailable):
- Windows:
%USERPROFILE%\.claude.json - macOS:
~/.claude.json - Linux:
~/.claude.json - Legacy/Alternative:
~/.claude/mcp.json
Paste this (replace your-api-key):
{
"mcpServers": {
"video-enhancement": {
"command": "npx",
"args": ["-y", "@avclabs.ai/media-mcp@latest"],
"env": {
"API_KEY": "your-api-key"
}
}
}
}
Save and run /mcp to verify it's loaded.
2. Cursor
Go to Settings > Tools & MCPs > Add New MCP Server:
- Name:
video-enhancement - Type:
command - Command:
env API_KEY=your-api-key npx -y @avclabs.ai/media-mcp@latest
Or edit ~/.cursor/mcp.json:
{
"mcpServers": {
"video-enhancement": {
"command": "npx",
"args": ["-y", "@avclabs.ai/media-mcp@latest"],
"env": {
"API_KEY": "your-api-key"
}
}
}
}
Verify Installation
After restarting your client, check if the tools are available:
- Or ask: "What tools do you have available?"
- You should see:
create_task,get_task_status,enhance_video_sync,sam3_predict
Configuration Options
| Variable | Required | Default | Description |
|---|---|---|---|
API_KEY | Yes | - | API authentication key (shared by video enhancement and SAM3) |
HTTP_API_BASE_URL | No | https://mcp.avc.ai/enhance | Video enhancement service endpoint |
SAM3_API_BASE_URL | No | https://mcp.avc.ai/sam | SAM3 service endpoint |
SAM3_POLL_INTERVAL | No | 2000 | SAM3 polling interval (milliseconds) |
SAM3_POLL_MAX_ATTEMPTS | No | 25 | SAM3 maximum polling attempts |
Custom Endpoint
{
"env": {
"HTTP_API_BASE_URL": "https://your-endpoint.com",
"API_KEY": "your-api-key",
"SAM3_API_BASE_URL": "https://your-sam3-endpoint.com"
}
}
Or via CLI args:
npx -y @avclabs.ai/media-mcp@latest --base-url https://your-endpoint.com --api-key your-api-key --sam3-base-url https://your-sam3-endpoint.com
Recommended Workflow
This project provides both synchronous and asynchronous modes.
Because MCP Agents typically enforce a ~60-second timeout per tool call, tasks with longer processing times (video enhancement) are strongly recommended to use asynchronous mode:
Asynchronous Mode (Recommended)
Video Enhancement:
- Call
create_taskto create a task → immediately gettask_id - Wait a few seconds, then call
get_task_statusto query the status - If
statusisprocessing, continue waiting and repeat step 2 - If
statusiscompleted, the task is done and the result containsvideo_url - If
statusisfailed, the task failed and the result containserror_message
Synchronous Mode (Simple Scenarios)
Video Enhancement:
- Call
enhance_video_sync→ the server polls internally - Defaults to a maximum wait of 50 seconds
- If completed within 50 seconds, returns the result directly
- If not completed within 50 seconds, returns
task_idand instructions for the Agent to switch toget_task_status
Image Segmentation (SAM3):
- Call
sam3_predict→ the server polls internally - Defaults to a maximum wait of 50 seconds (25 attempts × 2-second polling interval)
- If completed within 50 seconds, returns the segmentation result directly
- If not completed within 50 seconds, returns a truncation notice indicating the task is still processing
Usage Examples
Once configured, ask your AI agent naturally:
"Enhance this video to 1080p: https://example.com/video.mp4"
"Improve the quality of /Users/me/Desktop/video.mp4 to 2k"
"Analyze this image and find all objects: C:\Users\xxx\photo.png"
"Use SAM3 to segment this image, prompt: 'find all cars'"
The agent will automatically choose sync or async tools based on task complexity.
Provided Tools
Video Enhancement
create_task
Create an asynchronous video enhancement task.
Recommended for most use cases. Ideal for longer videos (over 10 seconds) to avoid timeouts and blocking the connection.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
video_source | string | Yes | - | Video URL or local file path (URL must be publicly accessible, links requiring login or signatures are not supported) |
type | string | No | url | url or local |
resolution | string | No | 720p | 480p, 540p, 720p, 1080p, 2k |
Returns:
{
"success": true,
"task_id": "xxx",
"status": "processing"
}
get_task_status
Query video enhancement task status.
The returned
statusfield can be:processing,completed, orfailed. Ifstatusisprocessing, you need to wait a few seconds and call this tool again.
| Parameter | Type | Required |
|---|---|---|
task_id | string | Yes |
Returns:
{
"success": true,
"task_id": "xxx",
"status": "completed",
"progress": 100,
"video_url": "https://...",
"message": "Task is still processing, please check again later"
}
The message field only appears when status is processing, prompting the Agent to continue waiting.
enhance_video_sync
Synchronously enhance video (blocks until completion).
Best for short videos (estimated processing time < 1 minute). If the task is not completed within 50 seconds, the tool returns early with a
task_id, and you need to useget_task_statusto continue querying.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
video_source | string | Yes | - | Video URL or local file path |
type | string | No | url | url or local |
resolution | string | No | 720p | Target resolution |
poll_interval | number | No | 5 | Poll interval (seconds) |
timeout | number | No | 50 | Sync wait timeout (seconds), returns early when exceeded |
Truncated return example (not completed within 50s):
{
"success": true,
"status": "processing",
"task_id": "xxx",
"message": "Task is still processing (waited 50 seconds). Please use get_task_status to continue polling.",
"note": "The synchronous wait for this long-running task has been truncated. Switch to get_task_status polling."
}
Image Segmentation (SAM3)
sam3_predict
Analyze an image using the SAM3 segmentation API to generate inference results (masks, boxes, scores).
Parameters:
Image input (choose one, must provide exactly one):
-
imagePath(string): Absolute path of a local image file. Supports common formats (PNG, JPG, JPEG).- Example:
"C:\\Users\\xxx\\photo.png","/home/user/images/cat.jpg" - Use when: The user explicitly provides a local file path
- Example:
-
imageUrl(string): Publicly accessible URL of the image.- Example:
"https://example.com/photo.jpg" - Use when: The image is already online and the user provides a link
- Note: The URL must be publicly accessible. Links requiring login or signatures are not supported
- Example:
-
imageBase64(string): Base64-encoded image data.- Example:
"iVBORw0KGgoAAAANSUhEUgAA..." - Use when: The user drags or uploads an image attachment, and the Agent encodes it as base64
- Note: Large images will produce very large base64 strings, which may slow transmission
- Example:
Other parameters:
prompt(string, required): English text prompt specifying the target object to segment. Since the SAM3 model only accepts English prompts, provide an English description. If the user provides Chinese or other non-English text, the Agent will automatically translate it before calling the tool.
Normal completion return:
After inference completes, returns a JSON string containing three fields:
-
masks: 2D array. Each element is a binary mask (values 0 or 1) with the same dimensions as the input image, marking the pixel-level location of detected objects. The i-th mask corresponds to the i-th detected object instance. -
boxes: 2D array. Each element is a bounding box in[x1, y1, x2, y2]format, representing the rectangular region of the detected object.x1,y1are the top-left coordinates;x2,y2are the bottom-right coordinates.Coordinate system: The top-left corner of the image is the origin
(0, 0). The x-axis increases to the right, and the y-axis increases downward, in pixels. For example,[120, 80, 300, 450]means the region starts 120px from the left edge and 80px from the top edge, extending to 300px from the left and 450px from the top. Width =x2 - x1 = 180px, Height =y2 - y1 = 370px. -
scores: 1D array. Each element is a confidence score for the corresponding detection result, ranging from 0 to 1. Higher scores indicate greater model confidence.
Example result JSON:
{
"masks": [
[[0, 0, 1, ...], [0, 1, 1, ...], ...],
[[0, 0, 0, ...], [0, 0, 1, ...], ...]
],
"boxes": [
[120, 80, 300, 450],
[400, 200, 600, 500]
],
"scores": [0.95, 0.87]
}
Truncated return example (not completed within 50s):
{
"success": true,
"status": "processing",
"task_id": "xxx",
"message": "Task is still processing (waited about 50 seconds). Please retry later or record this task_id for manual follow-up.",
"note": "The synchronous wait for this long-running task has been truncated."
}
FAQ
Agent reports timeout when calling tools?
This is the primary issue this project addresses. MCP Agents (such as Claude, Cursor) typically enforce a ~60-second timeout per tool call. If task processing exceeds this limit, the Agent will error and disconnect.
Solutions:
-
Prefer asynchronous tools: For video enhancement and other time-consuming tasks, always use
create_task+get_task_status. These tools return instantly on each call and will not trigger timeouts. -
Sync tool truncation mechanism:
enhance_video_synchas an internal 50-second truncation limit. If the task is not completed within 50 seconds, the tool proactively returns atask_idand instructs the Agent to useget_task_statusto follow up. -
SAM3 truncation mechanism:
sam3_predictdefaults to 25 polling attempts (~50 seconds). If the task is not completed, it returns a truncation notice indicating the task is still processing. -
Adjust SAM3 polling parameters (advanced): If you are confident that SAM3 tasks are usually fast (e.g., under 10 seconds), you can increase polling attempts via environment variable:
SAM3_POLL_MAX_ATTEMPTS=60But ensure the total wait time does not exceed your Agent's timeout limit.
Drag-and-drop attachment says file not found?
This is a known limitation of stdio MCP. When dragging or uploading an attachment through the Agent interface, the file path is usually not automatically passed to the MCP Server.
Solutions:
-
Provide the path simultaneously (recommended): After dragging the image, add the local absolute path in your message:
"Please analyze this image
D:\\photos\\cat.jpgand find the cat" -
Wait for auto-encoding: Claude may automatically encode the image as base64. If successful, no extra action is needed.
-
Reply to path inquiry: If Claude asks for the image path, simply reply with the local absolute path.
Is there a priority among the three input methods?
There is no strict priority. Claude will automatically choose the most appropriate method based on conversation context:
- You provided a local path → uses
imagePath - You provided a web link → uses
imageUrl - You dragged an attachment without a path → tries
imageBase64
What image formats are supported?
Common formats: PNG, JPG, JPEG, BMP, WebP, etc. PNG or JPG is recommended.
What if URL image download fails?
Ensure the URL is publicly accessible, requiring no login, cookies, or signatures. If the image is on a service requiring authentication (e.g., private S3 Bucket, login-required image host), download it locally first and use imagePath.
What if the base64 image is too large?
If the image is very large (e.g., 4K resolution), the base64-encoded data will be very large and may slow transmission. Suggestions:
- Use
imagePathinstead - Or compress the image before encoding
File Upload Notes
When type is "local":
- File is read locally by the MCP Server
- Uploaded directly to TOS object storage via pre-signed URL
- Max file size: 100MB
Troubleshooting
"command not found: npx"
Install Node.js >= 18: https://nodejs.org/
"Error: --api-key argument or API_KEY environment variable is required"
Your API Key is missing. Double-check the env.API_KEY in your config.
MCP Server shows red/error in client
Check logs:
- Claude Desktop macOS:
~/Library/Logs/Claude/mcp*.log - Claude Desktop Windows:
%APPDATA%\Claude\logs\mcp*.log - Cursor: Output panel > MCP
"TOS upload failed"
Usually a signature mismatch. Ensure your HTTP_API_BASE_URL and API_KEY are correct and active.
Global Install (Alternative)
If you prefer not using npx every time:
npm install -g @avclabs.ai/media-mcp
Then use "command": "media-mcp" with "args": ["--api-key", "your-api-key"] in your config.
License
MIT License - See LICENSE file for details