DINO-X

chính thức

Advanced computer vision and object detection MCP server powered by Dino-X, enabling AI agents to analyze images, detect objects, identify keypoints, and perform visual understanding tasks.

DINO-X MCP Server

License npm version npm downloads PRs Welcome MCP Badge GitHub stars

English | 中文

DINO-X Official MCP Server — powered by the DINO-X and Grounding DINO models — brings fine-grained object detection and image understanding to your multimodal applications.

Your browser does not support the video tag.

Why DINO-X MCP?

With DINO-X MCP, you can:

  • Fine-Grained Understanding: Full image detection, object detection, and region-level descriptions.

  • Structured Outputs: Get object categories, counts, locations, and attributes for VQA and multi-step reasoning tasks.

  • Composable: Works seamlessly with other MCP servers to build end-to-end visual agents or automation pipelines.

Transport Modes

DINO-X MCP supports two transport modes:

FeatureSTDIO (default)Streamable HTTP
RuntimeLocalLocal or Cloud
TransportStandard I/OHTTP (streaming responses)
Input sourcefile:// and https://https:// only
VisualizationSupported (saves annotated images locally)Not supported (for now)

Quick Start

1. Prepare an MCP client

Any MCP-compatible client works, e.g.:

2. Get your API key

Apply on the DINO-X platform: Request API Key (new users get free quota).

3. Configure MCP

Option A: Official Hosted Streamable HTTP (Recommended)

Add to your MCP client config and replace with your API key:

{
  "mcpServers": {
    "dinox-mcp": {
      "url": "https://mcp.deepdataspace.com/mcp?key=your-api-key"
    }
  }
}

Option B: Use the NPM package locally (STDIO)

Install Node.js first

  • Download the installer from nodejs.org

  • Or use command:

# macOS / Linux
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash
# or
wget -qO- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash

# load nvm into current shell (choose the one you use)
source ~/.bashrc || true
source ~/.zshrc  || true

# install and use LTS Node.js
nvm install --lts
nvm use --lts

# Windows (one of the following)
winget install OpenJS.NodeJS.LTS
# or with Chocolatey (in admin PowerShell)
iwr -useb https://raw.githubusercontent.com/chocolatey/chocolatey/master/chocolateyInstall/InstallChocolatey.ps1 | iex
choco install nodejs-lts -y

Configure your MCP client:

{
  "mcpServers": {
    "dinox-mcp": {
      "command": "npx",
      "args": ["-y", "@deepdataspace/dinox-mcp"],
      "env": {
        "DINOX_API_KEY": "your-api-key-here",
        "IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory"
      }
    }
  }
}

Note: Replace your-api-key-here with your real key.

Option C: Run from source locally

Make sure Node.js is installed (see Option B), then:

# clone
git clone https://github.com/IDEA-Research/DINO-X-MCP.git
cd DINO-X-MCP

# install deps
npm install

# build
npm run build

Configure your MCP client:

{
  "mcpServers": {
    "dinox-mcp": {
      "command": "node",
      "args": ["/path/to/DINO-X-MCP/build/index.js"],
      "env": {
        "DINOX_API_KEY": "your-api-key-here",
        "IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory"
      }
    }
  }
}

CLI Flags & Environment Variables

  • Common flags

    • --http: start in Streamable HTTP mode (otherwise STDIO by default)
    • --stdio: force STDIO mode
    • --dinox-api-key=...: set API key
    • --enable-client-key: allow API key via URL ?key= (Streamable HTTP only)
    • --port=8080: HTTP port (default 3020)
  • Environment variables

    • DINOX_API_KEY (required/conditionally required): DINO-X platform API key
    • IMAGE_STORAGE_DIRECTORY (optional, STDIO): directory to save annotated images
    • AUTH_TOKEN (optional, HTTP): if set, client must send Authorization: Bearer <token>

    Examples:

# STDIO (local)
node build/index.js --dinox-api-key=your-api-key

# Streamable HTTP (server provides a shared API key)
node build/index.js --http --dinox-api-key=your-api-key

# Streamable HTTP (custom port)
node build/index.js --http --dinox-api-key=your-api-key --port=8080

# Streamable HTTP (require client-provided API key via URL)
node build/index.js --http --enable-client-key

Client config when using ?key=:

{
  "mcpServers": {
    "dinox-mcp": {
      "url": "http://localhost:3020/mcp?key=your-api-key"
    }
  }
}

Using AUTH_TOKEN with a gateway that injects Authorization: Bearer <token>:

AUTH_TOKEN=my-token node build/index.js --http --enable-client-key

Client example with supergateway:

{
  "mcpServers": {
    "dinox-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "supergateway",
        "--streamableHttp",
        "http://localhost:3020/mcp?key=your-api-key",
        "--oauth2Bearer",
        "my-token"
      ]
    }
  }
}

Tools

CapabilityTool IDTransportInputOutput
Full-scene object detectiondetect-all-objectsSTDIO / HTTPImage URLCategory + bbox + (optional) captions
Text-prompted object detectiondetect-objects-by-textSTDIO / HTTPImage URL + English nouns (dot-separated for multiple, e.g., person.car)Target object bbox + (optional) captions
Human pose estimationdetect-human-pose-keypointsSTDIO / HTTPImage URL17 keypoints + bbox + (optional) captions
Visualizationvisualize-detection-resultSTDIO onlyImage URL + detection results arrayLocal path to annotated image

🎬 Use Cases

🎯 Scenario📝 Input✨ Output
Detection & Localization💬 Prompt:
Detect and visualize the
fire areas in the forest

🖼️ Input Image:
1-1
1-2
Object Counting💬 Prompt:
Please analyze this
warehouse image, detect
all the cardboard boxes,
count the total number

🖼️ Input Image:
2-1
2-2
Feature Detection💬 Prompt:
Find all red cars
in the image

🖼️ Input Image:
4-1
4-2
Attribute Reasoning💬 Prompt:
Find the tallest person
in the image, describe
their clothing

🖼️ Input Image:
5-1
5-2
Full Scene Detection💬 Prompt:
Find the fruit with
the highest vitamin C
content in the image

🖼️ Input Image:
6-1
6-3

Answer: Kiwi fruit (93mg/100g)
Pose Analysis💬 Prompt:
Please analyze what
yoga pose this is

🖼️ Input Image:
3-1
3-3

FAQ

  • Supported image sources?
    • STDIO: file:// and https://
    • Streamable HTTP: https:// only
  • Supported image formats?
    • jpg, jpeg, webp, png

Development & Debugging

Use watch mode to auto-rebuild during development:

npm run watch

Use MCP Inspector for debugging:

npm run inspector

License

Apache License 2.0

Máy chủ liên quan

NotebookLM Web Importer

Nhập trang web và video YouTube vào NotebookLM chỉ với một cú nhấp. Được tin dùng bởi hơn 200.000 người dùng.

Cài đặt tiện ích Chrome