DINO-X
Advanced computer vision and object detection MCP server powered by Dino-X, enabling AI agents to analyze images, detect objects, identify keypoints, and perform visual understanding tasks.
DINO-X MCP Server
English | δΈζ
DINO-X Official MCP Server β powered by the DINO-X and Grounding DINO models β brings fine-grained object detection and image understanding to your multimodal applications.
Why DINO-X MCP?
With DINO-X MCP, you can:
-
Fine-Grained Understanding: Full image detection, object detection, and region-level descriptions.
-
Structured Outputs: Get object categories, counts, locations, and attributes for VQA and multi-step reasoning tasks.
-
Composable: Works seamlessly with other MCP servers to build end-to-end visual agents or automation pipelines.
Transport Modes
DINO-X MCP supports two transport modes:
| Feature | STDIO (default) | Streamable HTTP |
|---|---|---|
| Runtime | Local | Local or Cloud |
| Transport | Standard I/O | HTTP (streaming responses) |
| Input source | file:// and https:// | https:// only |
| Visualization | Supported (saves annotated images locally) | Not supported (for now) |
Quick Start
1. Prepare an MCP client
Any MCP-compatible client works, e.g.:
2. Get your API key
Apply on the DINO-X platform: Request API Key (new users get free quota).
3. Configure MCP
Option A: Official Hosted Streamable HTTP (Recommended)
Add to your MCP client config and replace with your API key:
{
"mcpServers": {
"dinox-mcp": {
"url": "https://mcp.deepdataspace.com/mcp?key=your-api-key"
}
}
}
Option B: Use the NPM package locally (STDIO)
Install Node.js first
-
Download the installer from nodejs.org
-
Or use command:
# macOS / Linux
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash
# or
wget -qO- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash
# load nvm into current shell (choose the one you use)
source ~/.bashrc || true
source ~/.zshrc || true
# install and use LTS Node.js
nvm install --lts
nvm use --lts
# Windows (one of the following)
winget install OpenJS.NodeJS.LTS
# or with Chocolatey (in admin PowerShell)
iwr -useb https://raw.githubusercontent.com/chocolatey/chocolatey/master/chocolateyInstall/InstallChocolatey.ps1 | iex
choco install nodejs-lts -y
Configure your MCP client:
{
"mcpServers": {
"dinox-mcp": {
"command": "npx",
"args": ["-y", "@deepdataspace/dinox-mcp"],
"env": {
"DINOX_API_KEY": "your-api-key-here",
"IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory"
}
}
}
}
Note: Replace your-api-key-here with your real key.
Option C: Run from source locally
Make sure Node.js is installed (see Option B), then:
# clone
git clone https://github.com/IDEA-Research/DINO-X-MCP.git
cd DINO-X-MCP
# install deps
npm install
# build
npm run build
Configure your MCP client:
{
"mcpServers": {
"dinox-mcp": {
"command": "node",
"args": ["/path/to/DINO-X-MCP/build/index.js"],
"env": {
"DINOX_API_KEY": "your-api-key-here",
"IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory"
}
}
}
}
CLI Flags & Environment Variables
-
Common flags
--http: start in Streamable HTTP mode (otherwise STDIO by default)--stdio: force STDIO mode--dinox-api-key=...: set API key--enable-client-key: allow API key via URL?key=(Streamable HTTP only)--port=8080: HTTP port (default 3020)
-
Environment variables
DINOX_API_KEY(required/conditionally required): DINO-X platform API keyIMAGE_STORAGE_DIRECTORY(optional, STDIO): directory to save annotated imagesAUTH_TOKEN(optional, HTTP): if set, client must sendAuthorization: Bearer <token>
Examples:
# STDIO (local)
node build/index.js --dinox-api-key=your-api-key
# Streamable HTTP (server provides a shared API key)
node build/index.js --http --dinox-api-key=your-api-key
# Streamable HTTP (custom port)
node build/index.js --http --dinox-api-key=your-api-key --port=8080
# Streamable HTTP (require client-provided API key via URL)
node build/index.js --http --enable-client-key
Client config when using ?key=:
{
"mcpServers": {
"dinox-mcp": {
"url": "http://localhost:3020/mcp?key=your-api-key"
}
}
}
Using AUTH_TOKEN with a gateway that injects Authorization: Bearer <token>:
AUTH_TOKEN=my-token node build/index.js --http --enable-client-key
Client example with supergateway:
{
"mcpServers": {
"dinox-mcp": {
"command": "npx",
"args": [
"-y",
"supergateway",
"--streamableHttp",
"http://localhost:3020/mcp?key=your-api-key",
"--oauth2Bearer",
"my-token"
]
}
}
}
Tools
| Capability | Tool ID | Transport | Input | Output |
|---|---|---|---|---|
| Full-scene object detection | detect-all-objects | STDIO / HTTP | Image URL | Category + bbox + (optional) captions |
| Text-prompted object detection | detect-objects-by-text | STDIO / HTTP | Image URL + English nouns (dot-separated for multiple, e.g., person.car) | Target object bbox + (optional) captions |
| Human pose estimation | detect-human-pose-keypoints | STDIO / HTTP | Image URL | 17 keypoints + bbox + (optional) captions |
| Visualization | visualize-detection-result | STDIO only | Image URL + detection results array | Local path to annotated image |
π¬ Use Cases
| π― Scenario | π Input | β¨ Output |
|---|---|---|
| Detection & Localization | π¬ Prompt:Detect and visualize the fire areas in the forest πΌοΈ Input Image: | |
| Object Counting | π¬ Prompt:Please analyze thiswarehouse image, detectall the cardboard boxes,count the total numberπΌοΈ Input Image: | |
| Feature Detection | π¬ Prompt:Find all red carsin the imageπΌοΈ Input Image: | |
| Attribute Reasoning | π¬ Prompt:Find the tallest personin the image, describetheir clothingπΌοΈ Input Image: | |
| Full Scene Detection | π¬ Prompt:Find the fruit withthe highest vitamin Ccontent in the imageπΌοΈ Input Image: | Answer: Kiwi fruit (93mg/100g) |
| Pose Analysis | π¬ Prompt:Please analyze whatyoga pose this isπΌοΈ Input Image: |
FAQ
- Supported image sources?
- STDIO:
file://andhttps:// - Streamable HTTP:
https://only
- STDIO:
- Supported image formats?
- jpg, jpeg, webp, png
Development & Debugging
Use watch mode to auto-rebuild during development:
npm run watch
Use MCP Inspector for debugging:
npm run inspector
License
Apache License 2.0
Related Servers
Scout Monitoring MCP
sponsorPut performance and error data directly in the hands of your AI assistant.
Alpha Vantage MCP Server
sponsorAccess financial market data: realtime & historical stock, ETF, options, forex, crypto, commodities, fundamentals, technical indicators, & more
Refactory
Hybrid decomp tool β AI decides WHERE to split your monolith, deterministic engine COPIES the code. Minimize tokens, maximize syntax validity.
Dify Plugin Agent
An agent that supports Function Calling and ReAct for the MCP protocol via HTTP with SSE or Streamable HTTP transport.
LaTeX PDF MCP Server
Converts LaTeX source code into professionally formatted PDF documents.
Squidler.io
Squidler is designed to validate your web app as a human based on natural language use cases, without write brittle, DOM-dependent tests.
Jupyter Earth MCP Server
Provides tools for geospatial analysis within Jupyter notebooks.
PackageLens MCP
Lets your coding agent (such as Claude, Cursor, Copilot, Gemini or Codex) search package registries across multiple ecosystems (npm, PyPI, RubyGems, Crates.io, Packagist, Hex) and fetch package context (README, downloads, GitHub info, usage snippets)
CCXT MCP Server
Interact with over 100 cryptocurrency exchange APIs using the CCXT library.
tactual-mcp
Screen-reader navigation cost analyzer that measures the actual navigation effort for assistive-technology users by building a weighted graph from Playwright accessibility snapshots and scoring each target under real assistive-technology profiles (NVDA, JAWS, VoiceOver, TalkBack, generic mobile).
Jetty.io
Work on dataset metadata with MLCommons Croissant validation and creation.
Holy Bio MCP
A unified framework for bioinformatics research, integrating multiple specialized MCP servers for longevity and bioinformatics.