DINO-X
Advanced computer vision and object detection MCP server powered by Dino-X, enabling AI agents to analyze images, detect objects, identify keypoints, and perform visual understanding tasks.
DINO-X MCP
English | ไธญๆ
Enables large language models to perform fine-grained object detection and image understanding, powered by DINO-X and Grounding DINO 1.6 API.
๐ก Why DINO-X MCP?
Although multimodal models can understand and describe images, they often lack precise localization and high-quality structured outputs for visual content.
With DINO-X MCP, you can:
๐ง Achieve fine-grained image understanding โ both full-scene recognition and targeted detection based on natural language.
๐ฏ Accurately obtain object count, position, and attributes, enabling tasks such as visual question answering.
๐งฉ Integrate with other MCP Servers to build multi-step visual workflows.
๐ ๏ธ Build natural language-driven visual agents for real-world automation scenarios.
๐ฌ Use Case
๐ฏ Scenario | ๐ Input | โจ Output |
---|---|---|
Detection & Localization | ๐ฌ Prompt:Detect and visualize the fire areas in the forest ๐ผ๏ธ Input Image: | |
Object Counting | ๐ฌ Prompt:Please analyze this warehouse image, detect all the cardboard boxes, count the total number ๐ผ๏ธ Input Image: | |
Feature Detection | ๐ฌ Prompt:Find all red cars in the image ๐ผ๏ธ Input Image: | |
Attribute Reasoning | ๐ฌ Prompt:Find the tallest person in the image, describe their clothing ๐ผ๏ธ Input Image: | |
Full Scene Detection | ๐ฌ Prompt:Find the fruit with the highest vitamin C content in the image ๐ผ๏ธ Input Image: | |
Pose Analysis | ๐ฌ Prompt:Please analyze what yoga pose this is ๐ผ๏ธ Input Image: |
๐ Quick Start
1. Prerequisites
You can install Node.js using one of the following methods:
Option A: Command ๐
# For MacOS or Linux
# 1. Install nvm (Node Version Manager)
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash
# OR
wget -qO- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash
# 2. Add these lines to your profile (~/.bash_profile, ~/.zshrc, ~/.profile, or ~/.bashrc)
export NVM_DIR="$HOME/.nvm"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh"
[ -s "$NVM_DIR/bash_completion" ] && \. "$NVM_DIR/bash_completion"
# 3. Activate nvm in current shell
source ~/.bashrc
# Or
source ~/.zshrc
# 4. Verify nvm installation
command -v nvm
# 5. Install and use LTS version of Node.js
nvm install --lts
nvm use --lts
# For Windows
winget install OpenJS.NodeJS.LTS
# Or using PowerShell (Administrator)
iwr -useb https://raw.githubusercontent.com/chocolatey/chocolatey/master/chocolateyInstall/InstallChocolatey.ps1 | iex
choco install nodejs-lts -y
Option B: Manual Installation
Download the installer from nodejs.org
Also, choose an AI assistants and applications that support the MCP Client, including but not limited to:
2. Configure MCP Sever
You can use DINO-X MCP server in two ways:
Option A: Using NPM Package ๐
Add the following configuration in your MCP client:
{
"mcpServers": {
"dinox-mcp": {
"command": "npx",
"args": ["-y", "@deepdataspace/dinox-mcp"],
"env": {
"DINOX_API_KEY": "your-api-key-here",
"IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory"
}
}
}
}
Option B: Using Local Project
First, clone and build the project:
# Clone the project
git clone https://github.com/IDEA-Research/DINO-X-MCP.git
cd DINO-X-MCP
# Install dependencies
pnpm install
# Build the project
pnpm run build
Then configure your MCP client:
{
"mcpServers": {
"dinox-mcp": {
"command": "node",
"args": ["/path/to/DINO-X-MCP/build/index.js"],
"env": {
"DINOX_API_KEY": "your-api-key-here",
"IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory"
}
}
}
}
3. Get API Key
Get your API key from DINO-X Platform (A free quota is available for new users).
Replace your-api-key-here
in the configuration above with your actual API key.
4. Environment Variables
The DINO-X MCP server supports the following environment variables:
Variable Name | Description | Required | Default Value | Example |
---|---|---|---|---|
DINOX_API_KEY | Your DINO-X API key for authentication | Required | - | your-api-key-here |
IMAGE_STORAGE_DIRECTORY | Directory where generated visualization images will be saved | Optional | macOS/Linux: /tmp/dinox-mcp Windows: %TEMP%\dinox-mcp | /Users/admin/Downloads/dinox-images |
5. Available Tools
Restart your MCP client, and you should be able to use the following tools:
Method Name | Description | Input | Output |
---|---|---|---|
detect-all-objects | Detects and localizes all recognizable objects in an image. | Image | Category names + bounding boxes + captions |
object-detection-by-text | Detects and localizes objects in an image based on a natural language prompt. | Image + Text prompt | Bounding boxes + object captions |
detect-human-pose-keypoints | Detects 17 human body keypoints per person in an image for pose estimation. | Image | Keypoint coordinates and captions |
visualize-detections | Visualizes detection results by drawing bounding boxes and labels on the image. | Image + Detection results | Annotated image saved to storage directory |
๐ Usage
Supported Image Formats
- Remote URLs starting with
https://
๐ - Local file paths (starting with
file://
) - Common image formats:
jpg, jpeg, png, webp
API Docs
Please refer to DINO-X Platform for API usage limits and pricing information.
๐ ๏ธ Development
Watch Mode
During development, you can use watch mode for automatic rebuilding:
pnpm run watch
Debugging
Use MCP Inspector to debug the server:
pnpm run inspector
License
Apache License 2.0
Related Servers
Unstructured
Set up and interact with your unstructured data processing workflows in Unstructured Platform
Tabby-MCP-Server
A Tabby plugin implementing an MCP server for AI-powered terminal control and automation.
Sistema de Predicciรณn Energรฉtica con IA
An AI-powered system for analyzing and predicting domestic energy consumption. It offers precise forecasts, historical pattern analysis, and personalized optimization recommendations through a conversational interface.
Micromanage
A server for managing sequential development tasks with configurable rules using external .mdc files.
MCP Stripe Server
Integrates with Stripe to manage payments, customers, and refunds.
MCPunk
Explore and understand codebases through conversation by breaking files into logical chunks for searching and querying without embeddings.
Windows CLI
MCP server for secure command-line interactions on Windows systems, enabling controlled access to PowerShell, CMD, and Git Bash shells.
Docker
Run and manage docker containers, docker compose, and logs
MCP Router
A unified gateway for routing requests to multiple Model Context Protocol servers.
nUR MCP Server
An intelligent robot control middleware for natural language interaction with industrial robots, powered by LLMs. It integrates with Universal Robots and supports real-time, multi-robot control.