AI Image MCP Server
AI-powered image analysis using OpenAI's Vision API.
AI Image MCP Server
A comprehensive Model Context Protocol (MCP) server that provides both AI-powered image analysis and AI image generation capabilities using OpenAI's Vision API and image generation models.
System Requirements
Tested on:
- macOS 14.3.0 (Darwin 23.3.0, ARM64)
- Python 3.13.0
- uv 0.7.13
- OpenAI API access
Features
š Image Analysis & Description
- Smart Image Analysis: Analyze images using OpenAI's GPT-4O Vision model
- Targeted Analysis: Analyze specific aspects (objects, text, colors, composition, emotions)
- Image Comparisons: Compare two images and highlight similarities/differences
- Metadata Extraction: Get technical information about image files
- Intelligent Caching: Cache analysis results to avoid repeated API calls
- Multiple Formats: Support for PNG, JPEG, GIF, and WebP formats
šØ Image Generation & Editing
- Text-to-Image Generation: Create images from text prompts using DALL-E 2, DALL-E 3, or GPT-Image-1
- Image Editing: Edit existing images with text prompts using GPT-Image-1 or DALL-E 2
- Image Variations: Create variations of existing images using DALL-E 2
- Flexible Output: Save generated images locally with custom naming and directories
- Model Support: Full support for all OpenAI image generation models with their specific features
MCP Tools
describe_image(image_path, prompt)
- Get detailed image descriptionsanalyze_image_content(image_path, analysis_type)
- Analyze specific aspectscompare_images(image1_path, image2_path, comparison_focus)
- Compare two imagesget_image_metadata(image_path)
- Extract technical metadataget_cache_info()
- View cache statisticsclear_image_cache()
- Clear cached results
Installation
- Install dependencies:
curl -LsSf https://astral.sh/uv/install.sh | sh
uv add mcp[cli] openai pillow requests
- Set your OpenAI API key:
export OPENAI_API_KEY="your-api-key-here"
- Run the server:
uv run main.py
Running the Server
uv run main.py
MCP Integration
Claude Desktop
{
"mcpServers": {
"ai-image-mcp": {
"command": "uv",
"args": [
"--directory",
"/absolute/path/to/ai-image-mcp",
"run",
"main.py"
],
"env": {
"OPENAI_API_KEY": "your-api-key-here"
}
}
}
}
Cursor
Configure MCP in Cursor settings:
{
"servers": {
"ai-image-mcp": {
"command": "uv",
"args": ["run", "main.py"],
"cwd": "/absolute/path/to/ai-image-mcp",
"env": {
"OPENAI_API_KEY": "your-api-key-here"
}
}
}
}
Analysis Types
general
: Overall image descriptionobjects
: Object detection and identificationtext
: Text extraction and OCRcolors
: Color analysis and palettecomposition
: Visual composition and layoutemotions
: Emotional content and mood
Project Structure
ai-image-mcp/
āāā test_data/ # Sample images (gitignored)
āāā tools/ # MCP tool definitions
āāā utils/ # Utilities (caching, OpenAI client)
āāā main.py # Server entry point
āāā server.py # MCP server instance
Caching
- Automatic file change detection via SHA-256 hashes
- 30-day cache expiration
- Separate cache entries for different prompts/analysis types
- Significant performance improvements (1000x+ faster than API calls)
Available Tools
Image Analysis Tools
describe_image
Analyze an image and provide a detailed description.
- Parameters:
image_path
(str): Path to the image fileprompt
(str, optional): Custom analysis prompt
- Supports: PNG, JPEG, GIF, WebP
- Features: Caching, file validation, comprehensive error handling
analyze_image_content
Perform targeted analysis of specific image aspects.
- Parameters:
image_path
(str): Path to the image fileanalysis_type
(str): Type of analysis - "general", "objects", "text", "colors", "composition", "emotions"
- Features: Specialized prompts for different analysis types
compare_images
Compare two images and highlight similarities and differences.
- Parameters:
image1_path
(str): Path to first imageimage2_path
(str): Path to second imagecomparison_focus
(str): What to focus on in comparison
get_image_metadata
Get technical metadata about an image file.
- Returns: File size, dimensions, format, color mode, aspect ratio, etc.
Image Generation Tools
generate_image
Generate images from text prompts using OpenAI's image generation models.
- Parameters:
prompt
(str): Text description of desired imagemodel
(str): "dall-e-2", "dall-e-3", or "gpt-image-1" (default: dall-e-3)size
(str, optional): Image dimensions (varies by model)quality
(str, optional): Quality setting (varies by model)style
(str, optional): "vivid" or "natural" (DALL-E 3 only)n
(int, optional): Number of images (1-10, DALL-E 3 only supports 1)output_dir
(str): Directory to save images (default: "./generated_images")filename_prefix
(str): Prefix for filenames (default: "generated")
Model-Specific Features:
- DALL-E 2: Basic generation, sizes: 256x256, 512x512, 1024x1024
- DALL-E 3: High quality, styles (vivid/natural), sizes: 1024x1024, 1792x1024, 1024x1792
- GPT-Image-1: Advanced features, transparency support, compression control
edit_image
Edit existing images using text prompts.
- Parameters:
image_path
(str): Path to image to editprompt
(str): Description of desired editmask_path
(str, optional): Path to mask image (PNG with transparent edit areas)model
(str): "gpt-image-1" or "dall-e-2" (default: gpt-image-1)size
,quality
,n
: Model-specific optionsoutput_dir
,filename_prefix
: Output configuration
Supported Models: GPT-Image-1 (up to 16 images, 50MB each) and DALL-E 2 (1 square PNG, 4MB max)
create_image_variations
Create variations of existing images using DALL-E 2.
- Parameters:
image_path
(str): Path to source image (must be square PNG, <4MB)n
(int): Number of variations (1-10, default: 2)size
(str): Variation size - "256x256", "512x512", "1024x1024"output_dir
,filename_prefix
: Output configuration
list_generated_images
List all generated images in a directory with metadata.
- Parameters:
directory
(str): Directory to scan (default: "./generated_images")
- Returns: File listing with sizes, dimensions, modification dates
Cache Management Tools
get_cache_info
Get information about the analysis cache (file count, size, location).
clear_image_cache
Clear all cached analysis results.
Model Comparison
Feature | DALL-E 2 | DALL-E 3 | GPT-Image-1 |
---|---|---|---|
Generation | ā Basic | ā High Quality | ā Advanced |
Editing | ā Limited | ā | ā Advanced |
Variations | ā | ā | ā |
Max Images | 10 | 1 | 10 |
Sizes | 256x256, 512x512, 1024x1024 | 1024x1024, 1792x1024, 1024x1792 | 1024x1024, 1536x1024, 1024x1536 |
Styles | ā | vivid, natural | ā |
Quality | standard | standard, hd | auto, high, medium, low |
Transparency | ā | ā | ā |
Max Prompt | 1000 chars | 4000 chars | 32000 chars |
Usage Examples
Generate a Basic Image
# Generate an image with DALL-E 3
generate_image(
prompt="A serene mountain landscape at sunset with a crystal clear lake",
model="dall-e-3",
size="1792x1024",
quality="hd",
style="natural"
)
Edit an Existing Image
# Add elements to an image
edit_image(
image_path="./photos/room.png",
prompt="Add a beautiful bookshelf filled with colorful books to the left wall",
model="gpt-image-1",
quality="high"
)
Create Image Variations
# Create variations of a logo
create_image_variations(
image_path="./logos/logo.png",
n=5,
size="1024x1024"
)
Analyze Generated Images
# Analyze a generated image
describe_image(
image_path="./generated_images/generated_1234567890_1.png",
prompt="Describe the artistic style and composition of this generated image"
)
File Organization
Generated images are automatically organized in separate directories:
./generated_images/
- Text-to-image generations./edited_images/
- Image edits./image_variations/
- Image variations
Files are named with timestamps to avoid conflicts:
generated_1234567890_1.png
edited_1234567890_1.png
variation_1234567890_1.png
Error Handling
The server includes comprehensive error handling for:
- Invalid image formats and file paths
- Model-specific parameter validation
- File size and dimension limits
- API quota and rate limiting
- Network connectivity issues
- Malformed prompts and parameters
Cache System
The analysis tools use an intelligent caching system:
- File Change Detection: Uses SHA-256 hashes to detect file changes
- 30-Day Expiration: Automatically expires old cache entries
- Safe Operation: Cache failures don't affect main functionality
- Efficient Storage: Uses MD5 hashes for safe cache key generation
Requirements
- Python 3.13+
- OpenAI API key with access to Vision API and Image Generation
- Required packages:
mcp[cli]>=1.9.4
,openai>=1.90.0
,pillow>=11.2.1
,requests>=2.32.4
License
This project is licensed under the MIT License - see the LICENSE file for details.
Related Servers
Coolify MCP Server
An MCP server for interacting with the Coolify API to manage servers and applications.
Replicate Designer
Generate images using Replicate's Flux 1.1 Pro model.
Remote MCP Server (Authless)
An example of a remote MCP server deployable on Cloudflare Workers without authentication.
Cloudflare Remote MCP Server
A remote MCP server deployable on Cloudflare Workers or runnable locally, requiring no authentication or external data files.
QuickBooks MCP Server
Query QuickBooks data using natural language.
Akamai MCP Server
Automate Akamai resource actions using a conversational AI client. Requires Akamai API credentials.
DataWorks
A Model Context Protocol (MCP) server that provides tools for AI, allowing it to interact with the DataWorks Open API through a standardized interface. This implementation is based on the Aliyun Open API and enables AI agents to perform cloud resources operations seamlessly.
Heroku Platform
Interact with Heroku Platform resources securely using the Heroku CLI. Requires the Heroku CLI and a valid API key.
Remote MCP Server on Cloudflare
A remote MCP server deployable on Cloudflare Workers, featuring OAuth login support and local development capabilities.
AWS SSO
Interact with AWS resources using Single Sign-On (SSO). Supports SSO login, listing accounts/roles, and executing AWS CLI commands.