Image Generator
Image generation and editing with advanced features like multi-image blending and character consistency
🍌 MCP Image Generator
Powered by Gemini 2.5 Flash Image - Nano Banana 🍌
A powerful MCP (Model Context Protocol) server that enables AI assistants to generate and edit images using Google's Gemini 2.5 Flash Image (Nano Banana 🍌). Seamlessly integrate advanced image generation capabilities into Codex, Cursor, Claude Code, and other MCP-compatible AI tools.
✨ Features
- AI-Powered Image Generation: Create images from text prompts using Gemini 2.5 Flash Image (Nano Banana)
- Intelligent Prompt Enhancement: Automatically optimizes your prompts using Gemini 2.0 Flash for superior image quality
- Adds photographic and artistic details
- Enriches lighting, composition, and atmosphere descriptions
- Preserves your intent while maximizing generation quality
- Image Editing: Transform existing images with natural language instructions
- Context-aware editing that preserves original style
- Maintains visual consistency with source image
- Advanced Options:
- Multi-image blending for composite scenes
- Character consistency across generations
- World knowledge integration for accurate context
- Multiple Output Formats: PNG, JPEG, WebP support
- File Output: Images are saved as files for easy access and integration
🔧 Prerequisites
- Node.js 20 or higher
- Gemini API Key - Get yours at Google AI Studio
- Codex, Cursor, or Claude Code (file I/O capable AI tools)
- Basic terminal/command line knowledge
🚀 Quick Start
1. Get Your Gemini API Key
Get your API key from Google AI Studio
2. MCP Configuration
For Codex
Add to ~/.codex/config.toml:
[mcp_servers.mcp-image]
command = "npx"
args = ["-y", "mcp-image"]
[mcp_servers.mcp-image.env]
GEMINI_API_KEY = "your_gemini_api_key_here"
IMAGE_OUTPUT_DIR = "/absolute/path/to/images"
For Cursor
Add to your Cursor settings:
- Global (all projects):
~/.cursor/mcp.json - Project-specific:
.cursor/mcp.jsonin your project root
{
"mcpServers": {
"mcp-image": {
"command": "npx",
"args": ["-y", "mcp-image"],
"env": {
"GEMINI_API_KEY": "your_gemini_api_key_here",
"IMAGE_OUTPUT_DIR": "/absolute/path/to/images"
}
}
}
}
For Claude Code
Run in your project directory to enable for that project:
cd /path/to/your/project
claude mcp add mcp-image --env GEMINI_API_KEY=your-api-key --env IMAGE_OUTPUT_DIR=/absolute/path/to/images -- npx -y mcp-image
Or add globally for all projects:
claude mcp add mcp-image --scope user --env GEMINI_API_KEY=your-api-key --env IMAGE_OUTPUT_DIR=/absolute/path/to/images -- npx -y mcp-image
⚠️ Security Note: Never commit your API key to version control. Keep it secure and use environment-specific configuration.
📁 Path Requirements:
IMAGE_OUTPUT_DIRmust be an absolute path (e.g.,/Users/username/images, not./images)- Defaults to
./outputin the current working directory if not specified - Directory will be created automatically if it doesn't exist
Optional: Skip Prompt Enhancement
Set SKIP_PROMPT_ENHANCEMENT=true to disable automatic prompt optimization and send your prompts directly to the image generator. Useful when you need full control over the exact prompt wording.
Codex:
[mcp_servers.mcp-image.env]
GEMINI_API_KEY = "your_gemini_api_key_here"
SKIP_PROMPT_ENHANCEMENT = "true"
IMAGE_OUTPUT_DIR = "/absolute/path/to/images"
Cursor:
Add "SKIP_PROMPT_ENHANCEMENT": "true" to the env section in your config.
Claude Code:
claude mcp add mcp-image --env GEMINI_API_KEY=your-api-key --env SKIP_PROMPT_ENHANCEMENT=true --env IMAGE_OUTPUT_DIR=/absolute/path/to/images -- npx -y mcp-image
📖 Usage Examples
Once configured, your AI assistant can generate images using natural language:
Basic Image Generation
"Generate a serene mountain landscape at sunset with a lake reflection"
The system automatically enhances this to include rich details about lighting, materials, composition, and atmosphere for optimal results.
Image Editing
"Edit this image to make the person face right"
(with inputImagePath: "/path/to/image.jpg")
Advanced Features
"Generate a portrait of a medieval knight, maintaining character consistency for future variations"
(with maintainCharacterConsistency: true)
🔧 API Reference
generate_image Tool
The MCP server exposes a single tool for all image operations. Internally, it uses a two-stage process:
- Prompt Optimization: Gemini 2.0 Flash analyzes and enriches your prompt
- Image Generation: Gemini 2.5 Flash Image creates the final image
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | ✅ | Text description or editing instruction |
inputImagePath | string | - | Absolute path to input image for editing |
fileName | string | - | Custom filename for output (auto-generated if not specified) |
aspectRatio | string | - | Aspect ratio for the generated image. Supported values: 1:1 (square, default), 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9 |
blendImages | boolean | - | Enable multi-image blending |
maintainCharacterConsistency | boolean | - | Maintain character appearance across generations |
useWorldKnowledge | boolean | - | Use real-world knowledge for context |
Response
{
"type": "resource",
"resource": {
"uri": "file:///path/to/generated/image.png",
"name": "image-filename.png",
"mimeType": "image/png"
},
"metadata": {
"model": "gemini-2.5-flash-image",
"processingTime": 5000,
"timestamp": "2024-01-01T12:00:00.000Z"
}
}
🛠️ Troubleshooting
Common Issues
"API key not found"
- Ensure
GEMINI_API_KEYis set in your environment - Verify the API key is valid and has image generation permissions
"Input image file not found"
- Use absolute file paths, not relative paths
- Ensure the file exists and is accessible
- Supported formats: PNG, JPEG, WebP (max 10MB)
"No image data found in Gemini API response"
- Try rephrasing your prompt with more specific details
- Ensure your prompt is appropriate for image generation
- Check if your API key has sufficient quota
Performance Tips
- Image generation: 30-60 seconds typical (includes prompt optimization)
- Image editing: 15-45 seconds typical (includes context analysis)
- Simple prompts work great - the AI automatically adds professional details
- Complex prompts are preserved and further enhanced
- Consider enabling
useWorldKnowledgefor historical or factual subjects
💰 Usage Notes
- This MCP server uses the paid Gemini API for both prompt optimization and image generation
- Gemini 2.0 Flash for intelligent prompt enhancement (minimal token usage)
- Gemini 2.5 Flash Image for actual image generation
- Check current pricing and rate limits at Google AI Studio
- Monitor your API usage to avoid unexpected charges
- The prompt optimization step adds minimal cost while significantly improving output quality
📄 License
MIT License - see LICENSE for details.
Need help? Open an issue or check the troubleshooting section above.
Related Servers
AFL (Australian Football League)
Provides Australian Football League (AFL) data, including games, standings, and team information, from the Squiggle API.
Euroleague Live
Provides club information and advanced player statistics for Euroleague and Eurocup basketball from the Euroleague API.
Atris MCP for Audius
Access the Audius music platform via LLMs, with 105 tools covering most of the Audius Protocol API.
Adwords MCP
An MCP server that serves ads to developers in clients like Cursor and Claude.
Mnemex
Mnemex is a Python MCP server that provides AI assistants with human-like memory dynamics through temporal decay and natural spaced repetition, storing memories locally in human-readable JSONL and Markdown formats.
NWC MCP Server
Control a Lightning wallet using Nostr Wallet Connect (NWC).
Context Lens
Semantic search knowledge base for MCP-enabled AI assistants
TwelveLabs
The TwelveLabs MCP Server provides seamless integration with the TwelveLabs platform. This server enables AI assistants and applications to interact with TwelveLabs powerful video analysis capabilities through a standardized MCP interface.
Wordle MCP - Go
Fetches daily Wordle solutions using the official Wordle API.
Overseerr
Interact with the Overseerr API to manage movie and TV show requests.