🍌 MCP Image Generator

Powered by Gemini 3 Pro Image - Nano Banana Pro 🍌

A powerful MCP (Model Context Protocol) server that enables AI assistants to generate and edit images using Google's Gemini 3 Pro Image (Nano Banana Pro 🍌). Seamlessly integrate advanced image generation capabilities into Codex, Cursor, Claude Code, and other MCP-compatible AI tools.

✨ Features

AI-Powered Image Generation: Create images from text prompts using Gemini 3 Pro Image (Nano Banana Pro)
Intelligent Prompt Enhancement: Automatically optimizes your prompts using Gemini 2.5 Flash for superior image quality
- Adds photographic and artistic details
- Enriches lighting, composition, and atmosphere descriptions
- Preserves your intent while maximizing generation quality
Image Editing: Transform existing images with natural language instructions
- Context-aware editing that preserves original style
- Maintains visual consistency with source image
High-Resolution Output: Support for 2K and 4K image generation
- Standard quality for fast generation
- 2K resolution for enhanced detail
- 4K resolution for professional-grade images with superior text rendering
Flexible Aspect Ratios: Multiple aspect ratio options (1:1, 16:9, 9:16, 21:9, and more)
Advanced Options:
- Multi-image blending for composite scenes
- Character consistency across generations
- World knowledge integration for accurate context
Multiple Output Formats: PNG, JPEG, WebP support
File Output: Images are saved as files for easy access and integration

🎨 Agent Skill: Image Generation Prompt Guide

This project also provides a standalone Agent Skill that teaches AI assistants to write better image generation prompts — no MCP server or API key required.

What it does

Note: This skill does not generate images itself — it teaches your AI assistant to write better prompts. Your AI tool must already have built-in image generation capabilities (e.g., Cursor's image generation feature).

A reference guide that AI assistants use to improve image generation prompts based on the Subject-Context-Style framework. Works with any image model (Gemini, DALL-E, Flux, Stable Diffusion, etc.).

Covers:

Prompt structure — How to build prompts around Subject, Context, and Style
Visual details — Lighting, textures, camera angles, atmosphere, text in images
Advanced features — Character consistency, multi-element composition, factual accuracy, purpose-specific output
Image editing — How to describe edits while keeping the original look intact

Install

npx mcp-image skills install --path <target-directory>

The skill will be placed at <path>/image-generation/SKILL.md. Specify the skills directory for your AI tool:

# Cursor
npx mcp-image skills install --path ~/.cursor/skills

# Codex
npx mcp-image skills install --path ~/.codex/skills

# Claude Code
npx mcp-image skills install --path ~/.claude/skills

When to use the Skill vs the MCP server

	MCP Server	Agent Skill
Use when	Your AI tool does not have built-in image generation	Your AI tool already generates images natively
Requires	Gemini API key	Nothing
What it does	Generates images via API	Teaches the AI to write better prompts
Works with	MCP-compatible tools	Any tool supporting the Agent Skills standard

🔧 Prerequisites

Node.js 20 or higher
Gemini API Key - Get yours at Google AI Studio
Codex, Cursor, or Claude Code (file I/O capable AI tools)
Basic terminal/command line knowledge

🚀 Quick Start

1. Get Your Gemini API Key

Get your API key from Google AI Studio

2. MCP Configuration

For Codex

Add to ~/.codex/config.toml:

[mcp_servers.mcp-image]
command = "npx"
args = ["-y", "mcp-image"]

[mcp_servers.mcp-image.env]
GEMINI_API_KEY = "your_gemini_api_key_here"
IMAGE_OUTPUT_DIR = "/absolute/path/to/images"

For Cursor

Add to your Cursor settings:

Global (all projects): ~/.cursor/mcp.json
Project-specific: .cursor/mcp.json in your project root

{
  "mcpServers": {
    "mcp-image": {
      "command": "npx",
      "args": ["-y", "mcp-image"],
      "env": {
        "GEMINI_API_KEY": "your_gemini_api_key_here",
        "IMAGE_OUTPUT_DIR": "/absolute/path/to/images"
      }
    }
  }
}

For Claude Code

Run in your project directory to enable for that project:

cd /path/to/your/project
claude mcp add mcp-image --env GEMINI_API_KEY=your-api-key --env IMAGE_OUTPUT_DIR=/absolute/path/to/images -- npx -y mcp-image

Or add globally for all projects:

claude mcp add mcp-image --scope user --env GEMINI_API_KEY=your-api-key --env IMAGE_OUTPUT_DIR=/absolute/path/to/images -- npx -y mcp-image

⚠️ Security Note: Never commit your API key to version control. Keep it secure and use environment-specific configuration.

📁 Path Requirements:

IMAGE_OUTPUT_DIR must be an absolute path (e.g., /Users/username/images, not ./images)
Defaults to ./output in the current working directory if not specified
Directory will be created automatically if it doesn't exist

Optional: Skip Prompt Enhancement

Set SKIP_PROMPT_ENHANCEMENT=true to disable automatic prompt optimization and send your prompts directly to the image generator. Useful when you need full control over the exact prompt wording.

Codex:

[mcp_servers.mcp-image.env]
GEMINI_API_KEY = "your_gemini_api_key_here"
SKIP_PROMPT_ENHANCEMENT = "true"
IMAGE_OUTPUT_DIR = "/absolute/path/to/images"

Cursor: Add "SKIP_PROMPT_ENHANCEMENT": "true" to the env section in your config.

Claude Code:

claude mcp add mcp-image --env GEMINI_API_KEY=your-api-key --env SKIP_PROMPT_ENHANCEMENT=true --env IMAGE_OUTPUT_DIR=/absolute/path/to/images -- npx -y mcp-image

📖 Usage Examples

Once configured, your AI assistant can generate images using natural language:

Basic Image Generation

"Generate a serene mountain landscape at sunset with a lake reflection"

The system automatically enhances this to include rich details about lighting, materials, composition, and atmosphere for optimal results.

Image Editing

"Edit this image to make the person face right"
(with inputImagePath: "/path/to/image.jpg")

Advanced Features

Character Consistency:

"Generate a portrait of a medieval knight, maintaining character consistency for future variations"
(with maintainCharacterConsistency: true)

High-Resolution 4K Generation:

"Generate a professional product photo of a smartphone with clear text on the screen"
(with imageSize: "4K")

Custom Aspect Ratio:

"Generate a cinematic landscape of a desert at golden hour"
(with aspectRatio: "21:9")

🔧 API Reference

`generate_image` Tool

The MCP server exposes a single tool for all image operations. Internally, it uses a two-stage process:

Prompt Optimization: Gemini 2.5 Flash analyzes and enriches your prompt
Image Generation: Gemini 3 Pro Image creates the final image

Parameters

Parameter	Type	Required	Description
`prompt`	string	✅	Text description or editing instruction
`inputImagePath`	string	-	Absolute path to input image for editing
`fileName`	string	-	Custom filename for output (auto-generated if not specified)
`aspectRatio`	string	-	Aspect ratio for the generated image. Supported values: `1:1` (square, default), `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9`
`imageSize`	string	-	Image resolution for high-quality output. Specify `2K` or `4K` for higher resolution images with better text rendering and fine details. Leave unspecified for standard quality. Supported values: `2K`, `4K`
`blendImages`	boolean	-	Enable multi-image blending for combining multiple visual elements naturally
`maintainCharacterConsistency`	boolean	-	Maintain character appearance consistency across different poses and scenes
`useWorldKnowledge`	boolean	-	Use real-world knowledge for accurate context (recommended for historical figures, landmarks, or factual scenarios)
`useGoogleSearch`	boolean	-	Enable Google Search grounding to access real-time web information for factually accurate image generation. Use when prompt requires current or time-sensitive data that may have changed since the model's knowledge cutoff. Leave disabled for creative, fictional, historical, or timeless content.
`purpose`	string	-	Intended use for the image (e.g., "cookbook cover", "social media post", "presentation slide"). Helps tailor visual style, quality level, and details to match the purpose.

Response

{
  "type": "resource",
  "resource": {
    "uri": "file:///path/to/generated/image.png",
    "name": "image-filename.png",
    "mimeType": "image/png"
  },
  "metadata": {
    "model": "gemini-3-pro-image-preview",
    "processingTime": 5000,
    "timestamp": "2024-01-01T12:00:00.000Z"
  }
}

🛠️ Troubleshooting

Common Issues

"API key not found"

Ensure GEMINI_API_KEY is set in your environment
Verify the API key is valid and has image generation permissions

"Input image file not found"

Use absolute file paths, not relative paths
Ensure the file exists and is accessible
Supported formats: PNG, JPEG, WebP (max 10MB)

"No image data found in Gemini API response"

Try rephrasing your prompt with more specific details
Ensure your prompt is appropriate for image generation
Check if your API key has sufficient quota

Performance Tips

Image generation: 30-60 seconds typical (includes prompt optimization)
Image editing: 15-45 seconds typical (includes context analysis)
High-resolution generation (2K/4K): May take longer but provides superior quality
Simple prompts work great - the AI automatically adds professional details
Complex prompts are preserved and further enhanced
Consider enabling useWorldKnowledge for historical or factual subjects
Use imageSize: "4K" when text clarity and fine details are critical

💰 Usage Notes

This MCP server uses the paid Gemini API for both prompt optimization and image generation
- Gemini 2.5 Flash for intelligent prompt enhancement (minimal token usage)
- Gemini 3 Pro Image for actual image generation
Check current pricing and rate limits at Google AI Studio
Monitor your API usage to avoid unexpected charges
The prompt optimization step adds minimal cost while significantly improving output quality

📄 License

MIT License - see LICENSE for details.

Need help? Open an issue or check the troubleshooting section above.

Image Generator

🍌 MCP Image Generator

✨ Features

🎨 Agent Skill: Image Generation Prompt Guide

What it does

Install

When to use the Skill vs the MCP server

🔧 Prerequisites

🚀 Quick Start

1. Get Your Gemini API Key

2. MCP Configuration

For Codex

For Cursor

For Claude Code

Optional: Skip Prompt Enhancement

📖 Usage Examples

Basic Image Generation

Image Editing

Advanced Features

🔧 API Reference

`generate_image` Tool

Parameters

Response

🛠️ Troubleshooting

Common Issues

Performance Tips

💰 Usage Notes

📄 License

Related Servers

NVD CVE MCP Server

TechMCP

Crypto Fear & Greed Index

MCP Media Player

Nano Currency MCP Server

MCP Cookie Server

Image Reader

Plex MCP Server

Chess Stats

MCP Kali Server

Image Generator

🍌 MCP Image Generator

✨ Features

🎨 Agent Skill: Image Generation Prompt Guide

What it does

Install

When to use the Skill vs the MCP server

🔧 Prerequisites

🚀 Quick Start

1. Get Your Gemini API Key

2. MCP Configuration

For Codex

For Cursor

For Claude Code

Optional: Skip Prompt Enhancement

📖 Usage Examples

Basic Image Generation

Image Editing

Advanced Features

🔧 API Reference

generate_image Tool

Parameters

Response

🛠️ Troubleshooting

Common Issues

Performance Tips

💰 Usage Notes

📄 License

Related Servers

NVD CVE MCP Server

TechMCP

Crypto Fear & Greed Index

MCP Media Player

Nano Currency MCP Server

MCP Cookie Server

Image Reader

Plex MCP Server

Chess Stats

MCP Kali Server

`generate_image` Tool