Gemini MCP
Integrate the full power of Gemini Pro 3 to Claude Code
MCP Server Gemini
A Model Context Protocol (MCP) server for integrating Google's Gemini 3 models with Claude Code, enabling powerful collaboration between both AI systems. Now with a beautiful CLI!
What's New in v0.7.2
Beautiful CLI with Themes! Use Gemini directly from your terminal:
# Install globally
npm install -g @rlabs-inc/gemini-mcp
# Set your API key once
gcli config set api-key YOUR_KEY
# Generate images, videos, search, research, and more!
gcli image "a cat astronaut" --size 4K
gcli search "latest AI news"
gcli research "quantum computing applications" --wait
gcli speak "Hello world" --voice Puck
5 Beautiful Themes: terminal, neon, ocean, forest, minimal
CLI Commands:
gcli query- Direct Gemini queries with thinking levelsgcli search- Real-time web search with citationsgcli research- Deep research agentgcli image- Generate images (up to 4K)gcli video- Generate videos with Veogcli speak- Text-to-speech with 30 voicesgcli tokens- Count tokens and estimate costsgcli config- Manage settings
MCP Registry Support: Now discoverable in the official MCP ecosystem!
Previous Versions
v0.6.x: Deep Research, Token Counting, TTS, URL analysis, Context Caching v0.5.x: 30+ tools, YouTube analysis, Document analysis v0.4.x: Code execution, Google Search v0.3.x: Thinking levels, Structured output, 4K images v0.2.x: Image/Video generation with Veo
Features
| Feature | Description |
|---|---|
| Deep Research Agent | Autonomous multi-step research with web search and citations |
| Token Counting | Count tokens and estimate costs before API calls |
| Text-to-Speech | 30 unique voices, single speaker or two-speaker dialogues |
| URL Analysis | Analyze, compare, and extract data from web pages |
| Context Caching | Cache large documents for efficient repeated queries |
| YouTube Analysis | Analyze videos by URL with timestamp clipping |
| Document Analysis | PDFs, DOCX, spreadsheets with table extraction |
| 4K Image Generation | Generate images up to 4K with 10 aspect ratios |
| Multi-Turn Image Editing | Iteratively refine images through conversation |
| Video Generation | Create videos with Veo 2.0 (async with polling) |
| Code Execution | Gemini writes and runs Python code (pandas, numpy, matplotlib) |
| Google Search | Real-time web information with inline citations |
| Structured Output | JSON responses with schema validation |
| Data Extraction | Extract entities, facts, sentiment from text |
| Thinking Levels | Control reasoning depth (minimal/low/medium/high) |
| Direct Query | Send prompts to Gemini 3 Pro/Flash models |
| Brainstorming | Claude + Gemini collaborative problem-solving |
| Code Analysis | Analyze code for quality, security, performance |
| Summarization | Summarize content at different detail levels |
Quick Installation
MCP Server for Claude Code
# Using npm (Recommended)
claude mcp add gemini -s user -- env GEMINI_API_KEY=YOUR_KEY npx -y @rlabs-inc/gemini-mcp
# Using bun
claude mcp add gemini -s user -- env GEMINI_API_KEY=YOUR_KEY bunx @rlabs-inc/gemini-mcp
CLI (Global Install)
# Install globally
npm install -g @rlabs-inc/gemini-mcp
# Set your API key once (stored securely)
gcli config set api-key YOUR_KEY
# Now use any command!
gcli search "latest news"
glci image "sunset over mountains" --ratio 16:9
Get your API key: Visit Google AI Studio - it's free and takes seconds!
Installation Options
# With verbose logging
claude mcp add gemini -s user -- env GEMINI_API_KEY=YOUR_KEY VERBOSE=true bunx -y @rlabs-inc/gemini-mcp
# With custom output directory for generated images/videos
claude mcp add gemini -s user -- env GEMINI_API_KEY=YOUR_KEY GEMINI_OUTPUT_DIR=/path/to/output bunx -y @rlabs-inc/gemini-mcp
Available Tools
gemini-query
Direct queries to Gemini with thinking level control:
prompt: "Explain quantum entanglement"
model: "pro" or "flash"
thinkingLevel: "low" | "medium" | "high" (optional)
- low: Fast responses, minimal reasoning
- medium: Balanced (Flash only)
- high: Deep reasoning for complex tasks (default)
gemini-generate-image
Generate images with Nano Banana Pro (Claude can SEE them!):
prompt: "a futuristic city at sunset"
style: "cyberpunk" (optional)
aspectRatio: "16:9" (1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9)
imageSize: "2K" (1K, 2K, 4K)
useGoogleSearch: false (ground in real-world info)
gemini-start-image-edit
Start a multi-turn image editing session:
prompt: "a cozy cabin in the mountains"
aspectRatio: "16:9"
imageSize: "2K"
useGoogleSearch: false
Returns a session ID for iterative editing.
gemini-continue-image-edit
Continue refining an image:
sessionId: "edit-123456789"
prompt: "add snow on the roof and make it nighttime"
gemini-end-image-edit
Close an editing session:
sessionId: "edit-123456789"
gemini-list-image-sessions
List all active editing sessions.
gemini-generate-video
Generate videos using Veo:
prompt: "a cat playing piano"
aspectRatio: "16:9" (optional)
negativePrompt: "blurry, text" (optional)
Video generation is async (takes 1-5 minutes). Use gemini-check-video to poll.
gemini-check-video
Check video generation status and download when complete:
operationId: "operations/xxx-xxx-xxx"
gemini-analyze-code
Analyze code for issues:
code: "function foo() { ... }"
language: "typescript" (optional)
focus: "quality" | "security" | "performance" | "bugs" | "general"
gemini-analyze-text
Analyze text content:
text: "Your text here..."
type: "sentiment" | "summary" | "entities" | "key-points" | "general"
gemini-brainstorm
Collaborative brainstorming:
prompt: "How could we implement real-time collaboration?"
claudeThoughts: "I think we should use WebSockets..."
maxRounds: 3 (optional)
gemini-summarize
Summarize content:
content: "Long text to summarize..."
length: "brief" | "moderate" | "detailed"
format: "paragraph" | "bullet-points" | "outline"
gemini-run-code
Let Gemini write and execute Python code:
prompt: "Calculate the first 50 prime numbers and plot them"
data: "optional CSV data to analyze" (optional)
Supports libraries: numpy, pandas, matplotlib, scipy, scikit-learn, tensorflow, and more. Generated charts are saved to the output directory and returned as images.
gemini-search
Real-time web search with citations:
query: "What happened in tech news this week?"
returnCitations: true (default)
Returns grounded responses with inline citations and source URLs.
gemini-structured
Get JSON responses matching a schema:
prompt: "Extract the meeting details from this email..."
schema: '{"type":"object","properties":{"date":{"type":"string"},"attendees":{"type":"array"}}}'
useGoogleSearch: false (optional)
gemini-extract
Convenience tool for common extraction patterns:
text: "Your text to analyze..."
extractType: "entities" | "facts" | "summary" | "keywords" | "sentiment" | "custom"
customFields: "name, date, amount" (for custom extraction)
gemini-youtube
Analyze YouTube videos directly:
url: "https://www.youtube.com/watch?v=..."
question: "What happens at 2:30?"
startTime: "1m30s" (optional, for clipping)
endTime: "5m00s" (optional, for clipping)
gemini-youtube-summary
Quick video summarization:
url: "https://www.youtube.com/watch?v=..."
style: "brief" | "detailed" | "bullet-points" | "chapters"
gemini-analyze-document
Analyze PDFs and documents:
filePath: "/path/to/document.pdf"
question: "Summarize the key findings"
mediaResolution: "low" | "medium" | "high"
gemini-summarize-pdf
Quick PDF summarization:
filePath: "/path/to/document.pdf"
style: "brief" | "detailed" | "outline" | "key-points"
gemini-extract-tables
Extract tables from documents:
filePath: "/path/to/document.pdf"
outputFormat: "markdown" | "csv" | "json"
Workflow: Claude + Gemini
The killer combination for development:
| Claude | Gemini |
|---|---|
| Complex logic | Frontend/UI |
| Architecture | Visual components |
| Backend code | Image generation |
| Integration | React/CSS styling |
| Reasoning | Creative generation |
Example workflow:
- Ask Claude to design the backend API
- Use
gemini-generate-imagefor UI mockups - Ask Gemini to generate React components via
gemini-query - Use multi-turn editing to refine visuals
- Let Claude wire everything together
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
GEMINI_API_KEY | Yes | - | Your Google Gemini API key |
GEMINI_OUTPUT_DIR | No | ./gemini-output | Where to save generated files |
GEMINI_MODEL | No | - | Override model for init test |
GEMINI_PRO_MODEL | No | gemini-3-pro-preview | Pro model (Gemini 3) |
GEMINI_FLASH_MODEL | No | gemini-3-flash-preview | Flash model (Gemini 3) |
GEMINI_IMAGE_MODEL | No | gemini-3-pro-image-preview | Image model (Nano Banana Pro) |
GEMINI_VIDEO_MODEL | No | veo-2.0-generate-001 | Video model |
VERBOSE | No | false | Enable verbose logging |
QUIET | No | false | Minimize logging |
Manual Installation
Global Install
# Using npm
npm install -g @rlabs-inc/gemini-mcp
# Using bun
bun install -g @rlabs-inc/gemini-mcp
Claude Code Configuration
{
"gemini": {
"command": "npx",
"args": ["-y", "@rlabs-inc/gemini-mcp"],
"env": {
"GEMINI_API_KEY": "your-api-key",
"GEMINI_OUTPUT_DIR": "/path/to/save/files"
}
}
}
Troubleshooting
Rate Limits (429 Errors)
If you're hitting rate limits on the free tier:
- Set
GEMINI_MODEL=gemini-3-flash-previewto use Flash for init (higher limits) - Or upgrade to a paid plan
Connection Issues
- Verify your API key at Google AI Studio
- Check server status:
claude mcp list - Try with verbose logging:
VERBOSE=true
Image/Video Issues
- Ensure your API key has access to image/video generation
- Check output directory permissions
- Files save to
GEMINI_OUTPUT_DIR(default:./gemini-output) - For 4K images, generation takes longer
Development
git clone https://github.com/rlabs-inc/gemini-mcp.git
cd gemini-mcp
bun install
bun run build
bun run dev -- --verbose
Scripts
| Command | Description |
|---|---|
bun run build | Build for production |
bun run dev | Development mode with watch |
bun run typecheck | Type check without emitting |
bun run format | Format with Prettier |
bun run lint | Lint with ESLint |
License
MIT License
Made with Claude + Gemini working together
Related Servers
Scout Monitoring MCP
sponsorPut performance and error data directly in the hands of your AI assistant.
Alpha Vantage MCP Server
sponsorAccess financial market data: realtime & historical stock, ETF, options, forex, crypto, commodities, fundamentals, technical indicators, & more
MCP Datetime
A server for datetime formatting and file name generation, with support for various formats and timezones.
SwarmTask
An asynchronous task manager for parallel execution of shell commands with real-time progress monitoring.
Figma
Interact with Figma files to view, comment on, and analyze designs.
MCP Yeoman Server
Search for and run Yeoman generator templates programmatically.
Dify MCP Server
A TypeScript-based server that integrates the Dify AI application platform with the MCP Client.
Model Context Protocol (MCP)
Interact with Gibson projects to create/update projects, explain database/API interactions, and write code within your IDE.
Oso Cloud MCP Server
Understand, develop, and debug authorization policies in Oso Cloud.
CodeGraph
Generates and queries a graph representation of a codebase.
mcp-agent-kit
a complete and intuitive SDK for building MCP Servers, MCP Agents, and LLM integrations (OpenAI, Claude, Gemini) with minimal effort. It abstracts all the complexity of the MCP protocol, provides an intelligent agent with automatic model routing, and includes a universal client for external APIs all through a single, simple, and powerful interface. Perfect for chatbots, enterprise automation, internal system integrations, and rapid development of MCP-based ecosystems.
Web Accessibility Testing (A11y MCP)
Test web pages and HTML for accessibility issues and WCAG compliance using Axe-core and Puppeteer.