ask-gemini-mcp
MCP server that enables AI assistants to interact with Google Gemini CLI
Ask LLM
| Package | Type | Version | Downloads |
|---|---|---|---|
ask-gemini-mcp | MCP Server | ||
ask-codex-mcp | MCP Server | ||
ask-ollama-mcp | MCP Server | ||
ask-llm-mcp | MCP Server | ||
@ask-llm/plugin | Claude Code Plugin | /plugin install |
MCP servers + Claude Code plugin for AI-to-AI collaboration
MCP servers that bridge your AI client with multiple LLM providers for AI-to-AI collaboration. Works with Claude Code, Claude Desktop, Cursor, Warp, Copilot, and 40+ other MCP clients. Leverage Gemini's 1M+ token context, Codex's GPT-5.5, or local Ollama models — all via standard MCP.
Why?
- Get a second opinion — Ask another AI to review your coding approach before committing
- Debate plans — Send architecture proposals for critique and alternative suggestions
- Review changes — Have multiple AIs analyze diffs to catch issues your primary AI might miss
- Massive context — Gemini reads entire codebases (1M+ tokens) that would overflow other models
- Local & private — Use Ollama for reviews where no data leaves your machine
Quick Start
Claude Code
# All-in-one — auto-detects installed providers
claude mcp add --scope user ask-llm -- npx -y ask-llm-mcp
Or install providers individually
claude mcp add --scope user gemini -- npx -y ask-gemini-mcp
claude mcp add --scope user codex -- npx -y ask-codex-mcp
claude mcp add --scope user ollama -- npx -y ask-ollama-mcp
Claude Desktop
Add to claude_desktop_config.json:
{
"mcpServers": {
"ask-llm": {
"command": "npx",
"args": ["-y", "ask-llm-mcp"]
}
}
}
Or install providers individually
{
"mcpServers": {
"gemini": {
"command": "npx",
"args": ["-y", "ask-gemini-mcp"]
},
"codex": {
"command": "npx",
"args": ["-y", "ask-codex-mcp"]
},
"ollama": {
"command": "npx",
"args": ["-y", "ask-ollama-mcp"]
}
}
}
Cursor, Codex CLI, OpenCode, and other clients
Cursor (.cursor/mcp.json):
{
"mcpServers": {
"ask-llm": { "command": "npx", "args": ["-y", "ask-llm-mcp"] }
}
}
Codex CLI (~/.codex/config.toml):
[mcp_servers.ask-llm]
command = "npx"
args = ["-y", "ask-llm-mcp"]
Any MCP Client (STDIO transport):
{ "command": "npx", "args": ["-y", "ask-llm-mcp"] }
Replace ask-llm-mcp with ask-gemini-mcp, ask-codex-mcp, or ask-ollama-mcp for a single provider.
Claude Code Plugin
The Ask LLM plugin adds multi-provider code review, brainstorming, and automated hooks directly into Claude Code:
/plugin marketplace add Lykhoyda/ask-llm
/plugin install ask-llm@ask-llm-plugins
What You Get
| Feature | Description |
|---|---|
/multi-review | Parallel Gemini + Codex review with 4-phase validation pipeline and consensus highlighting |
/gemini-review | Gemini-only review with confidence filtering |
/codex-review | Codex-only review with confidence filtering |
/ollama-review | Local review — no data leaves your machine |
/brainstorm | Multi-LLM brainstorm: Claude Opus researches the topic against real files in parallel with external providers (Gemini/Codex/Ollama), then synthesizes all findings with verified findings weighted higher |
/compare | Side-by-side raw responses from multiple providers, no synthesis — for when you want to see how each provider phrases the same answer |
codex-pair hook | Opt-in continuous review — runs Codex against every Edit/Write/MultiEdit when a .codex-pair/context.md marker is present in the project |
The review agents use a 4-phase pipeline inspired by Anthropic's code-review plugin: context gathering, prompt construction with explicit false-positive exclusions, synthesis, and source-level validation of each finding.
See the plugin docs for details.
Prerequisites
- Node.js v20.0.0 or higher (LTS)
- At least one provider:
- Gemini CLI —
npm install -g @google/gemini-cli && gemini login - Codex CLI — installed and authenticated
- Ollama — running locally with a model pulled (
ollama pull qwen2.5-coder:7b)
- Gemini CLI —
MCP Tools
| Tool | Package | Purpose |
|---|---|---|
ask-gemini | ask-gemini-mcp | Send prompts to Gemini CLI with @ file syntax. 1M+ token context. Live progressive output via stream-json |
ask-gemini-edit | ask-gemini-mcp | Get structured OLD/NEW code edit blocks from Gemini |
fetch-chunk | ask-gemini-mcp | Retrieve chunks from cached large responses |
ask-codex | ask-codex-mcp | Send prompts to Codex CLI. GPT-5.5 with mini fallback. Native session resume via sessionId |
ask-ollama | ask-ollama-mcp | Send prompts to local Ollama. Fully private, zero cost. Server-side conversation replay via sessionId |
ask-llm | ask-llm-mcp | Unified orchestrator — pick provider per call. Fan out to all installed providers |
multi-llm | ask-llm-mcp | Dispatch the same prompt to multiple providers in parallel; returns per-provider responses + usage in one call |
get-usage-stats | all | Per-session token totals, fallback counts, breakdowns by provider/model — all in-memory, no persistence |
diagnose | ask-llm-mcp | Self-diagnosis: Node version, PATH resolution, provider CLI presence + versions. Read-only |
ping | all | Connection test — verify MCP setup |
All ask-* tools accept an optional sessionId parameter for multi-turn conversations and now return a structured AskResponse (provider, response, model, sessionId, usage) via MCP outputSchema alongside the human-readable text. The orchestrator (ask-llm-mcp) also exposes usage://current-session as an MCP Resource for live JSON snapshots.
Usage Examples
ask gemini to review the changes in @src/auth.ts for security issues
ask codex to suggest a better algorithm for @src/sort.ts
ask ollama to explain @src/config.ts (runs locally, no data sent anywhere)
use gemini to summarize @. the current directory
use multi-llm to compare what gemini and codex think about this approach
CLI Subcommands
The orchestrator binary (ask-llm-mcp) supports two CLI modes alongside the default MCP server:
# Interactive multi-provider REPL — switch providers, persist sessions, see usage live
npx ask-llm-mcp repl
# Diagnose your setup — Node version, PATH, provider CLI versions, env vars
npx ask-llm-mcp doctor # human-readable
npx ask-llm-mcp doctor --json # machine-readable, exit 1 on error
The REPL ships sessions per provider (/provider gemini, /provider codex, /new, /sessions, /usage) and inherits all the executor behavior (quota fallback, stream-json output for Gemini, native session resume).
Models
| Provider | Default | Fallback |
|---|---|---|
| Gemini | gemini-3.1-pro-preview | gemini-3-flash-preview (on quota) |
| Codex | gpt-5.5 | gpt-5.5-mini (on quota) |
| Ollama | qwen2.5-coder:7b | qwen2.5-coder:1.5b (if not found) |
All providers automatically fall back to a lighter model on errors.
Documentation
- Docs site: lykhoyda.github.io/ask-llm
- AI-readable: llms.txt | llms-full.txt
Contributing
Contributions are welcome! See open issues for things to work on.
License
MIT License. See LICENSE for details.
Disclaimer: This is an unofficial, third-party tool and is not affiliated with, endorsed, or sponsored by Google or OpenAI.
Related Servers
Alpha Vantage MCP Server
sponsorAccess financial market data: realtime & historical stock, ETF, options, forex, crypto, commodities, fundamentals, technical indicators, & more
Neo
sponsorNEO MCP lets Claude Code, Cursor and VS Code hand off complex AI engineering tasks like AI model evals, AI agent optimization and more to NEO.
Sleep MCP Server
Provides a sleep/wait tool to add delays between operations, such as waiting between API calls or testing eventually consistent systems.
Scout Monitoring MCP
Scout's official MCP pipes error, trace and metric data from production to your AI agent
SheetsData
Instant access to electronic component datasheets for AI agents — specs, pinouts, package info, and absolute max ratings extracted from manufacturer PDFs on demand.
Instant Meshes MCP
A 3D model processing server for automatic retopology, simplification, and quality analysis of OBJ/GLB models.
phantom-secrets
Stop AI coding agents from leaking your API keys. Local proxy + MCP that swaps real secrets for phm_ tokens. Works with Claude Code, Cursor, Windsurf, and Codex.
Brainfaq
MCP server for the Brainfuck programming language that allows your favourite LLM to debug Brainfuck programs.
Puppeteer MCP
MCP server for browser automation via Puppeteer
VSCode Maestro MCP
The most comprehensive MCP server for VS Code — 100+ tools across 25 categories. File ops, terminal, git, LSP providers (hover, completion, definition, references), and more. Free core + premium features.
AgentStamp
Trust intelligence platform for AI agents — identity certification, trust scoring, forensic audit trails, and x402 micropayments. 14 MCP tools.
Apifox MCP Server
Provides API documentation from Apifox projects as a data source for AI programming tools that support MCP.