Voice Mode
A server for natural voice conversations with AI assistants like Claude and ChatGPT.
VoiceMode
Natural voice conversations with Claude Code (and other MCP capable agents)
[!WARNING] Known Issue (2026-04-13): Claude Code 2.1.105+ kills VoiceMode's MCP server when you press ESC to cancel a voice conversation. Workaround: Pin to Claude Code 2.1.104. See discussion #349 for details.
VoiceMode enables natural voice conversations with Claude Code. Voice isn't about replacing typing - it's about being available when typing isn't.
Perfect for:
- Walking to your next meeting
- Cooking while debugging
- Giving your eyes a break after hours of screen time
- Holding a coffee (or a dog)
- Any moment when your hands or eyes are busy
See It In Action
Quick Start
Requirements: Computer with microphone and speakers
Option 1: Claude Code Plugin (Recommended)
The fastest way for Claude Code users to get started:
# Add the VoiceMode marketplace
claude plugin marketplace add mbailey/voicemode
# Install VoiceMode plugin
claude plugin install voicemode@voicemode
## Install dependencies (CLI, Local Voice Services)
/voicemode:install
# Start talking!
/voicemode:converse
Option 2: Python installer package
Installs dependencies and the VoiceMode Python package.
# Install UV package manager (if needed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Run the installer (sets up dependencies and local voice services)
uvx voice-mode-install
# Add to Claude Code
claude mcp add --scope user voicemode -- uvx --refresh voice-mode
# Optional: Add OpenAI API key as fallback for local services
export OPENAI_API_KEY=your-openai-key
# Start a conversation
claude converse
For manual setup, see the Getting Started Guide.
Features
- Natural conversations - speak naturally, hear responses immediately
- Works offline - optional local voice services (Whisper STT, Kokoro TTS)
- Low latency - fast enough to feel like a real conversation
- Smart silence detection - stops recording when you stop speaking
- Privacy options - run entirely locally or use cloud services
Compatibility
Platforms: Linux, macOS, Windows (WSL), NixOS Python: 3.10-3.14
Configuration
VoiceMode works out of the box. For customization:
# Set OpenAI API key (if using cloud services)
export OPENAI_API_KEY="your-key"
# Or configure via file
voicemode config edit
See the Configuration Guide for all options.
Permissions Setup (Optional)
To use VoiceMode without permission prompts, add to ~/.claude/settings.json:
{
"permissions": {
"allow": [
"mcp__voicemode__converse",
"mcp__voicemode__service"
]
}
}
See the Permissions Guide for more options.
Local Voice Services
For privacy or offline use, install local speech services:
- Whisper.cpp - Local speech-to-text
- Kokoro - Local text-to-speech with multiple voices
These provide the same API as OpenAI, so VoiceMode switches seamlessly between them.
Installation Details
System Dependencies by Platform
Ubuntu/Debian
sudo apt update
sudo apt install -y ffmpeg gcc libasound2-dev libasound2-plugins libportaudio2 portaudio19-dev pulseaudio pulseaudio-utils python3-dev
WSL2 users: The pulseaudio packages above are required for microphone access.
Fedora/RHEL
sudo dnf install alsa-lib-devel ffmpeg gcc portaudio portaudio-devel python3-devel
macOS
brew install ffmpeg node portaudio
NixOS
# Use development shell
nix develop github:mbailey/voicemode
# Or install system-wide
nix profile install github:mbailey/voicemode
Alternative Installation Methods
From source
git clone https://github.com/mbailey/voicemode.git
cd voicemode
uv tool install -e .
NixOS system-wide
# In /etc/nixos/configuration.nix
environment.systemPackages = [
(builtins.getFlake "github:mbailey/voicemode").packages.${pkgs.system}.default
];
Troubleshooting
| Problem | Solution |
|---|---|
| No microphone access | Check terminal/app permissions. WSL2 needs pulseaudio packages. |
| UV not found | Run curl -LsSf https://astral.sh/uv/install.sh | sh |
| OpenAI API error | Verify OPENAI_API_KEY is set correctly |
| No audio output | Check system audio settings and available devices |
Save Audio for Debugging
export VOICEMODE_SAVE_AUDIO=true
# Files saved to ~/.voicemode/audio/YYYY/MM/
Documentation
- Getting Started - Full setup guide
- Configuration - All environment variables
- Whisper Setup - Local speech-to-text
- Kokoro Setup - Local text-to-speech
- Development Setup - Contributing guide
Full documentation: voice-mode.readthedocs.io
Links
- Website: getvoicemode.com
- GitHub: github.com/mbailey/voicemode
- PyPI: pypi.org/project/voice-mode
- YouTube: @getvoicemode
- Twitter/X: @getvoicemode
- Newsletter:
License
MIT - A Failmode Project
mcp-name: com.failmode/voicemode
관련 서버
Gmail MCP Server
An MCP server that enables AI models to interact directly with the Gmail API to manage emails.
Machine 2 Machine Protocol
A proof-of-concept for autonomous economic interactions between AI agents using MCP, A2A, and x402 protocols.
Apple Mail MCP
Fast MCP server for Apple Mail with batch JXA (87x faster) and FTS5 search index (700-3500x faster).
mcp-telegram
Telegram MCP server using User API (MTProto) with default-deny ACL, granular per-chat permissions, file sending, media downloads, and rate limiting
Twilio
Interact with Twilio APIs to send messages, manage phone numbers, configure your account, and more.
Discord MCP by Quadslab.io
Full-stack Discord server management via AI. 134 tools covering moderation, roles, channels, webhooks, and community features — with fuzzy name resolution and instant permission health checks.
User Feedback
Simple MCP Server to enable a human-in-the-loop workflow in tools like Cline and Cursor.
API Docs MCP
MCP server for API documentation, supporting GraphQL, OpenAPI/Swagger, and gRPC from local files or remote URLs
CData Microsoft Teams MCP Server
A read-only MCP server for querying live Microsoft Teams data, powered by CData.
Telegram MCP Server
Interact with the Telegram messaging service to send and receive messages.
