Voice MCP
Enables voice interactions with Claude and other LLMs using an OpenAI API key for STT/TTS services.
VoiceMode
Install via:
uv tool install voice-mode
| getvoicemode.com
Natural voice conversations for AI assistants. VoiceMode brings human-like voice interactions to Claude Code, AI code editors through the Model Context Protocol (MCP).
🖥️ Compatibility
Runs on: Linux • macOS • Windows (WSL) • NixOS | Python: 3.10+
✨ Features
- 🎙️ Natural Voice Conversations with Claude Code - ask questions and hear responses
- 🗣️ Supports local Voice Models - works with any OpenAI API compatible STT/TTS services
- ⚡ Real-time - low-latency voice interactions with automatic transport selection
- 🔧 MCP Integration - seamless with Claude Code (and other MCP clients)
- 🎯 Silence detection - automatically stops recording when you stop speaking (no more waiting!)
- 🔄 Multiple transports - local microphone or LiveKit room-based communication
🎯 Simple Requirements
All you need to get started:
- 🎤 Computer with microphone and speakers
- 🔑 OpenAI API Key (Recommended, if only as a backup for local services)
Quick Start
Install VoiceMode and dependencies with UV (Recommended)
- Linux (fedora, debian/ubuntu)
- macOS
- Windows WSL
# Install VoiceMode MCP python package and dependencies
curl -LsSf https://astral.sh/uv/install.sh | sh
uvx voice-mode-install
# While local voice services can be installed automatically, we recommend
# providing an OpenAI API key as a fallback in case local services are unavailable
export OPENAI_API_KEY=your-openai-key # Optional but recommended
# Add VoiceMode to Claude
claude mcp add --scope user voicemode -- uvx --refresh voice-mode
# Start a voice conversation
claude converse
Manual Installation
For manual setup steps, see the Getting Started Guide.
🎬 Demo
Watch VoiceMode in action with Claude Code:
The converse
function makes voice interactions natural - it automatically waits for your response by default, creating a real conversation flow.
Installation
Prerequisites
- Python >= 3.10
- Astral UV - Package manager (install with
curl -LsSf https://astral.sh/uv/install.sh | sh
) - OpenAI API Key (or compatible service)
System Dependencies
sudo apt update
sudo apt install -y ffmpeg gcc libasound2-dev libasound2-plugins libportaudio2 portaudio19-dev pulseaudio pulseaudio-utils python3-dev
Note for WSL2 users: WSL2 requires additional audio packages (pulseaudio, libasound2-plugins) for microphone access.
sudo dnf install alsa-lib-devel ffmpeg gcc portaudio portaudio-devel python3-devel
# Install Homebrew if not already installed
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
# Install dependencies
brew install ffmpeg node portaudio
Follow the Ubuntu/Debian instructions above within WSL.
VoiceMode includes a flake.nix with all required dependencies. You can either:
- Use the development shell (temporary):
nix develop github:mbailey/voicemode
- Install system-wide (see Installation section below)
Quick Install
# Using Claude Code (recommended)
claude mcp add --scope user voicemode uvx --refresh voice-mode
Configuration for AI Coding Assistants
📖 Looking for detailed setup instructions? Check our comprehensive Getting Started Guide for step-by-step instructions!
Below are quick configuration snippets. For full installation and setup instructions, see the integration guides above.
claude mcp add --scope user voicemode -- uvx --refresh voice-mode
Or with environment variables:
claude mcp add --scope user --env OPENAI_API_KEY=your-openai-key voicemode -- uvx --refresh voice-mode
Alternative Installation Options
git clone https://github.com/mbailey/voicemode.git
cd voicemode
uv tool install -e .
1. Install with nix profile (user-wide):
nix profile install github:mbailey/voicemode
2. Add to NixOS configuration (system-wide):
# In /etc/nixos/configuration.nix
environment.systemPackages = [
(builtins.getFlake "github:mbailey/voicemode").packages.${pkgs.system}.default
];
3. Add to home-manager:
# In home-manager configuration
home.packages = [
(builtins.getFlake "github:mbailey/voicemode").packages.${pkgs.system}.default
];
4. Run without installing:
nix run github:mbailey/voicemode
Configuration
- 📖 Getting Started - Step-by-step setup guide
- 🔧 Configuration Reference - All environment variables
Quick Setup
The only required configuration is your OpenAI API key:
export OPENAI_API_KEY="your-key"
Local STT/TTS Services
For privacy-focused or offline usage, VoiceMode supports local speech services:
- Whisper.cpp - Local speech-to-text with OpenAI-compatible API
- Kokoro - Local text-to-speech with multiple voice options
These services provide the same API interface as OpenAI, allowing seamless switching between cloud and local processing.
Troubleshooting
Common Issues
- No microphone access: Check system permissions for terminal/application
- WSL2 Users: Additional audio packages (pulseaudio, libasound2-plugins) required for microphone access
- UV not found: Install with
curl -LsSf https://astral.sh/uv/install.sh | sh
- OpenAI API error: Verify your
OPENAI_API_KEY
is set correctly - No audio output: Check system audio settings and available devices
Audio Saving
To save all audio files (both TTS output and STT input):
export VOICEMODE_SAVE_AUDIO=true
Audio files are saved to: ~/.voicemode/audio/YYYY/MM/
with timestamps in the filename.
Documentation
📚 Read the full documentation at voice-mode.readthedocs.io
Getting Started
- Getting Started - Step-by-step setup for all supported tools
- Configuration Guide - Complete environment variable reference
Development
- Development Setup - Local development guide
Service Guides
- Whisper.cpp Setup - Local speech-to-text configuration
- Kokoro Setup - Local text-to-speech configuration
- LiveKit Integration - Real-time voice communication
Links
- Website: getvoicemode.com
- Documentation: voice-mode.readthedocs.io
- GitHub: github.com/mbailey/voicemode
- PyPI: pypi.org/project/voice-mode
Community
- Twitter/X: @getvoicemode
- YouTube: @getvoicemode
See Also
- 🚀 Getting Started - Setup instructions for all supported tools
- 🔧 Configuration Reference - Environment variables and options
- 🎤 Local Services Setup - Run TTS/STT locally for privacy
License
MIT - A Failmode Project
mcp-name: com.failmode/voicemode
Related Servers
chakoshi MCP Server
A bridge server connecting Claude Desktop with the chakoshi moderation API for content safety.
Twilio MCP Server
Enables AI assistants to send SMS and MMS messages using the Twilio API.
X (Twitter)
Integrate with the X (Twitter) API for workflow automation, enhanced error handling, and real-time documentation.
TikTok
TikTok integration for getting post details and video subtitles
Qiye Wechat MCP
Enables AI assistants to send messages to Enterprise WeChat (Qiye Wechat) groups via webhooks.
S-IMSY MCP Server
Provides SSE and HTTP streamable connection endpoints, authenticated via a SIMSY App token.
Webhooks MCP
Send HTTP requests to webhooks with dynamic parameters.
Globalping
Network access with the ability to run commands like ping, traceroute, mtr, http, dns resolve.
Bluesky MCP Server
An MCP server for Bluesky that provides tools to interact with the ATProtocol.
Webex MCP Server
Provides AI assistants with comprehensive access to Cisco Webex messaging capabilities.