Voice MCP Server

OpenAI API 키를 사용하여 STT/TTS 서비스로 Claude 및 다른 LLM과 음성 상호작용을 가능하게 합니다.

문서

VoiceMode

Natural voice conversations with Claude Code (and other MCP capable agents)

VoiceMode enables natural voice conversations with Claude Code. Voice isn't about replacing typing - it's about being available when typing isn't.

Perfect for:

Walking to your next meeting
Cooking while debugging
Giving your eyes a break after hours of screen time
Holding a coffee (or a dog)
Any moment when your hands or eyes are busy

See It In Action

Quick Start

Requirements: Computer with microphone and speakers

Option 1: Claude Code Plugin (Recommended)

The fastest way for Claude Code users to get started:

# Add the VoiceMode marketplace
claude plugin marketplace add mbailey/voicemode

# Install VoiceMode plugin
claude plugin install voicemode@voicemode

## Install dependencies (CLI, Local Voice Services)

/voicemode:install

# Start talking!
/voicemode:converse

Option 2: Python installer package

Installs dependencies and the VoiceMode Python package.

# Install UV package manager (if needed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Run the installer (sets up dependencies and local voice services)
uvx voice-mode-install

# Add to Claude Code
claude mcp add --scope user voicemode -- uvx --refresh --from voice-mode voicemode-mcp-launcher

# Optional: Add OpenAI API key as fallback for local services
export OPENAI_API_KEY=your-openai-key

# Start a conversation
claude converse

For manual setup, see the Getting Started Guide.

Features

Natural conversations - speak naturally, hear responses immediately
Works offline - optional local voice services (Whisper STT, Kokoro TTS)
Low latency - fast enough to feel like a real conversation
Smart silence detection - stops recording when you stop speaking
Privacy options - run entirely locally or use cloud services

Compatibility

Platforms: Linux, macOS, Windows (WSL), NixOS Python: 3.10-3.14

Configuration

VoiceMode works out of the box. For customization:

# Set OpenAI API key (if using cloud services)
export OPENAI_API_KEY="your-key"

# Or configure via file
voicemode config edit

See the Configuration Guide for all options.

Permissions Setup (Optional)

To use VoiceMode without permission prompts, add to ~/.claude/settings.json:

{
  "permissions": {
    "allow": [
      "mcp__voicemode__converse",
      "mcp__voicemode__service"
    ]
  }
}

See the Permissions Guide for more options.

Local Voice Services

For privacy or offline use, install local speech services:

Whisper.cpp - Local speech-to-text
Kokoro - Local text-to-speech with multiple voices

These provide the same API as OpenAI, so VoiceMode switches seamlessly between them.

Installation Details

System Dependencies by Platform

Ubuntu/Debian

sudo apt update
sudo apt install -y ffmpeg gcc libasound2-dev libasound2-plugins libportaudio2 portaudio19-dev pulseaudio pulseaudio-utils python3-dev

WSL2 users: The pulseaudio packages above are required for microphone access.

Fedora/RHEL

sudo dnf install alsa-lib-devel ffmpeg gcc portaudio portaudio-devel python3-devel

macOS

brew install ffmpeg node portaudio

NixOS

# Use development shell
nix develop github:mbailey/voicemode

# Or install system-wide
nix profile install github:mbailey/voicemode

Alternative Installation Methods

From source

git clone https://github.com/mbailey/voicemode.git
cd voicemode
uv tool install -e .

NixOS system-wide

# In /etc/nixos/configuration.nix
environment.systemPackages = [
  (builtins.getFlake "github:mbailey/voicemode").packages.${pkgs.system}.default
];

Troubleshooting

Problem	Solution
No microphone access	Check terminal/app permissions. WSL2 needs pulseaudio packages.
UV not found	Run `curl -LsSf https://astral.sh/uv/install.sh \| sh`
OpenAI API error	Verify `OPENAI_API_KEY` is set correctly
No audio output	Check system audio settings and available devices

Save Audio for Debugging

export VOICEMODE_SAVE_AUDIO=true
# Files saved to ~/.voicemode/audio/YYYY/MM/

Documentation

Getting Started - Full setup guide
Configuration - All environment variables
Whisper Setup - Local speech-to-text
Kokoro Setup - Local text-to-speech
Development Setup - Contributing guide

Full documentation: voicemode.dev

License

MIT - A Failmode Project

mcp-name: dev.voicemode/voicemode