MCP TTS VOICEVOX
A Text-to-Speech server that integrates with an external VOICEVOX engine.
MCP TTS VOICEVOX
English | 日本語
A text-to-speech MCP server using VOICEVOX
Features
- Advanced playback control - Flexible audio processing with queue management, immediate playback, and synchronous/asynchronous control
- Prefetching - Pre-generates next audio for smooth playback
- Cross-platform support - Works on Windows, macOS, and Linux (including WSL environment audio playback)
- Stdio/HTTP support - Supports Stdio, SSE, and StreamableHttp
- Multiple speaker support - Individual speaker specification per segment
- Automatic text segmentation - Stable audio synthesis through automatic long text segmentation
- Independent client library - Provided as a separate package
@kajidog/voicevox-client
Requirements
- Node.js 18.0.0 or higher
- VOICEVOX Engine or compatible engine
Installation
npm install -g @kajidog/mcp-tts-voicevox
Usage
As MCP Server
1. Start VOICEVOX Engine
Start the VOICEVOX Engine and have it wait on the default port (http://localhost:50021).
2. Start MCP Server
Standard I/O mode (recommended):
npx @kajidog/mcp-tts-voicevox
HTTP server mode:
# Linux/macOS
MCP_HTTP_MODE=true npx @kajidog/mcp-tts-voicevox
# Windows PowerShell
$env:MCP_HTTP_MODE='true'; npx @kajidog/mcp-tts-voicevox
MCP Tools
speak - Text-to-speech
Converts text to speech and plays it.
Parameters:
text: String (multiple texts separated by newlines, speaker specification in "1:text" format)speaker(optional): Speaker IDspeedScale(optional): Playback speedimmediate(optional): Whether to start playback immediately (default: true)waitForStart(optional): Whether to wait for playback to start (default: false)waitForEnd(optional): Whether to wait for playback to end (default: false)
Examples:
// Simple text
{ "text": "Hello\nIt's a nice day today" }
// Speaker specification
{ "text": "Hello", "speaker": 3 }
// Per-segment speaker specification
{ "text": "1:Hello\n3:It's a nice day today" }
// Immediate playback (bypass queue)
{
"text": "Emergency message",
"immediate": true,
"waitForEnd": true
}
// Wait for playback to complete (synchronous processing)
{
"text": "Wait for this audio playback to complete before next processing",
"waitForEnd": true
}
// Add to queue but don't auto-play
{
"text": "Wait for manual playback start",
"immediate": false
}
Advanced Playback Control Features
Immediate Playback (immediate: true)
Play audio immediately by bypassing the queue:
- Parallel operation with regular queue: Does not interfere with existing queue playback
- Multiple simultaneous playback: Multiple immediate playbacks can run simultaneously
- Ideal for urgent notifications: Prioritizes important messages
Synchronous Playback Control (waitForEnd: true)
Wait for playback completion to synchronize processing:
- Sequential processing: Execute next processing after audio playback
- Timing control: Enables coordination between audio and other processing
- UI synchronization: Align screen display with audio timing
// Example 1: Play urgent message immediately and wait for completion
{
"text": "Emergency! Please check immediately",
"immediate": true,
"waitForEnd": true
}
// Example 2: Step-by-step audio guide
{
"text": "Step 1: Please open the file",
"waitForEnd": true
}
// Next processing executes after the above audio completes
Other Tools
generate_query- Generate query for speech synthesissynthesize_file- Generate audio filestop_speaker- Stop playback and clear queueget_speakers- Get speaker listget_speaker_detail- Get speaker details
Package Structure
@kajidog/mcp-tts-voicevox (this package)
- MCP Server - Communicates with MCP clients like Claude Desktop
- HTTP Server - Remote MCP communication via SSE/StreamableHTTP
@kajidog/voicevox-client (independent package)
- General-purpose library - Communication functionality with VOICEVOX Engine
- Cross-platform - Node.js and browser environment support
- Advanced playback control - Immediate playback, synchronous playback, and queue management features
MCP Configuration Examples
Claude Desktop Configuration
Add the following configuration to your claude_desktop_config.json file:
{
"mcpServers": {
"tts-mcp": {
"command": "npx",
"args": ["-y", "@kajidog/mcp-tts-voicevox"]
}
}
}
When SSE Mode is Required
If you need speech synthesis in SSE mode, you can use mcp-remote for SSE↔Stdio conversion:
-
Claude Desktop Configuration
{ "mcpServers": { "tts-mcp-proxy": { "command": "npx", "args": ["-y", "mcp-remote", "http://localhost:3000/sse"] } } } -
Starting SSE Server
Mac/Linux:
MCP_HTTP_MODE=true MCP_HTTP_PORT=3000 npx @kajidog/mcp-tts-voicevoxWindows:
$env:MCP_HTTP_MODE='true'; $env:MCP_HTTP_PORT='3000'; npx @kajidog/mcp-tts-voicevox
### AivisSpeech Configuration Example
```json
{
"mcpServers": {
"tts-mcp": {
"command": "npx",
"args": ["-y", "@kajidog/mcp-tts-voicevox"],
"env": {
"VOICEVOX_URL": "http://127.0.0.1:10101",
"VOICEVOX_DEFAULT_SPEAKER": "888753764"
}
}
}
}
Environment Variables
VOICEVOX Configuration
VOICEVOX_URL: VOICEVOX Engine URL (default:http://localhost:50021)VOICEVOX_DEFAULT_SPEAKER: Default speaker ID (default:1)VOICEVOX_DEFAULT_SPEED_SCALE: Default playback speed (default:1.0)
Playback Options Configuration
VOICEVOX_DEFAULT_IMMEDIATE: Whether to start playback immediately when added to queue (default:true)VOICEVOX_DEFAULT_WAIT_FOR_START: Whether to wait for playback to start (default:false)VOICEVOX_DEFAULT_WAIT_FOR_END: Whether to wait for playback to end (default:false)
Usage Examples:
# Example 1: Wait for completion for all audio playback (synchronous processing)
export VOICEVOX_DEFAULT_WAIT_FOR_END=true
npx @kajidog/mcp-tts-voicevox
# Example 2: Wait for both playback start and end
export VOICEVOX_DEFAULT_WAIT_FOR_START=true
export VOICEVOX_DEFAULT_WAIT_FOR_END=true
npx @kajidog/mcp-tts-voicevox
# Example 3: Manual control (disable auto-play)
export VOICEVOX_DEFAULT_IMMEDIATE=false
npx @kajidog/mcp-tts-voicevox
These options allow fine-grained control of audio playback behavior according to application requirements.
Server Configuration
MCP_HTTP_MODE: Enable HTTP server mode (set totrueto enable)MCP_HTTP_PORT: HTTP server port number (default:3000)MCP_HTTP_HOST: HTTP server host (default:0.0.0.0)
Usage with WSL (Windows Subsystem for Linux)
Configuration method for connecting from WSL environment to Windows host MCP server.
1. Windows Host Configuration
Starting MCP server with AivisSpeech and PowerShell:
$env:MCP_HTTP_MODE='true'; $env:MCP_HTTP_PORT='3000'; $env:VOICEVOX_URL='http://127.0.0.1:10101'; $env:VOICEVOX_DEFAULT_SPEAKER='888753764'; npx @kajidog/mcp-tts-voicevox
2. WSL Environment Configuration
Check Windows host IP address:
# Get Windows host IP address from WSL
ip route show | grep default | awk '{print $3}'
Usually in the format 172.x.x.1.
Claude Code .mcp.json configuration example:
{
"mcpServers": {
"tts": {
"type": "sse",
"url": "http://172.29.176.1:3000/sse"
}
}
}
Important Points:
- Within WSL,
localhostor127.0.0.1refers to WSL internal, so cannot access Windows host services - Use WSL gateway IP (usually
172.x.x.1) to access Windows host - Ensure the port is not blocked by Windows firewall
Connection Test:
# Check connection to Windows host MCP server from WSL
curl http://172.29.176.1:3000
If normal, 404 Not Found will be returned (because root path doesn't exist).
Troubleshooting
Common Issues
-
VOICEVOX Engine is not running
curl http://localhost:50021/speakers -
Audio is not playing
- Check system audio output device
- Check platform-specific audio playback tools:
- Linux: Requires one of
aplay,paplay,play,ffplay - macOS:
afplay(pre-installed) - Windows: PowerShell (pre-installed)
- Linux: Requires one of
-
Not recognized by MCP client
- Check package installation:
npm list -g @kajidog/mcp-tts-voicevox - Check JSON syntax in configuration file
- Check package installation:
License
ISC
Developer Information
Instructions for developing this repository locally.
Setup
- Clone the repository:
git clone https://github.com/kajidog/mcp-tts-voicevox.git cd mcp-tts-voicevox - Install pnpm (if not already installed).
- Install dependencies:
pnpm install
Main Development Commands
You can run the following commands in the project root.
- Build all packages:
pnpm build - Run all tests:
pnpm test - Run all linters:
pnpm lint - Start root server in development mode:
pnpm dev - Start stdio interface in development mode:
pnpm dev:stdio
These commands will also properly handle processing for related packages within the workspace.
Related Servers
KimpalbokTV Slack
A Slack server for managing workspace channels, messages, and users, created by KimpalbokTV.
Claude Assist MCP
Enables communication between Claude Code and Claude Desktop for code reviews.
Facebook MCP Server
Automate and manage interactions on a Facebook Page using the Facebook Graph API.
MCP反馈收集器
An MCP server for collecting interactive user feedback through a graphical user interface.
Twitter MCP
Interact with Twitter to post and search for tweets.
Gmail
Provides comprehensive integration with Gmail for reading, searching, and sending emails.
Freshdesk MCP Server
An MCP server for interacting with the Freshdesk API v2, enabling management of customer support tickets and contacts.
Tangerine
An MCP server for Tangerine, the Convo AI assistant backend.
Desktop Notification
Send cross-platform desktop notifications from AI assistants.
NATS
An MCP server for integrating with the NATS messaging system.
