VOICEROID Daemon

A text-to-speech server for VOICEROID2 via the voiceroid_daemon.

voiceroid_daemon-mcp

MCP (Model Context Protocol) server for VOICEROID2 text-to-speech via voiceroid_daemon.

Features

  • Text-to-speech generation with VOICEROID2 voices
  • Text-to-kana phonetic conversion
  • Customizable voice parameters (volume, speed, pitch, emphasis)
  • Audio playback support for macOS, Windows, and Linux
  • Basic authentication support

Prerequisites

  • Node.js 18 or higher
  • voiceroid_daemon running on your system
  • VOICEROID2 installed (for voiceroid_daemon)

Installation

# Clone the repository
git clone https://github.com/mohemohe/voiceroid_daemon-mcp.git
cd voiceroid_daemon-mcp

# Install dependencies
npm install

Configuration

Create a .env file in the project root (optional):

# voiceroid_daemon server URL (default: http://127.0.0.1:8080)
VOICEROID_DAEMON_URL=http://127.0.0.1:8080

# Basic authentication (if required)
VOICEROID_DAEMON_USERNAME=your_username
VOICEROID_DAEMON_PASSWORD=your_password

# Default voice parameters (optional)
VOICEROID_DEFAULT_VOLUME=1.0        # 0-2
VOICEROID_DEFAULT_SPEED=1.3         # 0.5-4
VOICEROID_DEFAULT_PITCH=1.0         # 0.5-2
VOICEROID_DEFAULT_EMPHASIS=1.1      # 0-2
VOICEROID_DEFAULT_PAUSE_MIDDLE=150  # 80-500
VOICEROID_DEFAULT_PAUSE_LONG=370    # 100-2000
VOICEROID_DEFAULT_PAUSE_SENTENCE=800 # 0-10000

Usage

Running the MCP Server

# Run directly with tsx (no build required)
npm start

# Or for development with auto-reload
npm run dev

Configuring with Claude Desktop

Add the following to your Claude Desktop configuration file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "voiceroid-daemon": {
      "command": "npx",
      "args": ["tsx", "/path/to/voiceroid_daemon-mcp/src/index.ts"],
      "env": {
        "VOICEROID_DAEMON_URL": "http://127.0.0.1:8080"
      }
    }
  }
}

Available Tools

test_connection

Test the connection to voiceroid_daemon server.

No parameters required

convert_text

Convert Japanese text to phonetic kana reading.

Parameters:

  • text (string, required): Text to convert to kana

speak_text

Generate and play speech audio from text.

Parameters:

  • text (string, required): Text to speak
  • kana (string, optional): Phonetic reading in kana
  • volume (number, optional): Voice volume (0-2, default: 1)
  • speed (number, optional): Speech speed (0.5-4, default: 1)
  • pitch (number, optional): Voice pitch (0.5-2, default: 1)
  • emphasis (number, optional): Emphasis level (0-2, default: 1)

Example Usage in Claude

Once configured, you can use the tools in Claude:

Use the test_connection tool to check if voiceroid_daemon is running.

Convert "こんにちは" to kana using the convert_text tool.

Use speak_text to say "こんにちは、今日はいい天気ですね" with speed 1.2.

Troubleshooting

Connection Failed

  1. Ensure voiceroid_daemon is running
  2. Check the URL in your configuration
  3. Verify firewall settings allow connections
  4. Test with curl: curl http://127.0.0.1:8080/

Audio Playback Issues

  • macOS: Uses afplay (built-in)
  • Windows: Uses PowerShell's Media.SoundPlayer
  • Linux: Requires aplay (usually part of alsa-utils)

Authentication Errors

If voiceroid_daemon requires authentication, ensure you've set:

  • VOICEROID_DAEMON_USERNAME
  • VOICEROID_DAEMON_PASSWORD

Development

# Type checking
npm run typecheck

# Linting
npm run lint

# Run in development mode
npm run dev

License

MIT

Related Servers