Access Whissle API for speech-to-text, diarization, translation, and text summarization.
A Python-based server that provides access to Whissle API endpoints for speech-to-text, diarization, translation, and text summarization.
Clone the repository:
git clone <repository-url>
cd whissle_mcp
Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows, use: venv\Scripts\activate
Install the required packages:
pip install -e .
Set up environment variables:
Create a .env
file in the project root with the following content:
WHISSLE_AUTH_TOKEN=insert_auth_token_here # Replace with your actual Whissle API token
WHISSLE_MCP_BASE_PATH=/path/to/your/base/directory
⚠️ Important: Never commit your actual token to the repository. The .env
file is included in .gitignore
to prevent accidental commits.
Configure Claude Integration:
Copy claude_config.example.json
to claude_config.json
and update the paths:
{
"mcpServers": {
"Whissle": {
"command": "/path/to/your/venv/bin/python",
"args": [
"/path/to/whissle_mcp/server.py"
],
"env": {
"WHISSLE_AUTH_TOKEN": "insert_auth_token_here"
}
}
}
}
/path/to/your/venv/bin/python
with the actual path to your Python interpreter in the virtual environment/path/to/whissle_mcp/server.py
with the actual path to your server.py fileWHISSLE_AUTH_TOKEN
: Your Whissle API authentication token (required)
.env
fileWHISSLE_MCP_BASE_PATH
: Base directory for file operations (optional, defaults to user's Desktop)The server supports the following audio formats:
Convert speech to text using the Whissle API.
response = speech_to_text(
audio_file_path="path/to/audio.wav",
model_name="en-NER", # Default model
timestamps=True, # Include word timestamps
boosted_lm_words=["specific", "terms"], # Words to boost in recognition
boosted_lm_score=80 # Score for boosted words (0-100)
)
Convert speech to text with speaker identification.
response = diarize_speech(
audio_file_path="path/to/audio.wav",
model_name="en-NER", # Default model
max_speakers=2, # Maximum number of speakers to identify
boosted_lm_words=["specific", "terms"],
boosted_lm_score=80
)
Translate text from one language to another.
response = translate_text(
text="Hello, world!",
source_language="en",
target_language="es"
)
Summarize text using an LLM model.
response = summarize_text(
content="Long text to summarize...",
model_name="openai", # Default model
instruction="Provide a brief summary" # Optional
)
List all available ASR models and their capabilities.
response = list_asr_models()
{
"transcript": "The transcribed text",
"duration_seconds": 10.5,
"language_code": "en",
"timestamps": [
{
"word": "The",
"startTime": 0,
"endTime": 100,
"confidence": 0.95
}
],
"diarize_output": [
{
"text": "The transcribed text",
"speaker_id": 1,
"start_timestamp": 0,
"end_timestamp": 10.5
}
]
}
{
"type": "text",
"text": "Translation:\nTranslated text here"
}
{
"type": "text",
"text": "Summary:\nSummarized text here"
}
{
"error": "Error message here"
}
The server includes robust error handling with:
Common error types:
Start the server:
mcp serve
The server will be available at the default MCP port (usually 8000)
A test script is provided to verify the functionality of all tools:
python test_whissle.py
The test script will:
For issues or questions, please:
[Add your license information here]
Access market data, manage accounts, and execute trades on the Upbit Cryptocurrency Exchange via its OpenAPI.
A bridge server connecting Claude Desktop with the chakoshi moderation API for content safety.
BGG MCP enables AI tools to interact with the BoardGameGeek API.
Connects AI agents to the Feishu/Lark platform via its OpenAPI to automate tasks like document processing, conversation management, and calendar scheduling.
Enables communication between an LLM and a user through an interactive Electron interface.
The most powerful MCP server for Slack Workspaces. This integration supports both Stdio and SSE transports, proxy settings and does not require any permissions or bots being created or approved by Workspace admins 😏.
A Discord relay server to send messages and prompts to a channel and receive responses.
This server enables users to send emails through various email providers, including Gmail, Outlook, Yahoo, Sina, Sohu, 126, 163, and QQ Mail. It also supports attaching files from specified directories, making it easy to upload attachments along with the email content.
Personalized music recommendations and playlist management for TIDAL, powered by its API and LLM filtering.
A Node.js service for interacting with the LnExchange API for spot trading.