Speech AI MCP Server

Production speech AI MCP server with pronunciation scoring, speech-to-text, and text-to-speech — 10 tools, 7 resources, 3 prompts.

GitHub

Documentation

Brainiall AI APIs

Production AI APIs for speech, text, image, and LLM inference. Available as REST endpoints and MCP servers for AI agents.

Base URL: https://apim-ai-apis.azure-api.net Full API reference for LLMs: llms-full.txt | llms.txt

Products

Product	Endpoints	Latency	Notes
Pronunciation Assessment	`/v1/pronunciation/assess/base64`	<500ms	17MB ONNX, per-phoneme scoring (39 ARPAbet)
Text-to-Speech	`/v1/tts/synthesize`	<1s	12 voices (American + British), 24kHz WAV
Speech-to-Text	`/v1/stt/transcribe/base64`	<500ms	Compact 17MB model, English, word timestamps
Whisper Pro	`/v1/whisper/transcribe/base64`	<3s	99 languages, speaker diarization
NLP Suite	`/v1/nlp/{toxicity,sentiment,entities,pii,language}`	<50ms	CPU-only, ONNX, 5 endpoints
Image Processing	`/v1/image/{remove-background,upscale,restore-face}/base64`	<3s	GPU (A10), BiRefNet + ESRGAN + GFPGAN
LLM Gateway	`/v1/chat/completions`	varies	113+ models, OpenAI-compatible, streaming

Authentication

Include ONE of these headers in every request:

Ocp-Apim-Subscription-Key: YOUR_KEY
Authorization: Bearer YOUR_KEY
api-key: YOUR_KEY

Get API keys at the portal (GitHub sign-in, purchase credits, create key).

Quick Start

Python — LLM Gateway (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    base_url="https://apim-ai-apis.azure-api.net/v1",
    api_key="YOUR_KEY"
)

response = client.chat.completions.create(
    model="claude-sonnet",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Python — Pronunciation Assessment

import requests, base64

audio_b64 = base64.b64encode(open("audio.wav", "rb").read()).decode()
r = requests.post(
    "https://apim-ai-apis.azure-api.net/v1/pronunciation/assess/base64",
    headers={"Ocp-Apim-Subscription-Key": "YOUR_KEY"},
    json={"audio": audio_b64, "text": "Hello world", "format": "wav"}
)
print(r.json()["overallScore"])  # 0-100

Python — NLP Pipeline

import requests

headers = {"Ocp-Apim-Subscription-Key": "YOUR_KEY"}
base = "https://apim-ai-apis.azure-api.net/v1/nlp"

# Sentiment
r = requests.post(f"{base}/sentiment", headers=headers, json={"text": "I love this!"})
print(r.json())  # {"label": "positive", "score": 0.9987}

# PII detection with redaction
r = requests.post(f"{base}/pii", headers=headers, json={"text": "Email [email protected]", "redact": True})
print(r.json()["redacted_text"])  # "Email [EMAIL]"

Node.js — LLM Gateway

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://apim-ai-apis.azure-api.net/v1",
  apiKey: "YOUR_KEY"
});

const res = await client.chat.completions.create({
  model: "claude-sonnet",
  messages: [{ role: "user", content: "Hello!" }]
});
console.log(res.choices[0].message.content);

curl — Image Background Removal

curl -X POST https://apim-ai-apis.azure-api.net/v1/image/remove-background/base64 \
  -H "Ocp-Apim-Subscription-Key: YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d "{\"image\": \"$(base64 -i photo.jpg)\"}"

LLM Gateway — Popular Models

Model	Alias	Price ($/MTok in/out)
Claude Opus 4.6	`claude-opus`	$5 / $25
Claude Sonnet 4.6	`claude-sonnet`	$3 / $15
Claude Haiku 4.5	`claude-haiku`	$1 / $5
DeepSeek R1	`deepseek-r1`	$1.35 / $5.40
DeepSeek V3	`deepseek-v3`	$0.27 / $1.10
Llama 3.3 70B	`llama-3.3-70b`	$0.72 / $0.72
Amazon Nova Pro	`nova-pro`	$0.80 / $3.20
Amazon Nova Micro	`nova-micro`	$0.035 / $0.14
Mistral Large 3	`mistral-large-3`	$2 / $6
Qwen3 32B	`qwen3-32b`	$0.35 / $0.35

Full list: GET /v1/models (113+ models from 17 providers).

Supports: streaming SSE, tool calling, structured output (json_object/json_schema), extended thinking.

Works with: OpenAI SDK, LiteLLM, LangChain, Cline, Cursor, Aider, Continue, SillyTavern, Open WebUI.

MCP Servers (for AI Agents)

3 MCP servers with 20 tools total. Streamable HTTP transport.

Server	URL	Tools
Speech AI	`https://apim-ai-apis.azure-api.net/mcp/pronunciation/mcp`	10 tools + 8 resources + 3 prompts
NLP Tools	`https://apim-ai-apis.azure-api.net/mcp/nlp/mcp`	6 tools + 3 resources + 3 prompts
Image Tools	`https://apim-ai-apis.azure-api.net/mcp/image/mcp`	4 tools + 3 resources + 2 prompts

MCP Configuration (Claude Desktop / Cursor / Cline)

{
  "mcpServers": {
    "brainiall-speech": {
      "url": "https://apim-ai-apis.azure-api.net/mcp/pronunciation/mcp",
      "headers": { "Ocp-Apim-Subscription-Key": "YOUR_KEY" }
    },
    "brainiall-nlp": {
      "url": "https://apim-ai-apis.azure-api.net/mcp/nlp/mcp",
      "headers": { "Ocp-Apim-Subscription-Key": "YOUR_KEY" }
    },
    "brainiall-image": {
      "url": "https://apim-ai-apis.azure-api.net/mcp/image/mcp",
      "headers": { "Ocp-Apim-Subscription-Key": "YOUR_KEY" }
    }
  }
}

Also available on: Smithery (score 95/100) | MCPize | Apify ($0.02/call) | MCP Registry

Examples

File	Description
`python/basic_usage.py`	Speech APIs — assess, transcribe, synthesize
`python/pronunciation_tutor.py`	Interactive pronunciation tutor
`javascript/basic_usage.js`	Node.js examples for speech APIs
`curl/examples.sh`	curl commands for every endpoint
`mcp/claude-desktop-config.json`	MCP config for Claude Desktop
`mcp/cursor-config.json`	MCP config for Cursor IDE
`llms-full.txt`	Complete API reference for LLM consumption

Pricing

Product	Price	Unit
Pronunciation	$0.02	per call
TTS	$0.01-0.03	per 1K chars
STT (compact)	$0.01	per request
Whisper Pro	$0.02	per minute
NLP (any)	$0.001-0.002	per call
Image (any)	$0.003-0.005	per image
LLM Gateway	competitive pricing	per MTok

Credit packages: $5, $10, $25, $50, $100. Portal | Azure Marketplace (search "Brainiall").

License

MIT — Brainiall