https://github.com/fasuizu-br/speech-ai-examples

Production-ready examples for Brainiall Speech AI APIs — Pronunciation Assessment, STT, TTS. Python, JavaScript, curl, and MCP configs.
https://github.com/fasuizu-br/speech-ai-examples

ai-agents api-examples language-learning mcp pronunciation speech-ai speech-to-text text-to-speech

Last synced: 3 months ago
JSON representation

Production-ready examples for Brainiall Speech AI APIs — Pronunciation Assessment, STT, TTS. Python, JavaScript, curl, and MCP configs.

Host: GitHub
URL: https://github.com/fasuizu-br/speech-ai-examples
Owner: fasuizu-br
License: mit
Created: 2026-02-20T14:26:31.000Z (3 months ago)
Default Branch: main
Last Pushed: 2026-02-26T22:33:56.000Z (3 months ago)
Last Synced: 2026-02-27T01:17:47.410Z (3 months ago)
Topics: ai-agents, api-examples, language-learning, mcp, pronunciation, speech-ai, speech-to-text, text-to-speech
Language: Python
Homepage: https://brainiall.com
Size: 55.7 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-mcp-lists - GitHub
awesome-mcp-servers - fasuizu-br/speech-ai-examples - level scoring, 17MB model, <300ms), text-to-speech, and speech-to-text. 8 tools for AI agents building language learning, accessibility, and voice applications. (🖼️ Multimedia Processing)
awesome-mcp-list - fasuizu-br/speech-ai-examples - br/speech-ai-examples?style=social)](https://github.com/fasuizu-br/speech-ai-examples): Speech AI MCP server with pronunciation assessment (phoneme-level scoring, 17MB model, <300ms), text-to-speech, and speech-to-text. 8 tools for language learning, accessibility, and voice applications. (Uncategorized / Uncategorized)

README

          # Brainiall AI APIs

[![API Status](https://img.shields.io/badge/API-Live-brightgreen)](https://apim-ai-apis.azure-api.net/v1/health)

[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)

[![MCP Servers](https://img.shields.io/badge/MCP-20_Tools-purple)](https://apim-ai-apis.azure-api.net/mcp/pronunciation/mcp)

[![Azure Marketplace](https://img.shields.io/badge/Azure-Marketplace-0078D4)](https://azuremarketplace.microsoft.com)

[![Models](https://img.shields.io/badge/LLM_Gateway-113+_Models-orange)](https://apim-ai-apis.azure-api.net/v1/models)

Production AI APIs for speech, text, image, and LLM inference. Available as REST endpoints and MCP servers for AI agents.

**Base URL:** `https://apim-ai-apis.azure-api.net`

**Full API reference for LLMs:** [`llms-full.txt`](llms-full.txt) | [`llms.txt`](llms.txt)

## Products

| Product | Endpoints | Latency | Notes |

|---------|-----------|---------|-------|

| **Pronunciation Assessment** | `/v1/pronunciation/assess/base64` | <500ms | 17MB ONNX, per-phoneme scoring (39 ARPAbet) |

| **Text-to-Speech** | `/v1/tts/synthesize` | <1s | 12 voices (American + British), 24kHz WAV |

| **Speech-to-Text** | `/v1/stt/transcribe/base64` | <500ms | Compact 17MB model, English, word timestamps |

| **Whisper Pro** | `/v1/whisper/transcribe/base64` | <3s | 99 languages, speaker diarization |

| **NLP Suite** | `/v1/nlp/{toxicity,sentiment,entities,pii,language}` | <50ms | CPU-only, ONNX, 5 endpoints |

| **Image Processing** | `/v1/image/{remove-background,upscale,restore-face}/base64` | <3s | GPU (A10), BiRefNet + ESRGAN + GFPGAN |

| **LLM Gateway** | `/v1/chat/completions` | varies | 113+ models, OpenAI-compatible, streaming |

## Authentication

Include ONE of these headers in every request:

```

Ocp-Apim-Subscription-Key: YOUR_KEY

Authorization: Bearer YOUR_KEY

api-key: YOUR_KEY

```

Get API keys at the [portal](https://app.brainiall.com) (GitHub sign-in, purchase credits, create key).

## Quick Start

### Python — LLM Gateway (OpenAI SDK)

```python

from openai import OpenAI

client = OpenAI(

    base_url="https://apim-ai-apis.azure-api.net/v1",

    api_key="YOUR_KEY"

)

response = client.chat.completions.create(

    model="claude-sonnet",

    messages=[{"role": "user", "content": "Hello!"}]

)

print(response.choices[0].message.content)

```

### Python — Pronunciation Assessment

```python

import requests, base64

audio_b64 = base64.b64encode(open("audio.wav", "rb").read()).decode()

r = requests.post(

    "https://apim-ai-apis.azure-api.net/v1/pronunciation/assess/base64",

    headers={"Ocp-Apim-Subscription-Key": "YOUR_KEY"},

    json={"audio": audio_b64, "text": "Hello world", "format": "wav"}

)

print(r.json()["overallScore"])  # 0-100

```

### Python — NLP Pipeline

```python

import requests

headers = {"Ocp-Apim-Subscription-Key": "YOUR_KEY"}

base = "https://apim-ai-apis.azure-api.net/v1/nlp"

# Sentiment

r = requests.post(f"{base}/sentiment", headers=headers, json={"text": "I love this!"})

print(r.json())  # {"label": "positive", "score": 0.9987}

# PII detection with redaction

r = requests.post(f"{base}/pii", headers=headers, json={"text": "Email john@acme.com", "redact": True})

print(r.json()["redacted_text"])  # "Email [EMAIL]"

```

### Node.js — LLM Gateway

```javascript

import OpenAI from "openai";

const client = new OpenAI({

  baseURL: "https://apim-ai-apis.azure-api.net/v1",

  apiKey: "YOUR_KEY"

});

const res = await client.chat.completions.create({

  model: "claude-sonnet",

  messages: [{ role: "user", content: "Hello!" }]

});

console.log(res.choices[0].message.content);

```

### curl — Image Background Removal

```bash

curl -X POST https://apim-ai-apis.azure-api.net/v1/image/remove-background/base64 \

  -H "Ocp-Apim-Subscription-Key: YOUR_KEY" \

  -H "Content-Type: application/json" \

  -d "{\"image\": \"$(base64 -i photo.jpg)\"}"

```

## LLM Gateway — Popular Models

| Model | Alias | Price ($/MTok in/out) |

|-------|-------|----------------------|

| Claude Opus 4.6 | `claude-opus` | $5 / $25 |

| Claude Sonnet 4.6 | `claude-sonnet` | $3 / $15 |

| Claude Haiku 4.5 | `claude-haiku` | $1 / $5 |

| DeepSeek R1 | `deepseek-r1` | $1.35 / $5.40 |

| DeepSeek V3 | `deepseek-v3` | $0.27 / $1.10 |

| Llama 3.3 70B | `llama-3.3-70b` | $0.72 / $0.72 |

| Amazon Nova Pro | `nova-pro` | $0.80 / $3.20 |

| Amazon Nova Micro | `nova-micro` | $0.035 / $0.14 |

| Mistral Large 3 | `mistral-large-3` | $2 / $6 |

| Qwen3 32B | `qwen3-32b` | $0.35 / $0.35 |

Full list: `GET /v1/models` (113+ models from 17 providers).

Supports: streaming SSE, tool calling, structured output (`json_object`/`json_schema`), extended thinking.

Works with: OpenAI SDK, LiteLLM, LangChain, Cline, Cursor, Aider, Continue, SillyTavern, Open WebUI.

## MCP Servers (for AI Agents)

3 MCP servers with 20 tools total. Streamable HTTP transport.

| Server | URL | Tools |

|--------|-----|-------|

| **Speech AI** | `https://apim-ai-apis.azure-api.net/mcp/pronunciation/mcp` | 10 tools + 8 resources + 3 prompts |

| **NLP Tools** | `https://apim-ai-apis.azure-api.net/mcp/nlp/mcp` | 6 tools + 3 resources + 3 prompts |

| **Image Tools** | `https://apim-ai-apis.azure-api.net/mcp/image/mcp` | 4 tools + 3 resources + 2 prompts |

### MCP Configuration (Claude Desktop / Cursor / Cline)

```json

{

  "mcpServers": {

    "brainiall-speech": {

      "url": "https://apim-ai-apis.azure-api.net/mcp/pronunciation/mcp",

      "headers": { "Ocp-Apim-Subscription-Key": "YOUR_KEY" }

    },

    "brainiall-nlp": {

      "url": "https://apim-ai-apis.azure-api.net/mcp/nlp/mcp",

      "headers": { "Ocp-Apim-Subscription-Key": "YOUR_KEY" }

    },

    "brainiall-image": {

      "url": "https://apim-ai-apis.azure-api.net/mcp/image/mcp",

      "headers": { "Ocp-Apim-Subscription-Key": "YOUR_KEY" }

    }

  }

}

```

Also available on: [Smithery](https://smithery.ai/server/fabiosuizu/pronunciation-assessment) (score 95/100) | [MCPize](https://mcpize.com/mcp/pronunciation-assessment) | [Apify](https://apify.com/vivid_astronaut/pronunciation-assessment-mcp) ($0.02/call) | [MCP Registry](https://registry.modelcontextprotocol.io)

## Examples

| File | Description |

|------|-------------|

| [`python/basic_usage.py`](python/basic_usage.py) | Speech APIs — assess, transcribe, synthesize |

| [`python/pronunciation_tutor.py`](python/pronunciation_tutor.py) | Interactive pronunciation tutor |

| [`javascript/basic_usage.js`](javascript/basic_usage.js) | Node.js examples for speech APIs |

| [`curl/examples.sh`](curl/examples.sh) | curl commands for every endpoint |

| [`mcp/claude-desktop-config.json`](mcp/claude-desktop-config.json) | MCP config for Claude Desktop |

| [`mcp/cursor-config.json`](mcp/cursor-config.json) | MCP config for Cursor IDE |

| [`llms-full.txt`](llms-full.txt) | Complete API reference for LLM consumption |

## Pricing

| Product | Price | Unit |

|---------|-------|------|

| Pronunciation | $0.02 | per call |

| TTS | $0.01-0.03 | per 1K chars |

| STT (compact) | $0.01 | per request |

| Whisper Pro | $0.02 | per minute |

| NLP (any) | $0.001-0.002 | per call |

| Image (any) | $0.003-0.005 | per image |

| LLM Gateway | Bedrock pricing | per MTok |

Credit packages: $5, $10, $25, $50, $100. [Portal](https://app.brainiall.com/pricing) | [Azure Marketplace](https://azuremarketplace.microsoft.com) (search "Brainiall").

## License

[MIT](LICENSE) — Brainiall

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/fasuizu-br/speech-ai-examples

Awesome Lists containing this project

README