{"id":45608922,"url":"https://github.com/fasuizu-br/speech-ai-examples","last_synced_at":"2026-03-05T01:01:15.546Z","repository":{"id":340842581,"uuid":"1162650912","full_name":"fasuizu-br/speech-ai-examples","owner":"fasuizu-br","description":"Production-ready examples for Brainiall Speech AI APIs — Pronunciation Assessment, STT, TTS. Python, JavaScript, curl, and MCP configs.","archived":false,"fork":false,"pushed_at":"2026-02-26T22:33:56.000Z","size":57,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-27T01:17:47.410Z","etag":null,"topics":["ai-agents","api-examples","language-learning","mcp","pronunciation","speech-ai","speech-to-text","text-to-speech"],"latest_commit_sha":null,"homepage":"https://brainiall.com","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fasuizu-br.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-20T14:26:31.000Z","updated_at":"2026-02-26T22:34:00.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/fasuizu-br/speech-ai-examples","commit_stats":null,"previous_names":["fasuizu-br/speech-ai-examples"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/fasuizu-br/speech-ai-examples","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fasuizu-br%2Fspeech-ai-examples","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fasuizu-br%2Fspeech-ai-examples/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fasuizu-br%2Fspeech-ai-examples/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fasuizu-br%2Fspeech-ai-examples/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fasuizu-br","download_url":"https://codeload.github.com/fasuizu-br/speech-ai-examples/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fasuizu-br%2Fspeech-ai-examples/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30104218,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-05T00:38:46.881Z","status":"ssl_error","status_checked_at":"2026-03-05T00:38:45.829Z","response_time":59,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","api-examples","language-learning","mcp","pronunciation","speech-ai","speech-to-text","text-to-speech"],"created_at":"2026-02-23T17:00:20.403Z","updated_at":"2026-03-05T01:01:15.528Z","avatar_url":"https://github.com/fasuizu-br.png","language":"Python","funding_links":[],"categories":["Docker MCP Toolkit","🖼️ Multimedia Processing","Uncategorized","📦 Other"],"sub_categories":["Multimedia \u0026 Design","Uncategorized"],"readme":"# Brainiall AI APIs\n\n[![API Status](https://img.shields.io/badge/API-Live-brightgreen)](https://apim-ai-apis.azure-api.net/v1/health)\n[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)\n[![MCP Servers](https://img.shields.io/badge/MCP-20_Tools-purple)](https://apim-ai-apis.azure-api.net/mcp/pronunciation/mcp)\n[![Azure Marketplace](https://img.shields.io/badge/Azure-Marketplace-0078D4)](https://azuremarketplace.microsoft.com)\n[![Models](https://img.shields.io/badge/LLM_Gateway-113+_Models-orange)](https://apim-ai-apis.azure-api.net/v1/models)\n\nProduction AI APIs for speech, text, image, and LLM inference. Available as REST endpoints and MCP servers for AI agents.\n\n**Base URL:** `https://apim-ai-apis.azure-api.net`\n**Full API reference for LLMs:** [`llms-full.txt`](llms-full.txt) | [`llms.txt`](llms.txt)\n\n## Products\n\n| Product | Endpoints | Latency | Notes |\n|---------|-----------|---------|-------|\n| **Pronunciation Assessment** | `/v1/pronunciation/assess/base64` | \u003c500ms | 17MB ONNX, per-phoneme scoring (39 ARPAbet) |\n| **Text-to-Speech** | `/v1/tts/synthesize` | \u003c1s | 12 voices (American + British), 24kHz WAV |\n| **Speech-to-Text** | `/v1/stt/transcribe/base64` | \u003c500ms | Compact 17MB model, English, word timestamps |\n| **Whisper Pro** | `/v1/whisper/transcribe/base64` | \u003c3s | 99 languages, speaker diarization |\n| **NLP Suite** | `/v1/nlp/{toxicity,sentiment,entities,pii,language}` | \u003c50ms | CPU-only, ONNX, 5 endpoints |\n| **Image Processing** | `/v1/image/{remove-background,upscale,restore-face}/base64` | \u003c3s | GPU (A10), BiRefNet + ESRGAN + GFPGAN |\n| **LLM Gateway** | `/v1/chat/completions` | varies | 113+ models, OpenAI-compatible, streaming |\n\n## Authentication\n\nInclude ONE of these headers in every request:\n\n```\nOcp-Apim-Subscription-Key: YOUR_KEY\nAuthorization: Bearer YOUR_KEY\napi-key: YOUR_KEY\n```\n\nGet API keys at the [portal](https://app.brainiall.com) (GitHub sign-in, purchase credits, create key).\n\n## Quick Start\n\n### Python — LLM Gateway (OpenAI SDK)\n\n```python\nfrom openai import OpenAI\n\nclient = OpenAI(\n    base_url=\"https://apim-ai-apis.azure-api.net/v1\",\n    api_key=\"YOUR_KEY\"\n)\n\nresponse = client.chat.completions.create(\n    model=\"claude-sonnet\",\n    messages=[{\"role\": \"user\", \"content\": \"Hello!\"}]\n)\nprint(response.choices[0].message.content)\n```\n\n### Python — Pronunciation Assessment\n\n```python\nimport requests, base64\n\naudio_b64 = base64.b64encode(open(\"audio.wav\", \"rb\").read()).decode()\nr = requests.post(\n    \"https://apim-ai-apis.azure-api.net/v1/pronunciation/assess/base64\",\n    headers={\"Ocp-Apim-Subscription-Key\": \"YOUR_KEY\"},\n    json={\"audio\": audio_b64, \"text\": \"Hello world\", \"format\": \"wav\"}\n)\nprint(r.json()[\"overallScore\"])  # 0-100\n```\n\n### Python — NLP Pipeline\n\n```python\nimport requests\n\nheaders = {\"Ocp-Apim-Subscription-Key\": \"YOUR_KEY\"}\nbase = \"https://apim-ai-apis.azure-api.net/v1/nlp\"\n\n# Sentiment\nr = requests.post(f\"{base}/sentiment\", headers=headers, json={\"text\": \"I love this!\"})\nprint(r.json())  # {\"label\": \"positive\", \"score\": 0.9987}\n\n# PII detection with redaction\nr = requests.post(f\"{base}/pii\", headers=headers, json={\"text\": \"Email john@acme.com\", \"redact\": True})\nprint(r.json()[\"redacted_text\"])  # \"Email [EMAIL]\"\n```\n\n### Node.js — LLM Gateway\n\n```javascript\nimport OpenAI from \"openai\";\n\nconst client = new OpenAI({\n  baseURL: \"https://apim-ai-apis.azure-api.net/v1\",\n  apiKey: \"YOUR_KEY\"\n});\n\nconst res = await client.chat.completions.create({\n  model: \"claude-sonnet\",\n  messages: [{ role: \"user\", content: \"Hello!\" }]\n});\nconsole.log(res.choices[0].message.content);\n```\n\n### curl — Image Background Removal\n\n```bash\ncurl -X POST https://apim-ai-apis.azure-api.net/v1/image/remove-background/base64 \\\n  -H \"Ocp-Apim-Subscription-Key: YOUR_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d \"{\\\"image\\\": \\\"$(base64 -i photo.jpg)\\\"}\"\n```\n\n## LLM Gateway — Popular Models\n\n| Model | Alias | Price ($/MTok in/out) |\n|-------|-------|----------------------|\n| Claude Opus 4.6 | `claude-opus` | $5 / $25 |\n| Claude Sonnet 4.6 | `claude-sonnet` | $3 / $15 |\n| Claude Haiku 4.5 | `claude-haiku` | $1 / $5 |\n| DeepSeek R1 | `deepseek-r1` | $1.35 / $5.40 |\n| DeepSeek V3 | `deepseek-v3` | $0.27 / $1.10 |\n| Llama 3.3 70B | `llama-3.3-70b` | $0.72 / $0.72 |\n| Amazon Nova Pro | `nova-pro` | $0.80 / $3.20 |\n| Amazon Nova Micro | `nova-micro` | $0.035 / $0.14 |\n| Mistral Large 3 | `mistral-large-3` | $2 / $6 |\n| Qwen3 32B | `qwen3-32b` | $0.35 / $0.35 |\n\nFull list: `GET /v1/models` (113+ models from 17 providers).\n\nSupports: streaming SSE, tool calling, structured output (`json_object`/`json_schema`), extended thinking.\n\nWorks with: OpenAI SDK, LiteLLM, LangChain, Cline, Cursor, Aider, Continue, SillyTavern, Open WebUI.\n\n## MCP Servers (for AI Agents)\n\n3 MCP servers with 20 tools total. Streamable HTTP transport.\n\n| Server | URL | Tools |\n|--------|-----|-------|\n| **Speech AI** | `https://apim-ai-apis.azure-api.net/mcp/pronunciation/mcp` | 10 tools + 8 resources + 3 prompts |\n| **NLP Tools** | `https://apim-ai-apis.azure-api.net/mcp/nlp/mcp` | 6 tools + 3 resources + 3 prompts |\n| **Image Tools** | `https://apim-ai-apis.azure-api.net/mcp/image/mcp` | 4 tools + 3 resources + 2 prompts |\n\n### MCP Configuration (Claude Desktop / Cursor / Cline)\n\n```json\n{\n  \"mcpServers\": {\n    \"brainiall-speech\": {\n      \"url\": \"https://apim-ai-apis.azure-api.net/mcp/pronunciation/mcp\",\n      \"headers\": { \"Ocp-Apim-Subscription-Key\": \"YOUR_KEY\" }\n    },\n    \"brainiall-nlp\": {\n      \"url\": \"https://apim-ai-apis.azure-api.net/mcp/nlp/mcp\",\n      \"headers\": { \"Ocp-Apim-Subscription-Key\": \"YOUR_KEY\" }\n    },\n    \"brainiall-image\": {\n      \"url\": \"https://apim-ai-apis.azure-api.net/mcp/image/mcp\",\n      \"headers\": { \"Ocp-Apim-Subscription-Key\": \"YOUR_KEY\" }\n    }\n  }\n}\n```\n\nAlso available on: [Smithery](https://smithery.ai/server/fabiosuizu/pronunciation-assessment) (score 95/100) | [MCPize](https://mcpize.com/mcp/pronunciation-assessment) | [Apify](https://apify.com/vivid_astronaut/pronunciation-assessment-mcp) ($0.02/call) | [MCP Registry](https://registry.modelcontextprotocol.io)\n\n## Examples\n\n| File | Description |\n|------|-------------|\n| [`python/basic_usage.py`](python/basic_usage.py) | Speech APIs — assess, transcribe, synthesize |\n| [`python/pronunciation_tutor.py`](python/pronunciation_tutor.py) | Interactive pronunciation tutor |\n| [`javascript/basic_usage.js`](javascript/basic_usage.js) | Node.js examples for speech APIs |\n| [`curl/examples.sh`](curl/examples.sh) | curl commands for every endpoint |\n| [`mcp/claude-desktop-config.json`](mcp/claude-desktop-config.json) | MCP config for Claude Desktop |\n| [`mcp/cursor-config.json`](mcp/cursor-config.json) | MCP config for Cursor IDE |\n| [`llms-full.txt`](llms-full.txt) | Complete API reference for LLM consumption |\n\n## Pricing\n\n| Product | Price | Unit |\n|---------|-------|------|\n| Pronunciation | $0.02 | per call |\n| TTS | $0.01-0.03 | per 1K chars |\n| STT (compact) | $0.01 | per request |\n| Whisper Pro | $0.02 | per minute |\n| NLP (any) | $0.001-0.002 | per call |\n| Image (any) | $0.003-0.005 | per image |\n| LLM Gateway | Bedrock pricing | per MTok |\n\nCredit packages: $5, $10, $25, $50, $100. [Portal](https://app.brainiall.com/pricing) | [Azure Marketplace](https://azuremarketplace.microsoft.com) (search \"Brainiall\").\n\n## License\n\n[MIT](LICENSE) — Brainiall\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffasuizu-br%2Fspeech-ai-examples","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffasuizu-br%2Fspeech-ai-examples","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffasuizu-br%2Fspeech-ai-examples/lists"}