https://github.com/iamnortey/ninolex-docs
Pronunciation infrastructure for AI voice applications — architecture and documentation
https://github.com/iamnortey/ninolex-docs
api-design architecture documentation nlp pronunciation python tts typescript voice voice-ai
Last synced: 28 days ago
JSON representation
Pronunciation infrastructure for AI voice applications — architecture and documentation
- Host: GitHub
- URL: https://github.com/iamnortey/ninolex-docs
- Owner: iamnortey
- License: mit
- Created: 2026-01-10T19:56:16.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2026-05-25T13:11:13.000Z (about 1 month ago)
- Last Synced: 2026-05-28T05:34:49.401Z (28 days ago)
- Topics: api-design, architecture, documentation, nlp, pronunciation, python, tts, typescript, voice, voice-ai
- Homepage: https://github.com/iamnortey/portfolio/blob/main/case-studies/ninolex.md
- Size: 4.88 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Security: SECURITY.md
Awesome Lists containing this project
README
# Ninolex
> Pronunciation infrastructure for AI voice applications.
[](https://github.com/iamnortey/portfolio/blob/main/case-studies/ninolex.md)
[](https://github.com/iamnortey/ninolex-gh)
[](https://github.com/iamnortey/portfolio)
---
## What It Does
Ninolex provides a cross-platform source of truth for how real-world entities should be pronounced in AI voice applications. TTS engines mispronounce brands, products, and names — Ninolex fixes that.
**Pronunciation Resolution Pipeline:** Text → Normalize → Resolve → Export
Entity registry with IPA phonemes, exportable to ElevenLabs, AWS Polly, and Vapi formats.
---
## The Problem
- TTS engines mispronounce brands: "BMW" becomes gibberish
- Product names are unpronounceable: "WH-1000XM5"
- Domain terminology is wrong: medical, legal, automotive terms
- Cultural names are consistently butchered
---
## Stack
| Layer | Technology |
|-------|------------|
| **Frontend** | Next.js |
| **Backend** | Next.js API Routes, Modal |
| **Resolution** | CMUdict, Phonemizer |
| **Database** | Supabase |
| **Export** | ElevenLabs, AWS Polly, Vapi |
---
## Key Architecture Patterns
- **Two-tier resolution** — CMUdict primary, Phonemizer fallback
- **Alphanumeric expansion** — "WH-1000XM5" → "W H one thousand X M five"
- **Vertical pack architecture** — domain-specific packs (AutoLex, DineLex)
- **API-first design** with versioning
---
## API Example
```http
POST /api/v1/resolve
Content-Type: application/json
{
"entities": ["BMW X5", "WH-1000XM5"],
"format": "elevenlabs"
}
```
Response includes IPA phonemes and a downloadable dictionary file.
---
## Documentation
| Document | Description |
|----------|-------------|
| [Architecture](./ARCHITECTURE.md) | Pipeline design and data flow |
| [Case Study](https://github.com/iamnortey/portfolio/blob/main/case-studies/ninolex.md) | Full project overview |
| [Security](./docs/security.md) | API authentication and access |
---
## Open Source Component
The [ninolex-gh](https://github.com/iamnortey/ninolex-gh) repository contains the open Ghanaian pronunciation dictionary — demonstrating the data format and linguistic approach used in the full platform.
---
## Access
The core implementation is in a **private repository** to protect intellectual property. This repository contains architecture documentation, API design docs, pipeline diagrams, and integration examples.
For API access or technical discussions: [LinkedIn](https://linkedin.com/in/inortey/)
---
## Related
- [Ninolex-GH (Open Source)](https://github.com/iamnortey/ninolex-gh) — open Ghanaian pronunciation dictionary
- [Portfolio](https://github.com/iamnortey/portfolio) — all case studies and architecture samples
- [Case Study](https://github.com/iamnortey/portfolio/blob/main/case-studies/ninolex.md) — full project deep-dive