{"id":50916693,"url":"https://github.com/sidick/narrator.wyoming","last_synced_at":"2026-06-16T16:01:46.988Z","repository":{"id":364549804,"uuid":"1260719068","full_name":"sidick/narrator.wyoming","owner":"sidick","description":"Drop-in Amiga narrator.device replacement with modern neural voices — forwards text to a Piper TTS server over the Wyoming protocol, plays via AHI.","archived":false,"fork":false,"pushed_at":"2026-06-13T14:12:47.000Z","size":69,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-13T14:13:24.516Z","etag":null,"topics":["ahi","amiga","amigaos","aminet","m68k","narrator-device","neural-tts","piper","piper-tts","pistorm","retrocomputing","text-to-speech","tts","wyoming-protocol"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sidick.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-06-05T19:51:48.000Z","updated_at":"2026-06-13T14:12:49.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/sidick/narrator.wyoming","commit_stats":null,"previous_names":["sidick/narrator.wyoming"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/sidick/narrator.wyoming","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sidick%2Fnarrator.wyoming","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sidick%2Fnarrator.wyoming/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sidick%2Fnarrator.wyoming/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sidick%2Fnarrator.wyoming/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sidick","download_url":"https://codeload.github.com/sidick/narrator.wyoming/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sidick%2Fnarrator.wyoming/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34412795,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-16T02:00:06.860Z","response_time":126,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ahi","amiga","amigaos","aminet","m68k","narrator-device","neural-tts","piper","piper-tts","pistorm","retrocomputing","text-to-speech","tts","wyoming-protocol"],"created_at":"2026-06-16T16:01:45.457Z","updated_at":"2026-06-16T16:01:46.972Z","avatar_url":"https://github.com/sidick.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# narrator.wyoming\n\n[![CI](https://github.com/sidick/narrator.wyoming/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/sidick/narrator.wyoming/actions/workflows/ci.yml)\n\nA drop-in replacement for the Amiga **`narrator.device`** that speaks with a\nmodern **neural voice** instead of the old Paula formant synthesizer. It forwards\ntext to a [Piper](https://github.com/rhasspy/piper) text-to-speech server over the\n[Wyoming protocol](https://github.com/OHF-Voice/wyoming) (plain TCP) and plays the\nreturned PCM through **AHI**.\n\nBecause it replaces `narrator.device` by name in `DEVS:`, existing Amiga software\ngets neural speech transparently — **the stock `Say` command works with no\nmodification**.\n\n\u003e Aimed at **PiStorm / emu68k**-accelerated and other fast 68k Amigas, where\n\u003e the emulated CPU makes the network round-trip practical. Developed and\n\u003e tested under the Amiberry emulator; see CLAUDE.md and docs/ for the gory\n\u003e details.\n\n## Status\n\nWorks end-to-end on-target:\n\n- `Say \"hello\"` → neural speech out of AHI. ✅\n- Programs that open `narrator.device` directly (`CMD_WRITE`) get neural speech. ✅\n- Warm round-trip latency ~440 ms first-audio on an emulated 68020 (GOOD). ✅\n- `volume` and `sex`→voice mapping; graceful failure if the server is down. ✅\n- Persistent connection held by the device's own task; `CMD_WRITE`s are async\n  (`SendIO` returns immediately) and queued FIFO over the one socket. ✅\n\nNot done by design (see \"Limitations\"): `CMD_READ` mouth-shape/word sync, and\nper-request rate/pitch (a Wyoming limitation).\n\n## How it works\n\n```\nAmiga app ──IOSpeech/CMD_WRITE──▶ narrator.device (this)\n   (Say) ──English──▶ translator.library (this, pass-through)──┘\n                                          │  Wyoming over TCP (bsdsocket)\n                                          ▼\n                                   Piper TTS server (your LAN)\n                                          │  raw 16-bit PCM\n                                          ▼\n                                   narrator.device ──byte-swap──▶ AHI playback\n```\n\n`Say` translates English→phonemes via `translator.library` before writing to\nnarrator. Piper wants **English**, so narrator.wyoming ships a **pass-through\n`translator.library`** whose `Translate()` returns the English unchanged — so\nEnglish reaches our device, which forwards it to Piper.\n\n## Requirements\n\n**Amiga side**\n- **AHI v4+** installed, with a Unit 0 audio mode configured (the `paula.audio`\n  driver works under emulation).\n- A **TCP/IP stack** (Roadshow / AmiTCP) providing `bsdsocket.library` — or\n  Amiberry's built-in `bsdsocket.library` emulation.\n- Fast 68k (PiStorm/emu68k or ~68030/40MHz+). No FPU required. No TLS/AmiSSL.\n- **Optional**: `codesets.library` (Aminet: `util/libs/codesets`) for native-locale\n  text input. ASCII and pre-encoded UTF-8 work without it; ISO-8859-1 (accents,\n  umlauts) needs it or Piper may reject the request.\n\n**Server side**\n- A **Piper** TTS server reachable on your LAN with the **Wyoming** protocol\n  enabled (default port **10200**), e.g. the `wyoming-piper` add-on / container\n  (see below).\n\n## Running a Piper/Wyoming server (Docker)\n\nnarrator.wyoming needs a Piper TTS server speaking the Wyoming protocol on your\nLAN. The easiest way to get one is the official **`rhasspy/wyoming-piper`**\ncontainer. Run it on any always-on machine (a NAS, a mini-PC, or the same\nhost you build on):\n\n```\ndocker run -d --name wyoming-piper \\\n  -p 10200:10200 \\\n  -v wyoming-piper-data:/data \\\n  rhasspy/wyoming-piper \\\n  --voice en_US-lessac-medium\n```\n\n- `-p 10200:10200` — exposes the Wyoming TCP port the device connects to.\n- `-v wyoming-piper-data:/data` — persists downloaded voice models across\n  restarts (so they aren't re-fetched each time).\n- `--voice` — the startup voice; Piper auto-downloads it on first run. The\n  server listens on `tcp://0.0.0.0:10200` by default.\n\nThen point the device at that machine's LAN IP in `ENV:narrator.wyoming`:\n\n```\nhost 192.168.1.50\nport 10200\n```\n\n### Multiple voices\n\nA single wyoming-piper instance serves **more than one voice** — it loads any\nvoice named in a request on demand (downloading it the first time). So `voice`,\n`voice_male`, and `voice_female` can each point at a different voice against the\n*same* server, and `Say`'s male/female option just selects between them. Browse\nvoices in the [Piper voice samples](https://rhasspy.github.io/piper-samples/) and\nuse the exact model name (e.g. `en_US-ryan-high`).\n\n### Docker Compose\n\nThe same thing as a `docker-compose.yml`:\n\n```yaml\nservices:\n  wyoming-piper:\n    image: rhasspy/wyoming-piper\n    command: --voice en_US-lessac-medium\n    ports:\n      - \"10200:10200\"\n    volumes:\n      - wyoming-piper-data:/data\n    restart: unless-stopped\nvolumes:\n  wyoming-piper-data:\n```\n\n### Verify it before involving the Amiga\n\nFrom your build host, point a native test build at the server — no Amiga needed:\n\n```\nmake host\n./build/host/saytest --voice en_US-lessac-medium --out say.wav 192.168.1.50 10200\n```\n\nA playable `say.wav` confirms the server works. (The first request is slow while\nPiper loads the voice model; warm requests are fast — use `wyomingtest --runs 3`\nto see the warm latency.)\n\n## Install\n\nCopy the two built artifacts into place and create the prefs file:\n\n```\nCopy narrator.device     DEVS:narrator.device\nCopy translator.library  LIBS:translator.library\n```\n\nCreate the prefs file **`ENV:narrator.wyoming`** (and `ENVARC:` to persist it):\n\n```\nhost 192.168.1.50      ; your Piper/Wyoming server\nport 10200\nvoice         en_US-lessac-medium   ; optional; must exist on your server\nvoice_male    en_US-ryan-high       ; used when the caller sets sex = MALE\nvoice_female  en_US-amy-medium      ; used when the caller sets sex = FEMALE\n```\n\nAll keys are optional except `host`. Omit the `voice*` keys to use the server's\ndefault voice. Built-in fallbacks apply if a key (or the whole file) is missing\n(host defaults to `127.0.0.1`). See `config/narrator.wyoming.example`.\n\n\u003e **⚠️ Persisting across reboots.** `ENV:` lives in RAM and is rebuilt from\n\u003e `ENVARC:` at every boot, so editing **only** `ENV:narrator.wyoming` lasts until\n\u003e the next reboot. To make your settings permanent, also save the file to\n\u003e `ENVARC:`:\n\u003e\n\u003e ```\n\u003e Copy ENV:narrator.wyoming ENVARC:narrator.wyoming\n\u003e ```\n\u003e\n\u003e (The device reads `ENV:` at runtime; `ENVARC:` is just the on-disk copy the\n\u003e system restores `ENV:` from at startup.)\n\nThen just use speech as normal:\n\n```\nSay \"Hello from the Amiga.\"\n```\n\n## Works with other narrator.device software\n\nAnything that speaks through `narrator.device` (+ `translator.library`) gets the\nneural voice for free — that's the whole point of replacing them by name. A nice\ncompanion is **[speak-handler](https://aminet.net/package/util/sys/speak-handler)**\n(Alexander Fritsch, v39.1), a from-scratch native replacement for the Commodore\n`SPEAK:` handler that drives `narrator.device`/`translator.library`. Mount its\n`SPEAK:` and you can pipe text straight to neural speech, e.g.:\n\n```\nType README.txt TO SPEAK:\necho \"Hello from the Amiga\" \u003eSPEAK:\n```\n\nIt pairs well with narrator.wyoming as an up-to-date, native speech handler.\n\n## Build\n\nThe Amiga binaries cross-compile with **Bebbo's m68k-amigaos GCC**, shipped in the\n`stefanreinauer/amiga-gcc:latest` Docker image — no local toolchain needed:\n\n```\nmake docker      # build build/amiga/{narrator.device, translator.library, ...}\nmake host        # native build/host/{wyomingtest, saytest} for protocol testing\nmake clean\n```\n\nBuild outputs:\n- `narrator.device` — the device (libnix `devinit.o`, freestanding).\n- `translator.library` — the pass-through translator (libnix `libinit.o`).\n- `wyomingtest` — latency probe.\n- `saytest` — standalone text→Wyoming→audio pipeline (`.wav` on host).\n- `devtest` — opens and drives the device directly.\n\n`make host` reads `config/narrator.wyoming` (copy from\n`config/narrator.wyoming.example`) for the server\naddress; on Amiga the device reads `ENV:narrator.wyoming` at runtime instead.\n\n## Configuration reference (`ENV:narrator.wyoming`)\n\n`key value` per line; `#` or `;` start comments (inline comments after a value\nare allowed). The device reads `ENV:` at runtime — remember to also copy the\nfile to `ENVARC:` so edits survive a reboot (see the persistence note under\n[Install](#install)).\n\n| key | meaning | default |\n|---|---|---|\n| `host` | Piper/Wyoming server address | `127.0.0.1` |\n| `port` | server port | `10200` |\n| `voice` | default Piper voice name | server default |\n| `voice_male` | voice for `narrator_rb.sex == MALE` | `voice` |\n| `voice_female` | voice for `narrator_rb.sex == FEMALE` | `voice` |\n| `ahi_unit` | `ahi.device` unit to play through (for multiple AHI units) | `0` |\n| `gain` | AHI output level as percent of full scale (1–100). Lower = more headroom against Piper's hot peaks | `80` |\n| `smooth` | high-cut strength (cascade of 1-tap averagers) to tame AHI/Paula sibilance. 0 = off | `2` |\n| `split_words` | split long input into ~N-word pipelined chunks for faster start (0 = off) | `0` |\n\n`sex` is set at runtime by the caller (e.g. `Say`'s male/female option); the\nprefs only say which Piper voice each maps to. `volume` (0–64) maps to AHI\noutput level. `rate`/`pitch` are accepted but ignored — the Wyoming `synthesize`\nrequest has no per-request rate/pitch knob (those are properties of the Piper\nvoice, set server-side).\n\n\u003e **Prefs are snapshotted at `OpenDevice`.** The device reads `ENV:narrator.wyoming`\n\u003e once when its task starts (so edits don't take effect mid-open). Interactive\n\u003e `Say` opens a fresh device for each invocation, so live edits keep working\n\u003e in practice; a long-running consumer that holds the device open across edits\n\u003e will keep using the values it loaded at `OpenDevice`. Close and reopen the\n\u003e device to pick up new prefs.\n\n\u003e **Text codeset.** The Wyoming JSON requires UTF-8. The device auto-detects\n\u003e the caller's text: valid UTF-8 (including pure ASCII) is passed through,\n\u003e and anything else is transcoded from ISO-8859-1 to UTF-8 via the optional\n\u003e `codesets.library` (Aminet) if it is installed. If `codesets.library` is\n\u003e not present, non-UTF-8 input is passed through unchanged and Piper may\n\u003e reject it — install codesets.library to handle native-locale text.\n\n### Faster start on long input (`split_words`)\n\nPiper synthesizes one **sentence** at a time and only emits audio once the whole\nsentence is rendered, so a *single long sentence* has a long wait before any\nsound (≈2.3s for a 40-word run-on); multi-sentence text already starts fast\nbecause the device plays sentence 1 while the rest render. Set `split_words` to a\nword count (e.g. `12`) to break long input into smaller chunks that are sent\n**pipelined** on the one connection — playback of the first chunk starts in\n~¼s while the server renders the rest. Splits happen **only at punctuation**\n(full stops always; commas/semicolons once a chunk passes the word target), so\nthe joins land in the natural pauses Piper already inserts.\n\n\u003e **Caveat — chunk-boundary glitches.** AHI's read-ahead is only two buffers\n\u003e (~370 ms total at 22050 Hz/16-bit/mono). If per-chunk Piper synthesis +\n\u003e network transfer ever takes longer than the lookahead drains, you'll hear\n\u003e a brief stutter or click at the chunk seam. Under the Amiberry emulator's\n\u003e bsdsocket emulation this is observable on short chunk sizes; on real\n\u003e hardware with PiStorm + a TCP stack like Roadshow the headroom is larger.\n\u003e If you hear boundary artefacts, raise `split_words` (fewer, longer chunks)\n\u003e or drop back to `0` (one request, slower start, but no seams).\n\nA genuinely unpunctuated run-on won't split (it stays one request — slower\nstart, but never a mid-phrase gap). Leave it at `0` for the most faithful\nprosody.\n\n## Limitations / future work\n\n- **`CMD_READ` (mouth shapes) and word/syllable sync are intentionally out of\n  scope.** `CMD_READ` returns `ND_NoWrite` (so talking-head software gets no\n  lip-sync data but doesn't hang). Real sync isn't meaningfully achievable here:\n  we forward English to Piper and get back PCM with no phoneme timing, so mouth\n  shapes could only be faked from audio amplitude, and accurate lip-sync would\n  need playback to run concurrently with each write. Niche value, can't be done\n  properly — deliberately skipped.\n- **Direct ARPABET phoneme input is silently discarded.** Classic\n  `narrator.device` takes phonemes; our drop-in expects English (via the\n  pass-through `translator.library`, which `Say` and standard callers use). A\n  program that writes raw phonemes straight to the device can't be served —\n  Piper takes text, not phonemes, over Wyoming — so the device accepts the write\n  but plays nothing rather than emit gibberish. Detection is conservative (no\n  lowercase + a stress digit like `EH1`), so ordinary English, including all-caps\n  words, is always spoken. See `docs/narrator-device.md`.\n- **No `rate`/`pitch` control** (Wyoming limitation, above).\n- **First request after the server is idle is slow** (~1.8 s) while Piper loads\n  the voice model; subsequent (warm) requests are fast.\n- Validated under **Amiberry** only; real PiStorm hardware untested.\n\n## Repository layout\n\n- `src/` — C sources (device, translator, engine, standalone test programs).\n- `Makefile` — host + container (Amiga) builds.\n- `CLAUDE.md` — guidance + the hard-won Amiga/toolchain lessons (worth reading\n  before hacking on this).\n- `docs/narrator-device.md`, `docs/translator-library.md` — API references\n  distilled from the Amiga RKM and the AmigaOS wiki.\n\n## Development\n\nThis project was developed with the assistance of AI tooling (Anthropic's\nClaude, via Claude Code), under human direction and validated on-target under\nthe Amiberry emulator.\n\n## License\n\nMIT — see `LICENSE`.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsidick%2Fnarrator.wyoming","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsidick%2Fnarrator.wyoming","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsidick%2Fnarrator.wyoming/lists"}