https://github.com/ysdede/tdt-webgpu-demo
https://github.com/ysdede/tdt-webgpu-demo
Last synced: 9 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/ysdede/tdt-webgpu-demo
- Owner: ysdede
- Created: 2026-03-01T18:39:43.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2026-03-05T23:40:40.000Z (3 months ago)
- Last Synced: 2026-03-05T23:41:02.753Z (3 months ago)
- Language: JavaScript
- Size: 311 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Transformers.js v4 Parakeet TDT Demo
[](https://github.com/ysdede/tdt-webgpu-demo/actions/workflows/deploy-pages.yml)
[](https://github.com/ysdede/tdt-webgpu-demo/actions/workflows/sync-hf-space.yml)
[](https://ysdede.github.io/tdt-webgpu-demo/)
[](https://huggingface.co/spaces/ysdede/tdt-webgpu-demo)
Live demo: https://ysdede.github.io/tdt-webgpu-demo/
This project is a React + Vite web app for automatic speech recognition with Nemo Conformer TDT models using [Transformers.js](https://huggingface.co/docs/transformers.js) v4.
## Features
- Run Parakeet-style TDT ASR models in the browser, for example `ysdede/parakeet-tdt-0.6b-v2-onnx-tfjs4`.
- Choose encoder backend and dtype settings (WebGPU or WASM).
- Switch between HF pipeline mode and direct `model.transcribe()` mode with mode-specific controls.
- Compare browser audio prep paths, including a deterministic custom JS path with linear parity resampling and optional higher-quality SRC modes.
- Transcribe sample or uploaded audio.
- Inspect transcript, timestamps, confidence data, raw JSON output, and separated run/audio/model metrics.
- Run a Node.js CLI test script for quick non-UI checks.
## Requirements
- Node.js 18 or newer.
- A modern browser. WebGPU (Chrome or Edge) is recommended for best encoder performance.
- Optional local development setup for `transformers.js` as a sibling folder at `../transformers.js`.
## Install
```bash
git clone
cd transformers-v4-parakeet-demo
npm install
```
## Run
### NPM mode (default)
Uses `@huggingface/transformers@next` from npm:
```bash
npm run dev
```
Then open the URL shown by Vite (typically `http://localhost:5173`).
### Local source mode
Use this when you want to test local `transformers.js` changes without publishing a package.
1. Keep both repositories as siblings:
- `.../transformers.js/`
- `.../transformers-v4-parakeet-demo/`
2. Build transformers from the `transformers.js` root:
```bash
cd path/to/transformers.js
pnpm --filter @huggingface/transformers run build
```
3. Start the demo in local mode:
```bash
cd path/to/transformers-v4-parakeet-demo
npm run dev:local
```
`dev:local` sets `TRANSFORMERS_LOCAL=true` and aliases `@huggingface/transformers` to `../transformers.js/packages/transformers/dist/transformers.web.js`.
## Production Build
```bash
npm run build
npm run preview
```
## GitHub Pages Deployment
Deployment is handled by `.github/workflows/deploy-pages.yml`.
The workflow:
1. Checks out this demo repository.
2. Checks out `transformers.js` into a sibling directory.
3. Builds `@huggingface/transformers` from source with `pnpm`.
4. Builds this demo with `TRANSFORMERS_LOCAL=true`.
5. Publishes `dist/` to GitHub Pages.
Repository settings:
- Enable Pages and select `GitHub Actions` as the source.
- Optional repository variable `TRANSFORMERS_REPO` (default `ysdede/transformers.js`).
- Optional repository variable `TRANSFORMERS_REPO_REF` (default `v4-nemo-conformer-tdt-main-r3`).
- Optional secret `TRANSFORMERS_REPO_TOKEN` if `transformers.js` is private.
Notes:
- GitHub Actions can only build commits that are pushed to GitHub.
- `workflow_dispatch` supports a `transformers_ref` input for one-off branch/tag/SHA overrides.
- The checked-in workflow defaults to your fork branch `v4-nemo-conformer-tdt-main-r3`.
- The Vite base path is set automatically for both project pages and user pages.
## Hugging Face Spaces Sync
Sync is handled by `.github/workflows/sync-hf-space.yml`.
The workflow:
1. Exports an HF-safe copy of the app to `hf_export/`.
2. Removes GitHub/local-only files and COI serviceworker wiring.
3. Writes HF-specific `README.md`, `vite.config.js`, and `package.json`.
4. Pushes the result to `https://huggingface.co/spaces/ysdede/tdt-webgpu-demo`.
Repository settings:
- Add secret `HF_TOKEN` with write access to `ysdede/tdt-webgpu-demo`.
## Node CLI Test
Run a quick transcription from the terminal:
```bash
npm run test:node -- --model ysdede/parakeet-tdt-0.6b-v2-onnx-tfjs4 --audio --encoder-device webgpu
```
By default, this script loads the local transformers build from `../transformers.js/packages/transformers/dist/transformers.node.mjs`.
Use `--npm` to use the installed npm package instead.
Node CLI input must be WAV (`.wav`).
| Option | Description |
|--------|-------------|
| `--model ` | Model ID or local model path |
| `--audio ` | WAV file path |
| `--encoder-device ` | Encoder device (`cpu` is safer for Node) |
| `--encoder-dtype`, `--decoder-dtype` | Examples: `fp16`, `int8`, `fp32` |
| `--timestamps` | Request word-level timestamps |
| `--loop ` | Repeat transcription `n` times |
| `--npm` | Use `@huggingface/transformers` from `node_modules` |
| `--local-module ` | Path to a local transformers node build |
## Included Sample
Sample audio file used by the UI: `public/assets/Harvard-L2-1.ogg`.
## Additional Notes
- [Conformer TDT return granularity details](./docs/return-granularity.md)
## UI Notes
- Model configuration supports load mode, model ID, device, dtype, and WASM thread tuning.
- Transcription options include explicit inference mode selection, direct Nemo API flags, pipeline timestamp settings, and audio prep controls.
- Metrics are split by mode: pipeline shows wall-clock run timing plus audio prep, while direct mode shows audio prep plus direct model internals.
- The transcribe workspace is organized into three columns with transcript and API contract visible at the same time.
- Settings and theme preferences are persisted in `localStorage`.
## License
See the repository license.