https://github.com/m4n5ter/tokstream
A token streaming simulator powered by Hugging Face tokenizers. It downloads a tokenizer from HF Hub and generates tokens at a target rate, with live stats for target vs actual throughput.
https://github.com/m4n5ter/tokstream
simulator tokenizer
Last synced: 5 months ago
JSON representation
A token streaming simulator powered by Hugging Face tokenizers. It downloads a tokenizer from HF Hub and generates tokens at a target rate, with live stats for target vs actual throughput.
- Host: GitHub
- URL: https://github.com/m4n5ter/tokstream
- Owner: M4n5ter
- License: mit
- Created: 2025-12-20T10:32:22.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-12-20T10:41:17.000Z (6 months ago)
- Last Synced: 2025-12-22T15:51:05.547Z (6 months ago)
- Topics: simulator, tokenizer
- Language: JavaScript
- Homepage:
- Size: 1010 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# tokstream
[中文](README_ZH.md) | English
A token streaming simulator powered by Hugging Face tokenizers. It downloads a tokenizer from HF Hub and generates tokens at a target rate, with live stats for target vs actual throughput.
## Highlights
- Rust CLI with high‑precision pacing (sleep + spin)
- Web demo (WASM) and npx executable
- Random English / Chinese generation and text replay
- Configurable filtering strategy
- Target vs actual tokens/sec stats
- Workspace layout with reusable core
## Project Layout
```
.
├── crates
│ ├── tokstream-core # tokenizer engine
│ ├── tokstream-cli # Rust CLI
│ └── tokstream-wasm # wasm-bindgen bindings
├── npm # npx CLI + web demo
├── bin # npm bin entry
├── Cargo.toml # workspace
├── justfile
├── package.json
├── README.md
└── README_ZH.md
```
## Rust CLI
### Quick Start
```bash
cargo run -p tokstream-cli -- --model gpt2 --mode english --rate 8
cargo run -p tokstream-cli -- --model gpt2 --mode chinese --rate 8
cargo run -p tokstream-cli -- --model gpt2 --mode text --text "Hello" --repeat 3
```
### Install from crates.io
```bash
cargo install tokstream-cli
# or
cargo binstall tokstream-cli
```
Notes:
- The binary name is `tokstream` after installation.
- `cargo binstall` will compile from source unless you provide prebuilt release assets and set `repository` in the crate metadata.
### Model & Auth
- `--model ` HF Hub model id (default: `gpt2`)
- `--revision ` HF revision (default: `main`)
- `--hf-token ` access token for private models
### Modes
- `--mode `
- `--text ` text mode input
- `--text-file ` text mode input from file
- `--loop-text` loop text forever
- `--repeat ` repeat text n times
### Rate Control
- `--rate ` target tokens/sec
- `--rate-min ` min rate for random range
- `--rate-max ` max rate for random range
- `--rate-sample-interval ` sampling interval for rate range (seconds, default: 1)
- `--batch ` tokens emitted per batch
- `--max-tokens ` stop after n tokens
### Pacing & Throughput
- `--pace ` pacing mode (default: `strict`)
- `--spin-threshold-us ` busy‑spin threshold for `strict` mode
- `--no-throttle` disable pacing (measure max throughput)
- `--no-output` disable stdout output (closer to tokenizer upper bound)
### Stats
- `--no-stats` disable stats output (stderr)
- `--stats-interval ` stats interval seconds (default: 1)
### Random Output Filters
- `--no-skip-special` do not skip special tokens
- `--allow-digits`
- `--allow-punct`
- `--allow-space`
- `--allow-non-ascii`
- `--no-require-letter`
- `--no-require-cjk`
### Seed
- `--seed ` random seed
### Examples
```bash
# Random rate range sampled every 2 seconds
cargo run -p tokstream-cli -- --model gpt2 --mode english --rate-min 6 --rate-max 12 --rate-sample-interval 2
# Text mode from file, repeat 5 times
cargo run -p tokstream-cli -- --model gpt2 --mode text --text-file ./sample.txt --repeat 5
# Infinite loop text
cargo run -p tokstream-cli -- --model gpt2 --mode text --text "Hello" --loop-text
# Throughput upper bound (no throttle, no output)
cargo run -p tokstream-cli -- --model gpt2 --mode english --no-throttle --no-output
```
## npx CLI
### Quick Start
```bash
npx tokstream@latest --model gpt2 --mode english --rate 8
npx tokstream@latest --web --port 8787
```
For local development in this repo:
```bash
npx . --model gpt2 --mode english --rate 8
```
### Supported Flags (npx)
- `--model `
- `--revision `
- `--hf-token ` (or env `HF_TOKEN` / `HUGGINGFACE_HUB_TOKEN`)
- `--mode `
- `--text `
- `--loop` (loop text forever)
- `--repeat `
- `--rate `
- `--rate-min ` / `--rate-max `
- `--rate-sample-interval `
- `--seed `
- `--max-tokens `
- `--no-skip-special`
- `--allow-digits` / `--allow-punct` / `--allow-space` / `--allow-non-ascii`
- `--no-require-letter` / `--no-require-cjk`
- `--no-stats` / `--stats-interval `
- `--no-throttle` / `--no-output`
- `--web --port `
Notes:
- `--loop-text`, `--text-file`, `--batch`, `--pace`, and `--spin-threshold-us` are Rust‑CLI only.
## Web Demo
```bash
npx tokstream@latest --web --port 8787
# open http://localhost:8787
```
While running, you can drag the rate slider or enable random rate range. The page shows target and actual throughput. The output pane is fixed‑height and scrolls independently.
## Accuracy Notes
- Rust CLI `strict` uses sleep + short spin for high precision.
- Web / npx are best‑effort due to event loop and I/O limits.
- If actual throughput doesn’t change while raising target rates, you likely hit tokenizer limits.
- For maximum throughput testing, use the Rust CLI with `--no-output --no-throttle`.
## Build WASM (optional refresh)
```bash
npm run build:wasm
```
WASM artifacts are committed and included in the npm package.
## just Recipes
```bash
just
```
## Tests
```bash
cargo clippy --workspace
cargo nextest run --workspace
```
## License
MIT