An open API service indexing awesome lists of open source software.

https://github.com/m4n5ter/tokstream

A token streaming simulator powered by Hugging Face tokenizers. It downloads a tokenizer from HF Hub and generates tokens at a target rate, with live stats for target vs actual throughput.
https://github.com/m4n5ter/tokstream

simulator tokenizer

Last synced: 5 months ago
JSON representation

A token streaming simulator powered by Hugging Face tokenizers. It downloads a tokenizer from HF Hub and generates tokens at a target rate, with live stats for target vs actual throughput.

Awesome Lists containing this project

README

          

# tokstream

[中文](README_ZH.md) | English

A token streaming simulator powered by Hugging Face tokenizers. It downloads a tokenizer from HF Hub and generates tokens at a target rate, with live stats for target vs actual throughput.

## Highlights

- Rust CLI with high‑precision pacing (sleep + spin)
- Web demo (WASM) and npx executable
- Random English / Chinese generation and text replay
- Configurable filtering strategy
- Target vs actual tokens/sec stats
- Workspace layout with reusable core

## Project Layout

```
.
├── crates
│ ├── tokstream-core # tokenizer engine
│ ├── tokstream-cli # Rust CLI
│ └── tokstream-wasm # wasm-bindgen bindings
├── npm # npx CLI + web demo
├── bin # npm bin entry
├── Cargo.toml # workspace
├── justfile
├── package.json
├── README.md
└── README_ZH.md
```

## Rust CLI

### Quick Start

```bash
cargo run -p tokstream-cli -- --model gpt2 --mode english --rate 8
cargo run -p tokstream-cli -- --model gpt2 --mode chinese --rate 8
cargo run -p tokstream-cli -- --model gpt2 --mode text --text "Hello" --repeat 3
```

### Install from crates.io

```bash
cargo install tokstream-cli
# or
cargo binstall tokstream-cli
```

Notes:
- The binary name is `tokstream` after installation.
- `cargo binstall` will compile from source unless you provide prebuilt release assets and set `repository` in the crate metadata.

### Model & Auth

- `--model ` HF Hub model id (default: `gpt2`)
- `--revision ` HF revision (default: `main`)
- `--hf-token ` access token for private models

### Modes

- `--mode `
- `--text ` text mode input
- `--text-file ` text mode input from file
- `--loop-text` loop text forever
- `--repeat ` repeat text n times

### Rate Control

- `--rate ` target tokens/sec
- `--rate-min ` min rate for random range
- `--rate-max ` max rate for random range
- `--rate-sample-interval ` sampling interval for rate range (seconds, default: 1)
- `--batch ` tokens emitted per batch
- `--max-tokens ` stop after n tokens

### Pacing & Throughput

- `--pace ` pacing mode (default: `strict`)
- `--spin-threshold-us ` busy‑spin threshold for `strict` mode
- `--no-throttle` disable pacing (measure max throughput)
- `--no-output` disable stdout output (closer to tokenizer upper bound)

### Stats

- `--no-stats` disable stats output (stderr)
- `--stats-interval ` stats interval seconds (default: 1)

### Random Output Filters

- `--no-skip-special` do not skip special tokens
- `--allow-digits`
- `--allow-punct`
- `--allow-space`
- `--allow-non-ascii`
- `--no-require-letter`
- `--no-require-cjk`

### Seed

- `--seed ` random seed

### Examples

```bash
# Random rate range sampled every 2 seconds
cargo run -p tokstream-cli -- --model gpt2 --mode english --rate-min 6 --rate-max 12 --rate-sample-interval 2

# Text mode from file, repeat 5 times
cargo run -p tokstream-cli -- --model gpt2 --mode text --text-file ./sample.txt --repeat 5

# Infinite loop text
cargo run -p tokstream-cli -- --model gpt2 --mode text --text "Hello" --loop-text

# Throughput upper bound (no throttle, no output)
cargo run -p tokstream-cli -- --model gpt2 --mode english --no-throttle --no-output
```

## npx CLI

### Quick Start

```bash
npx tokstream@latest --model gpt2 --mode english --rate 8
npx tokstream@latest --web --port 8787
```

For local development in this repo:

```bash
npx . --model gpt2 --mode english --rate 8
```

### Supported Flags (npx)

- `--model `
- `--revision `
- `--hf-token ` (or env `HF_TOKEN` / `HUGGINGFACE_HUB_TOKEN`)
- `--mode `
- `--text `
- `--loop` (loop text forever)
- `--repeat `
- `--rate `
- `--rate-min ` / `--rate-max `
- `--rate-sample-interval `
- `--seed `
- `--max-tokens `
- `--no-skip-special`
- `--allow-digits` / `--allow-punct` / `--allow-space` / `--allow-non-ascii`
- `--no-require-letter` / `--no-require-cjk`
- `--no-stats` / `--stats-interval `
- `--no-throttle` / `--no-output`
- `--web --port `

Notes:
- `--loop-text`, `--text-file`, `--batch`, `--pace`, and `--spin-threshold-us` are Rust‑CLI only.

## Web Demo

```bash
npx tokstream@latest --web --port 8787
# open http://localhost:8787
```

While running, you can drag the rate slider or enable random rate range. The page shows target and actual throughput. The output pane is fixed‑height and scrolls independently.

## Accuracy Notes

- Rust CLI `strict` uses sleep + short spin for high precision.
- Web / npx are best‑effort due to event loop and I/O limits.
- If actual throughput doesn’t change while raising target rates, you likely hit tokenizer limits.
- For maximum throughput testing, use the Rust CLI with `--no-output --no-throttle`.

## Build WASM (optional refresh)

```bash
npm run build:wasm
```

WASM artifacts are committed and included in the npm package.

## just Recipes

```bash
just
```

## Tests

```bash
cargo clippy --workspace
cargo nextest run --workspace
```

## License

MIT