https://github.com/maxim-saplin/tiktoken-bench
Comparing OpenAI tokeniser (tiktoken) performance - stock Python/Rust vs JS/WASM
https://github.com/maxim-saplin/tiktoken-bench
ai chatgpt javascript nlp python tiktoken
Last synced: 3 months ago
JSON representation
Comparing OpenAI tokeniser (tiktoken) performance - stock Python/Rust vs JS/WASM
- Host: GitHub
- URL: https://github.com/maxim-saplin/tiktoken-bench
- Owner: maxim-saplin
- Created: 2023-12-17T16:49:37.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-12-28T17:05:14.000Z (over 2 years ago)
- Last Synced: 2024-12-28T18:55:21.968Z (over 1 year ago)
- Topics: ai, chatgpt, javascript, nlp, python, tiktoken
- Language: Dart
- Homepage:
- Size: 5.37 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Comparing OpenAI tokeniser (tiktoken) performance - stock Python/Rust vs JS/WASM.
Running tests on M1 MacBook Pro
# Results
- Small File (68 tokens)
```
Python/Rust (tiktoken 0.5.2) ████ (0.04ms)
Pure JS (js-tiktoken 1.0.8) █████ (0.05ms)
JS/WASM (tiktoken 1.0.11) ██████████ (0.11ms)
@dqbd/WASM 1.0.7 ██████████████████ (0.18ms)
```
- Medium File (1068 tokens)
```
Python/Rust (tiktoken 0.5.2) ██████ (0.54ms)
JS/WASM (tiktoken 1.0.11) █████████ (0.78ms)
@dqbd/WASM 1.0.7 █████████ (0.80ms)
Pure JS (js-tiktoken 1.0.8) ██████████ (0.96ms)
```
- Large File (923942 tokens)
```
Python/Rust (tiktoken 0.5.2) ████████████████ (359.49ms)
@dqbd/WASM 1.0.7 ████████████████████ (421.71ms)
JS/WASM (tiktoken 1.0.11) ██████████████████████ (451.92ms)
Pure JS (js-tiktoken 1.0.8) █████████████████████████████████████ (1005.69ms)
```
# Pyhton 3.11.6
## tiktoken 0.5.2
OpenAi implementation (using Rust behind the scenes)
```
File: 0_small.txt (2000 - 68) - Avg Time: 0.04ms, StdDev: 29.47%
File: 1_medium.txt (200 - 1068) - Avg Time: 0.54ms, StdDev: 3.07%
File: 2_large.txt (20 - 923942) - Avg Time: 359.49ms, StdDev: 0.85%
```
# JS, Node 21.2.0
```
npm install
npm start
```
Pure JS and Web Assembly versions
## tiktoken 1.0.11 (WASM)
```
File: 0_small.txt (2000 - 68) - Avg Time: 0.11ms, StdDev: 55.14%
File: 1_medium.txt (200 - 1068) - Avg Time: 0.78ms, StdDev: 6.84%
File: 2_large.txt (20 - 923942) - Avg Time: 451.92ms, StdDev: 0.75%
```
## js-tiktoken 1.0.8
```
File: 0_small.txt (2000 - 68) - Avg Time: 0.05ms, StdDev: 125.55%
File: 1_medium.txt (200 - 1068) - Avg Time: 0.96ms, StdDev: 29.51%
File: 2_large.txt (20 - 923942) - Avg Time: 1005.69ms, StdDev: 0.58%
```
## @dqbd/tiktoken 1.0.7 (WASM)
```
File: 0_small.txt (2000 - 68) - Avg Time: 0.18ms, StdDev: 48.96%
File: 1_medium.txt (200 - 1068) - Avg Time: 0.80ms, StdDev: 11.21%
File: 2_large.txt (20 - 923942) - Avg Time: 421.71ms, StdDev: 1.30%
```