https://github.com/ivanitskiy/njs-gpt-encoder
Tokenizing text with GPT-3 models for NJS
https://github.com/ivanitskiy/njs-gpt-encoder
Last synced: 3 months ago
JSON representation
Tokenizing text with GPT-3 models for NJS
- Host: GitHub
- URL: https://github.com/ivanitskiy/njs-gpt-encoder
- Owner: ivanitskiy
- License: apache-2.0
- Created: 2023-09-20T15:20:30.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-09-20T23:12:34.000Z (almost 2 years ago)
- Last Synced: 2025-02-08T07:24:50.111Z (5 months ago)
- Language: JavaScript
- Size: 667 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
This is an attempt to port https://github.com/latitudegames/GPT-3-Encoder/ to NJS runtime
Turned out that changing some code because NJS doesn't support syntax.
Using [rollup-replace](https://www.npmjs.com/package/@rollup/plugin-replace) plug-in to embed some assets at build time to produce a self containing single JS file.to start it:
$ docker compose up
then can be tested like that:
```
curl -d "Welcome. Replace this with your text to see how tokenization works." http://localhost:8009/
body: Welcome. Replace this with your text to see how tokenization works.
encoded: ,234,220,,220,,220,,220,,220,,220,,220,,220,,220,,220,,234
decoded: . .%```
Related tiktoken projects for JavaScript:
- https://github.com/niieani/gpt-tokenizer (the most complete, but hard to port to NJS)
- https://github.com/ceifa/tiktoken-node
- https://github.com/dqbd/tiktoken
- https://github.com/openai/tiktoken
- https://github.com/latitudegames/GPT-3-Encoder