Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ivanitskiy/njs-gpt-encoder
Tokenizing text with GPT-3 models for NJS
https://github.com/ivanitskiy/njs-gpt-encoder
Last synced: 9 days ago
JSON representation
Tokenizing text with GPT-3 models for NJS
- Host: GitHub
- URL: https://github.com/ivanitskiy/njs-gpt-encoder
- Owner: ivanitskiy
- License: apache-2.0
- Created: 2023-09-20T15:20:30.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-09-20T23:12:34.000Z (over 1 year ago)
- Last Synced: 2024-10-28T02:07:25.562Z (about 2 months ago)
- Language: JavaScript
- Size: 667 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
This is an attempt to port https://github.com/latitudegames/GPT-3-Encoder/ to NJS runtime
Turned out that changing some code because NJS doesn't support syntax.
Using [rollup-replace](https://www.npmjs.com/package/@rollup/plugin-replace) plug-in to embed some assets at build time to produce a self containing single JS file.to start it:
$ docker compose up
then can be tested like that:
```
curl -d "Welcome. Replace this with your text to see how tokenization works." http://localhost:8009/
body: Welcome. Replace this with your text to see how tokenization works.
encoded: ,234,220,,220,,220,,220,,220,,220,,220,,220,,220,,220,,234
decoded: . .%```
Related tiktoken projects for JavaScript:
- https://github.com/niieani/gpt-tokenizer (the most complete, but hard to port to NJS)
- https://github.com/ceifa/tiktoken-node
- https://github.com/dqbd/tiktoken
- https://github.com/openai/tiktoken
- https://github.com/latitudegames/GPT-3-Encoder