Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/zer0int/gpt-4-tiktoken-tokens-etc
Extracted from OpenAI/tiktoken
https://github.com/zer0int/gpt-4-tiktoken-tokens-etc
Last synced: 6 days ago
JSON representation
Extracted from OpenAI/tiktoken
- Host: GitHub
- URL: https://github.com/zer0int/gpt-4-tiktoken-tokens-etc
- Owner: zer0int
- Created: 2023-09-05T10:19:23.000Z (about 1 year ago)
- Default Branch: CLIP-vision
- Last Pushed: 2023-09-05T10:43:27.000Z (about 1 year ago)
- Last Synced: 2023-09-05T11:35:32.187Z (about 1 year ago)
- Size: 2.24 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# GPT-4-tiktoken-tokens-etc
GPT-4 tokenizer, extracted from [OpenAI/tiktoken](https://github.com/openai/tiktoken)# Can be used for curiosity, or for uploading to GPT-4 code interpreter / advanced data analysis.
Upload .whl (pip download tiktoken [...], or pypi.org, choose *manylinux cp38) and file in: "/data-gym-cache".
Contains the file (base64 encoded tokens -> for decoded tokens, see "tokens.txt") obtained from the internet with e.g. "encoding = tiktoken.get_encoding("cl100k_base")" -> for use with gpt-4, gpt-3.5-turbo, text-embedding-ada-002, as the AI doesn't have internet access within its sandbox environment.
Note: Needs modification of "tiktoken" library to bypass / wrap around stuff that would try to access the internet (and fail), but GPT-4 can do that (if you can't).