{"id":17220851,"url":"https://github.com/kojix2/tiktoken-c","last_synced_at":"2025-10-07T01:31:58.717Z","repository":{"id":187438782,"uuid":"676428867","full_name":"kojix2/tiktoken-c","owner":"kojix2","description":"C API for tiktoken-rs","archived":false,"fork":false,"pushed_at":"2025-05-21T03:51:35.000Z","size":53,"stargazers_count":12,"open_issues_count":1,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-07-02T20:06:08.857Z","etag":null,"topics":["bpe","c","tiktoken","tokenizer"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kojix2.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"ko_fi":"kojix2"}},"created_at":"2023-08-09T07:18:47.000Z","updated_at":"2025-06-30T02:30:19.000Z","dependencies_parsed_at":"2023-11-23T10:30:42.846Z","dependency_job_id":"23b22f6b-c167-409f-9f3f-1653ea6c72c1","html_url":"https://github.com/kojix2/tiktoken-c","commit_stats":null,"previous_names":["kojix2/tiktoken-c"],"tags_count":10,"template":false,"template_full_name":null,"purl":"pkg:github/kojix2/tiktoken-c","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kojix2%2Ftiktoken-c","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kojix2%2Ftiktoken-c/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kojix2%2Ftiktoken-c/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kojix2%2Ftiktoken-c/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kojix2","download_url":"https://codeload.github.com/kojix2/tiktoken-c/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kojix2%2Ftiktoken-c/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263208051,"owners_count":23430675,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bpe","c","tiktoken","tokenizer"],"created_at":"2024-10-15T03:53:25.068Z","updated_at":"2025-10-07T01:31:58.709Z","avatar_url":"https://github.com/kojix2.png","language":"Rust","funding_links":["https://ko-fi.com/kojix2"],"categories":[],"sub_categories":[],"readme":"# tiktoken-c\n\n[![test](https://github.com/kojix2/tiktoken-c/actions/workflows/test.yml/badge.svg)](https://github.com/kojix2/tiktoken-c/actions/workflows/test.yml)\n[![Lines of Code](https://img.shields.io/endpoint?url=https%3A%2F%2Ftokei.kojix2.net%2Fbadge%2Fgithub%2Fkojix2%2Ftiktoken-c%2Flines)](https://tokei.kojix2.net/github/kojix2/tiktoken-c)\n[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/kojix2/tiktoken-c)\n\n- C API for [Tiktoken](https://github.com/openai/tiktoken), OpenAI's tokenizer\n- Compatible with [tiktoken-rs](https://github.com/zurawiki/tiktoken-rs) 0.7.0+\n\n## Installation\n\nDownload from [GitHub Releases](https://github.com/kojix2/tiktoken-c/releases) or build from source:\n\n```sh\ngit clone https://github.com/kojix2/tiktoken-c\ncd tiktoken-c\ncargo build --release\n# Output: target/release/libtiktoken_c.{so,dylib,dll}\n```\n\n## C API Overview\n\nThe API mirrors the functionality of [tiktoken-rs](https://docs.rs/tiktoken-rs/). Below are key types and functions.\n\n### Types\n\n```c\ntypedef void CoreBPE;\ntypedef uint32_t Rank;\n\ntypedef struct CFunctionCall {\n  const char *name;\n  const char *arguments;\n} CFunctionCall;\n\ntypedef struct CChatCompletionRequestMessage {\n  const char *role;\n  const char *content;\n  const char *name;\n  const struct CFunctionCall *function_call;\n} CChatCompletionRequestMessage;\n```\n\n### Core Functions\n\n#### Version / Init\n\n```c\nconst char *tiktoken_c_version(void);\nvoid tiktoken_init_logger(void);\n```\n\n#### Load Tokenizer\n\n```c\nCoreBPE *tiktoken_get_bpe_from_model(const char *model);\nCoreBPE *tiktoken_r50k_base(void);   // GPT-3 models\nCoreBPE *tiktoken_p50k_base(void);   // Code models\nCoreBPE *tiktoken_p50k_edit(void);   // Edit models\nCoreBPE *tiktoken_cl100k_base(void); // ChatGPT models\nCoreBPE *tiktoken_o200k_base(void);  // GPT-4o models\n```\n\n#### Encoding \u0026 Decoding\n\n```c\nRank *tiktoken_corebpe_encode(CoreBPE *ptr, const char *text,\n                              const char *const *allowed_special,\n                              size_t allowed_special_len,\n                              size_t *num_tokens);\n\nRank *tiktoken_corebpe_encode_ordinary(CoreBPE *ptr, const char *text, size_t *num_tokens);\nRank *tiktoken_corebpe_encode_with_special_tokens(CoreBPE *ptr, const char *text, size_t *num_tokens);\nchar *tiktoken_corebpe_decode(CoreBPE *ptr, const Rank *tokens, size_t num_tokens);\n```\n\n#### Token Counting\n\n```c\nsize_t tiktoken_get_completion_max_tokens(const char *model, const char *prompt);\n\nsize_t tiktoken_num_tokens_from_messages(const char *model,\n                                         uint32_t num_messages,\n                                         const CChatCompletionRequestMessage *messages);\n\nsize_t tiktoken_get_chat_completion_max_tokens(const char *model,\n                                               uint32_t num_messages,\n                                               const CChatCompletionRequestMessage *messages);\n```\n\n#### Cleanup\n\n```c\nvoid tiktoken_destroy_corebpe(CoreBPE *ptr);\nvoid tiktoken_free(void *ptr);\n```\n\n## Memory Management\n\nUse `tiktoken_free()` to release any heap memory returned by the library:\n\n| Function                                              | Return Type       | Free with                    |\n| ----------------------------------------------------- | ----------------- | ---------------------------- |\n| `*_encode*` / `*_decode`                              | `Rank*` / `char*` | `tiktoken_free(ptr)`         |\n| `tiktoken_*_base()` / `tiktoken_get_bpe_from_model()` | `CoreBPE*`        | `tiktoken_destroy_corebpe()` |\n\nImportant Notes:\n\n- Do NOT pass the pointer returned by `tiktoken_c_version()` to any free function (static string).\n- On Windows, always prefer `tiktoken_free()` rather than `free()`.\n- When encoding results in 0 tokens, the returned pointer may be NULL. Always check for NULL before use.\n\n## Example\n\n### Count Tokens\n\n```c\n#include \u003cstdio.h\u003e\n#include \u003cstdlib.h\u003e\n#include \"tiktoken.h\"\n\nint main() {\n  CoreBPE *bpe = tiktoken_get_bpe_from_model(\"gpt-4\");\n  if (!bpe) return 1;\n\n  const char *text = \"Hello, world!\";\n  size_t num_tokens;\n  Rank *tokens = tiktoken_corebpe_encode_with_special_tokens(bpe, text, \u0026num_tokens);\n\n  if (tokens) {\n    printf(\"Token count: %zu\\n\", num_tokens);\n    tiktoken_free(tokens);\n  }\n\n  tiktoken_destroy_corebpe(bpe);\n  return 0;\n}\n```\n\n## Language Bindings\n\n| Language | Repository                                           |\n| -------- | ---------------------------------------------------- |\n| Crystal  | [tiktoken-cr](https://github.com/kojix2/tiktoken-cr) |\n\n## Development\n\n```sh\n# Run tests\ncargo test\ncd test \u0026\u0026 ./test.sh\n\n# Generate header\ncargo install --force cbindgen\ncbindgen --config cbindgen.toml --crate tiktoken-c --output tiktoken.h\n\n# Patch header to insert typedefs for CoreBPE and Rank\nperl -i -pe '$i ||= /#include/; $_ = \"\\ntypedef void CoreBPE;\\ntypedef uint32_t Rank;\\n\" if $i \u0026\u0026 /^$/ \u0026\u0026 !$f++; $i = 0 if /^$/ \u0026\u0026 $f' tiktoken.h\n```\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkojix2%2Ftiktoken-c","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkojix2%2Ftiktoken-c","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkojix2%2Ftiktoken-c/lists"}