{"id":20355778,"url":"https://github.com/zackproser/tokenization-demo","last_synced_at":"2025-07-29T22:38:42.774Z","repository":{"id":240217735,"uuid":"802005021","full_name":"zackproser/tokenization-demo","owner":"zackproser","description":null,"archived":false,"fork":false,"pushed_at":"2024-05-17T10:39:06.000Z","size":215,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-29T05:34:15.829Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zackproser.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-17T10:34:01.000Z","updated_at":"2024-06-16T12:16:41.000Z","dependencies_parsed_at":"2024-05-17T11:53:11.183Z","dependency_job_id":null,"html_url":"https://github.com/zackproser/tokenization-demo","commit_stats":null,"previous_names":["zackproser/tokenization-demo"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/zackproser/tokenization-demo","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zackproser%2Ftokenization-demo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zackproser%2Ftokenization-demo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zackproser%2Ftokenization-demo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zackproser%2Ftokenization-demo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zackproser","download_url":"https://codeload.github.com/zackproser/tokenization-demo/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zackproser%2Ftokenization-demo/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267772851,"owners_count":24142095,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-29T02:00:12.549Z","response_time":2574,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-14T23:14:04.126Z","updated_at":"2025-07-29T22:38:42.752Z","avatar_url":"https://github.com/zackproser.png","language":"TypeScript","readme":"# Tokenization Demo \n\n### Understand how Large Language Models (LLMs) see your context and prompts\n\n![Tokenization demo](./docs/tokenization-demo.png)\n\n### Built With\n\n- Next.js + tailwind\n- The `Tiktoken` library\n- Node version 20 or higher\n\n---\n\n### Start the project\n\n**Requires Node version 20+**\n\nFrom the project root directory, run the following command.\n\n```bash\nnpm install\n```\n\nThere are no required environment variables for this project, and this project does not use any \nthird party services that cost money. It simply passes the text input on the frontend to the \nbackend API route which tokenizes the text using `Tiktoken`.\n\nStart the app.\n\n```bash\nnpm run dev\n```\n\n## Project structure\n\nIn this example we opted to use Next.js and the app router, which colocates the frontend and backend code in a single repository.\n\n**Frontend Client**\n\nThe frontend uses Next.js and tailwind to allow users to enter free form text. This text is split by word on the client-side and then converted to tokens by the `tiktoken` library \nwhen the user clicks the `Tokenize text` button. \n\nThe `tiktoken` library looks up and assigns each word an ID according to its vocabulary.\n\n**Backend API route**\n\nThis project exposes an API route: `/api/tokenize`, that uses the `Tiktoken` library to tokenize text that it receives from the frontend: \n\n```typescript\nimport { NextRequest, NextResponse } from 'next/server';\nimport { encodingForModel } from \"js-tiktoken\";\n\nexport async function POST(req: NextRequest) {\n\n  try {\n    const enc = encodingForModel('gpt-3.5-turbo');\n\n    const { inputText } = await req.json();\n    console.log(`inputText: ${inputText}`);\n\n    const tokens = enc.encode(inputText);\n\n    return NextResponse.json({ tokens }, { status: 200 });\n  } catch (error) {\n    return NextResponse.json({ error }, { status: 500 });\n  }\n}\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzackproser%2Ftokenization-demo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzackproser%2Ftokenization-demo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzackproser%2Ftokenization-demo/lists"}