{"id":16622005,"url":"https://github.com/itzderock/llama-playground","last_synced_at":"2025-10-29T21:31:53.411Z","repository":{"id":147519461,"uuid":"618852047","full_name":"ItzDerock/llama-playground","owner":"ItzDerock","description":"A simple to use and powerful web-interface to mess around with Meta's LLaMA LLM. ","archived":false,"fork":false,"pushed_at":"2023-03-30T18:56:42.000Z","size":272,"stargazers_count":16,"open_issues_count":3,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-10-10T06:09:38.775Z","etag":null,"topics":["llama","llama-cpp","llama-inference-server","llamacpp","nextjs","trpc"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ItzDerock.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-03-25T14:42:22.000Z","updated_at":"2024-06-10T23:23:47.000Z","dependencies_parsed_at":null,"dependency_job_id":"c0fd62b6-9289-4111-92bf-0c32b304a704","html_url":"https://github.com/ItzDerock/llama-playground","commit_stats":{"total_commits":11,"total_committers":1,"mean_commits":11.0,"dds":0.0,"last_synced_commit":"d8358d07b39769a5da9cfbf5ce6330db12327fd6"},"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ItzDerock%2Fllama-playground","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ItzDerock%2Fllama-playground/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ItzDerock%2Fllama-playground/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ItzDerock%2Fllama-playground/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ItzDerock","download_url":"https://codeload.github.com/ItzDerock/llama-playground/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":219857154,"owners_count":16556071,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["llama","llama-cpp","llama-inference-server","llamacpp","nextjs","trpc"],"created_at":"2024-10-12T02:49:08.837Z","updated_at":"2025-10-29T21:31:52.501Z","avatar_url":"https://github.com/ItzDerock.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🦙 LLaMA Playground 🛝\n\nA simple Open-AI inspired interface that uses [llama.cpp#tcp_server](https://github.com/ggerganov/llama.cpp/tree/tcp_server) in the background.\n\n![demo](./public/demo.gif)\n\n## Difference vs. other interfaces\n\nOther interfaces use the llama.cpp cli command to run the model. This is not ideal since it requires to spawn a new process for each request. This is not only slow but also requires to load the model each time. This interface uses the llama.cpp tcp_server to run the model in the background. This allows to run multiple requests in parallel and also to cache the model in memory.\n\n## Features\n\n- Simple to use UI\n- Able to handle multiple requests in parallel quickly\n- Controls to change the model parameters on the fly\n  - Does not require rebooting, changes are applied instantly\n- Save and load templates to save your work\n  - Templates are saved in the browser's local storage and are not sent to the server\n\n## About\n\nBuilt on top of a modified [T3-stack](https://github.com/t3-oss/create-t3-app) application.  \nFastify is used instead the regular next.js server since websocket support is needed.  \n[Mantine](https://mantine.dev/) is used for the UI.  \n[tRPC](https://trpc.io/) is used for an end-to-end type-safe API.\n\nThe fastify server starts a tcp_server from llama.cpp in the background.  \nUpon each request, the server establishes a new TCP connection to the tcp_server and sends the request.  \nOutput is then forwarded to the client via websockets.\n\n## Notice\n\nThis is not meant to be used in production. There is no rate-limiting, no authentication, etc. It is just a simple interface to play with the models.\n\n## Usage\n\n### Getting the model\n\nThis repository will not include the model weights as these are the property of Meta. Do not share the weights in this repository.\n\nCurrently, the application will not convert and quantize the model for you. You will need to do this yourself. This means you will need the llama.cpp build dependencies.\n\n- For ubuntu: `build-essentail make python3`\n- For arch: `base-devel make python3`\n\n```bash\n# build this repo\ngit clone https://github.com/ggerganov/llama.cpp\ncd llama.cpp\nmake\n\n# obtain the original LLaMA model weights and place them in ./models\nls ./models\n65B 30B 13B 7B tokenizer_checklist.chk tokenizer.model\n\n# install Python dependencies\npython3 -m pip install torch numpy sentencepiece\n\n# convert the 7B model to ggml FP16 format\npython3 convert-pth-to-ggml.py models/7B/ 1\n\n# quantize the model to 4-bits\npython3 quantize.py 7B\n```\n\n\u003csub\u003e\u003csup\u003e^ (source [llama.cpp/README.md](https://github.com/ggerganov/llama.cpp/))\u003c/sup\u003e\u003c/sub\u003e\n\nThen you can start the server using one of the below methods:\n\n### With Docker\n\n```bash\n# Clone the repository\ngit clone --depth 1 https://github.com/ItzDerock/llama-playground .\n\n# Edit the docker-compose.yml file to point to the correct model\nvim docker-compose.yml\n\n# Start the server\ndocker-compose up -d\n```\n\n### Without Docker\n\n```bash\n# Clone the repository\ngit clone --depth 1 https://github.com/ItzDerock/llama-playground .\n\n# Install dependencies\npnpm install # you will need pnpm\n\n# Edit the .env file to point to the correct model\nvim .env\n\n# Build the server\npnpm build\n\n# Start the server\npnpm start\n```\n\n## Development\n\nRun `pnpm run dev` to start the development server.\n\n## License\n\n[MIT](./LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fitzderock%2Fllama-playground","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fitzderock%2Fllama-playground","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fitzderock%2Fllama-playground/lists"}