{"id":21058255,"url":"https://github.com/kuvaus/llama-chat","last_synced_at":"2025-05-16T00:33:14.498Z","repository":{"id":156366367,"uuid":"632843892","full_name":"kuvaus/llama-chat","owner":"kuvaus","description":"Simple chat program for LLaMa models","archived":false,"fork":false,"pushed_at":"2023-04-29T11:18:41.000Z","size":778,"stargazers_count":8,"open_issues_count":1,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-19T11:17:39.180Z","etag":null,"topics":["ai","cpp","gpt","gpt4all","llama"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kuvaus.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-26T08:46:42.000Z","updated_at":"2025-04-06T23:38:36.000Z","dependencies_parsed_at":null,"dependency_job_id":"eaf45da5-1749-4448-9127-e8443d0ea36f","html_url":"https://github.com/kuvaus/llama-chat","commit_stats":{"total_commits":40,"total_committers":1,"mean_commits":40.0,"dds":0.0,"last_synced_commit":"bb53fa50f0f6066e65d19f4bc1ef1357535c3b37"},"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kuvaus%2Fllama-chat","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kuvaus%2Fllama-chat/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kuvaus%2Fllama-chat/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kuvaus%2Fllama-chat/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kuvaus","download_url":"https://codeload.github.com/kuvaus/llama-chat/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251626898,"owners_count":21617743,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","cpp","gpt","gpt4all","llama"],"created_at":"2024-11-19T17:07:16.348Z","updated_at":"2025-05-16T00:33:09.458Z","avatar_url":"https://github.com/kuvaus.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![CMake](https://github.com/kuvaus/llama-chat/actions/workflows/cmake.yml/badge.svg)](https://github.com/kuvaus/llama-chat/actions/workflows/cmake.yml)\n# Llama-Chat\nSimple command line chat program for [LLaMA](https://en.wikipedia.org/wiki/LLaMA) models written in C++. Based on [llama.cpp](https://github.com/ggerganov/llama.cpp) with some bindings from [gpt4all-chat](https://github.com/nomic-ai/gpt4all-chat).\n\n\u003cimg alt=\"Llama-Chat demo\" src=\"https://user-images.githubusercontent.com/22169537/234532183-eb70ebca-7136-43d6-ac07-4ea9b4fab28f.gif\" width=\"600\" /\u003e\n\n# Table of contents\n\u003c!-- TOC --\u003e\n* [LLaMA-model](#llama-model)\n* [Installation](#installation)\n* [Usage](#usage)\n* [Detailed command list](#detailed-command-list)\n* [License](#license)\n\u003c!-- TOC --\u003e\n\n## LLaMA model\nYou need to download a LLaMA model first. The original weights are for research purposes and you can apply for access [here](https://ai.facebook.com/blog/large-language-model-llama-meta-ai/). Below are direct links to derived models:a\n\n\u003e- Vicuna 7b **v1.1**: [ggml-vicuna-7b-1.1-q4_2.bin](https://gpt4all.io/models/ggml-vicuna-7b-1.1-q4_2.bin)\n\u003e- Vicuna 13b **v1.1**: [ggml-vicuna-13b-1.1-q4_2.bin](https://gpt4all.io/models/ggml-vicuna-13b-1.1-q4_2.bin)\n\u003e- GPT-4-All **l13b-snoozy**: [ggml-gpt4all-l13b-snoozy.bin](https://gpt4all.io/models/ggml-gpt4all-l13b-snoozy.bin)\n\nThe LLaMA models are quite large: the 7B parameter versions are around 4.2 Gb and 13B parameter 8.2 Gb each. The chat program stores the model in RAM on runtime so you need enough memory to run. You can get more details on LLaMA models from the [whitepaper](https://arxiv.org/abs/2302.13971) or META AI [website](https://ai.facebook.com/blog/large-language-model-llama-meta-ai/).\n\n## Installation\n### Download\n\n```sh\ngit clone --recurse-submodules https://github.com/kuvaus/llama-chat\ncd llama-chat\n```\n\n### Build\n\n```sh\nmkdir build\ncd build\ncmake ..\ncmake --build . --parallel\n```\n\u003e **Note**\n\u003e If you have an old processor, you can turn AVX2 instructions off in the build step with `-DAVX2=OFF` flag \n\n## Usage\n\nAfter compiling, the binary is located at:\n\n```sh\nbuild/bin/chat\n```\nBut you're free to move it anywhere. Simple command for 4 threads to get started:\n```sh\n./chat -m \"/path/to/modelfile/ggml-vicuna-13b-1.1-q4_2.bin\" -t 4\n```\n\nHappy chatting!\n\n## Detailed command list\nYou can view the help and full parameter list with:\n`\n./chat -h\n`\n\n```sh\nusage: ./bin/chat [options]\n\nA simple chat program for LLaMA based models.\nYou can set specific initial prompt with the -p flag.\nRuns default in interactive and continuous mode.\nType 'quit', 'exit' or, 'Ctrl+C' to quit.\n\noptions:\n  -h, --help            show this help message and exit\n  --run-once            disable continuous mode\n  --no-interactive      disable interactive mode altogether (uses given prompt only)\n  -s SEED, --seed SEED  RNG seed (default: -1)\n  -t N, --threads N     number of threads to use during computation (default: 4)\n  -p PROMPT, --prompt PROMPT\n                        prompt to start generation with (default: empty)\n  --random-prompt       start with a randomized prompt.\n  -n N, --n_predict N   number of tokens to predict (default: 200)\n  --top_k N             top-k sampling (default: 50400)\n  --top_p N             top-p sampling (default: 1.0)\n  --temp N              temperature (default: 0.9)\n  -b N, --batch_size N  batch size for prompt processing (default: 9)\n  -r N, --remember N    number of chars to remember from start of previous answer (default: 200)\n  -j,   --load_json FNAME\n                        load options instead from json at FNAME (default: empty/no)\n  -m FNAME, --model FNAME\n                        model path (current: models/ggml-vicuna-13b-1.1-q4_2.bin)\n```\n\nYou can also fetch parameters from a json file with `--load_json \"/path/to/file.json\"` flag.  The json file has to be in following format:\n\n```javascript\n{\"top_p\": 1.0, \"top_k\": 50400, \"temp\": 0.9, \"n_batch\": 9}\n```\nThis is useful when you want to store different temperature and sampling settings.\n\n## License\n\nThis project is licensed under the MIT [License](https://github.com/kuvaus/llama-chat/blob/main/LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkuvaus%2Fllama-chat","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkuvaus%2Fllama-chat","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkuvaus%2Fllama-chat/lists"}