{"id":49332662,"url":"https://github.com/countzero/windows_exllama","last_synced_at":"2026-04-26T23:03:51.697Z","repository":{"id":177850089,"uuid":"660999923","full_name":"countzero/windows_exllama","owner":"countzero","description":"This is a playground to explore the ExLlama project in a Windows environment.","archived":false,"fork":false,"pushed_at":"2023-07-20T12:00:07.000Z","size":15,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-04-23T21:41:22.285Z","etag":null,"topics":["conda","cuda","exllama","python","torch"],"latest_commit_sha":null,"homepage":"","language":"PowerShell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/countzero.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-07-01T13:23:26.000Z","updated_at":"2024-04-23T21:41:22.286Z","dependencies_parsed_at":"2023-07-10T20:16:21.434Z","dependency_job_id":"456176fc-70f1-4978-8115-fa22a9ed5e47","html_url":"https://github.com/countzero/windows_exllama","commit_stats":{"total_commits":14,"total_committers":2,"mean_commits":7.0,"dds":0.5,"last_synced_commit":"9851482cd5299016bbfa841cb9047c02f98b2b86"},"previous_names":["countzero/windows_exllama"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/countzero/windows_exllama","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/countzero%2Fwindows_exllama","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/countzero%2Fwindows_exllama/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/countzero%2Fwindows_exllama/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/countzero%2Fwindows_exllama/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/countzero","download_url":"https://codeload.github.com/countzero/windows_exllama/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/countzero%2Fwindows_exllama/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32315714,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-26T21:09:39.134Z","status":"ssl_error","status_checked_at":"2026-04-26T21:09:21.240Z","response_time":129,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["conda","cuda","exllama","python","torch"],"created_at":"2026-04-26T23:03:50.715Z","updated_at":"2026-04-26T23:03:51.688Z","avatar_url":"https://github.com/countzero.png","language":"PowerShell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Learning ExLlama\n\nThis is a playground to explore the [ExLlama](https://github.com/turboderp/exllama) project in a Windows environment.\n\n## Installation\n\n### 1. Install Prerequisites\n\nDownload and install the latest versions:\n\n* [CMake](https://cmake.org/download/)\n* [Cuda](https://developer.nvidia.com/cuda-downloads)\n* [Git Large File Storage](https://git-lfs.com)\n* [Git](https://git-scm.com/download)\n* [Miniconda](https://conda.io/projects/conda/en/stable/user-guide/install)\n* [Visual Studio 2022 - Community](https://visualstudio.microsoft.com/downloads/)\n\n**Hint:** When installing Visual Studio 2022 it is sufficent to just install the `Build Tools for Visual Studio 2022` package. Also make sure that `Desktop development with C++` is enabled in the installer.\n\n### 2. Clone the repository from GitHub\n\nClone the repository to a nice place on your machine via:\n\n```Shell\ngit clone --recurse-submodules git@github.com:countzero/windows_exllama.git\n```\n\n### 3. Update the exllama submodule to the latest version (optional)\nThis repository can reference an outdated version of the exllama repository. To update the submodule to the latest version execute the following.\n\n```Shell\ngit submodule update --remote --merge\n```\n\nThen add, commit and push the changes to make the update available for others.\n\n```Shell\ngit add --all; git commit -am \"Update exllama submodule to latest commit\"; git push\n```\n\n**Hint:** This is optional because the build script will pull the latest version.\n\n### 4. Create a new Conda environment\n\nCreate a new Conda environment for this project with a specific version of Python:\n\n```Shell\nconda create --name exllama python=3.10\n```\n\n### 5. Initialize Conda for shell interaction\n\nTo make Conda available in you current shell execute the following:\n\n```Shell\nconda init\n```\n\n**Hint:** You can always revert this via `conda init --reverse`.\n\n### 6. Execute the build script\n\n```PowerShell\n./rebuild_exllama.ps1\n```\n\n### 7. Download a large language model\n\nDownload a large language model (LLM) with weights in the GPTQ format into the `./models` directory. You can for example download the [vicuna-7b-v1.3](https://huggingface.co/lmsys/vicuna-7b-v1.3) model in a quantized GPTQ format via:\n\n```Shell\ngit clone https://huggingface.co/TheBloke/vicuna-7B-v1.3-GPTQ ./models/vicuna-7B-v1.3-GPTQ\n```\n\n**Hint:** See the [🤗 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) for best in class open source LLMs.\n\n## Usage\n\n### Chat\n\nActivate the conda environment to make the dependencies available via:\n\n```Shell\nconda activate exllama\n```\n\nExecute the following to chat with a GPTQ formatted model:\n\n```PowerShell\npython ./vendor/exllama/example_chatbot.py `\n    --directory \"./models/vicuna-7B-v1.3-GPTQ\" `\n    --prompt \"./prompts/chatbot.txt\" `\n    --botname \"Vicuña\" `\n    --username \"User\" `\n    --length 2048 `\n    --no_newline\n```\n\n### Benchmark\n\nActivate the conda environment to make the dependencies available via:\n\n```Shell\nconda activate exllama\n```\n\nExecute the following to benchmark your system:\n\n```PowerShell\npython ./vendor/exllama/test_benchmark_inference.py `\n    --directory \"./models/vicuna-7B-v1.3-GPTQ\" `\n    --perf\n```\n\n### Measure model perplexity\n\nActivate the conda environment to make the dependencies available via:\n\n```Shell\nconda activate exllama\n```\n\nExecute the following to measure the perplexity of the GPTQ formatted model:\n\n```PowerShell\npython ./vendor/exllama/test_benchmark_inference.py `\n    --directory \"./models/vicuna-7B-v1.3-GPTQ\" `\n    --perplexity `\n    --perplexity_dataset \"./vendor/exllama/datasets/wikitext2_val_sample.jsonl\"\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcountzero%2Fwindows_exllama","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcountzero%2Fwindows_exllama","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcountzero%2Fwindows_exllama/lists"}