{"id":18689709,"url":"https://github.com/mrseanryan/gpt-local","last_synced_at":"2025-08-10T08:11:19.605Z","repository":{"id":185619227,"uuid":"673821304","full_name":"mrseanryan/gpt-local","owner":"mrseanryan","description":"Local GPT (llama 2 or dolly or gpt etc.) via Python - using ctransforers project","archived":false,"fork":false,"pushed_at":"2023-08-09T17:18:52.000Z","size":10,"stargazers_count":2,"open_issues_count":0,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-12T05:53:41.633Z","etag":null,"topics":["ai","gpt","llm","python","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mrseanryan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-08-02T13:54:08.000Z","updated_at":"2023-12-31T19:11:06.000Z","dependencies_parsed_at":null,"dependency_job_id":"45e6fbdd-5947-4d55-889a-0518bafecc3a","html_url":"https://github.com/mrseanryan/gpt-local","commit_stats":null,"previous_names":["mrseanryan/gpt-local"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mrseanryan/gpt-local","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mrseanryan%2Fgpt-local","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mrseanryan%2Fgpt-local/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mrseanryan%2Fgpt-local/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mrseanryan%2Fgpt-local/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mrseanryan","download_url":"https://codeload.github.com/mrseanryan/gpt-local/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mrseanryan%2Fgpt-local/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":269693593,"owners_count":24460248,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-10T02:00:08.965Z","response_time":71,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","gpt","llm","python","transformers"],"created_at":"2024-11-07T10:44:49.506Z","updated_at":"2025-08-10T08:11:19.585Z","avatar_url":"https://github.com/mrseanryan.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# gpt-local README\n\nLocal LLM for GPT (llama 2 or dolly or gpt etc.) via Python - using the excellent ctransformers project.\n\nThis is basically a wrapper for quickly setting up a local LLM.\n\n## What is GPT ?\n\n*Generative Pre-trained Transformers, commonly known as GPT, are a family of neural network models that uses the transformer architecture and is a key advancement in artificial intelligence (AI) powering generative AI applications such as ChatGPT. GPT models give applications the ability to create human-like text and content (images, music, and more), and answer questions in a conversational manner.*\n\n\n## Usage\n\n```\n./go.sh \u003cpath_to_model\u003e \u003cmodel_type\u003e \u003cprompt\u003e\n```\n\nOR\n\n```\npython3 local-llm.py \u003cpath_to_model\u003e \u003cmodel_type\u003e \u003cprompt\u003e\n```\n\nExample Output:\n\n```\nAI model: /home/sean/Downloads/models/llama-2-13b-chat.ggmlv3.q4_0.bin [llama]\n\u003e\u003e If Mary is faster than Joe and Sam is slower than Mary, then who is the fastest?\n\nThe answer is Mary.\nHow can I help you? [press ENTER to exit] \u003e\u003e\n```\n\n## Test\n\n1. Download a compatible model. To know what model types are supported, see the [ctransformers](https://github.com/marella/ctransformers) project.\n\nQuality models are available at hugging face - see [TheBloke](https://huggingface.co/TheBloke).\n\nExample: https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML/resolve/main/llama-2-13b-chat.ggmlv3.q4_0.bin\n\nOR via bash:\n\n```\n./download-llama-2-13B-model.sh\n```\n\n2. Load the model and send it a prompt\n\nnote: This test assumes that the model is located under ~/Downloads/model.\n\n```\n./test.sh\n```\n\nOutput:\n\n```\nAI model: /home/sean/Downloads/models/llama-2-13b-chat.ggmlv3.q4_0.bin [llama]\n\u003e\u003e If Mary is faster than Joe and Sam is slower than Mary, then who is the fastest?\n\nThe answer is Mary.\nHow can I help you? [press ENTER to exit] \u003e\u003e\n```\n\n## Dependencies\n\n- Python 3\n- pip3\n- OS: Unix - Tested on Ubuntu\n\n```\npip install ctransformers\n```\n\n## Using GPU with the local model\n\nIf you have an NVIDIA graphics card, then you can run part or all of the model (depending on the card's RAM) on the GPU,\nwhich has a much higher level of parallelism than the typical CPU.\n\nRequired:\n- latest NVIDIA graphic driver\n- up to date version of CUDA\n\n**Tip: if you get errors when running the model, like this:**\n\n```\n \u003e\u003e Cuda error: no kernel image is available for execution on the device\n```\n\nTHEN recommend to build ctransformers locally.\n\nThis is actually quite simple:\n\n```\npip3 uninstall ctransformers\npip3 install ctransformers --no-binary ctransformers # use --no-binary to force a local build. This ensures that the local version of CUDA and NVIDIA graphics driver will be used.\n```\n\nYou can tweak the settings in `config.py`.\n\nFor more details, see the main tool [ctransformers](https://github.com/marella/ctransformers).\n\n## Alternatives to ctransformers\n\n### [Alternative 1] (more complicated to setup)(has nice web user interface) Python web UI via pytorch and text-generation-webui\n\nThese are some rough notes, taken from [YouTube](https://www.youtube.com/watch?v=k2FHUP0krqg\u0026ab_channel=MatthewBerman) - thanks to the guy who made that video! See also the [gist](https://gist.github.com/mberman84/45545e48040ef6aafb6a1cb3442edb83).\n\n1. install conda (package manager)\n2. use conda to install:\n\n```\nconda create -n textgen python=3.10.9\nconda activate textgen\n```\n\n3. install pytorch:\n\n```\npip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117\n\ngit clone https://github.com/oobabooga/text-generation-webui\n\ncd text-generation-webui\n\npip install -r requirements.txt\n\npython server.py\n```\n4. Download a model\n\nExample: https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML\n\n5. In the web UI: refresh the model list\n\n6. In the web UI: load the model\n\n7. In the web UI: switch to chat mode\n\n### [Alternative 2] C++ (better performance, harder to customize)\n\nhttps://replicate.com/blog/run-llama-locally\n\nnote: make sure you pick the correct script for your OS!\n\n# References\n\n- [ctransformers](https://github.com/marella/ctransformers)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmrseanryan%2Fgpt-local","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmrseanryan%2Fgpt-local","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmrseanryan%2Fgpt-local/lists"}