{"id":13452945,"url":"https://github.com/PromtEngineer/localGPT","last_synced_at":"2025-03-24T00:32:42.651Z","repository":{"id":168901069,"uuid":"644715009","full_name":"PromtEngineer/localGPT","owner":"PromtEngineer","description":"Chat with your documents on your local device using GPT models. No data leaves your device and 100% private. ","archived":false,"fork":false,"pushed_at":"2025-03-02T02:42:51.000Z","size":3784,"stargazers_count":20380,"open_issues_count":479,"forks_count":2267,"subscribers_count":168,"default_branch":"main","last_synced_at":"2025-03-18T14:51:19.419Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/PromtEngineer.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":null,"patreon":null,"open_collective":null,"ko_fi":"promptengineering","tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"otechie":null,"lfx_crowdfunding":null,"custom":null}},"created_at":"2023-05-24T05:32:40.000Z","updated_at":"2025-03-18T14:20:04.000Z","dependencies_parsed_at":null,"dependency_job_id":"1d84c731-abb5-4fc4-8b2e-9598f7190571","html_url":"https://github.com/PromtEngineer/localGPT","commit_stats":null,"previous_names":["promtengineer/privategpt-plus"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PromtEngineer%2FlocalGPT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PromtEngineer%2FlocalGPT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PromtEngineer%2FlocalGPT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PromtEngineer%2FlocalGPT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/PromtEngineer","download_url":"https://codeload.github.com/PromtEngineer/localGPT/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245191489,"owners_count":20575246,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T08:00:28.927Z","updated_at":"2025-03-24T00:32:42.638Z","avatar_url":"https://github.com/PromtEngineer.png","language":"Python","funding_links":["https://ko-fi.com/promptengineering"],"categories":["Python","Knowledge Management","[Local GPT](https://github.com/PromtEngineer/localGPT)","AI and Language Models","others","A01_文本生成_文本对话","Community Projects","HarmonyOS","Learning","Repos","工具","📚 Contents","🤖 Deep Research Systems","Tools","Open-Source Local LLM Projects","Autonomous Agents","Open Source Projects","🧠 AI Applications \u0026 Platforms","Security \u0026 Privacy Agents","5. Retrieval-Augmented Generation (RAG) \u0026 Knowledge","AI Assistants"],"sub_categories":["Links","大语言对话模型及数据","RAG \u0026 Knowledge Management","Windows Manager","Repositories","代理","🌐 Open-Source Deep Research Implementations","Agents","Tools","Post-Exploitation Agents"],"readme":"# LocalGPT: Secure, Local Conversations with Your Documents 🌐\n\n\u003cp align=\"center\"\u003e\n\u003ca href=\"https://trendshift.io/repositories/2947\" target=\"_blank\"\u003e\u003cimg src=\"https://trendshift.io/api/badge/repositories/2947\" alt=\"PromtEngineer%2FlocalGPT | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"/\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n[![GitHub Stars](https://img.shields.io/github/stars/PromtEngineer/localGPT?style=social)](https://github.com/PromtEngineer/localGPT/stargazers)\n[![GitHub Forks](https://img.shields.io/github/forks/PromtEngineer/localGPT?style=social)](https://github.com/PromtEngineer/localGPT/network/members)\n[![GitHub Issues](https://img.shields.io/github/issues/PromtEngineer/localGPT)](https://github.com/PromtEngineer/localGPT/issues)\n[![GitHub Pull Requests](https://img.shields.io/github/issues-pr/PromtEngineer/localGPT)](https://github.com/PromtEngineer/localGPT/pulls)\n[![License](https://img.shields.io/github/license/PromtEngineer/localGPT)](https://github.com/PromtEngineer/localGPT/blob/main/LICENSE)\n\n🚨🚨 You can run localGPT on a pre-configured [Virtual Machine](https://bit.ly/localGPT). Make sure to use the code: PromptEngineering to get 50% off. I will get a small commision!\n\n**LocalGPT** is an open-source initiative that allows you to converse with your documents without compromising your privacy. With everything running locally, you can be assured that no data ever leaves your computer. Dive into the world of secure, local document interactions with LocalGPT.\n\n## Features 🌟\n- **Utmost Privacy**: Your data remains on your computer, ensuring 100% security.\n- **Versatile Model Support**: Seamlessly integrate a variety of open-source models, including HF, GPTQ, GGML, and GGUF.\n- **Diverse Embeddings**: Choose from a range of open-source embeddings.\n- **Reuse Your LLM**: Once downloaded, reuse your LLM without the need for repeated downloads.\n- **Chat History**: Remembers your previous conversations (in a session).\n- **API**: LocalGPT has an API that you can use for building RAG Applications.\n- **Graphical Interface**: LocalGPT comes with two GUIs, one uses the API and the other is standalone (based on streamlit).\n- **GPU, CPU, HPU \u0026 MPS Support**: Supports multiple platforms out of the box, Chat with your data using `CUDA`, `CPU`, `HPU (Intel® Gaudi®)` or `MPS` and more!\n\n## Dive Deeper with Our Videos 🎥\n- [Detailed code-walkthrough](https://youtu.be/MlyoObdIHyo)\n- [Llama-2 with LocalGPT](https://youtu.be/lbFmceo4D5E)\n- [Adding Chat History](https://youtu.be/d7otIM_MCZs)\n- [LocalGPT - Updated (09/17/2023)](https://youtu.be/G_prHSKX9d4)\n\n## Technical Details 🛠️\nBy selecting the right local models and the power of `LangChain` you can run the entire RAG pipeline locally, without any data leaving your environment, and with reasonable performance.\n\n- `ingest.py` uses `LangChain` tools to parse the document and create embeddings locally using `InstructorEmbeddings`. It then stores the result in a local vector database using `Chroma` vector store.\n- `run_localGPT.py` uses a local LLM to understand questions and create answers. The context for the answers is extracted from the local vector store using a similarity search to locate the right piece of context from the docs.\n- You can replace this local LLM with any other LLM from the HuggingFace. Make sure whatever LLM you select is in the HF format.\n\nThis project was inspired by the original [privateGPT](https://github.com/imartinez/privateGPT).\n\n## Built Using 🧩\n- [LangChain](https://github.com/hwchase17/langchain)\n- [HuggingFace LLMs](https://huggingface.co/models)\n- [InstructorEmbeddings](https://instructor-embedding.github.io/)\n- [LLAMACPP](https://github.com/abetlen/llama-cpp-python)\n- [ChromaDB](https://www.trychroma.com/)\n- [Streamlit](https://streamlit.io/)\n\n# Environment Setup 🌍\n\n1. 📥 Clone the repo using git:\n\n```shell\ngit clone https://github.com/PromtEngineer/localGPT.git\n```\n\n2. 🐍 Install [conda](https://www.anaconda.com/download) for virtual environment management. Create and activate a new virtual environment.\n\n```shell\nconda create -n localGPT python=3.10.0\nconda activate localGPT\n```\n\n3. 🛠️ Install the dependencies using pip\n\nTo set up your environment to run the code, first install all requirements:\n\n```shell\npip install -r requirements.txt\n```\n\n***Installing LLAMA-CPP :***\n\nLocalGPT uses [LlamaCpp-Python](https://github.com/abetlen/llama-cpp-python) for GGML (you will need llama-cpp-python \u003c=0.1.76) and GGUF (llama-cpp-python \u003e=0.1.83) models.\n\nTo run the quantized Llama3 model, ensure you have llama-cpp-python version 0.2.62 or higher installed.\n\nIf you want to use BLAS or Metal with [llama-cpp](https://github.com/abetlen/llama-cpp-python#installation-with-openblas--cublas--clblast--metal) you can set appropriate flags:\n\nFor `NVIDIA` GPUs support, use `cuBLAS`\n\n```shell\n# Example: cuBLAS\nCMAKE_ARGS=\"-DLLAMA_CUBLAS=on\" FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir\n```\n\nFor Apple Metal (`M1/M2`) support, use\n\n```shell\n# Example: METAL\nCMAKE_ARGS=\"-DLLAMA_METAL=on\"  FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir\n```\nFor more details, please refer to [llama-cpp](https://github.com/abetlen/llama-cpp-python#installation-with-openblas--cublas--clblast--metal)\n\n## Docker 🐳\n\nInstalling the required packages for GPU inference on NVIDIA GPUs, like gcc 11 and CUDA 11, may cause conflicts with other packages in your system.\nAs an alternative to Conda, you can use Docker with the provided Dockerfile.\nIt includes CUDA, your system just needs Docker, BuildKit, your NVIDIA GPU driver and the NVIDIA container toolkit.\nBuild as `docker build -t localgpt .`, requires BuildKit.\nDocker BuildKit does not support GPU during *docker build* time right now, only during *docker run*.\nRun as `docker run -it --mount src=\"$HOME/.cache\",target=/root/.cache,type=bind --gpus=all localgpt`.\nFor running the code on Intel® Gaudi® HPU, use the following Dockerfile - `Dockerfile_hpu`.\n\n## Test dataset\n\nFor testing, this repository comes with [Constitution of USA](https://constitutioncenter.org/media/files/constitution.pdf) as an example file to use.\n\n## Ingesting your OWN Data.\nPut your files in the `SOURCE_DOCUMENTS` folder. You can put multiple folders within the `SOURCE_DOCUMENTS` folder and the code will recursively read your files.\n\n### Support file formats:\nLocalGPT currently supports the following file formats. LocalGPT uses `LangChain` for loading these file formats. The code in `constants.py` uses a `DOCUMENT_MAP` dictionary to map a file format to the corresponding loader. In order to add support for another file format, simply add this dictionary with the file format and the corresponding loader from [LangChain](https://python.langchain.com/docs/modules/data_connection/document_loaders/).\n\n```shell\nDOCUMENT_MAP = {\n    \".txt\": TextLoader,\n    \".md\": TextLoader,\n    \".py\": TextLoader,\n    \".pdf\": PDFMinerLoader,\n    \".csv\": CSVLoader,\n    \".xls\": UnstructuredExcelLoader,\n    \".xlsx\": UnstructuredExcelLoader,\n    \".docx\": Docx2txtLoader,\n    \".doc\": Docx2txtLoader,\n}\n```\n\n### Ingest\n\nRun the following command to ingest all the data.\n\nIf you have `cuda` setup on your system.\n\n```shell\npython ingest.py\n```\nYou will see an output like this:\n\u003cimg width=\"1110\" alt=\"Screenshot 2023-09-14 at 3 36 27 PM\" src=\"https://github.com/PromtEngineer/localGPT/assets/134474669/c9274e9a-842c-49b9-8d95-606c3d80011f\"\u003e\n\n\nUse the device type argument to specify a given device.\nTo run on `cpu`\n\n```sh\npython ingest.py --device_type cpu\n```\n\nTo run on `M1/M2`\n\n```sh\npython ingest.py --device_type mps\n```\n\nUse help for a full list of supported devices.\n\n```sh\npython ingest.py --help\n```\n\nThis will create a new folder called `DB` and use it for the newly created vector store. You can ingest as many documents as you want, and all will be accumulated in the local embeddings database.\nIf you want to start from an empty database, delete the `DB` and reingest your documents.\n\nNote: When you run this for the first time, it will need internet access to download the embedding model (default: `Instructor Embedding`). In the subsequent runs, no data will leave your local environment and you can ingest data without internet connection.\n\n## Ask questions to your documents, locally!\n\nIn order to chat with your documents, run the following command (by default, it will run on `cuda`).\n\n```shell\npython run_localGPT.py\n```\nYou can also specify the device type just like `ingest.py`\n\n```shell\npython run_localGPT.py --device_type mps # to run on Apple silicon\n```\n\n```shell\n# To run on Intel® Gaudi® hpu\nMODEL_ID = \"mistralai/Mistral-7B-Instruct-v0.2\" # in constants.py\npython run_localGPT.py --device_type hpu\n```\n\nThis will load the ingested vector store and embedding model. You will be presented with a prompt:\n\n```shell\n\u003e Enter a query:\n```\n\nAfter typing your question, hit enter. LocalGPT will take some time based on your hardware. You will get a response like this below.\n\u003cimg width=\"1312\" alt=\"Screenshot 2023-09-14 at 3 33 19 PM\" src=\"https://github.com/PromtEngineer/localGPT/assets/134474669/a7268de9-ade0-420b-a00b-ed12207dbe41\"\u003e\n\nOnce the answer is generated, you can then ask another question without re-running the script, just wait for the prompt again.\n\n\n***Note:*** When you run this for the first time, it will need internet connection to download the LLM (default: `TheBloke/Llama-2-7b-Chat-GGUF`). After that you can turn off your internet connection, and the script inference would still work. No data gets out of your local environment.\n\nType `exit` to finish the script.\n\n### Extra Options with run_localGPT.py\n\nYou can use the `--show_sources` flag with `run_localGPT.py` to show which chunks were retrieved by the embedding model. By default, it will show 4 different sources/chunks. You can change the number of sources/chunks\n\n```shell\npython run_localGPT.py --show_sources\n```\n\nAnother option is to enable chat history. ***Note***: This is disabled by default and can be enabled by using the  `--use_history` flag. The context window is limited so keep in mind enabling history will use it and might overflow.\n\n```shell\npython run_localGPT.py --use_history\n```\n\nYou can store user questions and model responses with flag `--save_qa` into a csv file `/local_chat_history/qa_log.csv`. Every interaction will be stored. \n\n```shell\npython run_localGPT.py --save_qa\n```\n\n# Run the Graphical User Interface\n\n1. Open `constants.py` in an editor of your choice and depending on choice add the LLM you want to use. By default, the following model will be used:\n\n   ```shell\n   MODEL_ID = \"TheBloke/Llama-2-7b-Chat-GGUF\"\n   MODEL_BASENAME = \"llama-2-7b-chat.Q4_K_M.gguf\"\n   ```\n\n3. Open up a terminal and activate your python environment that contains the dependencies installed from requirements.txt.\n\n4. Navigate to the `/LOCALGPT` directory.\n\n5. Run the following command `python run_localGPT_API.py`. The API should being to run.\n\n6. Wait until everything has loaded in. You should see something like `INFO:werkzeug:Press CTRL+C to quit`.\n\n7. Open up a second terminal and activate the same python environment.\n\n8. Navigate to the `/LOCALGPT/localGPTUI` directory.\n\n9. Run the command `python localGPTUI.py`.\n\n10. Open up a web browser and go the address `http://localhost:5111/`.\n\n\n# How to select different LLM models?\n\nTo change the models you will need to set both `MODEL_ID` and `MODEL_BASENAME`.\n\n1. Open up `constants.py` in the editor of your choice.\n2. Change the `MODEL_ID` and `MODEL_BASENAME`. If you are using a quantized model (`GGML`, `GPTQ`, `GGUF`), you will need to provide `MODEL_BASENAME`. For unquantized models, set `MODEL_BASENAME` to `NONE`\n5. There are a number of example models from HuggingFace that have already been tested to be run with the original trained model (ending with HF or have a .bin in its \"Files and versions\"), and quantized models (ending with GPTQ or have a .no-act-order or .safetensors in its \"Files and versions\").\n6. For models that end with HF or have a .bin inside its \"Files and versions\" on its HuggingFace page.\n\n   - Make sure you have a `MODEL_ID` selected. For example -\u003e `MODEL_ID = \"TheBloke/guanaco-7B-HF\"`\n   - Go to the [HuggingFace Repo](https://huggingface.co/TheBloke/guanaco-7B-HF)\n\n7. For models that contain GPTQ in its name and or have a .no-act-order or .safetensors extension inside its \"Files and versions on its HuggingFace page.\n\n   - Make sure you have a `MODEL_ID` selected. For example -\u003e model_id = `\"TheBloke/wizardLM-7B-GPTQ\"`\n   - Got to the corresponding [HuggingFace Repo](https://huggingface.co/TheBloke/wizardLM-7B-GPTQ) and select \"Files and versions\".\n   - Pick one of the model names and set it as  `MODEL_BASENAME`. For example -\u003e `MODEL_BASENAME = \"wizardLM-7B-GPTQ-4bit.compat.no-act-order.safetensors\"`\n\n8. Follow the same steps for `GGUF` and `GGML` models.\n\n# GPU and VRAM Requirements\n\nBelow is the VRAM requirement for different models depending on their size (Billions of parameters). The estimates in the table does not include VRAM used by the Embedding models - which use an additional 2GB-7GB of VRAM depending on the model.\n\n| Mode Size (B) | float32   | float16   | GPTQ 8bit      | GPTQ 4bit          |\n| ------- | --------- | --------- | -------------- | ------------------ |\n| 7B      | 28 GB     | 14 GB     | 7 GB - 9 GB    | 3.5 GB - 5 GB      |\n| 13B     | 52 GB     | 26 GB     | 13 GB - 15 GB  | 6.5 GB - 8 GB      |\n| 32B     | 130 GB    | 65 GB     | 32.5 GB - 35 GB| 16.25 GB - 19 GB   |\n| 65B     | 260.8 GB  | 130.4 GB  | 65.2 GB - 67 GB| 32.6 GB - 35 GB    |\n\n\n# System Requirements\n\n## Python Version\n\nTo use this software, you must have Python 3.10 or later installed. Earlier versions of Python will not compile.\n\n## C++ Compiler\n\nIf you encounter an error while building a wheel during the `pip install` process, you may need to install a C++ compiler on your computer.\n\n### For Windows 10/11\n\nTo install a C++ compiler on Windows 10/11, follow these steps:\n\n1. Install Visual Studio 2022.\n2. Make sure the following components are selected:\n   - Universal Windows Platform development\n   - C++ CMake tools for Windows\n3. Download the MinGW installer from the [MinGW website](https://sourceforge.net/projects/mingw/).\n4. Run the installer and select the \"gcc\" component.\n\n### NVIDIA Driver's Issues:\n\nFollow this [page](https://linuxconfig.org/how-to-install-the-nvidia-drivers-on-ubuntu-22-04) to install NVIDIA Drivers.\n\n## Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=PromtEngineer/localGPT\u0026type=Date)](https://star-history.com/#PromtEngineer/localGPT\u0026Date)\n\n# Disclaimer\n\nThis is a test project to validate the feasibility of a fully local solution for question answering using LLMs and Vector embeddings. It is not production ready, and it is not meant to be used in production. Vicuna-7B is based on the Llama model so that has the original Llama license.\n\n# Common Errors\n\n - [Torch not compatible with CUDA enabled](https://github.com/pytorch/pytorch/issues/30664)\n\n   -  Get CUDA version\n      ```shell\n      nvcc --version\n      ```\n      ```shell\n      nvidia-smi\n      ```\n   - Try installing PyTorch depending on your CUDA version\n      ```shell\n         conda install -c pytorch torchvision cudatoolkit=10.1 pytorch\n      ```\n   - If it doesn't work, try reinstalling\n      ```shell\n         pip uninstall torch\n         pip cache purge\n         pip install torch -f https://download.pytorch.org/whl/torch_stable.html\n      ```\n\n- [ERROR: pip's dependency resolver does not currently take into account all the packages that are installed](https://stackoverflow.com/questions/72672196/error-pips-dependency-resolver-does-not-currently-take-into-account-all-the-pa/76604141#76604141)\n  ```shell\n     pip install h5py\n     pip install typing-extensions\n     pip install wheel\n  ```\n- [Failed to import transformers](https://github.com/huggingface/transformers/issues/11262)\n  - Try re-install\n    ```shell\n       conda uninstall tokenizers, transformers\n       pip install transformers\n    ```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FPromtEngineer%2FlocalGPT","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FPromtEngineer%2FlocalGPT","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FPromtEngineer%2FlocalGPT/lists"}