{"id":14964733,"url":"https://github.com/uminosachi/open-llm-webui","last_synced_at":"2025-10-25T08:30:22.088Z","repository":{"id":166430487,"uuid":"641766239","full_name":"Uminosachi/open-llm-webui","owner":"Uminosachi","description":"This repository contains a web application designed to execute relatively compact, locally-operated Large Language Models (LLMs).","archived":false,"fork":false,"pushed_at":"2024-08-18T01:52:56.000Z","size":1309,"stargazers_count":31,"open_issues_count":1,"forks_count":4,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-09-23T11:05:25.333Z","etag":null,"topics":["chatbot","ggml","gradio","huggingface","language-model","llama","llama2","llama3","llava","llava-llama3","llm","nlp","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Uminosachi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-05-17T06:02:34.000Z","updated_at":"2024-09-10T12:56:07.000Z","dependencies_parsed_at":"2024-05-04T05:26:43.108Z","dependency_job_id":"45e8e4c7-ef8f-47a7-a2f0-40f7aa1fa856","html_url":"https://github.com/Uminosachi/open-llm-webui","commit_stats":{"total_commits":193,"total_committers":1,"mean_commits":193.0,"dds":0.0,"last_synced_commit":"c994a3593af76aed2fa106c5ea6f045105e010c6"},"previous_names":[],"tags_count":23,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Uminosachi%2Fopen-llm-webui","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Uminosachi%2Fopen-llm-webui/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Uminosachi%2Fopen-llm-webui/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Uminosachi%2Fopen-llm-webui/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Uminosachi","download_url":"https://codeload.github.com/Uminosachi/open-llm-webui/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":219867832,"owners_count":16554368,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chatbot","ggml","gradio","huggingface","language-model","llama","llama2","llama3","llava","llava-llama3","llm","nlp","transformers"],"created_at":"2024-09-24T13:33:42.337Z","updated_at":"2025-10-25T08:30:22.082Z","avatar_url":"https://github.com/Uminosachi.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Open LLM WebUI\n\nThis repository contains a web application designed to execute relatively compact, locally-operated Large Language Models (LLMs).\n\n## Installation\n\nPlease follow these steps to install the software:\n\n* Create a new conda environment:\n\n  ```bash\n  conda create -n ollm python=3.10\n  conda activate ollm\n  ```\n\n* Clone the software repository:\n\n  ```bash\n  git clone https://github.com/Uminosachi/open-llm-webui.git\n  cd open-llm-webui\n  ```\n\n### Python Package Installation\n\n#### General Instructions\n\n* Install the necessary Python packages by executing:\n\n  ```bash\n  pip install -r requirements.txt\n  ```\n\n#### (Optional) Installation for Flash Attention\n\n* To enable Flash Attention in some models, if CUDA is available, install Flash Attention:\n\n  ```bash\n  pip install packaging ninja\n  pip install flash-attn --no-build-isolation\n  ```\n\n#### Platform-Specific Instructions\n\n* **For Windows (with CUDA support):**\n\n  ##### Install pre-build wheel for Windows\n\n  * It is possible to install a pre-built wheel with CUDA support.\n    * Source URL: [https://abetlen.github.io/llama-cpp-python/whl/cu121/llama-cpp-python/](https://abetlen.github.io/llama-cpp-python/whl/cu121/llama-cpp-python/)\n\n    ```bash\n    wget https://github.com/abetlen/llama-cpp-python/releases/download/v0.3.4-cu121/llama_cpp_python-0.3.4-cp310-cp310-win_amd64.whl\n    pip install llama_cpp_python-0.3.4-cp310-cp310-win_amd64.whl\n    pip install -r requirements.txt\n    ```\n\n  ##### (Optional) Build with CUDA for Windows\n\n  * Install [Visual Studio](https://learn.microsoft.com/en-us/visualstudio/install/install-visual-studio?view=vs-2022):\n    * ⚠️ Important: Make sure to select `Desktop development with C++` during the installation process.\n  * Copy MSBuild extensions for CUDA as an administrator (adjust the CUDA version `v12.1` as needed):\n\n    ```bash\n    xcopy /e \"C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.1\\extras\\visual_studio_integration\\MSBuildExtensions\" \"C:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\MSBuild\\Microsoft\\VC\\v170\\BuildCustomizations\"\n    ```\n\n  * Configure the required environment variables for the build (adjust the CUDA version as necessary):\n\n    ```bash\n    set PATH=C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.1\\bin;%PATH%\n    \"C:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\VC\\Auxiliary\\Build\\vcvars64.bat\"\n    set FORCE_CMAKE=1\n    set CMAKE_ARGS=\"-DGGML_CUDA=ON -DCMAKE_CXX_FLAGS=/utf-8 -DCMAKE_C_FLAGS=/utf-8\"\n    set CMAKE_BUILD_PARALLEL_LEVEL=16\n    ```\n\n  * Install the necessary Python packages (this process may take some time):\n\n    ```bash\n    pip install ninja cmake scikit-build-core[pyproject]\n    pip install --force-reinstall --no-cache-dir llama-cpp-python\n    pip install -r requirements.txt\n    ```\n\n* **For Linux (with CUDA support):**\n\n  ##### Install pre-build wheel for Linux\n\n  * It is possible to install a pre-built wheel with CUDA support.\n    * Source URL: [https://abetlen.github.io/llama-cpp-python/whl/cu121/llama-cpp-python/](https://abetlen.github.io/llama-cpp-python/whl/cu121/llama-cpp-python/)\n\n    ```bash\n    wget https://github.com/abetlen/llama-cpp-python/releases/download/v0.3.4-cu121/llama_cpp_python-0.3.4-cp310-cp310-linux_x86_64.whl\n    pip install llama_cpp_python-0.3.4-cp310-cp310-linux_x86_64.whl\n    pip install -r requirements.txt\n    ```\n\n  ##### (Optional) Build with CUDA for Linux\n\n  * Configure the required environment variables for the build (if not already set):\n\n    ```bash\n    export PATH=/usr/local/cuda/bin:${PATH}\n    export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu:/usr/local/cuda/lib64:${LD_LIBRARY_PATH}\n    ```\n\n  * Install the OpenMP libraries used for the build:\n\n    ```bash\n    sudo apt-get update\n    sudo apt-get install libgomp1 libomp-dev\n    ```\n\n  * Install the necessary Python packages:\n\n    ```bash\n    conda install -c conda-forge libstdcxx-ng\n    pip install ninja cmake scikit-build-core[pyproject]\n    export CMAKE_ARGS=\"-DGGML_CUDA=ON\"\n    pip install --force-reinstall --no-cache-dir llama-cpp-python\n    pip install -r requirements.txt\n    ```\n\n* **For Mac OS (without CUDA support):**\n  * Install the necessary Python packages:\n\n    ```bash\n    BUILD_CUDA_EXT=0 pip install -r requirements.txt\n    ```\n\n  * Rebuild the `bitsandbytes` package with the CPU option.\n\n    ```bash\n    pip uninstall bitsandbytes\n    git clone https://github.com/TimDettmers/bitsandbytes.git\n    cd bitsandbytes\n    cmake -DCOMPUTE_BACKEND=cpu -S .\n    make\n    pip install .\n    ```\n\n  * Install CMake and set the compiler:\n  \n    ```bash\n    brew install cmake\n    export CC=/usr/bin/gcc\n    export CXX=/usr/bin/g++\n    ```\n\n  * Install `llama-cpp-python` with Metal support:\n\n    ```bash\n    export CMAKE_ARGS=\"-DLLAMA_METAL=on\"\n    export FORCE_CMAKE=1\n    pip install -U llama-cpp-python --no-cache-dir\n    ```\n\n  * Known Issue: Running the LLaVA model on Mac results in an error.\n\n## Running the application\n\n```bash\npython ollm_app.py\n```\n\n* Open \u003chttp://127.0.0.1:7860/\u003e in your browser.\n\n## Downloading the Model\n\nTo download the model:\n\n* Launch this application.\n* Click on the \"Download model\" button next to the LLM model ID.\n* Wait for the download to complete.\n\n### 📜 Model List (transformers)\n\n| Provider      | Model Names                                                                                |\n|---------------|--------------------------------------------------------------------------------------------|\n| Microsoft     | Phi-3-mini-4k-instruct                                                                     |\n| Google        | gemma-2-9b-it, gemma-1.1-2b-it, gemma-1.1-7b-it                                            |\n| NVIDIA        | Llama3-ChatQA-1.5-8B                                                                       |\n| Qwen          | Qwen2.5-1.5B-Instruct, Qwen2.5-7B-Instruct                                                 |\n| Mistral AI    | Mistral-7B-Instruct-v0.3                                                                   |\n| Rakuten       | RakutenAI-7B-chat, RakutenAI-7B-instruct                                                   |\n| rinna         | youri-7b-chat                                                                              |\n| TheBloke      | Llama-2-7b-Chat-GPTQ, Kunoichi-7B-GPTQ                                                     |\n\n* 📋 Note: By adding the repository paths of models to `model_manager/add_tfs_models.txt`, they will be included in the list of Model IDs and displayed in the UI.\n* 🔍 Note: The downloaded model file will be stored in the `.cache/huggingface/hub` directory of your home directory.\n\n#### Access and Download Gemma and Llama Models\n\n* Before downloading any models, ensure that you have obtained the necessary access rights through Hugging Face. Please visit the following pages to request access:\n  * [Llama 3 model by Meta](https://huggingface.co/meta-llama/Meta-Llama-3-8B)\n  * [Llama 2 model by Meta](https://huggingface.co/meta-llama/Llama-2-7b-hf)\n  * [Gemma 1.1 model by Google](https://huggingface.co/google/gemma-1.1-2b-it)\n\n#### Login to Hugging Face\n\n* Before downloading any models, please log in via the command line using:\n\n  ```bash\n  huggingface-cli login\n  ```\n\n### 🦙 Model List (llama.cpp)\n\n| Provider      | Model Names                                                                                |\n|---------------|--------------------------------------------------------------------------------------------|\n| Microsoft     | Phi-3-mini-4k-instruct-q4.gguf, Phi-3-mini-4k-instruct-fp16.gguf                           |\n| TheBloke      | llama-2-7b-chat.Q4_K_M.gguf                                                                |\n| QuantFactory  | Meta-Llama-3-8B-Instruct.Q4_K_M.gguf                                                       |\n\n#### Using any GGUF file\n\n* 🔍 File Placement: Place files with the `.gguf` extension in the `models` directory within the `open-llm-webui` folder. These files will then appear in the model list on the `llama.cpp` tab of the web UI and can be used accordingly.\n* 📝 Metadata Usage: If the metadata of a GGUF model includes `tokenizer.chat_template`, this template will be used to create the prompts.\n\n### 🖼️ Model List (Multimodal LLaVA)\n\n| Provider      | Model Names                                                                                |\n|---------------|--------------------------------------------------------------------------------------------|\n| Google        | google/gemma-3-4b-it, google/paligemma2-3b-pt-224, google/paligemma2-3b-pt-448             |\n| Microsoft     | Phi-3.5-vision-instruct, Phi-3-vision-128k-instruct                                        |\n| Meta          | Llama-3.2-11B-Vision (limited support as a trial)                                          |\n| llava-hf      | llava-v1.6-mistral-7b-hf, llava-v1.6-vicuna-7b-hf, llava-1.5-7b-hf                         |\n| tinyllava     | TinyLLaVA-Phi-2-SigLIP-3.1B                                                                |\n| openbmb       | MiniCPM-V-2_6-int4, MiniCPM-V-2_6, MiniCPM-Llama3-V-2_5-int4, MiniCPM-Llama3-V-2_5         |\n| SakanaAI      | EvoVLM-JP-v1-7B                                                                            |\n\n#### Access and Download Llama 3 Models\n\n* Before downloading any models, ensure that you have obtained the necessary access rights through Hugging Face. Please visit the following pages to request access:\n  * [Llama 3.2 Vision model by Meta](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision)\n  * [Llama 3 model by Meta](https://huggingface.co/meta-llama/Meta-Llama-3-8B)\n\n## Usage\n\n* Enter your message into the \"Input text\" box. Adjust the slider for \"Max new tokens\" as needed.\n* Under \"Advanced options\" adjust the settings for \"Temperature\", \"Top k\", \"Top p\", and \"Repetition Penalty\" as needed.\n* If replacing the system message of the prompt, under \"Advanced options\" enable the checkbox and enter text.\n* Press \"Enter\" on your keyboard or click the \"Generate\" button.\n  * ⚠️ Note: If the cloud-based model has been updated, it may be downloaded upon execution.\n* If you click the \"Clear chat\" button, the chat history will be cleared.\n\n### transformers tab\n\n* By enabling the `CPU execution` checkbox, the model will use the argument `device_map=\"cpu\"`.\n* Some of the transformers models are loaded with the following 4-bit or 8-bit settings using the `bitsandbytes` package.\n\n### llama.cpp tab\n\n* Use the radio buttons in the `Default chat template` to select the template that will be used if the GGUF model lacks a `chat_template`.\n\n### LLaVA tab\n\n* You can upload an image to the LLaVA Image area of this tab and input a prompt related to the image.\n* Some of the LLaVA models are loaded with the following 4-bit or 8-bit settings using the `bitsandbytes` package.\n\n### Continuous Processing of Multiple Prompts\n\n* Enter `input_prompts.json` in the `Input text` textbox.\n* Ensure the `input_prompts.json` file in the current folder contains an array of objects with the key `\"prompt\"`.\n\n* An example of the `input_prompts.json` file structure is as follows:\n\n  ```json\n  [\n    {\n      \"prompt\": \"What is your name?\"\n    },\n    {\n      \"prompt\": \"How are you?\"\n    }\n  ]\n  ```\n\n### options\n\n* When you enable the `Translate (ja-\u003een/en-\u003eja)` checkbox:\n  * Any input in Japanese will be automatically translated to English, and responses in English will be automatically translated back into Japanese.\n  * ⚠️ Note: Downloading the translation model for the first time may take some time.\n\n![UI image](images/open-ollm-webui_ui_image_1.png)\n\n## Model Credit\n\n| Developer           | Model                        | License                                                        |\n|---------------------|------------------------------|----------------------------------------------------------------|\n| Meta                | Llama-3.2                    | [Llama 3.2 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE) |\n| Meta                | Llama-3.1                    | [Llama 3.1 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE) |\n| Meta                | Llama-3                      | [Llama 3 Community License](https://github.com/meta-llama/llama3/blob/main/LICENSE) |\n| Meta                | Llama-2                      | [Llama 2 Community License](https://github.com/facebookresearch/llama/blob/main/LICENSE) |\n| Microsoft           | Phi-3.5, Phi-3               | [The MIT License](https://opensource.org/licenses/MIT)         |\n| Google              | Gemma                        | [Gemma Terms of Use](https://ai.google.dev/gemma/terms)        |\n| NVIDIA              | Llama3-ChatQA                | [Llama 3 Community License](https://github.com/meta-llama/llama3/blob/main/LICENSE) |\n| Alibaba Group       | Qwen2.5-3B-Instruct          | [Qwen RESEARCH LICENSE](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct/blob/main/LICENSE) |\n| Alibaba Group       | Qwen2.5-7B-Instruct          | [Apache License 2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md) |\n| Mistral AI          | Mistral-7B-Instruct          | [Apache License 2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md) |\n| Rakuten             | RakutenAI                    | [Apache License 2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md) |\n| rinna               | Youri                        | [Llama 2 Community License](https://ai.meta.com/llama/license/) |\n| Sanji Watsuki       | Kunoichi-7B                  | [CC-BY-NC-4.0](https://spdx.org/licenses/CC-BY-NC-4.0)         |\n| Hugging Face        | llava-v1.6-mistral-7b-hf     | [Apache License 2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md) |\n| Hugging Face        | llava-v1.6-vicuna-7b-hf, llava-1.5-7b-hf | [Llama 2 Community License](https://ai.meta.com/llama/license/) |\n| TinyLLaVA           | TinyLLaVA-Phi-2-SigLIP-3.1B  | [Apache License 2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md) |\n| OpenBMB             | MiniCPM                      | [MiniCPM Model License](https://github.com/OpenBMB/MiniCPM/blob/main/MiniCPM%20Model%20License.md) |\n| Sakana AI           | EvoVLM-JP-v1-7B              | [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0) |\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fuminosachi%2Fopen-llm-webui","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fuminosachi%2Fopen-llm-webui","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fuminosachi%2Fopen-llm-webui/lists"}