{"id":39709920,"url":"https://github.com/digihunch/chat-service","last_synced_at":"2026-01-18T10:37:26.249Z","repository":{"id":246794021,"uuid":"819430606","full_name":"digihunch/chat-service","owner":"digihunch","description":"Sample LLM-based chatbot service with local and remote models","archived":false,"fork":false,"pushed_at":"2024-07-25T23:31:07.000Z","size":36,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-07-26T16:03:12.641Z","etag":null,"topics":["chatbot","chatgpt","chatgpt-api","docker","docker-compose","gpt-35-turbo","litellm","llama3","llama31","llm","nginx","ollama","openai","openai-api","openwebui","postgresql"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/digihunch.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-24T13:42:20.000Z","updated_at":"2024-07-25T23:32:06.000Z","dependencies_parsed_at":"2024-07-11T04:03:19.578Z","dependency_job_id":null,"html_url":"https://github.com/digihunch/chat-service","commit_stats":null,"previous_names":["digihunch/chat-sample","digihunch/chat-service"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/digihunch/chat-service","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digihunch%2Fchat-service","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digihunch%2Fchat-service/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digihunch%2Fchat-service/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digihunch%2Fchat-service/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/digihunch","download_url":"https://codeload.github.com/digihunch/chat-service/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digihunch%2Fchat-service/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28534533,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-18T10:13:46.436Z","status":"ssl_error","status_checked_at":"2026-01-18T10:13:11.045Z","response_time":98,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chatbot","chatgpt","chatgpt-api","docker","docker-compose","gpt-35-turbo","litellm","llama3","llama31","llm","nginx","ollama","openai","openai-api","openwebui","postgresql"],"created_at":"2026-01-18T10:37:26.142Z","updated_at":"2026-01-18T10:37:26.235Z","avatar_url":"https://github.com/digihunch.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# chat-service\n\nThis project demos how to host a chat service using Open WebUI and Large Language Models (LLMs). The artifact is a Docker compose file that orchestrates the required components in Docker daemon running on any supported operating system. \n\nThe Docker compose file supports two profiles: `local-model` and `remote-model`, as distinguished in the table below:\n\n\n|  | local-model  | remote-model |\n|---------|-------------|--------------|\n| Language Model | A llama3.1 model locally managed by Ollama | A GPT-3.5-Turbo model remotely hosted by Open AI platform |\n| API Spec | Ollama API | Open AI API |\n| Host Requirement | A server with NVIDIA GPU | Any commodity grade device  |\n\nShould the hosting environment supports, we can run with both profiles providing models from multiple sources for the end users to choose from. In this scenario, the architecture is illustrated as below:\n\n```mermaid\ngraph LR;\n    A(End User via\\nWeb Browser) --\u003e |HTTPS\\n TCP 443|B[Nginx \\n Web Proxy];\n    subgraph Virtual Machine\n    subgraph Docker daemon\n    B --\u003e|HTTP| C[Open Web UI]\n    C --\u003e|Ollama API| D[llama3.1 model\\n running on Ollama];\n    C --\u003e|Open AI API| E[LiteLLM\\n Model Proxy];\n    E -.-|SQL| F[(PostgreSQL\\n Database)];\n    end\n    D -.- G[NVIDIA driver\\non host OS]\n    G -.- I(GPU)\n    end\n    E --\u003e|Open AI API| H[gpt-3.5-turbo model\\n platform.openai.com];\n```\nDisclaimer: this demo project isn't production ready. For professional deployment service, contact author [Digi Hunch](https://www.digihunch.com/). Each section below is tagged with ![Static Badge](https://img.shields.io/badge/local--model-blue) or ![Static Badge](https://img.shields.io/badge/remote--model-darkgreen), or both to indicate which profile they are applicable to.\n\n\n## Configure a Virtual Machine with GPU ![Static Badge](https://img.shields.io/badge/local--model-blue)\n\nRunning local model requires a server with GPU, be it a bare metal or virtual machine. In this project, we use an EC2 instance, a cloud virtual machine (VM) from Amazon Web Service. Without a GPU, the model would be unbearably slow.\n\nThe VM should be reachable from the Internet (to host the website), and should be able to connect to the Internet (for downloading Docker images, models, etc). The VM should also have sufficient disk space to download a model. Let's make the root device 100G. A good option for this demo is the `g4dn.xlarge` instance type, which comes with a single GPU (NVIDIA T4). For OS, we use an AMI with Ubuntu 24.04 OS. I use the following AWS CLI command to find out such AMIs in a given region and pick the latest:\n\n```sh\naws ec2 describe-images --owners amazon --filters \"Name=name,Values=ubuntu/images/hvm-ssd-gp3/ubuntu-noble-24.04-amd64-server-*\" --query 'Images[].{ImageId:ImageId, Name:Name}|sort_by(@,\u0026Name)' --output table --region us-east-1\n```\n\nI picked AMI ID `ami-04b70fa74e45c3917` for region `us-east-1` my testing. On top of the VM, we host the application in Docker container because it saves a lot of efforts troubleshooting dependencies and provides better portability. With these AMIs the default Linux user is `ubuntu`.\n\n## Install Docker Compose ![Static Badge](https://img.shields.io/badge/local--model-blue) ![Static Badge](https://img.shields.io/badge/remote--model-darkgreen)\n\nTo install Docker community edition, follow the [official guide](https://docs.docker.com/engine/install/) and locate the section relevant to your OS. To verify that the Docker compose plugin has been installed, run the following command and it should return the version:\n```sh\ndocker compose version\n\u003e Docker Compose version v2.27.1\n```\nWe need the Docker Compose version greater than 2.20 to leverage some of its latest features.\nNote: if you're testing on a Linux VM, we do need we need `root` privilege to start Docker daemon and run CLI commands to interact with the daemon. For this sake, the docker commands in the rest of this instruction all has `sudo` at the beginning. If you're running these commands from MacBook or other OS, `sudo` may not be required.\n\n\n## Install NVIDIA driver ![Static Badge](https://img.shields.io/badge/local--model-blue)\n\nTo install NVIDIA driver on the VM, run the following command:\n```sh\nsudo apt update\nsudo apt install nvidia-driver-550 nvidia-dkms-550\n```\n\nIn the command, 550 is the driver branch but the specific command may vary depending on the date it is run. Check out the `Manual driver installation` section on [this guide](https://ubuntu.com/server/docs/nvidia-drivers-installation) to figure out the correct command for the current date.\n\nAfter installation, run `sudo nvidia-smi` command, which should return the driver version and CUDA version:\n```sh\n+-----------------------------------------------------------------------------------------+\n| NVIDIA-SMI 550.90.07              Driver Version: 550.90.07      CUDA Version: 12.4     |\n|-----------------------------------------+------------------------+----------------------+\n| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |\n| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |\n|                                         |                        |               MIG M. |\n|=========================================+========================+======================|\n|   0  Tesla T4                       Off |   00000000:00:1E.0 Off |                    0 |\n| N/A   42C    P0             25W /   70W |       1MiB /  15360MiB |      8%      Default |\n|                                         |                        |                  N/A |\n+-----------------------------------------+------------------------+----------------------+\n\n+-----------------------------------------------------------------------------------------+\n| Processes:                                                                              |\n|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |\n|        ID   ID                                                               Usage      |\n|=========================================================================================|\n|  No running processes found                                                             |\n+-----------------------------------------------------------------------------------------+\n```\n\nIn addition to the driver, let's also install `nvidia-container-toolkit` following the [instruction](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installing-with-apt) to install with apt. Then configure the runtime and restart Docker daemon:\n\n```sh\nsudo nvidia-ctk runtime configure --runtime=docker\n\nsudo systemctl restart docker\n```\nNote if you do not perform this step, you will come across this error when trying to bring up the model from docker:\n`could not select device driver \"nvidia\" with capabilities: [[gpu]]`\n\n## Configure Compose Profile ![Static Badge](https://img.shields.io/badge/local--model-blue) ![Static Badge](https://img.shields.io/badge/remote--model-darkgreen)\n\nLet's download the files in this repository and enter the directory to edit the environment variable file `.env`:\n```sh\ncd ~\ngit clone https://github.com/digihunch/chat-service.git\ncd chat-service\nvim .env\n```\nThe value of `COMPOSE_PROFILES` should be the profiles, based on how you would like to source the models. The value can be `local-model`, `remote-model` or `local-model,remote-model` to use both profiles. The value of `ENABLE_OLLAMA_LOCAL_MODEL` should be `True` if `local-model` is used in `COMPOSE_PROFILES`.\n\n## Configure Model ![Static Badge](https://img.shields.io/badge/remote--model-darkgreen)\n\nTo configure access to `gpt-3.5-turbo` model hosted remotely on platform.openai.com, you must have an [OpenAI](https://platform.openai.com/) account and an [API key](https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key). Get an OpenAI API key from OpenAI use it as the value for variable `REMOTE_OPENAI_API_KEY` in the file `.env` to tell the LiteLLM proxy how to connect to the remote model. We also set the value of `LITELLM_MASTER_KEY` to a value that we desire. With that, we can provision another key for Open Web UI to connect to LiteLLM. To do so, first, we launch the `litellm-proxy-svc` service: \n\n```sh\nsudo docker compose up litellm-proxy-svc\n```\nThen can fire the following request using `curl` from another terminal session:\n```sh\ncurl 'http://0.0.0.0:4000/key/generate' \\\n--header 'Authorization: Bearer sk-liteLLM1234' \\\n--header 'Content-Type: application/json' \\\n--data-raw '{\"models\": [\"gpt-3.5-turbo\"], \"metadata\": {\"user\": \"user@digihunch.com\"}}'\n```\nIn the response, one of the attribute is `key`, with the value starting with `sk-`. Use this value as the value for `LITELLM_OPENAI_API_KEY` in the environment variable file `.env`.\n\n\n## Configure Model ![Static Badge](https://img.shields.io/badge/local--model-blue)\n\nTo configure the llama3.1 model, let's start the `ollama-svc` service:\n```sh\nsudo docker compose up ollama-svc\n```\nOnce the service is started, we can pull `llama3.1 model`, by running `ollama` client from the container. Start a new terminal session and run:\n```sh\nsudo docker exec -it ollama ollama pull llama3.1\nsudo docker exec -it ollama ollama list\n```\nThe second command verifies that the model is available locally. \n\n## Start the Web Server ![Static Badge](https://img.shields.io/badge/local--model-blue) ![Static Badge](https://img.shields.io/badge/remote--model-darkgreen)\nBefore launching the web site, we need to create a demo certificate. Suppose the site name is `chatsample.digihunch.com`, and the work directory is `/home/ubuntu/chat-service`. Run the following `openssl` command:\n```sh\nopenssl req -x509 -sha256 -newkey rsa:4096 -days 365 -nodes -subj /C=CA/ST=Ontario/L=Waterloo/O=Digihunch/OU=Development/CN=chatsample.digihunch.com/emailAddress=chatsample@digihunch.com -keyout /home/ubuntu/chat-service/nginx/certs/hostname-domain.key -out /home/ubuntu/chat-service/nginx/certs/hostname-domain.crt\n```\nThis command creates the `hostname-domain.key` and `hostname-domain.crt` file in the `/home/ubuntu/chat-service/nginx/certs/` directory, which are referenced by relative path in the configuration in `nginx.conf` file included in the repo.\n\nThen we can (re)start docker compose to reflect the changes to litellm and nginx config:\n\n```sh\nsudo docker compose up\n```\nIf all services start correctly, the web service is listening on port 443. If your service is running your MacBook, you can hit the port 443 directly from a browser. If your service is running on a VM, you can hit the server on public IP at port 443. Alternatively, if your have SSH access to the VM, you can map the server's 443 port to your laptop's 8443 port with port forwarding:\n```bash\nssh ubuntu@i-0ebba94b69620677e -L 8443:localhost:443\n```\nThen hit port 8443 from your browser to visit the site. \n\nSign up and enjoy the chat.\n\n## Troubleshooting\n\nFor more troubleshooting details, refer to my blog post.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdigihunch%2Fchat-service","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdigihunch%2Fchat-service","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdigihunch%2Fchat-service/lists"}