https://github.com/fly-apps/ollama-open-webui
Self-host a ChatGPT-style web interface for Ollama 🦙
https://github.com/fly-apps/ollama-open-webui
ai gemma gpu llama3 llava mistral mixtral ollama ollama-webui
Last synced: 8 months ago
JSON representation
Self-host a ChatGPT-style web interface for Ollama 🦙
- Host: GitHub
- URL: https://github.com/fly-apps/ollama-open-webui
- Owner: fly-apps
- License: mit
- Created: 2024-02-06T11:13:30.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-08-14T09:30:26.000Z (almost 2 years ago)
- Last Synced: 2025-02-06T07:08:52.057Z (over 1 year ago)
- Topics: ai, gemma, gpu, llama3, llava, mistral, mixtral, ollama, ollama-webui
- Language: Shell
- Homepage: https://fly.io/docs/gpus/
- Size: 28.3 KB
- Stars: 73
- Watchers: 8
- Forks: 26
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
___________
## Deploy
Everyone loves a one-liner — let's clone the repo and deploy the app with [flyctl](https://fly.io/docs/hands-on/install-flyctl/):
```bash
fly launch --from https://github.com/fly-apps/ollama-open-webui
```
That's it! When you visit `https://[app].fly.dev` you should see the Open WebUI interface where you can log in and create the initial admin user. You can then optionally disable signups and make the app private by setting `ENABLE_SIGNUP = "false"` in your fly.toml [`env` variables section](https://fly.io/docs/reference/configuration/#the-env-variables-section).
> [!IMPORTANT]
> By default, the app runs on Fly GPUs — Nvidia L40s to be exact. This can be customized in the fly.toml [`vm` settings](https://github.com/fly-apps/ollama-open-webui/blob/e168239c26fb2548ee26d1e44e1df3ab1278497d/fly.toml#L26). It will _probably_ run on a standard Fly Machine because Ollama does leverage llama.cpp — but performance will be drastically reduced.
## Scaling to Zero
By default, the app does scale-to-zero. This is recommended (especially with GPUs) to save on costs. When the app receives a new request from the proxy, the Machine will boot in ~3s with the Web UI server ready to serve requests in ~15s. Loading models into VRAM can take a bit longer, depending on the size of the model.
## Having trouble?
Create an issue or ask a question here: https://community.fly.io/