https://github.com/mhajder/openwebui-stack
Docker Compose stack for Open WebUI with LiteLLM proxy, observability, and security best practices.
https://github.com/mhajder/openwebui-stack
ai docker docker-compose litellm open-webui openwebui otel qdrant traefik
Last synced: 4 months ago
JSON representation
Docker Compose stack for Open WebUI with LiteLLM proxy, observability, and security best practices.
- Host: GitHub
- URL: https://github.com/mhajder/openwebui-stack
- Owner: mhajder
- License: mit
- Created: 2026-01-01T19:07:34.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2026-01-01T20:03:57.000Z (5 months ago)
- Last Synced: 2026-01-07T04:20:28.152Z (5 months ago)
- Topics: ai, docker, docker-compose, litellm, open-webui, openwebui, otel, qdrant, traefik
- Language: Shell
- Homepage:
- Size: 67.4 KB
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Open WebUI Stack
Docker Compose stack for [Open WebUI](https://github.com/open-webui/open-webui) with LiteLLM proxy, observability, and security best practices.
## Architecture
```mermaid
graph TB
Internet[Internet]
Traefik[Traefik
Reverse Proxy]
OpenWebUI[Open WebUI]
LiteLLM[LiteLLM
Proxy]
LGTM[LGTM Stack
Grafana/Loki/Tempo/Mimir]
Qdrant[Qdrant
Vector DB]
Valkey[Valkey
Shared Cache]
PostgreSQL[PostgreSQL
Shared DB]
OTEL[OpenTelemetry
Collector]
Exporters[Exporters
Node/PostgreSQL/GPU]
Internet -->|HTTPS| Traefik
Traefik --> OpenWebUI
Traefik --> LiteLLM
Traefik --> LGTM
OpenWebUI --> Qdrant
OpenWebUI --> Valkey
OpenWebUI --> LiteLLM
LiteLLM --> Valkey
LiteLLM --> PostgreSQL
Exporters --> PostgreSQL
Exporters --> OTEL
OTEL --> LGTM
```
## Features
- **Traefik Reverse Proxy**: Automatic HTTPS with self-signed certificates generated by Traefik, HTTP to HTTPS redirect
- **LiteLLM Proxy**: Unified gateway for multiple LLM providers (OpenAI, Anthropic, Gemini, Ollama)
- **LGTM Observability Stack**: Grafana, Loki, Tempo, Mimir for logs, traces, and metrics
- **Qdrant Vector Database**: Production-ready vector search for RAG
- **PostgreSQL**: Shared database for Open WebUI and LiteLLM
- **OpenTelemetry**: Full observability with distributed tracing
- **Security**: Network isolation, secure headers, rate limiting
## Quick Start
### Prerequisites
- Docker Engine 24.0+
- Docker Compose v2.20+
- 8GB+ RAM recommended
- (Optional) NVIDIA GPU with drivers for GPU metrics
### Installation
1. **Clone the repository**
```bash
git clone https://github.com/mhajder/openwebui-stack.git
cd openwebui-stack
```
2. **Run the setup script**
```bash
chmod +x scripts/*.sh
./scripts/setup.sh
```
In `docker-compose.yml` change `ENABLE_SIGNUP` to `true` if you want to register the first admin user via Open WebUI. You can disable it again after the first user is created.
3. **Start the stack**
```bash
docker compose --profile monitoring --profile gpu up -d
```
4. **Access the services**
- Open WebUI: https://localhost
- Grafana: https://grafana.localhost
- LiteLLM: https://litellm.localhost
### Local Domain Resolution
Add these entries to your `/etc/hosts`:
```
127.0.0.1 localhost grafana.localhost litellm.localhost traefik.localhost
```
## Configuration
### Environment Variables
Key environment variables in `.env`:
| Variable | Description | Example |
|----------|-------------|---------|
| `COMPOSE_PROJECT_NAME` | Docker Compose project name | `openwebui-stack` |
| `DOMAIN` | Base domain for services | `localhost` or `example.com` |
| `POSTGRES_USER` | PostgreSQL admin user | `postgres` |
| `POSTGRES_PASSWORD` | PostgreSQL admin password | Generated by setup script |
| `OPENWEBUI_DB_NAME` | Open WebUI database name | `openwebui` |
| `OPENWEBUI_DB_USER` | Open WebUI database user | `openwebui` |
| `OPENWEBUI_DB_PASSWORD` | Open WebUI database password | Generated by setup script |
| `LITELLM_DB_NAME` | LiteLLM database name | `litellm` |
| `LITELLM_DB_USER` | LiteLLM database user | `litellm` |
| `LITELLM_DB_PASSWORD` | LiteLLM database password | Generated by setup script |
| `OPENWEBUI_SECRET_KEY` | Secret key for Open WebUI sessions | Generated by setup script |
| `LITELLM_MASTER_KEY` | Master API key for LiteLLM | Generated by setup script |
| `LITELLM_SALT_KEY` | Salt key for LiteLLM encryption | Generated by setup script |
| `TRAEFIK_DASHBOARD_USER` | Traefik dashboard username | `admin` |
| `TRAEFIK_DASHBOARD_PASSWORD` | Traefik dashboard password (plain) | Generated by setup script |
| `TRAEFIK_DASHBOARD_PASSWORD_HASH` | Traefik dashboard password (apr1 hash) | Auto-generated by setup script |
| `GF_SECURITY_ADMIN_PASSWORD` | Grafana admin password | Generated by setup script |
### Services Overview
#### Core Services
- **Traefik**: Reverse proxy with automatic HTTPS, routing, and load balancing
- **Open WebUI**: Web UI for interacting with LLMs
- **LiteLLM**: Unified gateway for multiple LLM providers
- **PostgreSQL**: Shared relational database for Open WebUI and LiteLLM
- **Valkey**: High-performance Redis alternative for caching and sessions
#### Storage & Search
- **Qdrant**: Vector database for semantic search and RAG operations
#### Observability (LGTM Stack)
- **LGTM** (Grafana OTel): Integrated stack providing:
- **Prometheus/Mimir**: Metrics collection and storage
- **Loki**: Centralized log aggregation
- **Tempo**: Distributed tracing
- **Grafana**: Visualization and dashboards
#### Exporters & Collectors
- **Node Exporter**: System-level metrics (CPU, memory, disk, network)
- **OpenTelemetry Collector**: Collects and forwards metrics, logs, and traces
- **PostgreSQL Exporter** (optional profile): Database metrics
- **NVIDIA GPU Exporter** (optional profile): GPU metrics (requires GPU)
### Production Deployment
For production environments:
1. **Use proper certificates**
- Replace self-signed certs with Let's Encrypt or CA-signed certificates
- Update [traefik/traefik.yml](traefik/traefik.yml) for ACME configuration (Let's Encrypt)
- Example ACME configuration:
```yaml
certificatesResolvers:
letsencrypt:
acme:
email: admin@example.com
storage: /data/acme.json
httpChallenge:
entryPoint: web
```
2. **Update security settings**
- Change default passwords in `.env` (setup script generates strong ones automatically)
- Enable authentication on all service dashboards
- Configure rate limiting and WAF rules in Traefik
3. **Enable GPU support** (if available)
```bash
docker compose --profile gpu up -d
```
This activates the NVIDIA GPU Exporter for monitoring GPU metrics.
4. **Configure persistent backups**
- Use external volume drivers for data persistence
- Set up automated backup scripts using cron
- Store backups in secure, off-site locations
5. **Network security**
- Use VPN or IP whitelisting for external access
- Configure firewall rules (UFW, iptables)
- Use private networks when possible
6. **Database optimization**
- Configure PostgreSQL replication for HA
- Set up automated backup and recovery procedures
- Monitor database performance and storage
7. **Monitoring and alerting**
- Configure alert rules in Grafana for critical metrics
- Set up notification channels (email, Slack, PagerDuty)
- Implement log retention policies in Loki
## Monitoring
### Grafana Dashboards
Access Grafana at **https://grafana.localhost** with default credentials:
- **Username**: `admin`
- **Password**: Set in `.env` as `GF_SECURITY_ADMIN_PASSWORD` (generated by setup script)
#### Pre-configured Dashboards
The stack includes several pre-configured dashboards:
1. **litellm-dashboard.json** - LiteLLM proxy metrics and performance
- Request rates and latencies
- Token usage and costs
- Error rates by model
- Model provider health
2. **node-exporter-dashboard.json** - System metrics
- CPU, memory, and disk usage
- Network I/O
- Process metrics
- System uptime
3. **openwebui-dashboard.json** - Open WebUI application metrics
- Request rates and response times
- User activity
- Message counts
- API endpoint performance
4. **postgresql-dashboard.json** - Database metrics
- Query performance
- Connection pool status
- Cache hit rates
- Replication lag (if configured)
5. **traefik-dashboard.json** - Reverse proxy metrics
- Request rates by service
- Response time percentiles
- HTTP status codes
- SSL/TLS certificate expiry
6. **opentelemetry-dashboard.json** - System-wide observability
- Distributed traces
- Span analysis
- Service dependencies
- Error tracking
7. **nvidia-dcgm-dashboard.json** - GPU metrics (if GPU profile enabled)
- GPU memory usage
- Compute utilization
- Temperature monitoring
- Power consumption
### Available Metrics
**Prometheus/Mimir** stores these metric types:
| Category | Metrics |
|----------|---------|
| **System** | CPU load, memory usage, disk I/O, network traffic |
| **Database** | Connection count, query latency, transaction rate |
| **HTTP** | Request rate, response time, status codes, error rate |
| **Cache** | Hit/miss rates, eviction rates, memory usage |
| **Application** | Custom metrics from services |
| **GPU** | Memory usage, compute, temperature, power (if enabled) |
### Traces in Tempo
OpenTelemetry traces are automatically collected and stored in Tempo:
- Access via Grafana → Explore → Tempo
- View distributed traces across all services
- Analyze service dependencies
- Identify performance bottlenecks
- Track errors through the system
## Additional Resources
- [Open WebUI Documentation](https://docs.openwebui.com/)
- [LiteLLM Proxy Docs](https://docs.litellm.ai/docs/proxy/configs)
- [Traefik Documentation](https://traefik.io/traefik/)
- [Grafana LGTM Stack](https://grafana.com/products/lgtm-stack/)
- [Qdrant Documentation](https://qdrant.tech/documentation/)
- [OpenTelemetry](https://opentelemetry.io/)
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.