https://github.com/psalias2006/gpu-hot
🔥 Real-time NVIDIA GPU dashboard
https://github.com/psalias2006/gpu-hot
charts cuda docker flask gpu gpu-monitoring nvidia nvidia-smi python real-time-monitoring socker-io system-monitoring
Last synced: 1 day ago
JSON representation
🔥 Real-time NVIDIA GPU dashboard
- Host: GitHub
- URL: https://github.com/psalias2006/gpu-hot
- Owner: psalias2006
- License: mit
- Created: 2025-10-05T11:48:17.000Z (2 days ago)
- Default Branch: main
- Last Pushed: 2025-10-05T14:24:06.000Z (1 day ago)
- Last Synced: 2025-10-05T14:34:26.664Z (1 day ago)
- Topics: charts, cuda, docker, flask, gpu, gpu-monitoring, nvidia, nvidia-smi, python, real-time-monitoring, socker-io, system-monitoring
- Language: HTML
- Homepage:
- Size: 2.62 MB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# GPU Hot
### Real-Time NVIDIA GPU Monitoring Dashboard
Web-based monitoring for NVIDIA GPUs. Track 30+ metrics per GPU with live charts and real-time updates.

[](https://www.python.org/)
[](https://www.docker.com/)
[](LICENSE)
[](https://www.nvidia.com/)[Features](#features) • [Quick Start](#quick-start) • [Installation](#installation) • [Documentation](#documentation) • [Contributing](#contributing)
---
## Features
- **30+ GPU Metrics** - Utilization, temperature, memory, power, clocks, encoder/decoder stats, and more per GPU
- **Multi-GPU Support** - Automatic detection and independent monitoring of all NVIDIA GPUs
- **Live Historical Charts** - Real-time graphs with statistics (min/max/avg), threshold indicators, and contextual tooltips
- **Process Monitoring** - Track active GPU processes with memory usage and PIDs
- **Clean UI** - Responsive interface with glassmorphism design and smooth animations
- **WebSocket Updates** - Sub-second refresh rates (2s) for real-time monitoring
- **Docker Deployment** - One-command setup with NVIDIA Container Toolkit support
- **Zero Configuration** - Works out of the box with any NVIDIA GPU## Monitored Metrics
### Core GPU Metrics
- GPU & Memory Utilization (%)
- Core & Memory Temperature (°C)
- Memory Usage (Used/Free/Total MB)
- Power Draw & Limits (W)
- Fan Speed (%)
- Clock Speeds (Graphics, SM, Memory, Video MHz)### Advanced Metrics
- PCIe Generation & Lane Width (Current/Max)
- Performance State (P-State)
- Compute Mode
- Encoder/Decoder Sessions & Stats
- Driver & VBIOS Version
- Throttle Status Detection### System Context
- Host CPU & RAM Usage
- Active GPU Processes with Memory Tracking## Quick Start
### Docker Deployment (Recommended)
```bash
git clone https://github.com/psalias2006/gpu-hot
cd gpu-hot
docker-compose up --build
```Access the dashboard at `http://localhost:1312`
### Local Development
```bash
pip install -r requirements.txt
python app.py
```Access the dashboard at `http://localhost:1312`
## Installation
### Prerequisites
- NVIDIA GPU with drivers installed (verify with `nvidia-smi`)
- Docker & Docker Compose (for containerized deployment)
- NVIDIA Container Toolkit (for Docker GPU access)
- Python 3.8+ (for local development)### NVIDIA Container Toolkit Setup
Required for Docker deployment to access GPUs.
**Installation:**
Follow the official installation guide for your distribution:
📖 [NVIDIA Container Toolkit Installation Guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)The guide includes instructions for Ubuntu, Debian, RHEL, CentOS, Fedora, and other distributions.
**Verify Installation:**
```bash
docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
```## Documentation
### Project Structure
```
gpu-hot/
├── app.py # Flask application with WebSocket server
├── templates/
│ └── index.html # Web dashboard with live charts
├── requirements.txt # Python dependencies
├── Dockerfile # Container configuration
├── docker-compose.yml # Docker Compose setup
└── README.md # Documentation
```### API Endpoints
**HTTP:**
- `GET /` - Dashboard interface
- `GET /api/gpu-data` - Current GPU metrics (JSON)**WebSocket:**
- `gpu_data` - Real-time metrics broadcast (2s interval)
- `connect` / `disconnect` - Connection events### Configuration
**Environment Variables:**
- `NVIDIA_VISIBLE_DEVICES` - GPU visibility (default: all)
- `NVIDIA_DRIVER_CAPABILITIES` - GPU capabilities (default: all)**Customization in `app.py`:**
- Update interval: Modify `eventlet.sleep(2)` for refresh rate
- Port: Change in `socketio.run()` (default: 1312)
- Chart history: Adjust data retention (default: 30 points)### Docker Configuration
- Base: `nvidia/cuda:12.1-devel-ubuntu22.04`
- Self-contained nvidia-smi (no host mounting required)
- Health checks every 30s
- Automatic restart on failure
- Exposes port 1312### Development
**Adding New Metrics:**
1. Modify nvidia-smi query in `parse_nvidia_smi()`
2. Update frontend to display new metrics
3. Add chart configuration if needed**Local Development:**
```bash
pip install -r requirements.txt
python app.py
```Access at `http://localhost:1312` (Dashboard) or `http://localhost:1312/api/gpu-data` (API)
## Troubleshooting
### nvidia-smi not found
- Verify NVIDIA drivers: `nvidia-smi`
- Install NVIDIA Container Toolkit (see Installation section)
- Restart Docker daemon: `sudo systemctl restart docker`
- Test GPU access: `docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi`### No GPU data
- Check host GPU access: `nvidia-smi`
- Verify Container Toolkit: `nvidia-ctk --version`
- Review logs: `docker-compose logs`
- Configure Docker runtime: `sudo nvidia-ctk runtime configure --runtime=docker`### WebSocket issues
- Check port 1312 accessibility
- Review browser console for errors
- Verify firewall settings### Enable debug logging
```python
# In app.py
socketio.run(app, host='0.0.0.0', port=1312, debug=True)
```## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/NewFeature`)
3. Commit your changes (`git commit -m 'Add NewFeature'`)
4. Push to the branch (`git push origin feature/NewFeature`)
5. Open a Pull Request## License
See the [LICENSE](LICENSE) file for full details.## Acknowledgments
- NVIDIA for nvidia-smi
- Flask & Socket.IO teams
- Chart.js for visualization
- Open-source community---
**GPU Hot** - Real-Time GPU Monitoring
[Report Bug](https://github.com/psalias2006/gpu-hot/issues) • [Request Feature](https://github.com/psalias2006/gpu-hot/issues)