{"id":31581257,"url":"https://github.com/psalias2006/gpu-hot","last_synced_at":"2026-02-26T04:02:04.098Z","repository":{"id":318154090,"uuid":"1070160746","full_name":"psalias2006/gpu-hot","owner":"psalias2006","description":"🔥 Real-time NVIDIA GPU dashboard","archived":false,"fork":false,"pushed_at":"2025-12-25T13:37:02.000Z","size":72513,"stargazers_count":1254,"open_issues_count":5,"forks_count":58,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-12-27T01:19:10.940Z","etag":null,"topics":["charts","cuda","dashboard","devops","docker","flask","gpu","gpu-monitoring","llm","mlops","nvidia","nvidia-docker","nvidia-smi","python","real-time","real-time-monitoring","system-monitoring"],"latest_commit_sha":null,"homepage":"https://psalias2006.github.io/gpu-hot/","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/psalias2006.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-05T11:48:17.000Z","updated_at":"2025-12-26T07:52:00.000Z","dependencies_parsed_at":"2025-10-30T11:35:12.626Z","dependency_job_id":null,"html_url":"https://github.com/psalias2006/gpu-hot","commit_stats":null,"previous_names":["psalias2006/gpu-hot"],"tags_count":18,"template":false,"template_full_name":null,"purl":"pkg:github/psalias2006/gpu-hot","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/psalias2006%2Fgpu-hot","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/psalias2006%2Fgpu-hot/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/psalias2006%2Fgpu-hot/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/psalias2006%2Fgpu-hot/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/psalias2006","download_url":"https://codeload.github.com/psalias2006/gpu-hot/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/psalias2006%2Fgpu-hot/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29530640,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-17T00:57:22.232Z","status":"online","status_checked_at":"2026-02-17T02:00:08.105Z","response_time":100,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["charts","cuda","dashboard","devops","docker","flask","gpu","gpu-monitoring","llm","mlops","nvidia","nvidia-docker","nvidia-smi","python","real-time","real-time-monitoring","system-monitoring"],"created_at":"2025-10-05T21:55:30.639Z","updated_at":"2026-02-26T04:02:04.091Z","avatar_url":"https://github.com/psalias2006.png","language":"JavaScript","funding_links":[],"categories":["JavaScript"],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n# GPU Hot\n\nReal-time NVIDIA GPU monitoring dashboard. Lightweight, web-based, and self-hosted.\n\n[![Python](https://img.shields.io/badge/Python-3.8+-3776AB?style=flat-square\u0026logo=python\u0026logoColor=white)](https://www.python.org/)\n[![Docker](https://img.shields.io/badge/Docker-Ready-2496ED?style=flat-square\u0026logo=docker\u0026logoColor=white)](https://www.docker.com/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)\n[![NVIDIA](https://img.shields.io/badge/NVIDIA-GPU-76B900?style=flat-square\u0026logo=nvidia\u0026logoColor=white)](https://www.nvidia.com/)\n\n\u003cimg src=\"gpu-hot.png\" alt=\"GPU Hot Dashboard\" width=\"800\" /\u003e\n\n\u003cp\u003e\n\u003ca href=\"https://psalias2006.github.io/gpu-hot/demo.html\"\u003e\n\u003cimg src=\"https://img.shields.io/badge/%E2%96%B6%20%20Live_Demo-try_it_in_your_browser-1a1a1a?style=for-the-badge\u0026labelColor=76B900\" alt=\"Live Demo\" /\u003e\n\u003c/a\u003e\n\u003c/p\u003e\n\n\u003c/div\u003e\n\n---\n\n## Usage\n\nMonitor a single machine or an entire cluster with the same Docker image.\n\n**Single machine:**\n```bash\ndocker run -d --gpus all -p 1312:1312 ghcr.io/psalias2006/gpu-hot:latest\n```\n\n**Multiple machines:**\n```bash\n# On each GPU server\ndocker run -d --gpus all -p 1312:1312 -e NODE_NAME=$(hostname) ghcr.io/psalias2006/gpu-hot:latest\n\n# On a hub machine (no GPU required)\ndocker run -d -p 1312:1312 -e GPU_HOT_MODE=hub -e NODE_URLS=http://server1:1312,http://server2:1312,http://server3:1312 ghcr.io/psalias2006/gpu-hot:latest\n```\n\nOpen `http://localhost:1312`\n\n**Older GPUs:** Add `-e NVIDIA_SMI=true` if metrics don't appear.\n\n**Process monitoring:** Add `--init --pid=host` to see process names. Note: This allows the container to access host process information.\n\n**From source:**\n```bash\ngit clone https://github.com/psalias2006/gpu-hot\ncd gpu-hot\ndocker-compose up --build\n```\n\n**Requirements:** Docker + [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)\n\n---\n\n## Features\n\n- Real-time metrics (sub-second)\n- Automatic multi-GPU detection\n- Process monitoring (PID, memory usage)\n- Historical charts (utilization, temperature, power, clocks)\n- System metrics (CPU, RAM)\n- Scale from 1 to 100+ GPUs\n\n**Metrics:** Utilization, temperature, memory, power draw, fan speed, clock speeds, PCIe info, P-State, throttle status, encoder/decoder sessions\n\n---\n\n## Configuration\n\n**Environment variables:**\n```bash\nNVIDIA_VISIBLE_DEVICES=0,1     # Specific GPUs (default: all)\nNVIDIA_SMI=true                # Force nvidia-smi mode for older GPUs\nGPU_HOT_MODE=hub               # Set to 'hub' for multi-node aggregation (default: single node)\nNODE_NAME=gpu-server-1         # Node display name (default: hostname)\nNODE_URLS=http://host:1312...  # Comma-separated node URLs (required for hub mode)\n```\n\n**Backend (`core/config.py`):**\n```python\nUPDATE_INTERVAL = 0.5  # Polling interval in seconds\nPORT = 1312            # Server port\n```\n\n---\n\n## API\n\n### HTTP\n```bash\nGET /              # Dashboard\nGET /api/gpu-data  # JSON metrics snapshot\nGET /api/version   # Version and update info\n```\n\n### WebSocket\n```javascript\nconst ws = new WebSocket('ws://localhost:1312/socket.io/');\n\nws.onmessage = (event) =\u003e {\n  const data = JSON.parse(event.data);\n  // data.gpus      — per-GPU metrics\n  // data.processes  — active GPU processes\n  // data.system     — host CPU, RAM, swap, disk, network\n};\n```\n\n---\n\n## Project Structure\n\n```\ngpu-hot/\n├── app.py                      # FastAPI server + routes\n├── version.py                  # Version info\n├── core/\n│   ├── config.py               # Configuration\n│   ├── monitor.py              # NVML GPU monitoring\n│   ├── handlers.py             # WebSocket handlers\n│   ├── hub.py                  # Multi-node hub aggregator\n│   ├── hub_handlers.py         # Hub WebSocket handlers\n│   ├── nvidia_smi_fallback.py  # nvidia-smi fallback for older GPUs\n│   └── metrics/\n│       ├── collector.py        # Metrics collection\n│       └── utils.py            # Metric utilities\n├── static/\n│   ├── css/\n│   │   ├── tokens.css          # Design tokens (colors, spacing)\n│   │   ├── layout.css          # Page layout (sidebar, main)\n│   │   └── components.css      # UI components (cards, charts)\n│   ├── js/\n│   │   ├── chart-config.js     # Chart.js configurations\n│   │   ├── chart-manager.js    # Chart data + lifecycle\n│   │   ├── chart-drawer.js     # Correlation drawer\n│   │   ├── gpu-cards.js        # GPU card rendering\n│   │   ├── socket-handlers.js  # WebSocket + batched rendering\n│   │   ├── ui.js               # Sidebar navigation\n│   │   └── app.js              # Init + version check\n│   └── favicon.svg\n├── templates/index.html\n├── Dockerfile\n├── docker-compose.yml\n└── requirements.txt\n```\n\n---\n\n## Troubleshooting\n\n**No GPUs detected:**\n```bash\nnvidia-smi  # Verify drivers work\ndocker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi  # Test Docker GPU access\n```\n\n**Hub can't connect to nodes:**\n```bash\ncurl http://node-ip:1312/api/gpu-data  # Test connectivity\nsudo ufw allow 1312/tcp                # Check firewall\n```\n\n**Performance issues:** Increase `UPDATE_INTERVAL` in `core/config.py`\n\n---\n\n## Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=psalias2006/gpu-hot\u0026type=date\u0026legend=top-left)](https://www.star-history.com/#psalias2006/gpu-hot\u0026type=date\u0026legend=top-left)\n\n## Contributing\n\nPRs welcome. Open an issue for major changes.\n\n## License\n\nMIT - see [LICENSE](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpsalias2006%2Fgpu-hot","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpsalias2006%2Fgpu-hot","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpsalias2006%2Fgpu-hot/lists"}