{"id":43806927,"url":"https://github.com/skobkin/amdgputop-web","last_synced_at":"2026-04-01T22:01:32.374Z","repository":{"id":333235208,"uuid":"1074858476","full_name":"skobkin/amdgputop-web","owner":"skobkin","description":"AMD GPU status monitor web panel written in Go using LLMs","archived":false,"fork":false,"pushed_at":"2026-03-21T17:43:32.000Z","size":643,"stargazers_count":4,"open_issues_count":6,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-03-22T07:41:20.770Z","etag":null,"topics":["amd","amdgpu","dashboard","gpu","gpu-monitoring","linux","metrics","metrics-collection","metrics-collector","metrics-exporter","metrics-gathering","metrics-visualization","monitoring","resource-usage","top","utilization"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/skobkin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2025-10-12T16:01:54.000Z","updated_at":"2026-03-21T17:38:40.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/skobkin/amdgputop-web","commit_stats":null,"previous_names":["skobkin/amdgputop-web"],"tags_count":13,"template":false,"template_full_name":null,"purl":"pkg:github/skobkin/amdgputop-web","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/skobkin%2Famdgputop-web","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/skobkin%2Famdgputop-web/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/skobkin%2Famdgputop-web/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/skobkin%2Famdgputop-web/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/skobkin","download_url":"https://codeload.github.com/skobkin/amdgputop-web/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/skobkin%2Famdgputop-web/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31292631,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-01T21:15:39.731Z","status":"ssl_error","status_checked_at":"2026-04-01T21:15:34.046Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["amd","amdgpu","dashboard","gpu","gpu-monitoring","linux","metrics","metrics-collection","metrics-collector","metrics-exporter","metrics-gathering","metrics-visualization","monitoring","resource-usage","top","utilization"],"created_at":"2026-02-05T23:02:44.442Z","updated_at":"2026-04-01T22:01:32.366Z","avatar_url":"https://github.com/skobkin.png","language":"Go","readme":"# amdgpu_top-web\n\n[![CI](https://github.com/skobkin/amdgputop-web/actions/workflows/ci.yml/badge.svg)](https://github.com/skobkin/amdgputop-web/actions/workflows/ci.yml)\n\nRead-only web UI for live AMD GPU telemetry inspired by the `amdgpu_top` CLI.\nThe backend is pure Go (stdlib HTTP + WebSockets) and the frontend is a compact\nPreact single-page app.\n\n![AMD GPU telemetry UI](docs/screenshot.webp \"Current UI snapshot\")\n\n## Features\n\n- 🖥️ Enumerates DRM GPUs and streams utilization, clocks, temps, VRAM/GTT usage.\n- 🧾 Optional “process top” view sourced from `/proc/*/fdinfo` with engine-time\n  deltas when exposed by the kernel.\n- 📈 Historical charts (uPlot) for the selected GPU with hover tooltips.\n- 🌐 REST endpoints for `/api/gpus`, `/api/gpus/\u003cid\u003e/metrics`, and `/api/gpus/\u003cid\u003e/procs`\n  alongside a WebSocket feed (`/ws`).\n- 📊 Optional Prometheus `/metrics` export with per-GPU telemetry (no per-process data).\n- ⚙️ Configuration via environment variables (`APP_*`), including sampler cadence,\n  process scanner limits, and allowed origins.\n\n## Quick Start (host build)\n\n```bash\ncd web \u0026\u0026 npm ci \u0026\u0026 npm run build\ngo build ./cmd/amdgputop-web\n./amdgputop-web            # listens on :8080 by default\n\n# Alternatively, run the default build pipeline:\n# make\n```\n\nThe frontend build output is generated into `internal/httpserver/assets/` and is\nembedded at compile time; those files are not committed to the repository.\n\nOn AMD hardware you can sanity-check the sampler without the web UI:\n\n```bash\ngo run ./cmd/sampler-test -sample\n```\n\n## Docker\n\nThe official image built by Github Actions is available here: [`ghcr.io/skobkin/amdgputop-web`](https://github.com/skobkin/amdgputop-web/pkgs/container/amdgputop-web).\n\n### Docker compose\n\nExample Docker stack: https://git.skobk.in/skobkin/docker-stacks/src/branch/master/amdgputop-web\n\n### Running manually\n\nAn Alpine-based multi-stage image is defined in `Dockerfile`.\n\n```bash\ndocker build -t amdgputop-web:dev .\n\nVID_GID=$(getent group video | cut -d: -f3)\nRENDER_GID=$(getent group render | cut -d: -f3)\n\ndocker run --rm -p 8080:8080 \\\n  --device=/dev/dri \\\n  --device=/dev/kfd \\\n  --group-add \"${VID_GID}\" \\\n  --group-add \"${RENDER_GID}\" \\\n  --pid=host \\\n  --cap-add SYS_PTRACE \\\n  --user root \\\n  amdgputop-web:dev\n```\n\n### Important notes\n\n\u003e **GPU names**: the image bundles Alpine's `/usr/share/hwdata/pci.ids`, so GPU\n\u003e model names resolve without any extra volume mounts. If you want to override\n\u003e the bundled database with the host's copy, bind-mount it explicitly.\n\n\u003e **Why root + `SYS_PTRACE`?** Reading `/proc/\u003cpid\u003e/fdinfo` for host workloads\n\u003e requires elevated privileges and the `CAP_SYS_PTRACE` capability. Running the\n\u003e container as `root` with `--cap-add SYS_PTRACE` is the simplest way to let the\n\u003e process scanner observe GPU clients outside the container. If you only need\n\u003e device-level metrics, you can omit `--pid=host`, `--user root`, and the extra\n\u003e capability and run with the default non-root user.\n\nRefer to `docs/DOCKER.md` for more detail, including why `--pid=host` is needed\nto observe host processes.\n\n#### Troubleshooting \u0026 permissions\n\n- The [permissions matrix](docs/DOCKER.md#permissions-matrix) explains which\n  flags, groups, and capabilities are required for device-only metrics versus\n  host process telemetry.\n- If the UI shows empty process tables or partial metrics, consult the\n  [troubleshooting section](docs/DOCKER.md#troubleshooting) for the most common\n  container permission fixes.\n\n## Configuration\n\n| Variable                   | Default             | Description                                                    |\n|----------------------------|---------------------|----------------------------------------------------------------|\n| `APP_LISTEN_ADDR`          | `:8080`             | HTTP listen address.                                           |\n| `APP_LOG_LEVEL`            | `INFO`              | Log verbosity (`DEBUG`, `INFO`, `WARN`, `ERROR`).              |\n| `APP_ALLOWED_ORIGINS`      | `*`                 | Comma-separated origins allowed for WebSocket/HTTP.            |\n| `APP_DEFAULT_GPU`          | `auto`              | GPU pre-selected on connect (`auto` = first detected).         |\n| `APP_ENABLE_PROMETHEUS`    | `false`             | Enable `/metrics` endpoint with per-GPU telemetry when `true`. |\n| `APP_ENABLE_PPROF`         | `false`             | Expose Go pprof handlers on `/debug/pprof/*`.                  |\n| `APP_CHARTS_ENABLE`        | `true`              | Toggle historical charts feature.                              |\n| `APP_CHARTS_MAX_POINTS`    | `7200`              | Maximum data points retained per chart.                        |\n| `APP_SAMPLE_INTERVAL`      | `2s`                | Metrics sampling cadence.                                      |\n| `APP_PROC_ENABLE`          | `true`              | Toggle process scanner feature.                                |\n| `APP_PROC_SCAN_INTERVAL`   | `2s`                | Interval between process snapshot scans.                       |\n| `APP_PROC_MAX_PIDS`        | `5000`              | Upper bound on tracked process count per scan.                 |\n| `APP_PROC_MAX_FDS_PER_PID` | `64`                | Max file descriptors per PID to inspect.                       |\n| `APP_WS_MAX_CLIENTS`       | `1024`              | Maximum concurrent WebSocket clients.                          |\n| `APP_WS_WRITE_TIMEOUT`     | `3s`                | WebSocket write timeout.                                       |\n| `APP_WS_READ_TIMEOUT`      | `30s`               | WebSocket read timeout.                                        |\n| `APP_SYSFS_ROOT`           | `/sys`              | Override sysfs root (test-only).                               |\n| `APP_DEBUGFS_ROOT`         | `/sys/kernel/debug` | Override debugfs root (test-only).                             |\n| `APP_PROC_ROOT`            | `/proc`             | Override procfs root (test-only).                              |\n\nSee `internal/config/config.go` for the full list, including test-only roots\n(`APP_SYSFS_ROOT`, `APP_DEBUGFS_ROOT`, `APP_PROC_ROOT`).\n\n## Prometheus\n\nSet `APP_ENABLE_PROMETHEUS=true` to expose `GET /metrics`. The exporter\npublishes WebSocket counters along with the latest per-GPU telemetry pulled from\nthe sampler. Each gauge is labeled with `gpu_id` and includes:\n\n- Busy percentages for graphics and memory engines.\n- Current SCLK/MCLK frequencies, temperature, fan RPM, and power draw.\n- VRAM/GTT usage and capacity.\n- Timestamps and age for the most recent sample.\n\nPer-process statistics stay out of the Prometheus surface area.\n\n## Development\n\n```bash\n# Backend\ngo test ./...\n\n# Frontend\ncd web \u0026\u0026 npm ci \u0026\u0026 npm run build\n```\n\nCI (see `.github/workflows/ci.yml`) enforces `gofmt`, `go vet`, Go tests,\nfrontend build, and publishes tagged releases with Linux binaries and Docker\nimages.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fskobkin%2Famdgputop-web","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fskobkin%2Famdgputop-web","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fskobkin%2Famdgputop-web/lists"}