https://github.com/antvirf/stui
Manage Slurm nodes and jobs with a Terminal User Interface
https://github.com/antvirf/stui
go gui hpc slurm tui tview
Last synced: 6 months ago
JSON representation
Manage Slurm nodes and jobs with a Terminal User Interface
- Host: GitHub
- URL: https://github.com/antvirf/stui
- Owner: Antvirf
- Created: 2025-03-30T14:30:06.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2025-04-01T14:33:13.000Z (6 months ago)
- Last Synced: 2025-04-01T15:25:21.803Z (6 months ago)
- Topics: go, gui, hpc, slurm, tui, tview
- Language: Go
- Homepage:
- Size: 52.7 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# `stui` - Slurm terminal user interface
*Like [k9s](https://k9scli.io/), but for Slurm*
Terminal User Interface (TUI) for viewing and managing Slurm nodes and jobs.
## Features
- List and view nodes and jobs, quickly filter list with regexp
- View individual node details (`scontrol show node` equivalent)
- View individual job details (`scontrol show job` equivalent)
- Show `sdiag` output## Installation
With [`go 1.22`](https://go.dev/doc/install) or newer installed;
```bash
go install github.com/antvirf/stui@latest
sudo mv ~/go/bin/stui /usr/bin
```## Usage
1. Ensure your Slurm binaries are working and you can talk to your cluster, e.g. `sdiag` shows a valid output.
2. Run `stui` / `go run main.go` in the repo. See `-help` for arguments.
```
Usage of ./stui:
-copied-lines-separator string
string to use when separating copied lines in clipboard (default "\n")
-copy-first-column-only
if true, only copy the first column of the table to clipboard when copying (default true)
-debug-multiplier int
multiplier for nodes and jobs, helpful when debugging and developing (default 1)
-job-view-columns string
comma-separated list of scontrol fields to show in job view (default "JobId,UserId,Partition,JobName,JobState,RunTime,NodeList")
-node-view-columns string
comma-separated list of scontrol fields to show in node view (default "NodeName,Partitions,State,CPUTot,RealMemory,CPULoad,Reason,Sockets,CoresPerSocket,ThreadsPerCore,Gres")
-refresh-interval duration
interval in seconds when to refetch data (default 15ns)
-request-timeout duration
timeout setting for fetching data (default 4ns)
-search-debounce-interval duration
interval in milliseconds to wait before searching (default 50ns)
-slurm-binaries-path string
path where Slurm binaries like 'sinfo' and 'squeue' can be found (default "/usr/local/bin")
-slurm-conf-location string
path to slurm.conf for the desired cluster, sets 'SLURM_CONF' environment variable (default "/etc/slurm/slurm.conf")
-slurm-restd-address string
URI for Slurm REST API if available, including protocol and port
```
## Developing `stui`
The below helpers configure a locally running cluster with `888` virtual nodes across several partitions to help work on `stui` with realistic data.
```bash
make build-cluster # build Slurm with required options
make config-cluster # copy mock config to /etc/slurm/
make run-cluster # start `slurmctld` and `slurmd`
make launch-jobs # launch few hundred sleep jobs
make stop-cluster # stop clustermake setup # install pre-commit and download Go deps
```## To-do
- Selector/limit by partition across both job and node views
- Control commands: Set node state and reason for all selected nodes
- Control commands: Cancel jobs / Send to top of queue for all selected jobs
- Improve handling of sdiag/other calls if no scheduler available - by default they hang for a long time
- Ability to use `slurmrestd` / REST API instead of Slurm binaries