{"id":34519875,"url":"https://github.com/dedalus-labs/slurmq","last_synced_at":"2026-01-13T21:04:07.358Z","repository":{"id":330370419,"uuid":"1121488576","full_name":"dedalus-labs/slurmq","owner":"dedalus-labs","description":"Quota monitoring and management for Slurm","archived":false,"fork":false,"pushed_at":"2025-12-25T02:06:48.000Z","size":452,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-12-26T13:52:25.926Z","etag":null,"topics":["cli","devops","gpu","hpc","monitoring","quota","slurm","tooling"],"latest_commit_sha":null,"homepage":"https://dedalus-labs.github.io/slurmq","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dedalus-labs.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":"SECURITY.md","support":"SUPPORT.md","governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2025-12-23T04:38:20.000Z","updated_at":"2025-12-25T20:36:28.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/dedalus-labs/slurmq","commit_stats":null,"previous_names":["dedalus-labs/slurmq"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/dedalus-labs/slurmq","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dedalus-labs%2Fslurmq","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dedalus-labs%2Fslurmq/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dedalus-labs%2Fslurmq/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dedalus-labs%2Fslurmq/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dedalus-labs","download_url":"https://codeload.github.com/dedalus-labs/slurmq/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dedalus-labs%2Fslurmq/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28220352,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2026-01-05T02:00:06.358Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli","devops","gpu","hpc","monitoring","quota","slurm","tooling"],"created_at":"2025-12-24T04:39:59.190Z","updated_at":"2026-01-13T21:04:07.353Z","avatar_url":"https://github.com/dedalus-labs.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# slurmq\n\nGPU quota management for Slurm clusters.\n\n```console\n$ slurmq check\n\n╭──────────────────── GPU Quota Report ────────────────────╮\n│                                                          │\n│   User:     dedalus                                      │\n│   QoS:      medium                                       │\n│   Cluster:  Stella HPC                                   │\n│                                                          │\n│   ████████████████████░░░░░░░░░░ 68.5%                   │\n│                                                          │\n│   Used:      342.5 GPU-hours                             │\n│   Remaining: 157.5 GPU-hours                             │\n│   Quota:     500 GPU-hours (rolling 30 days)             │\n│                                                          │\n╰──────────────────────────────────────────────────────────╯\n```\n\n## Install\n\n```bash\nuv tool install slurmq\n```\n\n## Setup\n\n```bash\nslurmq config init       # interactive wizard\nslurmq config show       # verify settings\nslurmq config validate   # check syntax before deploy\n```\n\nConfig resolution order:\n\n1. `SLURMQ_CONFIG` env var\n2. `~/.config/slurmq/config.toml` (user)\n3. `/etc/slurmq/config.toml` (system-wide)\n\n```toml\ndefault_cluster = \"stella\"\n\n[clusters.stella]\nname = \"Stella HPC\"\naccount = \"research\"\nqos = [\"low\", \"medium\"]\nquota_limit = 500        # GPU-hours\nrolling_window_days = 30\n```\n\n## Commands\n\n### check\n\n```bash\nslurmq check                  # current user\nslurmq check --user alice     # specific user\nslurmq check --cluster other  # different cluster\nslurmq check --forecast       # usage projection\nslurmq --json check           # machine-readable\nslurmq --quiet check          # silent on success (for scripts)\n```\n\n### efficiency\n\nAnalyze job resource efficiency (like `seff`).\n\n```bash\nslurmq efficiency 12345\n```\n\nFlags low efficiency: CPU \u003c 30%, Memory \u003c 20%.\n\n### report\n\nGenerate usage reports (admin).\n\n```bash\nslurmq report                          # table view\nslurmq report --format csv -o out.csv\n```\n\n### monitor\n\nReal-time monitoring with optional enforcement (admin).\n\n```bash\nslurmq monitor                # live dashboard, 30s refresh\nslurmq monitor --interval 10\nslurmq monitor --once         # single check, for cron\nslurmq monitor --enforce      # cancel jobs over quota\n```\n\n### stats\n\nCluster-wide analytics with month-over-month comparison.\n\n```bash\nslurmq stats                          # GPU utilization + wait times\nslurmq stats --days 14                # custom period\nslurmq stats --no-compare             # skip MoM comparison\nslurmq stats -p gpu -p gpu-large      # specific partitions\nslurmq stats --small-threshold 25     # custom job size threshold\nslurmq --json stats                   # machine-readable\n```\n\nShows:\n\n- GPU utilization by partition/QoS\n- Wait time analysis (median, % jobs waiting \u003e 6h)\n- Small vs large job breakdown\n- Month-over-month trends\n\n## Enforcement\n\nCancel jobs automatically when users exceed quota.\n\n```toml\n[enforcement]\nenabled = true\ndry_run = true            # preview mode\ngrace_period_hours = 24   # warn before cancel\nexempt_users = [\"admin\"]\nexempt_job_prefixes = [\"checkpoint_\"]\n```\n\nRun with `slurmq monitor --enforce`. Disable `dry_run` when ready.\n\nGrace period: users exceeding quota get a warning window before jobs are cancelled.\n\n## Job States\n\nProblematic states are highlighted:\n\n| State | Meaning       |\n| ----- | ------------- |\n| `OOM` | Out of Memory |\n| `TO`  | Timeout       |\n| `NF`  | Node Failure  |\n| `F`   | Failed        |\n| `PR`  | Preempted     |\n\n## Scripting\n\n```bash\n# check quota status\nif slurmq --json check | jq -e '.status == \"exceeded\"' \u003e /dev/null; then\n  echo \"Quota exceeded\"\nfi\n\n# cron: enforce every 5 minutes (quiet mode)\n*/5 * * * * slurmq --quiet monitor --once --enforce \u003e\u003e /var/log/slurmq.log 2\u003e\u00261\n```\n\n## Documentation\n\n**Online:** [dedalus-labs.github.io/slurmq](https://dedalus-labs.github.io/slurmq)\n\n**For LLMs:** [llms.txt](https://dedalus-labs.github.io/slurmq/llms.txt) | [llms-full.txt](https://dedalus-labs.github.io/slurmq/llms-full.txt)\n\n**Locally:**\n\n```bash\nuv sync --extra docs\nuv run mkdocs serve\n```\n\n## Development\n\n```bash\ngit clone https://github.com/dedalus-labs/slurmq.git \u0026\u0026 cd slurmq\nuv sync --all-extras\nuv run pytest\nuv run ruff check\nuv run ty check\n```\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdedalus-labs%2Fslurmq","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdedalus-labs%2Fslurmq","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdedalus-labs%2Fslurmq/lists"}