{"id":50385071,"url":"https://github.com/stackhpc/slurm-appliance-lab","last_synced_at":"2026-05-30T14:30:57.350Z","repository":{"id":262654854,"uuid":"887931563","full_name":"stackhpc/slurm-appliance-lab","owner":"stackhpc","description":"Separate repo for ansible-slurm-appliance training sessions","archived":false,"fork":false,"pushed_at":"2024-11-13T15:39:33.000Z","size":2161,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"artemis","last_synced_at":"2024-11-13T15:44:09.141Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jinja","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/stackhpc.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-13T14:28:24.000Z","updated_at":"2024-11-13T15:39:38.000Z","dependencies_parsed_at":"2024-11-13T21:01:19.512Z","dependency_job_id":null,"html_url":"https://github.com/stackhpc/slurm-appliance-lab","commit_stats":null,"previous_names":["stackhpc/slurm-appliance-lab"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/stackhpc/slurm-appliance-lab","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stackhpc%2Fslurm-appliance-lab","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stackhpc%2Fslurm-appliance-lab/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stackhpc%2Fslurm-appliance-lab/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stackhpc%2Fslurm-appliance-lab/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/stackhpc","download_url":"https://codeload.github.com/stackhpc/slurm-appliance-lab/tar.gz/refs/heads/artemis","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stackhpc%2Fslurm-appliance-lab/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33696681,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-30T02:00:06.278Z","response_time":92,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-30T14:30:56.611Z","updated_at":"2026-05-30T14:30:57.341Z","avatar_url":"https://github.com/stackhpc.png","language":"Jinja","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Test deployment and image build on OpenStack](https://github.com/stackhpc/ansible-slurm-appliance/actions/workflows/stackhpc.yml/badge.svg)](https://github.com/stackhpc/ansible-slurm-appliance/actions/workflows/stackhpc.yml)\n\n# StackHPC Slurm Appliance\n\nThis repository contains playbooks and configuration to define a Slurm-based HPC environment. This includes:\n- [Rocky Linux](https://rockylinux.org/)-based hosts.\n- [OpenTofu](https://opentofu.org/) configurations to define the cluster's infrastructure-as-code.\n- Packages for Slurm and MPI software stacks from [OpenHPC](https://openhpc.community/).\n- Shared fileystem(s) using NFS (with in-cluster or external servers) or [CephFS](https://docs.ceph.com/en/latest/cephfs/) via [Openstack Manila](https://wiki.openstack.org/wiki/Manila).\n- Slurm accounting using a MySQL database.\n- Monitoring integrated with Slurm jobs using Prometheus, ElasticSearch and Grafana.\n- A web-based portal from [OpenOndemand](https://openondemand.org/).\n- Production-ready default Slurm configurations for access and memory limits.\n- [Packer](https://developer.hashicorp.com/packer)-based image build configurations for node images.\n\nThe repository is expected to be forked for a specific HPC site but can contain multiple environments for e.g. development, staging and production clusters\nsharing a common configuration. It has been designed to be modular and extensible, so if you add features for your HPC site please feel free to submit PRs\nback upstream to us!\n\nWhile it is tested on OpenStack it should work on any cloud with appropriate OpenTofu configuration files.\n\n## Demonstration Deployment\n\nThe default configuration in this repository may be used to create a cluster to explore use of the appliance. It provides:\n- Persistent state backed by an OpenStack volume.\n- NFS-based shared file system backed by another OpenStack volume.\n\nNote that the OpenOndemand portal and its remote apps are not usable with this default configuration.\n\nIt requires an OpenStack cloud, and an Ansible \"deploy host\" with access to that cloud.\n\nBefore starting ensure that:\n- You have root access on the deploy host.\n- You can create instances using a Rocky 9 GenericCloud image (or an image based on that).\n    - **NB**: In general it is recommended to use the [latest released image](https://github.com/stackhpc/ansible-slurm-appliance/releases) which already contains the required packages. This is built and tested in StackHPC's CI. However the appliance will install the necessary packages if a GenericCloud image is used.\n- You have a SSH keypair defined in OpenStack, with the private key available on the deploy host.\n- Created instances have access to internet (note proxies can be setup through the appliance if necessary).\n- Created instances have accurate/synchronised time (for VM instances this is usually provided by the hypervisor; if not or for bare metal instances it may be necessary to configure a time service via the appliance).\n\n### Setup deploy host\n\nThe following operating systems are supported for the deploy host:\n\n- Rocky Linux 9\n- Rocky Linux 8\n\nThese instructions assume the deployment host is running Rocky Linux 8:\n\n    sudo yum install -y git python38\n    git clone https://github.com/stackhpc/ansible-slurm-appliance\n    cd ansible-slurm-appliance\n    ./dev/setup-env.sh\n\nYou will also need to install [OpenTofu](https://opentofu.org/docs/intro/install/rpm/).\n\n### Create a new environment\n\nUse the `cookiecutter` template to create a new environment to hold your configuration. In the repository root run:\n\n    . venv/bin/activate\n    cd environments\n    cookiecutter skeleton\n\nand follow the prompts to complete the environment name and description.\n\n**NB:** In subsequent sections this new environment is refered to as `$ENV`.\n\nNow generate secrets for this environment:\n\n    ansible-playbook ansible/adhoc/generate-passwords.yml\n\n### Define infrastructure configuration\n\nCreate an OpenTofu variables file to define the required infrastructure, e.g.:\n\n    # environments/$ENV/terraform/terraform.tfvars:\n\n    cluster_name = \"mycluster\"\n    cluster_net = \"some_network\" # *\n    cluster_subnet = \"some_subnet\" # *\n    key_pair = \"my_key\" # *\n    control_node_flavor = \"some_flavor_name\"\n    login_nodes = {\n        login-0: \"login_flavor_name\"\n    }\n    cluster_image_id = \"rocky_linux_9_image_uuid\"\n    compute = {\n        general = {\n            nodes: [\"compute-0\", \"compute-1\"]\n            flavor: \"compute_flavor_name\"\n        }\n    }\n\nVariables marked `*` refer to OpenStack resources which must already exist. The above is a minimal configuration - for all variables\nand descriptions see `environments/$ENV/terraform/terraform.tfvars`.\n\n### Deploy appliance\n\n    ansible-playbook ansible/site.yml\n\nYou can now log in to the cluster using:\n\n    ssh rocky@$login_ip\n\nwhere the IP of the login node is given in `environments/$ENV/inventory/hosts.yml`\n\n\n## Overview of directory structure\n\n- `environments/`: See [docs/environments.md](docs/environments.md).\n- `ansible/`: Contains the ansible playbooks to configure the infrastruture.\n- `packer/`: Contains automation to use Packer to build machine images for an enviromment - see the README in this directory for further information.\n- `dev/`: Contains development tools.\n\nFor further information see the [docs](docs/) directory.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstackhpc%2Fslurm-appliance-lab","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstackhpc%2Fslurm-appliance-lab","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstackhpc%2Fslurm-appliance-lab/lists"}