{"id":26842529,"url":"https://github.com/biocomputingup/elastic-slurm-in-openstack","last_synced_at":"2025-03-30T18:30:04.611Z","repository":{"id":283309662,"uuid":"866992664","full_name":"BioComputingUP/elastic-slurm-in-openstack","owner":"BioComputingUP","description":"Configure an elastic Slurm cluster on OpenStack cloud","archived":false,"fork":false,"pushed_at":"2025-03-19T14:57:13.000Z","size":63,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-19T15:39:51.052Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BioComputingUP.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-10-03T08:59:52.000Z","updated_at":"2025-03-19T14:57:17.000Z","dependencies_parsed_at":"2025-03-19T15:49:57.284Z","dependency_job_id":null,"html_url":"https://github.com/BioComputingUP/elastic-slurm-in-openstack","commit_stats":null,"previous_names":["biocomputingup/elastic-slurm-in-openstack"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BioComputingUP%2Felastic-slurm-in-openstack","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BioComputingUP%2Felastic-slurm-in-openstack/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BioComputingUP%2Felastic-slurm-in-openstack/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BioComputingUP%2Felastic-slurm-in-openstack/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BioComputingUP","download_url":"https://codeload.github.com/BioComputingUP/elastic-slurm-in-openstack/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246363005,"owners_count":20765208,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-30T18:30:03.600Z","updated_at":"2025-03-30T18:30:04.522Z","avatar_url":"https://github.com/BioComputingUP.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Slurm cluster in OpenStack cloud\nThese Ansible playbooks create and manage a dynamically allocated (elastic) Slurm cluster in an OpenStack cloud.\nThe cluster is based on CentOS 8 (Rocky 8) and [OpenHPC 2.x](https://openhpc.community/downloads/). Slurm configurations are based on the work contained \nin [Jetstream_Cluster](https://github.com/XSEDE/CRI_Jetstream_Cluster).\nThis repo is based on the project [slurm-cluster-in-openstack](https://github.com/CornellCAC/slurm-cluster-in-openstack)\nadapted for use with [CloudVeneto](https://cloudveneto.ict.unipd.it/) OpenStack cloud.\n\n## Prerequisites\n### Install Ansible\nRun the `install_ansible.sh` command:\n```bash\n./install_ansible.sh\n```\n\n### Configure CloudVeneto gateway (Gate) for SSH access\nFor this you should have a CloudVeneto account and access to the Gate machine (`cv_user` and `cv_pass`):\n```bash\n# generate a new key pair locally (preferably with passphrase). Skip and adapt if you already have a key pair:\nssh-keygen -t ed25519 -f ~/.ssh/id_ed25519_vm\n\n# copy the public key to Gate machine (it will ask for cv_pass):\ncat ~/.ssh/id_ed25519_vm.pub | \\\n  ssh cv_user@gate.cloudveneto.it 'cat \u003eid_ed25519_vm.pub \u0026\u0026 \\\n  mkdir -p .ssh \u0026\u0026 \\\n  chmod 700 .ssh \u0026\u0026 \\\n  mv id_ed25519_vm.pub .ssh/id_ed25519_vm.pub \u0026\u0026 \\\n  cat .ssh/id_ed25519_vm.pub \u003e\u003e.ssh/authorized_keys'\n\n# copy the private key to Gate machine (it will ask for cv_pass):\ncat ~/.ssh/id_ed25519_vm | \\\n  ssh cv_user@gate.cloudveneto.it \\\n  'cat \u003e.ssh/id_ed25519_vm \u0026\u0026 chmod 600 .ssh/id_ed25519_vm'\n\n# connect to Gate machine (it will ask for SSH key passphrase, if used):\nssh -i ~/.ssh/id_ed25519_vm cv_user@gate.cloudveneto.it\n```\nIf you have also the credentials and IP of a VM running in the cloud (`vm_user`, `vm_pass`, `vm_ip`), you can import the key pair to it:\n```bash\n# copy the public key from the Gate machine to VM (it will ask for vm_pass)\ncat ~/.ssh/id_ed25519_vm.pub | \\\n  ssh vm_user@vm_ip 'cat \u003e.ssh/id_ed25519_vm.pub \u0026\u0026 \\\n  cat .ssh/id_ed25519_vm.pub \u003e\u003e.ssh/authorized_keys'\n\n# test connection to VM from Gate machine (it will ask for SSH passphrase, if used)\nssh -i ~/.ssh/id_ed25519_vm vm_user@vm_ip\nexit\n```\nAccessing a VM from your local machine requires proxying the SSH connection through the CloudVeneto Gate. You can achieve this by using the following SSH command:\n```bash\n# (optionally) add key to ssh-agent (it may ask for SSH key passphrase)\nssh-add ~/.ssh/id_ed25519_vm\n\n# connect to VM via proxy\nssh -i ~/.ssh/id_ed25519_vm \\\n  -o StrictHostKeyChecking=accept-new \\\n  -o ProxyCommand=\"ssh -i ~/.ssh/id_ed25519_vm \\\n  -W %h:%p cv_user@gate.cloudveneto.it\" \\\n  vm_user@vm_ip\n```\nYou can simplify the SSH connection to VM by configuring your SSH config file:\n```bash\n# update ssh config with proxy and headnode\ncat \u003c\u003cEOF | tee -a ~/.ssh/config\n\nHost cvgate\n\tHostName gate.cloudveneto.it\n\tUser cv_user\n\tIdentityFile ~/.ssh/id_ed25519_vm\n\nHost vm\n\tHostName vm_ip\n\tUser vm_user\n\tIdentityFile ~/.ssh/id_ed25519_vm\n\tUserKnownHostsFile /dev/null\n\tStrictHostKeyChecking=accept-new\n\tProxyJump cvgate\nEOF\n```\nTest the connection:\n```bash\n# connect to VM\nssh vm\n\n# copy files to and from VM with scp\nscp localdir/file vm:remotedir/\nscp vm:remotedir/file localdir/\n# or rsync\nrsync -ahv localdir/ vm:remotedir/\nrsync -ahv vm:remotedir/ localdir/\n```\n\n## Deploy Slurm Cluster\n### Download latest Rocky Linux 8 image\n```bash\nwget https://dl.rockylinux.org/pub/rocky/8/images/x86_64/Rocky-8-GenericCloud-Base.latest.x86_64.qcow2\n# no need to upload it to OpenStack, Ansible will do it\n# openstack image create --disk-format qcow2 --container-format bare --file Rocky-8-GenericCloud-Base.latest.x86_64.qcow2 rocky-8\n```\n### Configure cluster\nCopy `vars/main.yml.example` to `vars/main.yml` and adjust to your needs.\n\nCopy `clouds.yaml.example` to `clouds.yaml` and adjust with OpenStack credentials.\n\n### Deployment\nDeployment is done in four steps:\n1. Create the head node\n2. Provision the head node\n3. Create and provision the compute node\n4. Create the compute node image\n\n#### Create the head node\n```bash\nansible-playbook create_headnode.yml\n```\n\n#### Provision the head node\n```bash\nansible-playbook provision_headnode.yml\n```\n\n#### Create and provision the compute node\n```bash\nansible-playbook create_compute_node.yml\n```\n\n#### Create compute node image\n```bash\nansible-playbook create_compute_image.yml\n```\n\n#### All-in-one deployment\n```bash\ntime ( \\\nansible-playbook create_headnode.yml \u0026\u0026 \\\nansible-playbook provision_headnode.yml \u0026\u0026 \\\nansible-playbook create_compute_node.yml \u0026\u0026 \\\nansible-playbook create_compute_image.yml \u0026\u0026 \\\necho \"Deployment completed\" || echo \"Deployment failed\" )\n```\nor fancy with notifications:\n```bash\n/bin/time -f \"\\n### overall time: \\n### wall clock: %E\" /bin/bash -c '\\\n/bin/time -f \"\\n### timing \\\"%C ...\\\"\\n### wall clock: %E\" ansible-playbook create_headnode.yml \u0026\u0026 \\\n/bin/time -f \"\\n### timing \\\"%C ...\\\"\\n### wall clock: %E\" ansible-playbook provision_headnode.yml \u0026\u0026 \\\n/bin/time -f \"\\n### timing \\\"%C ...\\\"\\n### wall clock: %E\" ansible-playbook create_compute_node.yml \u0026\u0026 \\\n/bin/time -f \"\\n### timing \\\"%C ...\\\"\\n### wall clock: %E\" ansible-playbook create_compute_image.yml \u0026\u0026 \\\necho \"Deployment completed\" | tee /dev/tty | notify-send -t 0 \"$(\u003c/dev/stdin)\" || \\\necho \"Deployment failed\" | tee /dev/tty | notify-send -t 0 \"$(\u003c/dev/stdin)\"'\n```\n\n### Cleanup\nDelete all cloud resources with:\n```bash\nansible-playbook destroy_cluster.yml\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbiocomputingup%2Felastic-slurm-in-openstack","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbiocomputingup%2Felastic-slurm-in-openstack","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbiocomputingup%2Felastic-slurm-in-openstack/lists"}