Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-cloud-hpc
A curated list of Cloud HPC.
https://github.com/kjrstory/awesome-cloud-hpc
Last synced: 3 days ago
JSON representation
-
Management Tool
- Azure HPC OnDemand Platform - Azure-based HPC cluster solution with features like Terraform, Ansible, Packer integration, job scheduling, autoscaling, and monitoring ([Repository](https://github.com/Azure/az-hop), [Marketplace](https://azuremarketplace.microsoft.com/marketplace/apps/azhpc.azhop)).
- HPC-NOW - The platform aims to simplify the process of starting and managing HPC workloads in the cloud.
- Alibaba E-HPC - Alibaba Cloud's computing service for resource management, job submission, performance analysis, and VNC in E-HPC console.
- AWS ParallelCluster - Open source cluster management tool for deploying and managing HPC clusters ([Repository](https://github.com/aws/aws-parallelcluster)).
- Azure CycleCloud - Secure and flexible cloud HPC and Big Compute environments.
- CloudyCluster - Turn-Key Cloud HPC elastic orchestration with a familiar hpc look and feel.
- KT Cloud HPC - KT Cloud's HPC management product integrating Altair's solutions.
- OCI HPC Cluster - Automated HPC cluster deployment on OCI.
- OCI HPC File System (HFS) - Solution for deploying various HPC file servers on OCI. Automated HPC cluster deployment on OCI.
- SCP HPC Cluster - HPC cluster environment on SCP.
- JedAI Cloud - Optimized HPC stacks enable easy cluster management and on-demand HPC through pre-integrated solutions, delivering bare metal infrastructure, virtualized services, and containerized apps via a single management interface by Define Tech.
- TrinityX - Next-gen open-source HPC, AI, and cloud platform offering customizable installations with efficient provisioning, SLURM/OpenPBS, OpenHPC, and more for modern cluster management.
- AWS ParallelCluster UI - Front-end for AWS ParallelCluster.
- Flight Environment - The Flight User Suite for improved HPC access through CLI tools, the Flight Web Suite as a web interface for HPC end-users, and the Flight Admin Tools for administrative HPC environment configuration.
- Alibaba E-HPC - Alibaba Cloud's computing service for resource management, job submission, performance analysis, and VNC in E-HPC console.
- Cluster in the Cloud - Multi cloud solution that uses Terraform for infrastructure setup, Ansible for software configuration, and Slurm with custom Python scripts for dynamic node management in cloud-based HPC environment.
- Magic Castle - Multi-cloud HPC cluster solution that leverages Terraform and Puppet for deployment, featuring job scheduling with Slurm and over 3000 research software applications.
- GCP HPC Toolkit - Google Cloud's open-source software for deploying high-performance computing environments on GCP, featuring customizable Terraform modules and Packer integration. ([Repository](https://github.com/GoogleCloudPlatform/hpc-toolkit)).
-
IaaS-Image
- Azhpc-images - Installation scripts for HPC images in Azure Marketplace, specifically CentOS-HPC, Ubuntu-HPC, and AlmaLinux-HPC.
- Flight Solo - HPC-ready, platform-agnostic image approach to deploying HPC resources powerd by alcesflight.
- GCP HPC-ready VM - CentOS 7.9 or Rocky Linux 8 based VM image that is optimized for tightly coupled HPC workloads [Marketplace CentOS 7](https://console.cloud.google.com/marketplace/product/click-to-deploy-images/hpc-vm-image-centos-7) [Marketplace Rocky Linux 8](https://console.cloud.google.com/marketplace/product/click-to-deploy-images/hpc-vm-image-rocky-linux-8?q=search&referrer=search).
- HPC Pack 2019 - Microsoft HPC Pack 2019 image powered by Cloud Infrastructure Services ([Marketplace(Azure)](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/cloud-infrastructure-services.hpc2019-windows-server-2019), [Marketplace(AWS)](https://aws.amazon.com/marketplace/pp/prodview-hxo3dtqd4srdk), [Marketplace(GCP)](https://console.cloud.google.com/marketplace/product/cloud-infrastructure-services/hpc2019-windows-2019)).
- HPCBOX - Desktop-centric, intelligent workflow cloud HPC platform for automating and executing your application pipelines ([Marketplace](https://azuremarketplace.microsoft.com/en-us/marketplace/apps?search=hpcbox)).
- NVIDIA Virtual Machine Images - Operating system environment for running NVIDIA GPU accelerated software in the cloud.
-
Job Scheduler
- Slurm on Google Cloud Platform - Open-source software solution that enables setting up Slurm clusters on Google Cloud Platform with ease.
- Altair Access - HPC Job Submission Portal for Researchers and Engineers.
- Altair NavOps - Cloud Migration, Automation, and Spend Management for HPC.
- Altair Grid Engine - Distributed Resource Management and Optimization.
- Altair PBS-Professional - Industry-leading Workload Manager and Job Scheduler for HPC and High-throughput Computing.
- MS HPC Pack
- Altair Control - HPC Administrator's Control Center for Managing, Optimizing, and Forecasting Resources with seamless cloud bursting capabilities.
- Altair HPCWorks - High-Performance Computing (HPC) and Cloud Platform by Altair.
- IBM Spectrun LSF Suites - Workload management platform and job scheduler for HPC with dynamic HPC cloud support for all major cloud providers ([Repository](https://github.com/IBM/ibm-spectrum-scale-cloud-install)).
- Slurm Power Saving Guide - Suspending and resuming nodes as needed, and supports cloud integration with providers like AWS, GCP, and Azure for workload management and cloud bursting.
-
Recipes
-
- Azure HPC - Easy automation scripts for building a HPC environment in Azure.
- Cloud MPI - Collection of scripts for optimizing MPI performance in tightly coupled HPC workloads on GCP Compute Engines.
- Dynamic EC2 budget control - Dynamic EC2 cores allocation limit for each business unit (BU), automatically adapted according to a past time frame (e.g. one week) spending on AWS Parallel Cluster.
- HPC Recipes for AWS - Example recipes that demonstrate how to build HPC systems using AWS ParallelCluster, Research and Engineering Studio on AWS, and other AWS products.
-
Azure CycleCloud
-
-
Solution
-
IaaS-Server
- Amazon EC2 Hpc7g - HPC-optimized instances powered by AWS Graviton3E processors.
- Amazon EC2 Hpc7a - HPC-optimized instances powered by 4th Generation AMD EPYC processors.
- Amazon EC2 Hpc6id - HPC-optimized instances powered by 3rd Generation Intel Xeon Scalable processors.
- Amazon EC2 P5 - GPU instances powerd by NVIDIA H100 GPUs.
- Amazon EC2 P4 - GPU instances powerd by NVIDIA A100(80Gb,40Gb) GPUs.
- Amazon EC2 P3 - GPU instances powerd by NVIDIA V100 GPUs.
- Amazon EC2 G5 - GPU instances powerd by NVIDIA A10G GPUs and 2nd Gen AMD EPYC processors.
- Azure HBv4-series - HPC-optimized instances powered by 4th Generation AMD EPYC processors.
- Azure HBv3-series - HPC-optimized instances powered by 3rd Generation AMD EPYC processors.
- Azure HBv2-series - HPC-optimized instances powered by 2nd Generation AMD EPYC processors.
- Azure HB-series - HPC-optimized instances powered by 1st Generation AMD EPYC processors.
- Azure HC-series - HPC-optimized instances powered by 1st Generation Intel Xeon Scalable processors.
- Azure HX-series - Optimized instances for workloads that require significant memory capacity with twice the memory capacity as HBv4.
- Azure NDm H100 v5-series - GPU instances powerd by NVIDIA H100 GPUs.
- Azure NDm A100 v4-series - GPU instances powerd by NVIDIA A100(80Gb) GPUs and 3rd Generation AMD EPYC processors.
- Azure NC A100 v4-series - GPU instances powerd by NVIDIA A100(40Gb) GPUs and 3rd Generation AMD EPYC processors.
- Azure NCv3-series - GPU instances powerd by NVIDIA V100 GPUs.
- Azure NCasT4_v3-series - GPU instances powerd by NVIDIA T4 GPUs and 2nd Gen AMD EPYC CPUs.
- Super Computing Cluster - Based on ECS Bare Metal Instance powered by Alibaba Cloud, utilizes high-speed RDMA-based connections to enhance network performance and acceleration ratio in large-scale clusters, providing high-bandwidth and low-latency networks.
- Super Computing Cluster - Based on ECS Bare Metal Instance powered by Alibaba Cloud, utilizes high-speed RDMA-based connections to enhance network performance and acceleration ratio in large-scale clusters, providing high-bandwidth and low-latency networks.
- GCP G2 machine-series - GPU instances powerd by NVIDIA L4 GPUs.
-
IaaS-Network
- Azure InfiniBand - RDMA capable HB-series and N-series VMs communicate over the InfiniBand network.
- Elastic Fabric Adapter - Network interface for Amazon EC2 instances that enables customers to run applications requiring high levels of inter-node communications at scale.
- Compute Clusters - us/iaas/Content/Compute/Tasks/managingclusternetworks.htm)) - Group of high performance computing (HPC), GPU, or optimized instances that are connected with a high-bandwidth, ultra low-latency network. <a href="#"> <img src="https://img.shields.io/badge/OCI-F80000?style=flat&logo=oracle&logoColor=black"> </a>
-
IaaS-Storage
- Amazon FSx for Lustre - Fully managed shared storage with the scalability and performance of the popular Lustre file system.
- Amazon FSx for OpenZFS - Fully managed shared storage built on the popular OpenZFS file system.
- Azure HPC Cache
- Azure Managed Lustre - Managed, pay-as-you-go file system for high-performance computing (HPC) and AI workloads.
- Azure NetApp Files - Enterprise-grade Azure file shares, powered by NetApp.
- GCP File Store - High-performance, fully managed file storage.
- GCP Parallel Store - Based on Intel DAOS and delivers up to 6.3x greater read throughput performance compared to competitive Lustre scratch offerings.
-
PaaS
- AWS Batch - Fully managed batch computing service.
- Azure Batch - Cloud-scale job scheduling and compute management.
- GCP Batch - Fully managed batch service to schedule, queue, and execute batch jobs on Google's infrastructure.
- NICE DCV - High-performance remote display protocol that provides customers with a secure way to deliver remote desktops and application streaming.
- NICE EnginFrame - Unified interface to submit jobs for both on-premises and cloud workflow.
- Research and Engineering Studio - Open source, easy-to-use web-based portal for administrators to create and manage secure cloud-based research and engineering environments on AWS.
- Rntier Cloud - R&D cloud platform enabling easy and quick access to complex HPC simulations, vGPU-based remote 3D design, and multi-GPU deep learning environments via a web browser.
- Scyld Cloud Central™ - Fully managed, cloud-based, end-to-end solution for high performance computing that makes it easier and faster for end-users, developers, and data scientists to deploy pure HPC, pure AI, and converged HPC/AI workloads on high-performance clusters.
- Scyld ClusterWare - Intelligent suite of management functionality, including node provisioning, image customization, and cluster monitoring, while serving as a platform for additional software and schedulers.
- Scyld Cloud Workstation - Unparalleled performance and a breadth of features that allow it to stand out as a solution for remote access.
- Rntier Cloud - R&D cloud platform enabling easy and quick access to complex HPC simulations, vGPU-based remote 3D design, and multi-GPU deep learning environments via a web browser.
- AWS Parallel Computing Service - Managed service for HPC cluster deployment and scaling on AWS using Slurm.
- Batch Compute - Cloud service for massive simultaneous batch processing on Alibaba Cloud.
- Batch Compute - Cloud service for massive simultaneous batch processing on Alibaba Cloud.
- Batch Compute - Cloud service for massive simultaneous batch processing on Alibaba Cloud.
- Amazon DCV - High-performance remote display protocol that provides customers with a secure way to deliver remote desktops and application streaming.
- NI SP EF Portal - Unified interface to submit jobs for both on-premises and cloud workflow.
- Skypilot - Framework for running LLMs, AI, and batch jobs on any cloud, offering maximum cost savings, highest GPU availability, and managed execution ([Repository](https://github.com/skypilot-org/skypilot)).
-
SaaS
- CloudHPC - On-demand cloud computing for CAE engineering simulations powered by CFD FEA SERVICE.
- Nimbix - A comprehensive cloud computing solution powered by Atos, offering access to the HyperHub Application Marketplace with over 1,000 high-performance applications and workflows for diverse industries ([Repository](https://github.com/nimbix)).
- Sabalcore - User-friendly, pay-as-you-go high performance computing cloud service with a full-featured, light-weight client that doesn't require a browser.
- Scala Computing - Optimized, automated cloud-based HPC resource management platform with integrated network simulation and EDA tools, offering flexible, on-demand computing, secure workflows, and global infrastructure access.
- TAESUNG Cloud - Offering Ansys applications as a service in a cloud-based SaaS.
- dicehub - Real-time collaborative CFD (Computational Fluid Dynamics) simulations platform which simplifies your engineering workflow, offers massive parallel scaling and runs in web browser.
- Uber Cloud - A platform featuring HD 3D graphics desktop GUI, BYOL simulation software support, scalable container-based architecture, and automated cloud computing on AWS, Azure, Google, and HPE.
- Kaleidosim - Enabling of browser-based access to HPC software through advanced cloud orchestration technology.
- OnScale Solve - The cloud engineering simulation platform built by engineers for engineers.
- SyncHPC - Powerful and flexible hybrid HPC and VDI management platform that provides a comprehensive solution for managing high-performance computing (HPC) and Virtual Desktop Infrastructure (VDI) resources.
- EPIC - Primarily for CFD applications, available on the web and created by Zenotech, which also includes Zenotech's ZCFD.
- Luminary Cloud - A cloud-based, pay-per-use SaaS simulation platform with a fast, GPU-powered, cloud-native CFD solver and comprehensive high-fidelity capabilities.
- dicehub - Real-time collaborative CFD (Computational Fluid Dynamics) simulations platform which simplifies your engineering workflow, offers massive parallel scaling and runs in web browser.
-
CAE and EDA ISV
- Altair One - Cloud Gateway offering dynamic and collaborative access to simulation and data analytics technology, along with scalable HPC and cloud resources.
- Altair Unlimited - A turnkey, state-of-the-art private appliance available in both on-premises and cloud-based formats, offering unlimited access to a wide range of Altair HyperWorks solver software.
- Ansys Cloud Direct - Cloud-based interactive workstations and HPC clusters, with flexible licensing that can be accessed from desktop.
- Ansys Gateway by AWS - Cloud-based solution for managing Ansys Simulation & CAD/CAE developments via a web browser.
- Cadence OnCloud Platform - SaaS software platform for all your system design and simulation needs that can operate on any hardware, removing the requirement to run and maintain expensive infrastructure hardware.
- Simulia Cloud
- Synopsys Cloud - Platform that enables delivery of EDA tools, IP and infrastructure for end-to-end chip design through a browser.
- Managed Cloud Service - EDA-optimized platform powered by Cadence that provides a fully integrated and proven cloud environment to jump-start product design, verification, and implementation.
- Palladium and Protium Cloud - Emulation and prototyping offering provides pre-silicon hardware system verification and debug powered by Cadence.
- 3DEXPERIENCE platform on ther cloud - Complete suite of industry-leading apps and software(CATIA, SIMULIA, DELMIA, 3DEXCITE, etc.) powered by Dassalut Systèmes.
- Cloud Passport - Cloud-ready tools powered by Cadence that have been optimized for use in customers' own cloud environment.
- Ansys Access on Microsoft Azure - Cloud-based simulation solution available on the Azure Marketplace, offering fast, scalable access to Ansys applications ([Marketplace](https://azuremarketplace.microsoft.com/marketplace/apps/ansys.ansysaccessonmicrosoftazure?tab=overview)).
- Simcenter Cloud HPC - Part of the Xcelerator as a Service(XaaS) offering powered by Siemens, offers increased flexibility and scalability for CFD simulations with no additional setup needed.
- Simcenter Cloud HPC - Part of the Xcelerator as a Service(XaaS) offering powered by Siemens, offers increased flexibility and scalability for CFD simulations with no additional setup needed.
-
Resource
-
Blog Documentation YouTube
- Day 1 HPC - AWS engineering's hpc communutiy site.
Categories
Sub Categories
Keywords
azure
3
hpc
3
cyclecloud
2
slurm
2
mpi
2
aws
2
opentofu
1
linux
1
huaweicloud
1
google-cloud
1
devops
1
cluster
1
cloud
1
c
1
baiduyun
1
aliyun
1
openondemand
1
ondemand
1
microsoft
1
ai
1
provisioning
1
hpc-systems
1
hpc-cluster
1
res
1
parallelcluster
1
fsx-lustre
1
cloudformation
1
batch
1
infiniband
1
azurehpc
1
ubuntu-hpc
1
hpc-image
1
centos-hpc
1
azhpc
1
alma-hpc
1
ai-image
1
terraform
1
tencent-cloud
1
scripts
1