{"id":13467561,"url":"https://github.com/trevor-vincent/awesome-high-performance-computing","last_synced_at":"2025-04-05T03:30:44.474Z","repository":{"id":38270507,"uuid":"342449954","full_name":"trevor-vincent/awesome-high-performance-computing","owner":"trevor-vincent","description":"A curated list of awesome high performance computing resources","archived":false,"fork":false,"pushed_at":"2024-10-29T19:04:17.000Z","size":2422,"stargazers_count":629,"open_issues_count":0,"forks_count":64,"subscribers_count":22,"default_branch":"main","last_synced_at":"2024-10-29T21:19:05.710Z","etag":null,"topics":["awesome","awesome-list","hpc"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/trevor-vincent.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-02-26T03:19:12.000Z","updated_at":"2024-10-29T19:04:21.000Z","dependencies_parsed_at":"2023-10-16T16:56:02.994Z","dependency_job_id":"22936b94-2860-4e5e-8b36-ce09306d9e4d","html_url":"https://github.com/trevor-vincent/awesome-high-performance-computing","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trevor-vincent%2Fawesome-high-performance-computing","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trevor-vincent%2Fawesome-high-performance-computing/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trevor-vincent%2Fawesome-high-performance-computing/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trevor-vincent%2Fawesome-high-performance-computing/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/trevor-vincent","download_url":"https://codeload.github.com/trevor-vincent/awesome-high-performance-computing/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247284911,"owners_count":20913691,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["awesome","awesome-list","hpc"],"created_at":"2024-07-31T15:00:57.912Z","updated_at":"2025-04-05T03:30:44.423Z","avatar_url":"https://github.com/trevor-vincent.png","language":null,"funding_links":[],"categories":["Others","High-performance computing","Other Lists","⭐ Acknowledgements","Software Engineering"],"sub_categories":["TeX Lists","Learning Tools","HPC"],"readme":"\u003cp align=\"center\"\u003e\n\u003cimg src=\"https://www.montana.edu/uit/rci/assets/hpc.png\" width=\"600\"\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\nA curated list of awesome high performance computing resources. \n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://awesome.re\"\u003e\n    \u003cimg src=\"https://awesome.re/badge.svg\" /\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n## Table of Contents\n\n - [General Info](#general-info)\n - [Software](#software)\n - [Hardware](#hardware) \n - [People](#people)\n - [Resources](#resources)\n - [Other Curated Lists](#other-curated-lists)\n - [Acknowledgements](#acknowledgements )\n\n## General Info\n\n### A Few Upcoming Supercomputers \n - [Tianhe-3](https://www.nextplatform.com/2019/05/02/china-fleshes-out-exascale-design-for-tianhe-3/) - 2022, ~700 Petaflop (Linpack500)\n - [Venado](https://discover.lanl.gov/news/0530-venado/) - 2024, Grace-Hopper based ~10 exaflops\n   \n### Most Recent List of the Top500 Supercomputers\n - [Top500 (Nov. 2024)](https://www.top500.org/lists/top500/2024/11/)\n - [HPCG Top500 (Nov. 2024)](https://www.top500.org/lists/hpcg/2024/11/)\n - [Green500 (Nov. 2024)](https://www.top500.org/lists/green500/2024/11/)\n - [io500](https://io500.org/)\n \n### History\n - [History of Supercomputing (Wikipedia)](https://en.wikipedia.org/wiki/History_of_supercomputing)\n - [History of Parallel Computing (Wikipedia)](https://en.wikipedia.org/wiki/Parallel_computing#History)\n - [History of the Top500 (Wikipedia)](https://en.wikipedia.org/wiki/TOP500)\n - [History of LLNL Computing](https://computing.llnl.gov/about/machine-history)\n - [The Supermen: The Story of Seymour Cray ... (1997)](https://www.amazon.ca/Supermen-Seymour-Technical-Wizards-Supercomputer/dp/0471048852/ref=sr_1_1?crid=1IOWC3IOYWPOP\u0026keywords=seymour+cray\u0026qid=1690959561\u0026sprefix=seymour+cray%2Caps%2C88\u0026sr=8-1)\n - [Unmatched - 50 Years of Supercomputing (2023)](https://www.routledge.com/Unmatched-50-Years-of-Supercomputing/Barkai/p/book/9780367479619)\n   \n### Trends\n - [Trends in HPC for AI workloads](https://epochai.org/trends)\n \n## Software\n\n#### Popular HPC Programming Libraries/APIs/Tools/Standards/Simulators\n- [alpaka](https://github.com/alpaka-group/alpaka) - The alpaka library is a header-only C++17 abstraction library for accelerator development\n- [async-rdma](https://github.com/datenlord/async-rdma) - A framework for writing RDMA applications with high-level abstraction and asynchronous APIs\n- [CAF](https://github.com/actor-framework/actor-framework) - An Open Source Implementation of the Actor Model in C++\n- [Chapel](https://chapel-lang.org/) - A Programming Language for Productive Parallel Computing on Large-scale Systems\n- [Charm++](http://charm.cs.illinois.edu/research/charm) - Parallel Programming with Migratable Objects\n- [Cilk Plus](https://www.cilkplus.org/) - C/C++ Extension for Data and Task Parallelism\n- [Codon](https://github.com/exaloop/codon) - high-performance Python compiler that compiles Python code to native machine code without any runtime overhead\n- [CUDA](https://developer.nvidia.com/cuda-toolkit) - High performance NVIDIA GPU acceleration\n- [dask](https://dask.org) - Dask provides advanced parallelism for analytics, enabling performance at scale for the tools you love\n- [DeepSpeed](https://github.com/microsoft/DeepSpeed) - An easy-to-use deep learning optimization software suite that enables unprecedented scale and speed for Deep Learning Training and Inference\n- [DeterminedAI](https://www.determined.ai/) - Distributed deep learning\n- [FastFlow](https://github.com/fastflow/fastflow) - High-performance Parallel Patterns in C++\n- [Galois](https://github.com/IntelligentSoftwareSystems/Galois) - A C++ Library to Ease Parallel Programming with Irregular Parallelism\n- [Halide](https://halide-lang.org/index.html#gettingstarted) - A language for fast, portable computation on images and tensors\n- [Heteroflow](https://github.com/Heteroflow/Heteroflow) - Concurrent CPU-GPU Task Programming using Modern C++\n- [highway](https://github.com/google/highway) - Performance portable SIMD intrinsics\n- [HIP](https://github.com/ROCm-Developer-Tools/HIP) - HIP is a C++ Runtime API and Kernel Language for AMD/Nvidia GPU\n- [HPC-X](https://developer.nvidia.com/networking/hpc-x) - Nvidia implementation of MPI\n- [HPX](https://github.com/STEllAR-GROUP/hpx) - A C++ Standard Library for Concurrency and Parallelism\n- [Horovod](https://github.com/horovod/horovod) - Distributed deep learning training framework for TensorFlow, Keras, PyTorch, and Apache MXNet\n- [ISPC](https://ispc.github.io/) - An open-source compiler for high-performance SIMD programming on the CPU and GPU\n- [Intel ISPC](https://github.com/ispc/ispc) - SPMD compiler\n- [Intel TBB](https://www.threadingbuildingblocks.org/) - Threading Building Blocks\n- [joblib](https://joblib.readthedocs.io/en/latest/why.html) - Data-flow programming for performance (python)\n- [Kompute](https://github.com/KomputeProject/kompute) - The general purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA \u0026 friends)\n- [Kokkos](https://github.com/kokkos/kokkos) - A C++ Programming Model for Writing Performance Portable Applications on HPC platforms\n- [Kubeflow MPI Operator](https://github.com/kubeflow/mpi-operator) - MPI Operator for Kubeflow\n- [Legate](https://github.com/nv-legate/legate.numpy) - Nvidia replacement for numpy based on Legion\n- [Legion](https://github.com/StanfordLegion/legion) - Distributed heterogeneous programming library\n- [MAGMA](https://developer.nvidia.com/magma) - Next generation linear algebra (LA) GPU accelerated libraries\n- [Merlin](https://merlin.readthedocs.io/en/latest/) - A distributed task queuing system, designed to allow complex HPC workflows to scale to large numbers of simulations\n- [Metal](https://developer.apple.com/documentation/metal/performing_calculations_on_a_gpu) - Apple's GPU API\n- [Microsoft MPI](https://docs.microsoft.com/en-us/message-passing-interface/microsoft-mpi) - Microsoft's implementation of MPI\n- [MOGSLib](https://github.com/ECLScheduling/MOGSLib) - User defined schedulers\n- [mpi4jax](https://github.com/mpi4jax/mpi4jax) - Zero-copy mpi for jax arrays\n- [mpi4py](https://mpi4py.readthedocs.io/en/stable/) - Python bindings for MPI\n- [MPI](https://www.open-mpi.org/) - OpenMPI implementation of the Message passing interface\n- [MPI](https://www.mpich.org/) - MPICH implementation of the Message passing interface\n- [MPI Standardization Forum](https://www.mpi-forum.org/) - Forum for MPI standardization\n- [MPAVICH](https://mvapich.cse.ohio-state.edu/) - Implementation of MPI\n- [NCCL](https://developer.nvidia.com/nccl) - The NVIDIA Collective Communication Library for multi-GPU and multi-node communication\n- [cuNumeric](https://developer.nvidia.com/cunumeric) - GPU drop-in for numpy\n- [stdpar](https://developer.nvidia.com/blog/accelerating-standard-c-with-gpus-using-stdpar/) - GPU accelerated C++ from NVIDIA\n- [numba](https://numba.pydata.org/) - A JIT compiler that translates a subset of Python into fast machine code\n- [oneAPI](https://www.oneapi.io/) - A unified, multiarchitecture, multi-vendor programming model\n- [OpenACC](https://www.openacc.org/) - \"OpenMP for GPUs\"\n- [OpenCilk](https://www.opencilk.org/) - MIT continuation of Cilk Plus\n- [OpenMP](https://www.openmp.org/) - Multi-platform Shared-memory Parallel Programming in C/C++ and Fortran\n- [PVM](https://www.csm.ornl.gov/pvm/) - Parallel Virtual Machine: A predecessor to MPI for distributed computing\n- [PMIX](https://pmix.github.io/standard) - Standard for process management\n- [Pollux](https://github.com/polluxio/pollux-payload) - Message Passing Cloud orchestrator\n- [Pyfi](https://github.com/radiantone/pyfi) - Distributed flow and computation system\n- [Pyper](https://github.com/pyper-dev/pyper) - concurrent python made simple\n- [RAJA](https://github.com/LLNL/RAJA) - Architecture and programming model portability for HPC applications\n- [RaftLib](https://github.com/RaftLib/RaftLib) - A C++ Library for Enabling Stream and Dataflow Parallel Computation\n- [ray](https://www.ray.io/) - Scale AI and Python workloads from reinforcement learning to deep learning\n- [ROCM](https://rocmdocs.com/en/latest/) - First open-source software development platform for HPC/Hyperscale-class GPU computing\n- [RS MPI](https://rsmpi.github.io/rsmpi/mpi/index.html) - Rust bindings for MPI\n- [Scalix](https://github.com/NAGAGroup/Scalix) - Data parallel computing framework\n- [Simgrid](https://simgrid.org/) - Simulate cluster/HPC environments\n- [SkelCL](https://skelcl.github.io/) - A Skeleton Library for Heterogeneous Systems\n- [STAPL](https://parasol.tamu.edu/stapl/) - Standard Template Adaptive Parallel Programming Library in C++\n- [STLab](http://stlab.cc/libraries/concurrency/) - High-level Constructs for Implementing Multicore Algorithms with Minimized Contention\n- [SYCL](https://www.khronos.org/sycl/) - C++ Abstraction layer for heterogeneous devices\n- [Taichi](https://github.com/taichi-dev/taichi) - Parallel programming language for high-performance numerical computations in Python\n- [Taskflow](https://github.com/taskflow/taskflow) - A Modern C++ Parallel Task Programming Library\n- [The Open Community Runtime](https://wiki.modelado.org/Open_Community_Runtime) - Specification for Asynchronous Many Task systems\n- [Transwarp](https://github.com/bloomen/transwarp) - A Header-only C++ Library for Task Concurrency\n- [Triton](https://triton-lang.org/main/index.html) - Triton is a language and compiler for parallel programming\n- [Tuplex](https://tuplex.cs.brown.edu/) - Blazing fast python data science\n- [UCX](https://github.com/openucx/ucx#using-ucx) - Optimized production proven-communication framework\n- [Zluda](https://github.com/vosen/ZLUDA) - Run unmodified CUDA applications with near-native performance on Intel AMD GPUs.\n- [HyperQueue](https://github.com/It4innovations/hyperqueue) - HyperQueue is a tool designed to simplify execution of large workflows (task graphs) on HPC clusters.\n  \n#### Cluster Hardware Discovery Tools\n- [cpuid](https://en.wikipedia.org/wiki/CPUID) - A software instruction available on Intel, AMD, and other processors that can be used to determine processor type and features.\n- [cpuid instruction note](https://www.scss.tcd.ie/~jones/CS4021/processor-identification-cpuid-instruction-note.pdf) - A detailed note on the CPUID instruction used for processor identification.\n- [cpufetch](https://github.com/Dr-Noob/cpufetch) - A simple yet fancy CPU architecture fetching tool.\n- [gpufetch](https://github.com/Dr-Noob/gpufetch) - A tool similar to cpufetch, but for fetching GPU architecture.\n- [intel cpuinfo](https://www.intel.com/content/www/us/en/develop/documentation/mpi-developer-reference-linux/top/command-reference/cpuinfo.html) - Intel tool providing information about the characteristics of Intel CPUs.\n- [Likwid](https://github.com/RRZE-HPC/likwid) - Provides all information about the supercomputer/cluster.\n- [LIKWID.jl](https://juliaperf.github.io/LIKWID.jl/dev/) - Julia wrapper for LIKWID.\n- [openmpi hwloc](https://www.open-mpi.org/projects/hwloc/) - Portable Hardware Locality (hwloc) software project.\n- [PRK - Parallel Research Kernels](https://github.com/ParRes/Kernels) - A collection of kernels for parallel programming research.\n\n#### Cluster Management/Tools/Schedulers/Stacks\n- [BeeGFS](http://beegfs.io/docs/whitepapers/Introduction_to_BeeGFS_by_ThinkParQ.pdf) - A parallel file system designed for performance-critical environments.\n- [Bluebanquise](https://github.com/bluebanquise/bluebanquise) - An open-source cluster management tool.\n- [Bright Cluster Manager](https://www.brightcomputing.com/brightclustermanager) - Software for deploying and managing HPC and AI server clusters.\n- [Ceph](https://ceph.io/en/) - An open-source distributed storage system.\n- [DeepOps](https://github.com/NVIDIA/deepops) - Nvidia's GPU infrastructure and automation tools for Kubernetes and Slurm clusters.\n- [E4S - The Extreme Scale HPC Scientific Stack](https://e4s-project.github.io/) - A collection of open-source software packages for HPC environments.\n- [Easybuild](https://docs.easybuild.io/en/latest/) - A package manager for HPC/supercomputers.\n- [EESSI](https://www.eessi.io) - A shared stack of scientific software installations.\n- [Flux framework](https://flux-framework.org/) - A framework for high-performance computing clusters.\n- [fpsync](http://www.fpart.org/fpsync/) - A tool for fast parallel data transfer using fpart and rsync.\n- [GPFS](https://en.wikipedia.org/wiki/GPFS) - A high-performance parallel file system developed by IBM.\n- [Guix](https://hpc.guix.info/) - A package manager for HPC/supercomputers.\n- [Intel DAOS](https://daos.io) - A software-defined scale-out object store for HPC applications.\n- [LSF](https://www.ibm.com/docs/en/spectrum-lsf/10.1.0?topic=lsf-batch-jobs-tasks) - A batch system for HPC and distributed computing environments.\n- [Lmod](https://lmod.readthedocs.io/en/latest/) - A Lua-based module system for software environment management on HPC systems.\n- [Lustre Parallel File System](https://www.lustre.org/) - A high-performance distributed filesystem for large-scale cluster computing.\n- [moosefs](https://moosefs.com/) - A fault-tolerant, highly available, distributed file system.\n- [NetApp](www.netapp.com) - Intelligent data infrastructure for various workloads.\n- [Open Cluster Scheduler](https://github.com/hpc-gridware/clusterscheduler/) - A scalable HPC/AI workload manager based on SGE.\n- [OpenHPC](https://openhpc.community/) - A community-led set of HPC components.\n- [OpenOnDemand](https://openondemand.org/) - A web portal for accessing supercomputing resources.\n- [OpenPBS](https://www.openpbs.org/) - A software for workload management and job scheduling.\n- [OpenXdMod](https://open.xdmod.org/7.5/index.html) - A tool for managing high-performance computing resources.\n- [RADIUSS](https://computing.llnl.gov/projects/radiuss) - Rapid Application Development via an Institutional Universal Software Stack.\n- [rocks](http://www.rocksclusters.org/) - An open-source Linux cluster distribution.\n- [Ruse](https://github.com/JanneM/Ruse) - A tool for managing software environments in HPC clusters.\n- [SGE](http://star.mit.edu/cluster/docs/0.93.3/guides/sge.html) - A resource management software for large clusters of computers.\n- [Slurm](https://slurm.schedmd.com/overview.html) - A cluster management and job scheduling system for Linux clusters.\n- [Spectrum LSF](https://www.ibm.com/products/hpc-workload-management) - Workload management platform and job scheduler for distributed high performance computing (HPC)\n- [Spack](https://spack.io/) - A package manager for HPC/supercomputers.\n- [sstack](https://gitlab.com/nmsu_hpc/sstack) - A tool to install multiple software stacks such as Spack, EasyBuild, and Conda.\n- [Starfish](https://starfishstorage.com/) - Unstructured data management and metadata solution for files and objects.\n- [Warewulf](https://warewulf.lbl.gov/) - An operating system provisioning system and cluster management tool.\n- [xCat](https://xcat.org/) - A distributed computing management and provisioning tool.\n- [XDMoD](https://supremm.xdmod.org/10.0/supremm-overview.html) - An open-source tool for managing high-performance computing resources.\n- [Globus Connect](https://www.globus.org/globus-connect) - A fast data transfer tool between supercomputers.\n- [Slurm Web](https://slurm-web.com/) - Open source web dashboard for Slurm HPC clusters.\n  \n#### HPC-specific Operating Systems\n- [Kitten](https://www.sandia.gov/app/uploads/sites/210/2022/11/pedretti_lanl11.pdf) - A lightweight kernel designed for high-performance computing. It focuses on providing low noise and predictable performance for HPC applications.\n- [McKernel](https://github.com/RIKEN-SysSoft/mckernel) - A hybrid kernel that combines Linux and a lightweight kernel designed to provide high performance for HPC applications.\n- [mOS](http://cs.iit.edu/~khale/docs/mos.pdf) - A specialized operating system for high-performance computing, designed to support large-scale, manycore processors.\n\n#### Development/Workflow/Monitoring Tools for HPC\n\n- [Apache Airflow](https://airflow.apache.org/) - A platform to programmatically author, schedule, and monitor workflows.\n- [Apptainer (formerly Singularity)](https://singularity.lbl.gov/) - Container platform designed for scientific and high-performance computing (HPC) environments.\n- [arbiter2](https://github.com/CHPC-UofU/arbiter2) - Monitors and protects interactive nodes with cgroups.\n- [Charliecloud](https://hpc.github.io/charliecloud/) - Lightweight container solution for high-performance computing (HPC).\n- [Docker](https://www.docker.com/) - A set of platform as a service products that use OS-level virtualization to deliver software in packages called containers.\n- [genv](https://github.com/run-ai/genv) - GPU Environment Management for managing and scheduling GPU resources.\n- [Grafana](https://github.com/grafana/grafana) - Open-source platform for monitoring and observability, visualizing metrics.\n- [grpc](https://grpc.io/) - A high-performance, open-source universal RPC framework.\n- [HPC Rocket](https://github.com/SvenMarcus/hpc-rocket) - Allows submitting Slurm jobs in Continuous Integration (CI) pipelines.\n- [HTCondor](https://research.cs.wisc.edu/htcondor/) - An open-source high-throughput computing software framework.\n- [Jacamar-ci](https://gitlab.com/ecp-ci/jacamar-ci/-/blob/develop/README.md) - CI/CD tool designed for HPC and scientific computing workflows.\n- [Kubernetes](https://kubernetes.io/) - An open-source system for automating deployment, scaling, and management of containerized applications.\n- [nextflow](https://www.nextflow.io/) - A workflow framework to deploy data-driven computational pipelines.\n- [perun](https://github.com/Helmholtz-AI-Energy/perun) - Energy monitor for HPC systems, focusing on performance and energy efficiency.\n- [Prefect](https://www.prefect.io/) - A workflow management system, designed for modern infrastructure and powered by the open-source Prefect Core workflow engine.\n- [Prometheus](https://prometheus.io/) - An open-source monitoring system with a dimensional data model, flexible query language, efficient time series database and modern alerting approach.\n- [redun](https://github.com/insitro/redun) - Workflow engine that emphasizes simplicity, reliability, and scalability.\n- [remora](https://github.com/TACC/remora) - Tool for monitoring and reporting the performance of batch jobs on HPC systems.\n- [ruptime](https://github.com/alexmyczko/ruptime) - A utility for monitoring the status of computational jobs and systems.\n- [Slurmvision slurm dashboard](https://github.com/Ruunyox/slurmvision) - A dashboard for monitoring and managing Slurm jobs.\n- [slurm docker cluster](https://github.com/giovtorres/slurm-docker-cluster) - A Slurm cluster implemented using Docker containers, for development and testing.\n- [snakemake](https://snakemake.readthedocs.io/en/stable/) - A workflow management system that reduces the complexity of creating reproducible and scalable data analyses.\n- [Stui slurm dashboard for the terminal](https://github.com/mil-ad/stui) - A terminal-based UI for managing and monitoring Slurm clusters.\n- [Vaex](https://github.com/vaexio/vaex) - A Python library for lazy Out-of-Core DataFrames (similar to Pandas), to visualize and explore big tabular datasets.\n\n  \n#### Debugging Tools for HPC\n\n- [ddt](https://www.arm.com/products/development-tools/server-and-hpc/forge/ddt) - A powerful debugger designed for developers to solve complex problems on multi-threaded and multi-process environments in HPC.\n- [marmot MPI checker](https://www.lrz.de/services/software/parallel/marmot/) - A tool for detecting and reporting issues in MPI (Message Passing Interface) applications.\n- [python debugging tools](https://wiki.python.org/moin/PythonDebuggingTools) - A collection of tools for debugging Python applications, including pdb and other utilities.\n- [seer modern gui for gdb](https://github.com/epasveer/seer) - A graphical user interface for GDB, aiming to improve the debugging experience with modern features and visuals.\n- [Summary of C/C++ debugging tools](http://pramodkumbhar.com/2018/06/summary-of-debugging-tools/) - An overview of various debugging tools available for C/C++ applications, focusing on HPC environments.\n- [totalview](https://totalview.io/) - A comprehensive source code analysis and debugging tool designed for complex software running on HPC systems, supporting a wide range of languages and architectures.\n\n\n#### Performance/Benchmark Tools for HPC\n\n- [demonspawn](https://github.com/TACC/demonspawn) - A framework for automated execution of benchmarks and simulations, designed for HPC environments.\n- [Google benchmark](https://github.com/google/benchmark) - A microbenchmark support library for C++ that tracks performance over time.\n- [HPL benchmark](https://www.netlib.org/benchmark/hpl/) - The High Performance Linpack Benchmark for measuring floating-point computing power of systems.\n- [kerncraft](https://github.com/RRZE-HPC/kerncraft) - A tool for analytical modeling of loop performance and cache behavior on HPC systems.\n- [NASA parallel benchmark suite](https://www.nas.nasa.gov/software/npb.html) - A set of benchmarks designed to evaluate the performance of parallel supercomputers.\n- [papi](https://icl.utk.edu/papi/) - Provides standard APIs for accessing hardware performance counters available on modern microprocessors.\n- [scalasca](https://www.scalasca.org/) - A software tool that supports performance analysis of large-scale parallel applications.\n- [scalene](https://github.com/plasma-umass/scalene) - A high-performance, high-precision CPU, GPU, and memory profiler for Python.\n- [Summary of code performance analysis tools](https://doku.lrz.de/display/PUBLIC/Performance+and+Code+Analysis+Tools+for+HPC) - An overview of tools for analyzing HPC application performance.\n- [Summary of profiling tools](https://pramodkumbhar.com/2017/04/summary-of-profiling-tools/) - A comprehensive list of profiling tools for performance analysis in HPC.\n- [tau](https://www.cs.uoregon.edu/research/tau/home.php) - TAU (Tuning and Analysis Utilities) is a profiling and tracing toolkit for performance analysis of parallel programs.\n- [The Bandwidth Benchmark](https://github.com/RRZE-HPC/TheBandwidthBenchmark/) - A tool for measuring memory bandwidth across various CPUs and systems.\n- [vampir](https://vampir.eu/) - A tool for detailed analysis of MPI program executions by visualizing their event traces.\n- [bytehound memory profiler](https://github.com/koute/bytehound) - A detailed memory profiler for tracking down memory issues and leaks.\n- [Flamegraphs](https://www.brendangregg.com/flamegraphs.html) - Visualization tool for profiling software, allowing quick identification of performance bottlenecks.\n- [fio](https://linux.die.net/man/1/fio) - Flexible I/O tester for benchmarking and stress/hardware verification.\n- [IBM Spectrum Scale Key Performance Indicators (KPI)](https://github.com/IBM/SpectrumScale_NETWORK_READINESS) - Provides key performance indicators for IBM Spectrum Scale, aiding in performance tuning and monitoring.\n- [Ior](https://github.com/hpc/ior) - A parallel file system I/O benchmarking tool used widely in HPC for testing storage systems.\n- [ngstress](https://github.com/ColinIanKing/stress-ng) - A versatile tool for stressing various subsystems of a computer to find hardware faults or to benchmark performance.\n- [Hotspot](https://github.com/KDAB/hotspot/) - The Linux perf GUI for in-depth performance analysis and visualization of software behavior.\n- [mixbench](https://github.com/ekondis/mixbench) - A benchmark suite designed to evaluate CPUs and GPUs across different compute and memory operations.\n- [pmu-tools (toplev)](https://github.com/andikleen/pmu-tools) - Performance monitoring tools for modern Intel CPUs, offering detailed insights into hardware and application performance.\n- [SPEC CPU Benchmark](https://www.spec.org/benchmarks.html) - A benchmark suite designed to provide a comparative measure of compute-intensive performance across the widest practical range of hardware.\n- [STREAM Memory Bandwidth Benchmark](https://www.cs.virginia.edu/stream/) - Measures sustainable memory bandwidth and the corresponding computation rate for simple vector kernels.\n- [Intel MPI benchmarks](https://www.intel.com/content/www/us/en/docs/mpi-library/user-guide-benchmarks/2021-2/overview.html) - A set of benchmarks designed to measure the performance and scalability of MPI implementations on Intel architectures.\n- [Ohio state MPI benchmarks](https://mvapich.cse.ohio-state.edu/benchmarks/) - A comprehensive suite of benchmarks for evaluating MPI performance across a variety of message passing patterns and communication protocols.\n- [hpctoolkit](http://hpctoolkit.org/man/hpctoolkit.html) - An integrated suite of tools for measurement and analysis of program performance on computers ranging from desktops to supercomputers.\n- [core-to-core-latency](https://github.com/nviennot/core-to-core-latency) - A diagnostic tool designed to measure and report the latency between CPU cores, aiding in the optimization of parallel computing tasks.\n- [speedscope](https://github.com/jlfwong/speedscope) - An interactive, web-based viewer for performance profiles of software. It supports various formats and provides a flamegraph visualization to identify hot paths efficiently.\n- [Differential Flamegraphs](https://www.brendangregg.com/blog/2014-11-09/differential-flame-graphs.html) - A visualization technique developed by Brendan Gregg that highlights differences between performance profiles, making it easier to spot performance regressions or improvements.\n- [Hyperfine](https://github.com/sharkdp/hyperfine) - A command-line benchmarking tool that provides a simple and user-friendly means to compare the performance of commands, featuring statistical analysis across multiple runs.\n- [Openfoam HPC benchmark](https://develop.openfoam.com/committees/hpc/-/wikis/home) - A benchmarking suite for evaluating the High Performance Computing capabilities of OpenFOAM, an open-source CFD software, under various computational loads.\n- [OSU microbenchmarks](https://mvapich.cse.ohio-state.edu/benchmarks/) - A collection of microbenchmarks designed to evaluate the performance of MPI implementations across various communication protocols and message sizes.\n- [fio flexible I/O tester](https://fio.readthedocs.io/) - A versatile tool for I/O workload simulation and benchmarking, capable of testing a wide array of storage and filesystem configurations.\n- [vftrace](https://github.com/SX-Aurora/Vftrace) - A tracing tool specifically designed for the NEC SX-Aurora TSUBASA Vector Engine, enabling detailed performance analysis of vectorized code.\n- [tinymembench](https://github.com/ssvb/tinymembench) - A simple memory benchmark tool, focusing on benchmarking memory bandwidth and latency with minimal dependencies, suitable for various platforms.\n- [Geekbench](https://www.geekbench.com/) - Cross platform benchmarking tool\n- [Empirical Roofline Tool (ERT)](https://crd.lbl.gov/divisions/amcr/computer-science-amcr/par/research/roofline/software/ert/) - Create empirical roofline plots, alternative to intel vtune for any machine\n- [Roofline Visualizer for ERT](https://crd.lbl.gov/divisions/amcr/computer-science-amcr/par/research/roofline/software/roofline-visualizer/) - Visualizer for ERT\n- [Caliper](https://github.com/LLNL/Caliper) - A Performance Analysis Toolbox in a Library\n- [KDiskMark](https://github.com/JonMagon/KDiskMark) - Benchmarking Tool For SSD/HDD Drives\n- [OpenBenchmarking](https://openbenchmarking.org/) - Open benchmarks on a variety of algorithms and hardware\n- [Phoronix Test Suite](https://github.com/phoronix-test-suite/phoronix-test-suite) - Benchmarking suite for Linux\n\n#### IO/Visualization Tools for HPC\n- [ADIOS2](https://github.com/ornladios/ADIOS2) - The Adaptable IO System version 2, designed for flexible and efficient I/O for scientific data, supporting a wide range of HPC simulations.\n- [Amira](https://www.thermofisher.com/ca/en/home/electron-microscopy/products/software-em-3d-vis/amira-software.html) - A powerful, multifaceted 3D software platform for visualizing, manipulating, and understanding Life Science and bio-medical data coming from all types of sources.\n- [hdf5](https://www.hdfgroup.org/solutions/hdf5/) - The Hierarchical Data Format version 5 (HDF5), is an open source file format that supports large, complex, heterogeneous data.\n- [paraview](https://www.paraview.org/) - An open-source, multi-platform data analysis and visualization application.\n- [Scientific Visualization Wiki](https://en.wikipedia.org/wiki/Scientific_visualization) - A comprehensive guide to the field of scientific visualization, detailing techniques, tools, and applications.\n- [the yt project](https://yt-project.org/) - An open-source, Python-based package for analyzing and visualizing volumetric data.\n- [vedo](https://vedo.embl.es/) - A lightweight and powerful python module for scientific analysis and visualization of 3D objects and point clouds based on VTK.\n- [visit](https://wci.llnl.gov/simulation/computer-codes/visit) - An Open Source, interactive, scalable, visualization, animation and analysis tool.\n\n#### General Purpose Scientific Computing Libraries for HPC\n - [petsc](https://petsc.org/release/)\n - [ginkgo](https://ginkgo-project.github.io/)\n - [GSL](https://www.gnu.org/software/gsl/)\n - [Scalapack](https://netlib.org/scalapack/)\n - [rapids.ai - collection of libraries for executing end-to-end data science pipelines completely in the GPU](rapids.ai)\n - [trilinos](https://trilinos.github.io/)\n - [tnl project](https://tnl-project.org/)\n \n#### Misc.\n - [mimalloc memory allocator](https://github.com/microsoft/mimalloc)\n - [jemalloc memory allocator](https://github.com/jemalloc/jemalloc)\n - [tcmalloc memory allocator](https://github.com/google/tcmalloc)\n - [Horde memory allocator](https://github.com/emeryberger/Hoard)\n - [Software utilization at UK National Supercomputing Service, ARCHER2](https://www.archer2.ac.uk/support-access/status.html#software-usage-data)\n  \n#### Wikis\n- [Comparison of cluster software](https://en.wikipedia.org/wiki/Comparison_of_cluster_software)\n- [List of cluster management software](https://en.wikipedia.org/wiki/List_of_cluster_management_software)\n\n## Hardware\n\n### Interconnects/Topology\n\n- [Ethernet](https://en.wikipedia.org/wiki/Ethernet)\n- [Infiniband](https://en.wikipedia.org/wiki/InfiniBand)\n- [Network topologies](https://www.hpcwire.com/2019/07/15/super-connecting-the-supercomputers-innovations-through-network-topologies/)\n- [Battle of the infinibands - Omnipath vs Infiniband](https://www.nextplatform.com/2017/11/29/the-battle-of-the-infinibands/)\n- [Mellanox infiniband cluster config](https://www.mellanox.com/clusterconfig/)\n- [RoCE - RDMA Over Converged Ethernet](https://en.wikipedia.org/wiki/RDMA_over_Converged_Ethernet)\n- [Slingshot interconnect](https://www.hpe.com/ca/en/compute/hpc/slingshot-interconnect.html)\n- [CXL - Compute Express Link](https://www.computeexpresslink.org/)\n- [Infiniband Essentials](https://academy.nvidia.com/en/course/infiniband-essentials/?cm=244)\n- [NVlink](https://en.wikipedia.org/wiki/NVLink)\n- [List of lan-based interconnect bit rates](https://en.wikipedia.org/wiki/List_of_interface_bit_rates)\n- [List of internet-based interconnect bit rates](https://en.wikipedia.org/wiki/Bandwidth_(computing)#Internet_connection_bandwidths)\n  \n### CPU\n- [Wikichip](https://en.wikichip.org/wiki/WikiChip)\n- [Microarchitecture of Intel/AMD CPUs](https://www.agner.org/optimize/microarchitecture.pdf)\n- [Apple M1](https://en.wikipedia.org/wiki/Apple_M1)\n- [Apple M2](https://en.wikipedia.org/wiki/Apple_M2)\n- [Apple M2 Teardown](https://www.ifixit.com/News/62674/m2-macbook-air-teardown-apple-forgot-the-heatsink)\n- [Apply M1/M2 AMX](https://github.com/corsix/amx)\n- [Apple M3](https://en.wikipedia.org/wiki/Apple_M3)\n- [List of Intel processors](https://en.wikipedia.org/wiki/List_of_Intel_processors)\n- [List of Intel micro architectures](https://en.wikipedia.org/wiki/List_of_Intel_CPU_microarchitectures)\n- [Comparison of Intel processors](https://en.wikipedia.org/wiki/Comparison_of_Intel_processors)\n- [Comparison of Apple processors](https://en.wikipedia.org/wiki/Apple-designed_processors)\n- [List of AMD processors](https://en.wikipedia.org/wiki/List_of_AMD_processors)\n- [List of AMD CPU micro architectures](https://en.wikipedia.org/wiki/List_of_AMD_CPU_microarchitectures)\n- [Comparison of AMD architectures](https://en.wikipedia.org/wiki/Table_of_AMD_processors)\n\n### GPU\n\n- [Gpu Architecture Analysis](https://graphicscodex.courses.nvidia.com/app.html?page=_rn_parallel)\n- [A trip through the Graphics Pipeline](https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/)\n- [A100 Whitepaper](https://images.nvidia.com/aem-dam/en-zz/Solutions/data-center/nvidia-ampere-architecture-whitepaper.pdf)\n- [MIG](https://www.nvidia.com/en-us/technologies/multi-instance-gpu/)\n- [Gentle Intro to GPU Inner Workings](https://vksegfault.github.io/posts/gentle-intro-gpu-inner-workings/)\n- [AMD Instinct GPUs](https://en.wikipedia.org/wiki/AMD_Instinct_accelerators)\n- [AMD GPU ROCm Support and OS Compatibility](https://rocm.docs.amd.com/en/latest/release/gpu_os_support.html)\n- [List of AMD GPUs](https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units)\n- [Comparison of CUDA architectures](https://en.wikipedia.org/wiki/CUDA)\n- [Tales of the M1 GPU](https://asahilinux.org/2022/11/tales-of-the-m1-gpu/)\n- [List of Intel GPUs](https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units)\n- [Performance of DGX Cluster](https://www.computer.org/csdl/proceedings-article/cloudcom/2022/636700a170/1JNqFu7QdTG)\n\n### TPU/Tensor Cores\n\n- [Google TPU](https://thechipletter.substack.com/p/googles-first-tpu-architecture)\n- [TPU Wiki](https://en.wikipedia.org/wiki/Tensor_Processing_Unit)\n- [NVIDIA Tensor Cores](https://www.nvidia.com/en-us/data-center/tensor-cores/)\n\n### Many integrated core processor (MIC)\n\n- [Xeon Phi](https://en.wikipedia.org/wiki/Xeon_Phi)\n\n### Cloud\n\n- [Awesome Cloud HPC](https://github.com/kjrstory/awesome-cloud-hpc)\n\n#### Vendors\n\n- [Official NVIDIA Vendors](https://marketplace.nvidia.com/en-us/enterprise/cloud-solutions/?limit=15)\n- [AWS HPC](https://aws.amazon.com/hpc/)\n- [Azure HPC](https://azure.microsoft.com/en-us/solutions/high-performance-computing/#intro)\n- [rescale](https://rescale.com/)\n- [vast.ai](https://vast.ai/)\n- [vultr - cheap bare metal CPU, GPU, DGX servers](vultr.com)\n- [hetzner - cheap servers incl. 80-core ARM](https://www.hetzner.com/)\n- [Ampere ARM cloud-native processors](https://amperecomputing.com/)\n- [Scaleway](https://www.scaleway.com/en/)\n- [Chameleon Cloud](https://www.chameleoncloud.org/)\n- [Lambda Labs](https://lambdalabs.com/)\n- [Runpod](https://www.runpod.io/)\n\n#### Articles/Papers\n- [The use of Microsoft Azure for high performance cloud computing – A case study](https://www.diva-portal.org/smash/get/diva2:1704798/FULLTEXT01.pdf)\n- [AWS Cluster in the cloud](https://cluster-in-the-cloud.readthedocs.io/en/latest/aws-infrastructure.html)\n- [AWS Parallel Cluster](https://docs.aws.amazon.com/parallelcluster/latest/ug/tutorials-running-your-first-job-on-version-3.html)\n- [AWS HPC Workshop](https://www.hpcworkshops.com/)\n- [An Empirical Study of Containerized MPI and GUI Application on HPC in the Cloud](https://ieeexplore.ieee.org/abstract/document/10046607)\n\n### Custom/FPGA/ASIC/APU\n\n- [OpenPiton](http://parallel.princeton.edu/openpiton/)\n- [Parallela](https://www.parallella.org/)\n- [AMD APU](https://en.wikipedia.org/wiki/AMD_Accelerated_Processing_Unit)\n\n### Certification\n\n- [Intel Cluster Ready](https://en.wikipedia.org/wiki/Intel_Cluster_Ready)\n\n### Student Opportunities / Workshops\n\n- [Supercomputing Conference Student Opportunities](https://sc21.supercomputing.org/program/studentssc/)\n- [SCC Student cluster competition](https://www.studentclustercompetition.us/)\n- [Winter Classic Invitational](https://www.winterclassicinvitational.com/)\n- [Linux Cluster Institute](https://linuxclustersinstitute.org/)\n\n### Other/Wikis\n\n- [Supercomputer](https://en.wikipedia.org/wiki/Supercomputer)\n- [Supercomputer architecture](https://en.wikipedia.org/wiki/Supercomputer_architecture)\n- [Beowulf cluster](https://en.wikipedia.org/wiki/Beowulf_cluster)\n- [Computer cluster](https://en.wikipedia.org/wiki/Computer_cluster)\n- [Comparison of Intel processors](https://en.wikipedia.org/wiki/Comparison_of_Intel_processors)\n- [Comparison of Apple processors](https://en.wikipedia.org/wiki/Apple-designed_processors)\n- [Comparison of AMD architectures](https://en.wikipedia.org/wiki/Table_of_AMD_processors)\n- [Comparison of CUDA architectures](https://en.wikipedia.org/wiki/CUDA)\n- [Cache](https://en.wikipedia.org/wiki/Cache_(computing))\n- [Google TPU](https://en.wikipedia.org/wiki/Tensor_Processing_Unit)\n- [IPMI](https://en.wikipedia.org/wiki/Intelligent_Platform_Management_Interface)\n- [FRU](https://en.wikipedia.org/wiki/Field-replaceable_unit)\n- [Disk Arrays](https://en.wikipedia.org/wiki/Disk_array)\n- [RAID](https://en.wikipedia.org/wiki/RAID)\n- [Cray](https://en.wikipedia.org/wiki/Cray)\n- [Digital Signal Processors](https://en.wikipedia.org/wiki/Digital_signal_processor)\n- [Vector Processor](https://en.wikipedia.org/wiki/Vector_processor)\n  \n## People\n\n - [Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI](https://www.nature.com/articles/s43588-022-00245-w)\n - [Bill Gropp - 2010 IEEE TCSC Medal for Excellence in Scalable Computing](https://en.wikipedia.org/wiki/Bill_Gropp)\n - [David Bader - built the first Linux supercomputer](https://en.wikipedia.org/wiki/David_Bader_(computer_scientist))\n - [Thomas Sterling - \"Father of Beowulf clusters\", ParalleX/HPX](https://en.wikipedia.org/wiki/Thomas_Sterling_(computing))\n - [Seymour Cray - Inventor of the Cray Supercomputer](https://en.wikipedia.org/wiki/Seymour_Cray)\n - [Larry Smarr - HPC Application Pioneer](https://en.wikipedia.org/wiki/Larry_Smarr)\n - [Donald Becker - Beowulf cluster software, Gordon Bell Prize Winner](https://en.wikipedia.org/wiki/Donald_Becker)\n  \n## Resources\n\n#### Books/Manuals\n- [Free Modern HPC Books by Victor Eijkhout](https://theartofhpc.com/)\n- [High Performance Parallel Runtimes](https://www.amazon.com/High-Performance-Parallel-Runtimes-Implementation-ebook/dp/B08WH82KF9/ref=sr_1_1?keywords=High+Performance+Parallel+Runtimes\u0026qid=1689287759\u0026sr=8-1)\n- [The OpenMP Common Core: Making OpenMP Simple Again](https://www.amazon.com/OpenMP-Common-Core-Engineering-Computation/dp/0262538865/ref=d_pd_sbs_sccl_2_1/130-5660046-7109016?pd_rd_w=Cqnxw\u0026content-id=amzn1.sym.3676f086-9496-4fd7-8490-77cf7f43f846\u0026pf_rd_p=3676f086-9496-4fd7-8490-77cf7f43f846\u0026pf_rd_r=HG04QQS87WDHAGV578EE\u0026pd_rd_wg=u0csS\u0026pd_rd_r=8a6a0024-5dec-4934-8fa5-99e24d9fc4bd\u0026pd_rd_i=0262538865\u0026psc=1)\n- [Parallel and High Performance Computing](https://www.manning.com/books/parallel-and-high-performance-computing)\n- [Algorithms for Modern Hardware](https://en.algorithmica.org/hpc/)\n- [High Performance Computing: Modern Systems and Practices](https://www.amazon.ca/High-Performance-Computing-Systems-Practices/dp/012420158X) - Thomas Sterling, Maciej Brodowicz, Matthew Anderson 2017\n- [Introduction to High Performance Computing for Scientists and Engineers](https://www.amazon.ca/Introduction-Performance-Computing-Scientists-Engineers/dp/143981192X/ref=sr_1_1?crid=1L276HPEB8K7I\u0026keywords=Introduction+to+High+Performance+Computing+for+Scientists+and+Engineers\u0026qid=1645137608\u0026s=books\u0026sprefix=introduction+to+high+performance+computing+for+scientists+and+engineers%2Cstripbooks%2C46\u0026sr=1-1) - Hager 2010\n- [Computer Organization and Design](https://www.amazon.ca/Computer-Organization-Design-RISC-V-Interface/dp/0128203315/ref=sr_1_1?crid=1XLX1HWLGRVO6\u0026keywords=Computer+Organization+and+Design\u0026qid=1645137443\u0026s=books\u0026sprefix=computer+organization+and+design%2Cstripbooks%2C48\u0026sr=1-1)\n- [Optimizing HPC Applications with Intel Cluster Tools: Hunting Petaflops](C+Applications+with+Intel+Cluster+Tools\u0026qid=1645137507\u0026s=books\u0026sprefix=optimizing+hpc+applications+with+intel+cluster+tools%2Cstripbooks%2C80\u0026sr=1-1)\n- [Introduction to High Performance Scientific Computing](https://web.corral.tacc.utexas.edu/CompEdu/pdf/stc/EijkhoutIntroToHPC.pdf) - Victor Eijkhout 2021\n- [Parallel Programming for Science and Engineering](https://web.corral.tacc.utexas.edu/CompEdu/pdf/pcse/EijkhoutParallelProgramming.pdf) - Victor EIjkhout 2021\n- [Parallel Programming for Science and Engineering - HTML Version](https://pages.tacc.utexas.edu/~eijkhout/pcse/html/)\n- [C++ High Performance](https://www.amazon.ca/High-Performance-Master-optimizing-functioning/dp/1839216549/ref=sr_1_1?crid=31OVX4VQ6Z84X\u0026keywords=C%2B%2B+high+performance\u0026qid=1640671313\u0026sprefix=c%2B%2B+high+performance%2Caps%2C99\u0026sr=8-1)\n- [Data Parallel C++ Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL](https://www.apress.com/gp/book/9781484255735)\n- [High Performance Python](https://www.amazon.ca/High-Performance-Python-Performant-Programming/dp/1449361595)\n- [C++ Concurrency in Action: Practical Multithreading](https://www.manning.com/books/c-plus-plus-concurrency-in-action) - Anthony Williams 2012\n- [The Art of Multiprocessor Programming](https://www.amazon.com/Art-Multiprocessor-Programming-Revised-Reprint/dp/0123973376/ref=sr_1_1?ie=UTF8\u0026qid=1438003865\u0026sr=8-1\u0026keywords=maurice+herlihy) - Maurice Herlihy 2012\n- [Parallel Computing: Theory and Practice](http://www.cs.cmu.edu/afs/cs/academic/class/15210-f15/www/tapp.html#ch:work-stealing) - Umut A. Acar 2016\n- [Introduction to Parallel Computing](https://www.amazon.ca/Introduction-Parallel-Computing-Zbigniew-Czech/dp/1107174392/ref=sr_1_7?dchild=1\u0026keywords=parallel+computing\u0026qid=1625711415\u0026sr=8-7) - Zbigniew J. Czech\n- [Practical guide to bare metal C++](https://arobenko.github.io/bare_metal_cpp/)\n- [Optimizing software in C++](https://www.agner.org/optimize/optimizing_cpp.pdf)\n- [Optimizing subroutines in assembly code](https://www.agner.org/optimize/optimizing_assembly.pdf)\n- [Microarchitecture of Intel/AMD CPUs](https://www.agner.org/optimize/microarchitecture.pdf)\n- [Parallel Programming with MPI](https://www.cs.usfca.edu/~peter/ppmpi/)\n- [HPC, Big Data, AI Convergence Towards Exascale: Challenge and Vision](https://www.taylorfrancis.com/books/edit/10.1201/9781003176664/hpc-big-data-ai-convergence-towards-exascale-olivier-terzo-jan-martinovi%C4%8D?refId=2cd8b0ad-d63d-42fa-9c3e-fe47fbbe0e29\u0026context=ubx)\n- [Introduction to parallel computing](https://www.amazon.com/Introduction-Parallel-Computing-Ananth-Grama/dp/0201648652/ref=sr_1_1?crid=LE1VD245VDX5\u0026keywords=Ananth+Grama+-+Introduction+to+parallel+computing\u0026qid=1644907263\u0026sprefix=ananth+grama+-+introduction+to+parallel+computing%2Caps%2C43\u0026sr=8-1) - Ananth Grama\n- [The Student Supercomputer Challenge Guide](https://www.amazon.ca/Student-Supercomputer-Challenge-Guide-Supercomputing/dp/9811338310/ref=sr_1_1?crid=2J5374I76RP2Y\u0026keywords=The+student+supercomputer+challenge\u0026qid=1657060946\u0026sprefix=the+student+supercomputer+challenge%2Caps%2C53\u0026sr=8-1)\n- [The Rust Performance Book](https://nnethercote.github.io/perf-book/introduction.html)\n- [E-Zines on Bash, Linux, Perf, etc - Julia Evans](https://wizardzines.com/)\n- [The Art of Writing Efficient Programs: An Advanced Programmer's Guide to Efficient Hardware Utilization and Compiler Optimizations Using C++ Examples](https://www.amazon.ca/Art-Writing-Efficient-Programs-optimizations/dp/1800208111)\n- [OpenMP Examples - openmp.org](https://www.openmp.org/wp-content/uploads/openmp-examples-4.5.0.pdf)\n- [Latest books on OpemMP - openmp.org](https://www.openmp.org/resources/openmp-books/)\n- [Programming Massively Parallel Processors 4th Edition 2023](https://www.amazon.ca/Programming-Massively-Parallel-Processors-Hands/dp/0128119861/ref=sr_1_1?crid=18EW0LVO2VFMC\u0026keywords=Programming+Massively+Parallel+Processors+4th+Edition+2023\u0026qid=1695110729\u0026s=books\u0026sprefix=programming+massively+parallel+processors+4th+edition+2023%2Cstripbooks%2C88\u0026sr=1-1)\n- [Software Optimization Cookbook](https://www.amazon.ca/Software-Optimization-Cookbook-Performance-Platforms/dp/0976483211)\n- [Power and Performance_ Software Analysis and Optimization](https://www.amazon.ca/Power-Performance-Software-Analysis-Optimization-ebook/dp/B00WZ1AX6S/ref=sr_1_1?crid=22HMPRFCYAXC0\u0026keywords=Power+and+Performance_+Software+Analysis+and+Optimization\u0026qid=1695111518\u0026s=books\u0026sprefix=power+and+performance_+software+analysis+and+optimization%2Cstripbooks%2C85\u0026sr=1-1)\n- [Gropp books on MPI](https://wgropp.cs.illinois.edu/usingmpiweb/)\n- [Performance Analysis and Tuning on Modern CPUs](https://book.easyperf.net/perf_book)\n- [High Performance Computing in Biomimetics Modeling, Architecture and Applications](https://link.springer.com/book/10.1007/978-981-97-1017-1)\n- [Systems Performance - Brendan Gregg](https://www.amazon.com/Systems-Performance-Brendan-Gregg/dp/0136820158)\n- [Is Parallel Programming Hard, And, If So, What Can You Do About It? - Paul E. McKenney](https://cdn.kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html)\n- [The Little Book of Semaphores](https://greenteapress.com/wp/semaphores/)\n    \n#### Courses\n- [HPC Carpentry](https://www.hpc-carpentry.org/)\n- [Berkeley: Applications of Parallel Computers](https://sites.google.com/lbl.gov/cs267-spr2019/) - Detailed course on HPC\n- [CS6290 High-performance Computer Architecture](https://www.udacity.com/course/high-performance-computer-architecture--ud007) - Milos Prvulovic and Catherine Gamboa at George Tech\n- [Udacity High Performance Computing](https://www.youtube.com/playlist?list=PLAwxTw4SYaPk8NaXIiFQXWK6VPnrtMRXC)\n- [Parallel Numerical Algorithms](https://solomonik.cs.illinois.edu/teaching/cs554/index.html)\n- [Vanderbilt - Intro to HPC](https://github.com/vanderbiltscl/SC3260_HPC)\n- [Illinois - Intro to HPC](https://andreask.cs.illinois.edu/Teaching/HPCFall2012/) - Creator of PyCuda\n- [Archer1 Courses](http://www.archer.ac.uk/training/past_courses.php)\n- [TACC tutorials](https://portal.tacc.utexas.edu/tutorials)\n- [Livermore training materials](https://hpc.llnl.gov/training/tutorials)\n- [Xsede training materials](https://www.hpc-training.org/xsede/moodle/)\n- [Parallel Computation Math](https://www.cct.lsu.edu/~pdiehl/teaching/2021/4997/)\n- [Introduction to High-Performance and Parallel Computing - Coursera](https://www.coursera.org/learn/introduction-high-performance-computing)\n- [Foundations of HPC 2020/2021](https://github.com/Foundations-of-HPC)\n- [Principles of Distributed Computing](https://disco.ethz.ch/courses/podc_allstars/)\n- [High Performance Visualization](https://www.uni-bremen.de/ag-high-performance-visualization)\n- [Temple course on building/maintaining a cluster](https://www.hpc.temple.edu/mhpc/2021/hpc-technology/index.html)\n- [Nvidia Deep Learning Course](https://www.nvidia.com/en-us/training/online/)\n- [Coursera GPU Programming Specialization](https://www.coursera.org/specializations/gpu-programming)\n- [Coursera Fundamentals of Parallelism on Intel Architecture](https://www.coursera.org/learn/parallelism-ia)\n- [Coursera Introduction to High Performance Computing](https://www.coursera.org/learn/introduction-high-performance-computing)\n- [Archer2 Shared Memory Programming with OpenMP](https://www.archer2.ac.uk/training/courses/210000-openmp-self-service/)\n- [Archer2 Message-Passing Programming with MPI](https://www.archer2.ac.uk/training/courses/210000-mpi-self-service/)\n- [HetSys 2022 Course](https://www.youtube.com/playlist?list=PL5Q2soXY2Zi9XrgXR38IM_FTjmY6h7Gzm)\n- [Edukamu Introduction to Supercomputing](https://edukamu.fi/elements-of-supercomputing)\n- [Heterogeneous Parallel Programming by S K](https://www.youtube.com/channel/UCbD5dhBi6DBSvCTgEDFz7uA/videos)\n- [NCSA HPC Training Moodle](https://www.hpc-training.org/xsede/moodle/)\n- [Supercomputing in plain english](http://www.oscer.ou.edu/education.php)\n- [Cornell workshop](https://cvw.cac.cornell.edu/topics)\n- [Carpentries Incubator HPC Intro](https://carpentries-incubator.github.io/hpc-intro/)\n- [UL HPC School](https://ulhpc-tutorials.readthedocs.io/en/latest/hpc-school/)\n- [Introduction to High-Performance Parallel Distributed Computing using Chapel, UPC++ and Coarray Fortran](https://bitbucket.org/berkeleylab/upcxx/wiki/events/CUF23)\n- [Performance Engineering off Software Systems (MIT-OCW)](https://ocw.mit.edu/courses/6-172-performance-engineering-of-software-systems-fall-2018/video_galleries/lecture-videos/)\n- [Introduction to Parallel Computing (CMSC 498X/818X)](https://www.cs.umd.edu/class/fall2020/cmsc498x/lectures.shtml)\n- [Infiniband Essentials](https://academy.nvidia.com/en/course/infiniband-essentials/?cm=244)\n- [Performance Ninja Optimization Course](https://github.com/dendibakh/perf-ninja)\n- [HPC Administration Virtual Residency 2024](https://www.youtube.com/@VirtualResidency2024/videos)\n- [Programming Parallel Computers](https://ppc-exercises.cs.aalto.fi/courses)\n- [High Performace Machine Learning - Columbia University](https://www.cs.columbia.edu/~aa4870/high-performance-machine-learning/)\n        \n#### Tutorials/Guides/Articles\n##### General\n- [MpiTutorial](mpitutorial.com) - A fantastic mpi tutorial\n- [Beginners Guide to HPC](http://www.shodor.org/petascale/materials/UPModules/beginnersGuideHPC/)\n- [Rookie HPC Guide](https://rookiehpc.github.io/index.html)\n- [RedHat High Performance Computing 101](https://www.redhat.com/en/blog/high-performance-computing-101)\n- [Parallel Computing Training Tutorials](https://hpc.llnl.gov/training/tutorials) - Lawrence Livermore National Laboratory\n- [Foundations of Multithreaded, Parallel, and Distributed Programming](https://www.amazon.com/Foundations-Multithreaded-Parallel-Distributed-Programming/dp/B00F4I7HM2/ref=sr_1_2?dchild=1\u0026keywords=Gregory+R.+Andrews+Distributed+Programming\u0026qid=1625766665\u0026s=books\u0026sr=1-2)\n- [Building pipelines using slurm dependencies](https://hpc.nih.gov/docs/job_dependencies.html)\n- [Writing slurm scripts in python,r and bash](https://vsoch.github.io/lessons/sherlock-jobs/)\n- [Xsede new user tutorials](https://portal.xsede.org/online-training)\n- [Supercomputing in plain english](http://www.oscer.ou.edu/education.php)\n- [Improving Performance with SIMD intrinsics](https://stackoverflow.blog/2020/07/08/improving-performance-with-simd-intrinsics-in-three-use-cases/)\n- [Want speed? Pass by value](https://web.archive.org/web/20140205194657/http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/)\n- [Introduction to low level bit hacks](https://catonmat.net/low-level-bit-hacks)\n- [How to write fast numerical code: An Introduction](https://users.ece.cmu.edu/~franzf/papers/gttse07.pdf)\n- [Lecture notes on Loop optimizations](https://www.cs.cmu.edu/~fp/courses/15411-f13/lectures/17-loopopt.pdf)\n- [A practical approach to code optimization](https://www.einfochips.com/wp-content/uploads/resources/a-practical-approach-to-optimize-code-implementation.pdf)\n- [Software optimization manuals](https://www.agner.org/optimize/)\n- [Guide into OpenMP: Easy multithreading programming for C++](https://bisqwit.iki.fi/story/howto/openmp/)\n- [An Introduction to the Partitioned Global Address Space (PGAS) Programming Model](https://cnx.org/contents/gtg1AzdI@7/An-Introduction-to-the-Partitioned-Global-Address-Space-PGAS-Programming-Model)\n- [Jax in 2022](https://www.assemblyai.com/blog/why-you-should-or-shouldnt-be-using-jax-in-2022/)\n- [C++ Benchmarking for beginners](https://unum.cloud/post/2022-03-04-gbench/)\n- [Mapping MPI ranks to multiple cuda GPU](https://github.com/olcf-tutorials/local_mpi_to_gpu)\n- [Oak Ridge National Lab Tutorials](https://github.com/olcf-tutorials)\n- [How to perform large scale data processing in bioinformatics](https://medium.com/dnanexus/how-to-perform-large-scale-data-processing-in-bioinformatics-4006e8088af2)\n- [Step by step SGEMM in OpenCL](https://cnugteren.github.io/tutorial/pages/page1.html)\n- [Frontier User Guide](https://docs.olcf.ornl.gov/systems/frontier_user_guide.html)\n- [Allocating large blocks of memory in bare-metal C programming](https://lemire.me/blog/2020/01/17/allocating-large-blocks-of-memory-bare-metal-c-speeds/)\n- [Hashmap benchmarks 2022](https://martin.ankerl.com/2022/08/27/hashmap-bench-01/)\n- [LLNL HPC Tutorials](https://hpc.llnl.gov/documentation/tutorials)\n- [High Performance Computing: A Bird's Eye View](https://umashankar.blog/high-performance-computing-a-birds-eye-view/)\n- [The dirty secret of high performance computing](https://www.techradar.com/news/the-dirty-secret-of-high-performance-computing)\n- [Multiple GPUs with pytorch](https://www.run.ai/guides/multi-gpu/pytorch-multi-gpu-4-techniques-explained)\n- [Brendan Gregg on Linux Performance](https://www.brendangregg.com/linuxperf.html)\n- [Automatic Slurm build scripts](https://www.ni-sp.com/slurm-build-script-and-container-commercial-support/#h-automatic-slurm-build-script-for-rh-centos-7-8-and-9)\n- [Fastest unordered_map implementation / benchmarks](https://martin.ankerl.com/2022/08/27/hashmap-bench-01/)\n- [Memory bandwith NapkinMath](https://www.forrestthewoods.com/blog/memory-bandwidth-napkin-math/)\n- [Avoiding Instruction Cache Misses](https://paweldziepak.dev/2019/06/21/avoiding-icache-misses/)\n- [Multi-GPU Programming with Standard Parallel C++](https://developer.nvidia.com/blog/multi-gpu-programming-with-standard-parallel-c-part-1/)\n- [EuroCC National Competence Center Sweden (ENCCS) HPC tutorials](https://enccs.se/lessons/)\n- [LLNL hpc tutorials](https://hpc-tutorials.llnl.gov/)\n- [python.org Python Performance Tips](https://wiki.python.org/moin/PythonSpeed/PerformanceTips)\n- [HPC toolset tutorial (cluster management)](https://github.com/ubccr/hpc-toolset-tutorial)\n- [OpenMP tutorials](https://www.openmp.org/resources/tutorials-articles/)\n- [CUDA best practices guide](https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html)\n- [Understanding CPU Architecture And Performance Using LIKWID](https://pramodkumbhar.com/2020/03/architectural-optimisations-using-likwid-profiler/)\n- [32 OpenMP Traps For C++ Developers](https://pvs-studio.com/en/blog/posts/cpp/a0054/#ID0EWEAC)\n- [Best practices for running jobs on a HPC cluster](https://hpc.dccn.nl/docs/cluster_howto/best_practices.html)\n- [Glossary of HPC related terms](https://www.gigabyte.com/Glossary?lan=en)\n- [Setting the record straight: What is HPC?](https://www.gigabyte.com/Article/setting-the-record-straight-what-is-hpc-a-tech-guide-by-gigabyte?lan=en)\n- [Atomic operations and contention](https://fgiesen.wordpress.com/2014/08/18/atomics-and-contention/)\n- [A concurrency cost hiearchy](https://travisdowns.github.io/blog/2020/07/06/concurrency-costs.html)\n  \n##### Machine Learning Related\n- [Best practices for machine learning with HPC](https://info.gwdg.de/news/en/best-practices-for-machine-learning-with-hpc/)\n- [How to pick the right hardware for AI - Gigabyte - Part 1](https://www.gigabyte.com/Article/how-to-pick-the-right-server-for-ai-part-one-cpu-gpu)\n- [A practitioner's guide to testing and running large GPU clusters for training generative AI models](https://www.together.ai/blog/a-practitioners-guide-to-testing-and-running-large-gpu-clusters-for-training-generative-ai-models)\n- [AWS HPC Workshop](https://www.hpcworkshops.com/)\n- [Hardware Acceleration of LLMs: A comprehensive survey and comparison](https://news.ycombinator.com/item?id=41470074)\n- [The Utralscale Playbook - Training LLMs on GPU Clusters](https://huggingface.co/spaces/nanotron/ultrascale-playbook)\n  \n#### Review Papers/Articles\n- [Interactive and Urgent HPC Challenges (2024)](https://arxiv.org/pdf/2401.14550.pdf)\n- [The Landscape of Exascale Research: A Data-Driven Literature Analysis (2020)](https://dl.acm.org/doi/pdf/10.1145/3372390)\n- [The Landscape of Parallel Computing Research: A View from Berkeley](https://www2.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.pdf)\n- [Extreme Heterogeneity 2018: Productive Computational Science in the Era of Extreme Heterogeneity](references/2018-Extreme-Heterogeneity-DoE.pdf)\n- [Programming for Exascale Computers - Will Gropp, Marc Snir](https://snir.cs.illinois.edu/listed/J55.pdf)\n- [On the Memory Underutilization: Exploring Disaggregated Memory on HPC Systems (2020)](https://www.mcs.anl.gov/research/projects/argo/publications/2020-sbacpad-peng.pdf)\n- [Advances in Parallel \u0026 Distributed Processing, and Applications (conference proceedings)](https://link.springer.com/book/10.1007/978-3-030-69984-0)\n- [Designing Heterogeneous Systems: Large Scale Architectural Exploration Via Simulation](https://ieeexplore.ieee.org/abstract/document/9651152)\n- [Reinventing High Performance Computing: Challenges and Opportunities (2022)](https://arxiv.org/pdf/2203.02544.pdf)\n- [Challenges in Heterogeneous HPC White Paper (2022)](https://www.etp4hpc.eu/pujades/files/ETP4HPC_WP_Heterogeneous-HPC_20220216.pdf)\n- [An Evolutionary Technical \u0026 Conceptual Review on High Performance Computing Systems (Dec 2021)](https://kalaharijournals.com/resources/DEC_597.pdf)\n- [New Horizons for High-Performance Computing (2022)](https://csdl-downloads.ieeecomputer.org/mags/co/2022/12/09963771.pdf?Expires=1669702667\u0026Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9jc2RsLWRvd25sb2Fkcy5pZWVlY29tcHV0ZXIub3JnL21hZ3MvY28vMjAyMi8xMi8wOTk2Mzc3MS5wZGYiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE2Njk3MDI2Njd9fX1dfQ__\u0026Signature=s3K~-JXyED6vMVT9IKGj7LOhR75CrkQXiqAEsAEQt4zRqTbUFywmSoT10th1CdAaZcfZFuMsg23o2e719FRkCD6flVNB55d5tKyMUp7jUbkUtxnatOWLAKXfE4yQ-zrYQQEWBhtpSLKrTAS1oVmJ00YwkWqLYqCjhFIjW9La5od2SGQZEFZ136bbaGzxLZlED3JlMCMLB54YXKr-Ng1rngV4I9Wi-wSTFyLiA92~fUlk1KPQKU0XjtsMyYMYlt06Ze5H6jcQw4ytJ6c7r7qNJ43ifnsZepWmBywA8lVy2g3joOvZJtVjl~S91R8EZbiyWlYdWBGrO7pPdO6hH48~NQ__\u0026Key-Pair-Id=K12PMWTCQBDMDT)\n- [CConfidential High-Performance Computing in the Public Cloud](https://arxiv.org/pdf/2212.02378.pdf)\n- [Containerisation for High Performance Computing Systems: Survey and Prospects](https://ieeexplore.ieee.org/abstract/document/9985426)\n- [Heterogeneous Computing Systems (2023)](https://arxiv.org/pdf/2212.14418.pdf)\n- [Myths and Legends in High-Performance Computing](https://arxiv.org/pdf/2301.02432.pdf)\n- [Energy-Aware Scheduling for High-Performance Computing Systems: A Survey](https://www.mdpi.com/1996-1073/16/2/890)\n- [Ultimate Physical limits to computation - Seth Lloyd](https://arxiv.org/abs/quant-ph/9908043)\n- [Myths and Legends in High-Performance Computing](https://arxiv.org/abs/2301.02432)\n- [Abstract Machine Models and Proxy Architectures for Exascale Computing, 2014, Sandia National Laboratories and Lawrence Berkeley National Laboratory](https://www.osti.gov/servlets/purl/1561498)\n- [Some thoughts on the environmental impact of High Performance Computing](https://sifflez.org/publications/environment-hpc/)\n- [A Research Retrospective on AMD's Exascale Computing Journey](https://dl.acm.org/doi/abs/10.1145/3579371.3589349)\n  \n#### News\n- [InsideHPC](https://insidehpc.com/)\n- [HPCWire](https://www.hpcwire.com/)\n- [NextPlatform](https://www.nextplatform.com)\n- [Datacenter Dynamics](https://www.datacenterdynamics.com/en/)\n- [Admin Magazine HPC](https://www.admin-magazine.com/HPC/News)\n- [Toms hardware](https://www.tomshardware.com/)\n- [Tech Radar](https://www.techradar.com/)\n- [Phoronix](https://www.phoronix.com/)\n- [The Register](https://www.theregister.com/on_prem/hpc/)\n\n#### Podcasts\n- [This week in HPC](https://soundcloud.com/this-week-in-hpc)\n- [Preparing Applications for Aurora in the Exascale Era](https://connectedsocialmedia.com/20114/preparing-applications-for-aurora-in-the-exascale-era/)\n- [Slurm podcast](https://www.rce-cast.com/index.php/Podcast/rce-10-slurm.html)\n- [HPCPodcast](https://insidehpc.com/category/resources/hpc-podcast/)\n- [Developer Stories - The path to a career in high performance computing is not always equitable or clear.](https://rseng.github.io/devstories/2024/jay-lofstead/)\n- [Developer Stories - HPCToolkit](https://rseng.github.io/devstories/2024/wileam-phan/)\n  \n#### Video Presentations/Courses/Channels\n- [Argonne lectures on Extreme Scale Computing 2022](https://www.youtube.com/playlist?list=PLcbxjEfgjpO9OeDu--H9_XqyxPj3MkjdN)\n- [Argonne supercomputer tour](https://www.youtube.com/watch?v=UT9HCgp2X3A)\n- [Containers in HPC - what they fix and what they break ](https://youtube.com/watch?v=WQTrA4-9ZXk\u0026feature=share) \n- [HPC Tech Shorts](https://www.youtube.com/channel/UChSIn5kcWQvJxW17KIjdLVw)\n- [CppCon](https://www.youtube.com/user/CppCon/videos)\n- [Create a clustering server](https://www.youtube.com/watch?v=4LyL4sNZ1u4)\n- [Argonne national lab](https://www.youtube.com/channel/UCfwgjtIQB3puojz_N9ly_Ag)\n- [Oak Ridge National Lab](https://www.youtube.com/user/OakRidgeNationalLab)\n- [Concurrency in C++20 and Beyond](https://www.youtube.com/watch?v=jozHW_B3D4U) - A. Williams\n- [Is Parallel Programming still Hard?](https://www.youtube.com/watch?v=YM8Xy6oKVQg) - P. McKenney, M. Michael, and M. Wong at CppCon 2017\n- [The Speed of Concurrency: Is Lock-free Faster?](https://www.youtube.com/watch?v=9hJkWwHDDxs) - Fedor G Pikus in CppCon 2016\n- [Expressing Parallelism in C++ with Threading Building Blocks](https://www.youtube.com/watch?v=9Otq_fcUnPE) - Mike Voss at Intel Webinar 2018\n- [A Work-stealing Runtime for Rust](https://www.youtube.com/watch?v=4DQakkJ8XLI) - Aaron Todd in Air Mozilla 2017\n- [C++11/14/17 atomics and memory model: Before the story consumes you](https://www.youtube.com/watch?v=DS2m7T6NKZQ) - Michael Wong in CppCon 2015\n- [The C++ Memory Model](https://www.youtube.com/watch?v=gpsz8sc6mNU) - Valentin Ziegler at C++ Meeting 2014\n- [Sharcnet HPC](https://www.youtube.com/channel/UCCRmb5_GMWT2hSlALHlwIMg)\n- [Low Latency C++ for fun and profit](https://www.youtube.com/watch?v=BxfT9fiUsZ4)\n- [scalane python profiler](https://youtu.be/5iEf-_7mM1k)\n- [Kokkos lectures](https://www.youtube.com/watch?v=rUIcWtFU5qM\u0026t=698s)\n- [EasyBuild Tech Talk I - The ABCs of Open MPI, part 1 (by Jeff Squyres \u0026 Ralph Castain)](https://www.youtube.com/watch?v=WpVbcYnFJmQ)\n- [The Spack 2022 Roadmap](https://www.youtube.com/watch?v=HyA7RpjoY1k)\n- [A Not So Simple Matter of Software | Talk by Turing Award Winner Prof. Jack Dongarra](https://youtu.be/QBCX3Oxp3vw)\n- [Vectorization/SIMD intrinsics](https://www.youtube.com/watch?v=x9Scb5Mku1g)\n- [New Silicon for Supercomputers: A Guide for Software Engineers](https://www.youtube.com/watch?v=w3xNLj6nRgs\u0026t=197s)\n- [TechTechPotato Channel](TechTechPotato)\n- [How to write the perfect hash table ](https://www.youtube.com/watch?v=DMQ_HcNSOAI)\n- [FosDem 2024 HPC Big Data Conference videos](https://fosdem.org/2024/schedule/track/hpc-big-data-data-science/)\n- [Bright Computing Cluster Management Technical Overview](https://www.youtube.com/watch?v=0AxzcZuviW0)\n- [What is HPC? An introduction by Canonical](https://www.youtube.com/watch?v=tGIobcyKViI)\n- [Slurm job schedular basics](https://www.youtube.com/watch?v=Juo_mb3otJ0)\n- [EasyBuild Tech Talk I - The ABCs of Open MPI, part 1 (by Jeff Squyres \u0026 Ralph Castain)](https://youtu.be/WpVbcYnFJmQ?feature=shared)\n  \n#### Presentation Slides\n- [Task based Parallelism and why it's awesome](https://www.fz-juelich.de/ias/jsc/EN/Expertise/Workshops/Conferences/CSAM-2015/Programme/lecture7a_gonnet-pdf.pdf?__blob=publicationFile) - Pedro Gonnet\n- [Tuning Slurm Scheduling for Optimal Responsiveness and Utilization](https://slurm.schedmd.com/SUG14/sched_tutorial.pdf)\n- [Parallel Programming Models Overview (2020)](https://www.researchgate.net/publication/348187154_Parallel_programming_models_overview_2020)\n- [Comparative Analysis of Kokkos and Sycl (Jeff Hammond)](https://www.iwocl.org/wp-content/uploads/iwocl-2019-dhpcc-jeff-hammond-a-comparitive-analysis-of-kokkos-and-sycl.pdf) \n- [Hybrid OpenMP/MPI Programming](https://www.nersc.gov/assets/Uploads/NUG2013hybridMPIOpenMP2.pdf)\n- [Designs, Lessons and Advice from Building Large Distributed Systems - Jeff Dean (Google)](http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf)\n- [Practical Debugging and Performance Engineering](https://orbilu.uni.lu/bitstream/10993/55305/1/Practical_Debugging_and_Performance_Engineering_for_HPC.pdf)\n\n  \n#### Building Clusters/Virtual Clusters\n- [Resources for learning about HPC networks and storage r/HPC](https://www.reddit.com/r/HPC/comments/17o0q5d/resources_for_learning_about_hpc_networks_and/)\n- [Slurm for dummies guide](https://github.com/SergioMEV/slurm-for-dummies)\n- [Build a cluster under 50k](https://www.reddit.com/r/HPC/comments/srssrt/build_a_minicluster_under_50000/)\n- [Build a Beowulf cluster](https://github.com/darshanmandge/Cluster) \n- [Build a Raspberry Pi Cluster](https://www.raspberrypi.com/tutorials/cluster-raspberry-pi-tutorial/)\n- [Puget Systems](https://www.pugetsystems.com/)\n- [Lambda Systems](https://lambdalabs.com/)\n- [Titan computers](https://www.titancomputers.com)\n- [Temple course on building/maintaining a cluster](https://www.hpc.temple.edu/mhpc/2021/hpc-technology/index.html)\n- [Detailed reddit discussion on setting up a small cluster](https://www.reddit.com/r/HPC/comments/xeipt7/setting_up_a_small_hpc_cluster/)\n- [Tiny titan - build a really cool pi supercomputer](https://github.com/tinytitan)\n- [Turing PI - mini PI cluster off the shelf](https://turingpi.com/product/turing-pi-2-5/)\n- [Raspberry Pi Cluster](https://www.raspberrypi.com/tutorials/cluster-raspberry-pi-tutorial/)\n- [Building an Intel HPC cluster with OpenHPC](https://cdrdv2-public.intel.com/671501/installguide-openhpc2-centos8-18jul21.pdf)\n- [Reddit r/HPC post on building clusters](https://www.reddit.com/r/HPC/comments/11azmhy/wanting_to_setup_a_cluster/)\n- [Build a virtual cluster with PelicanHPC](https://sourceforge.net/projects/pelicanhpc/)\n- [Building a High-performance Computing Cluster Using FreeBSD](https://people.freebsd.org/~brooks/papers/bsdcon2003/fbsdcluster/)\n- [Supermicro GPU racks](https://www.supermicro.com/en/products/gpu)\n- [VirtualOrfeo - Virtual HPC Cluster](https://gitlab.com/area7/datacenter/codes/virtualorfeo)\n- [Is there a reason to build a raspberry pi clluster](https://www.reddit.com/r/HPC/comments/1bfywk8/is_there_ever_a_reason_to_build_a_raspberry_pi/)\n    \n#### Forums\n - [r/hpc](https://www.reddit.com/r/HPC/)\n - [r/homelab](https://www.reddit.com/r/homelab/)\n - [r/slurm](https://www.reddit.com/r/SLURM/)\n\n#### Careers/Jobs\n - [HPC University Careers search](http://hpcuniversity.org/careers/)\n - [HPC wire career site](https://careers.hpcwire.com/)\n - [HPC wire job postings](https://jobs.hpcwire.com/)\n - [HPC certification](https://www.hpc-certification.org/)\n - [HPC SysAdmin Jobs (reddit)](https://www.reddit.com/r/HPC/comments/w5eu66/systems_administrator_systems_engineer_jobs/)\n - [The United States Research Software Engineer Association](https://us-rse.org/)\n - [NCSA Internship](https://wiki.ncsa.illinois.edu/display/NCSACIP/NCSA+Internship+Program+for+CI+Professionals+Home)\n - [AI and Future HPC Job Prospect](https://www.reddit.com/r/HPC/comments/12anrgq/hpc_future_career_prospects/)\n - [HPC sys admin career (reddit)](https://www.reddit.com/r/HPC/comments/16jkqlv/it_support_for_an_academic_hpc_cluster_as_a_career/)\n   \n#### Membership Clubs\n - [Association for Computing Machinery](acm.org)\n - [ETP4HPC](https://www.etp4hpc.eu/)\n - [The SIGHPC Systems Professionals](https://sighpc-syspros.org/)\n   \n#### Blogs\n - [1024 Cores](http://www.1024cores.net/) - Dmitry Vyukov \n - [The Black Art of Concurrency](https://www.internalpointers.com/post-group/black-art-concurrency) - Internal Pointers\n - [Cluster Monkey](https://www.clustermonkey.net/)\n - [Johnathon Dursi](https://www.dursi.ca/)\n - [Arm Vendor HPC blog](https://community.arm.com/developer/tools-software/hpc/b/hpc-blog)\n - [HPC Notes](https://www.hpcnotes.com/)\n - [Brendan Gregg Performance Blog](https://www.brendangregg.com/blog/index.html)\n - [Performance engineering blog](https://pramodkumbhar.com)\n - [Concurrency Freaks](https://concurrencyfreaks.blogspot.com/)\n - [Servers@Home](https://servers.hydrology.cc/blog/)\n - [Dr.Bandwith Blog](https://sites.utexas.edu/jdm4372/2010/10/01/welcome-to-dr-bandwidths-blog/)\n - [Johnny's Software Lab](https://johnnysswlab.com/)\n - [Daniel Lemire Blog](https://lemire.me/blog/)\n - [Gigabyte HPC Blog](https://www.gigabyte.com/)\n   \n#### Journals\n - [IEEE Transactions on Parallel and Distributed Systems (TPDS)](https://www.computer.org/csdl/journal/td) \n - [Journal of Parallel and Distributed Computing](https://www.journals.elsevier.com/journal-of-parallel-and-distributed-computing)\n  \n#### Conferences\n\n - [ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (PPoPP)](https://ppopp19.sigplan.org/home)\n - [ACM Symposium on Parallel Algorithms and Architectures (SPAA)](https://spaa.acm.org/)\n - [SC conference (SC)](https://supercomputing.org/)\n - [IEEE International Parallel and Distributed Processing Symposium (IPDPS)](http://www.ipdps.org/)\n - [International Conference on Parallel Processing (ICPP)](https://www.hpcs.cs.tsukuba.ac.jp/icpp2019/)\n - [IEEE High Performance Extreme Computing Conference (HPEC)](https://ieee-hpec.org/cfp.htm)\n - [FosDem](https://fosdem.org/)\n\n#### Communities/Chat Groups\n  - [HPC Social Discord server](https://hpc.social/projects/chat/)\n  - [HPC Social slack group](https://hpcsocial.slack.com/)\n  - [HPC Social](https://hpc.social/)\n  - [Beowulf Mailing List](https://www.beowulf.org/)\n  - [Society of Research Software Engineering](https://society-rse.org/get-involved/)\n  - [Women In HPC](https://womeninhpc.org/)\n  - [HPC Hallway](https://hpc-hallway.github.io/The-Hallway/)\n  - [The High Performance Computing Special Interest Group](https://hpc-sig.org.uk/)\n  - [SigHPC](https://www.sighpc.org/)\n    \n#### Twitters\n - [Top500](https://twitter.com/top500supercomp?s=20)\n - [HPE HPC](https://twitter.com/hpe_hpc)\n - [HPC Wire](https://twitter.com/HPCwire)\n - [Rookie HPC](https://twitter.com/RookieHPC?s=20)\n - [HPC_Guru](https://twitter.com/HPC_Guru?s=20\u0026t=jHjVtUaZhz4s6Rq62IAmYg)\n - [Jeff Hammond](https://twitter.com/science_dot)\n \n#### Consulting\n- [Redline Performance](https://redlineperf.com/)\n- [R systems](http://rsystemsinc.com/)\n- [Advanced Clustering](https://www.advancedclustering.com/)\n\n#### Interview Preparation\n  - [Reddit Entry Level HPC interview help](https://www.reddit.com/r/HPC/comments/nhpdfb/entrylevel_hpc_job_interview/)\n\n#### Organizations\n  - [Prace](https://prace-ri.eu/)\n  - [Xsede](https://www.xsede.org/)\n  - [Compute Canada](https://www.computecanada.ca/)\n  - [Riken CSS](https://www.riken.jp/en/research/labs/r-ccs/)\n  - [Pawsey](https://pawsey.org.au/)\n  - [International Data Corporation](https://www.idc.com/)\n  - [List of Federally funded research and development centers](https://en.wikipedia.org/wiki/Federally_funded_research_and_development_centers)\n\n#### Interesting r/HPC posts\n  - [finding a supercomputer to use for research](https://www.reddit.com/r/HPC/comments/19e58z7/how_do_i_go_about_finding_a_supercomputer_to_use/)\n\n#### Misc. Wikis\n- [Amdahl's Law](https://en.wikipedia.org/wiki/Amdahl%27s_law)\n- [HPC Wiki](https://hpc-wiki.info/hpc/HPC_Wiki)\n- [FLOPS](https://en.wikipedia.org/wiki/FLOPS)\n- [Computational complexity of math operations](https://en.wikipedia.org/wiki/Computational_complexity_of_mathematical_operations)\n- [Many Task Computing](https://en.wikipedia.org/wiki/Many-task_computing)\n- [High Throughput Computing](https://en.wikipedia.org/wiki/High-throughput_computing)\n- [Parallel Virtual Machine](https://en.wikipedia.org/wiki/Parallel_Virtual_Machine)\n- [OSI Model](https://en.wikipedia.org/wiki/OSI_model)\n- [Workflow management](https://en.wikipedia.org/wiki/Scientific_workflow_system)\n- [Compute Canada Documentation](https://docs.computecanada.ca/wiki/Compute_Canada_Documentation)\n- [Network Interface Controller (NIC)](https://en.wikipedia.org/wiki/Network_interface_controller)\n- [Just in time compilation](https://en.wikipedia.org/wiki/Just-in-time_compilation)\n- [List of distributed computing projects](https://en.wikipedia.org/wiki/List_of_distributed_computing_projects)\n- [Computer cluster](https://en.wikipedia.org/wiki/Computer_cluster)\n- [Quasi-opportunistic supercomputing](https://en.wikipedia.org/wiki/Quasi-opportunistic_supercomputing)\n- [Limits of Computation](https://en.wikipedia.org/wiki/Limits_of_computation)\n- [Bremermann's Limit](https://en.wikipedia.org/wiki/Bremermann%27s_limit)\n- [Concurrency patterns](https://en.wikipedia.org/wiki/Concurrency_pattern)\n- [Parallel Computing](https://en.wikipedia.org/wiki/Parallel_computing)\n- [Server Management](https://wiki.hydrology.cc/en/home)\n  \n#### Misc. Papers/Articles\n- [Advanced Parallel Programming in C++](https://www.diehlpk.de/assets/modern_cpp.pdf)\n- [Tools for scientific computing](https://arxiv.org/pdf/2108.13053.pdf)\n- [Quantum Computing for High Performance Computing](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9537178)\n- [Benchmarking data science: Twelve ways to lie with statistics and performance on parallel computers.](http://ww.unixer.de/publications/img/hoefler-12-ways-data-science-preprint.pdf)\n- [Establishing the IO500 Benchmark](https://www.vi4io.org/_media/io500/about/io500-establishing.pdf)\n- [NVIDIA High Performance Computing articles](https://research.nvidia.com/research-area/high-performance-computing)\n- [Let's write a superoptimizer](https://austinhenley.com/blog/superoptimizer.html)\n- [Why I think C++ is still a desirable coding platform compared to Rust](https://lucisqr.substack.com/p/why-i-think-c-is-still-a-very-attractive)\n- [The State of Fortran (arxiv paper 2022)](https://arxiv.org/abs/2203.15110)\n- [50 years later, is two phase locking still the best](https://concurrencyfreaks.blogspot.com/2023/09/50-years-later-is-two-phase-locking.html)\n- [Estimating your memory bandwith](https://lemire.me/blog/2024/01/13/estimating-your-memory-bandwidth/)\n  \n#### Misc. Repos\n  - [Build a Beowulf cluster](https://github.com/darshanmandge/Cluster)\n  - [libsc - Supercomputing library](https://github.com/cburstedde/libsc)\n  - [xbyak jit assembler](https://github.com/herumi/xbyak)\n  - [cpufetch - pretty cpu info fetcher](https://github.com/Dr-Noob/cpufetch)\n  - [RRZE-HPC](https://github.com/RRZE-HPC)\n  - [Argonne Github](https://github.com/Argonne-National-Laboratory)\n  - [Argonne Leadership Computing Facility](https://github.com/argonne-lcf)\n  - [Oak Ridge National Lab Github](https://github.com/ORNL)\n  - [Compute Canada](https://github.com/ComputeCanada)\n  - [HPCInfo by Jeff Hammond](https://github.com/jeffhammond/HPCInfo)\n  - [Texas Advanced Computing Center (TACC) Github](https://github.com/TACC)\n  - [LANL HPC Github](https://github.com/hpc)\n  - [Rust in HPC](https://github.com/westernmagic/rust-in-hpc)\n  - [University of Buffalo - Center for Computational Research](https://github.com/ubccr)\n  - [Center for High Performance Computing - University of Utah](https://github.com/CHPC-UofU)\n  - [Top500 Supercomputer Data Analysis](https://github.com/glennklockwood/top500-data)\n    \n#### Misc. Theses\n   - [Rust programming language in the high-performance computing environment](https://www.research-collection.ethz.ch/handle/20.500.11850/474922)\n\n#### Misc.\n  - [Exascale Project](https://www.exascaleproject.org/)\n  - [Pocket HPC Survival Guide](https://tin6150.github.io/psg/lsf.html)\n  - [HPC Summer school](https://www.ihpcss.org/)\n  - [Overview of all linear algebra packages](http://www.netlib.org/utk/people/JackDongarra/la-sw.html)\n  - [Latency numbers](http://norvig.com/21-days.html#answers)\n  - [Nvidia HPC benchmarks](https://ngc.nvidia.com/catalog/containers/nvidia:hpc-benchmarks)\n  - [Intel Intrinsics Guide](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#)\n  - [AWS Cloud calculator](https://calculator.aws/)\n  - [Quickly benchmark C++ functions](https://quick-bench.com/)\n  - [LLNL Software repository](https://software.llnl.gov/)\n  - [Boinc - volunteer computing projects](https://boinc.berkeley.edu/projects.php)\n  - [Prace Training Events](https://events.prace-ri.eu/category/2/)\n  - [Nice discussion on FlameGraph profiling](https://stackoverflow.com/questions/27842281/unknown-events-in-nodejs-v8-flamegraph-using-perf-events/27867426#27867426)\n  - [Nice discussion on parts of a supercomputer on reddit](https://www.reddit.com/r/HPC/comments/11elh93/job_node_socket_task_runner_device_thread_logical/)\n  - [Technical Report on C++ performance](https://www.open-std.org/jtc1/sc22/wg21/docs/TR18015.pdf)\n  - [BOINC Compute for science](https://boinc.berkeley.edu/)\n  - [Count prime numbers using MPI](https://people.sc.fsu.edu/~jburkardt/c_src/prime_mpi/prime_mpi.html)\n  - [How to build your LEGO Scafell Pike Supercomputer](https://www.youtube.com/watch?v=m499o5rLh38)\n\n#### Games/Challenges\n  - [Deadlock empire - practice concurrency](https://github.com/deadlockempire/deadlockempire.github.io)\n  - [Sad Server - practice linux server management](https://sadservers.com/scenarios)\n    \n## Other Curated Lists \n   - [Awesome Cloud HPC](https://github.com/kjrstory/awesome-cloud-hpc)\n   - [Parallel Computing Guide](https://github.com/mikeroyal/Parallel-Computing-Guide)\n   - [Awesome Parallel Computing](https://github.com/taskflow/awesome-parallel-computing)\n   - [Princeton resources on OpenMP](https://researchcomputing.princeton.edu/education/external-online-resources/openmp)\n   - [Awesome HPC](https://github.com/dstdev/awesome-hpc/)\n   - [Sig HPC Education](https://sighpceducation.acm.org/resources/hpcresources/)\n   - [Fortran Codes On Github](https://github.com/Beliavsky/Fortran-code-on-GitHub)\n   - [Fortran Tools](https://github.com/Beliavsky/Fortran-Tools)\n     \n## Acknowledgements\n\nThis repo started from the great curated list https://github.com/taskflow/awesome-parallel-computing\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftrevor-vincent%2Fawesome-high-performance-computing","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftrevor-vincent%2Fawesome-high-performance-computing","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftrevor-vincent%2Fawesome-high-performance-computing/lists"}