Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-high-performance-computing
A curated list of awesome high performance computing resources
https://github.com/trevor-vincent/awesome-high-performance-computing
Last synced: 3 days ago
JSON representation
-
People
-
Other/Wikis
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Bill Gropp - 2010 IEEE TCSC Medal for Excellence in Scalable Computing
- David Bader - built the first Linux supercomputer
- Thomas Sterling - Inventor of Beowulf cluster, ParalleX/HPX
- Seymour Cray - Inventor of the Cray Supercomputer
- Larry Smarr - HPC Application Pioneer
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
- Jack Dongarra - 2021 Turing Award - LINPACK, BLAS, LAPACK, MPI
-
-
Software
-
Trends
- alpaka - The alpaka library is a header-only C++17 abstraction library for accelerator development
- async-rdma - A framework for writing RDMA applications with high-level abstraction and asynchronous APIs
- CAF - An Open Source Implementation of the Actor Model in C++
- Codon - high-performance Python compiler that compiles Python code to native machine code without any runtime overhead
- DeepSpeed - An easy-to-use deep learning optimization software suite that enables unprecedented scale and speed for Deep Learning Training and Inference
- FastFlow - High-performance Parallel Patterns in C++
- Galois - A C++ Library to Ease Parallel Programming with Irregular Parallelism
- Heteroflow - Concurrent CPU-GPU Task Programming using Modern C++
- highway - Performance portable SIMD intrinsics
- HPX - A C++ Standard Library for Concurrency and Parallelism
- Horovod - Distributed deep learning training framework for TensorFlow, Keras, PyTorch, and Apache MXNet
- Intel ISPC - SPMD compiler
- Kompute - The general purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends)
- Kokkos - A C++ Programming Model for Writing Performance Portable Applications on HPC platforms
- Kubeflow MPI Operator - MPI Operator for Kubeflow
- Legate - Nvidia replacement for numpy based on Legion
- Legion - Distributed heterogeneous programming library
- MOGSLib - User defined schedulers
- mpi4jax - Zero-copy mpi for jax arrays
- Pollux - Message Passing Cloud orchestrator
- Pyfi - Distributed flow and computation system
- RAJA - Architecture and programming model portability for HPC applications
- RaftLib - A C++ Library for Enabling Stream and Dataflow Parallel Computation
- Scalix - Data parallel computing framework
- Taichi - Parallel programming language for high-performance numerical computations in Python
- Taskflow - A Modern C++ Parallel Task Programming Library
- Transwarp - A Header-only C++ Library for Task Concurrency
- UCX - Optimized production proven-communication framework
- Zluda - Run unmodified CUDA applications with near-native performance on Intel AMD GPUs.
- HyperQueue - HyperQueue is a tool designed to simplify execution of large workflows (task graphs) on HPC clusters.
- cpufetch - A simple yet fancy CPU architecture fetching tool.
- gpufetch - A tool similar to cpufetch, but for fetching GPU architecture.
- Likwid - Provides all information about the supercomputer/cluster.
- PRK - Parallel Research Kernels - A collection of kernels for parallel programming research.
- Bluebanquise - An open-source cluster management tool.
- DeepOps - Nvidia's GPU infrastructure and automation tools for Kubernetes and Slurm clusters.
- Ruse - A tool for managing software environments in HPC clusters.
- sstack - A tool to install multiple software stacks such as Spack, EasyBuild, and Conda.
- McKernel - A hybrid kernel that combines Linux and a lightweight kernel designed to provide high performance for HPC applications.
- arbiter2 - Monitors and protects interactive nodes with cgroups.
- Charliecloud - Lightweight container solution for high-performance computing (HPC).
- genv - GPU Environment Management for managing and scheduling GPU resources.
- Grafana - Open-source platform for monitoring and observability, visualizing metrics.
- HPC Rocket - Allows submitting Slurm jobs in Continuous Integration (CI) pipelines.
- perun - Energy monitor for HPC systems, focusing on performance and energy efficiency.
- redun - Workflow engine that emphasizes simplicity, reliability, and scalability.
- remora - Tool for monitoring and reporting the performance of batch jobs on HPC systems.
- ruptime - A utility for monitoring the status of computational jobs and systems.
- Slurmvision slurm dashboard - A dashboard for monitoring and managing Slurm jobs.
- slurm docker cluster - A Slurm cluster implemented using Docker containers, for development and testing.
- Stui slurm dashboard for the terminal - A terminal-based UI for managing and monitoring Slurm clusters.
- Vaex - A Python library for lazy Out-of-Core DataFrames (similar to Pandas), to visualize and explore big tabular datasets.
- seer modern gui for gdb - A graphical user interface for GDB, aiming to improve the debugging experience with modern features and visuals.
- Geekbench - Cross platform benchmarking tool
- Chapel - A Programming Language for Productive Parallel Computing on Large-scale Systems
- Charm++ - Parallel Programming with Migratable Objects
- Cilk Plus - C/C++ Extension for Data and Task Parallelism
- CUDA - High performance NVIDIA GPU acceleration
- DeterminedAI - Distributed deep learning
- Halide - A language for fast, portable computation on images and tensors
- HPC-X - Nvidia implementation of MPI
- ISPC - An open-source compiler for high-performance SIMD programming on the CPU and GPU
- joblib - Data-flow programming for performance (python)
- MAGMA - Next generation linear algebra (LA) GPU accelerated libraries
- Merlin - A distributed task queuing system, designed to allow complex HPC workflows to scale to large numbers of simulations
- mpi4py - Python bindings for MPI
- MPI - OpenMPI implementation of the Message passing interface
- MPI - MPICH implementation of the Message passing interface
- MPI Standardization Forum - Forum for MPI standardization
- MPAVICH - Implementation of MPI
- NCCL - The NVIDIA Collective Communication Library for multi-GPU and multi-node communication
- cuNumeric - GPU drop-in for numpy
- stdpar - GPU accelerated C++ from NVIDIA
- numba - A JIT compiler that translates a subset of Python into fast machine code
- oneAPI - A unified, multiarchitecture, multi-vendor programming model
- OpenACC - "OpenMP for GPUs"
- OpenCilk - MIT continuation of Cilk Plus
- OpenMP - Multi-platform Shared-memory Parallel Programming in C/C++ and Fortran
- PVM - Parallel Virtual Machine: A predecessor to MPI for distributed computing
- PMIX - Standard for process management
- ray - Scale AI and Python workloads from reinforcement learning to deep learning
- ROCM - First open-source software development platform for HPC/Hyperscale-class GPU computing
- RS MPI - Rust bindings for MPI
- Simgrid - Simulate cluster/HPC environments
- SkelCL - A Skeleton Library for Heterogeneous Systems
- STAPL - Standard Template Adaptive Parallel Programming Library in C++
- STLab - High-level Constructs for Implementing Multicore Algorithms with Minimized Contention
- SYCL - C++ Abstraction layer for heterogeneous devices
- The Open Community Runtime - Specification for Asynchronous Many Task systems
- Tuplex - Blazing fast python data science
- cpuid instruction note
- openmpi hwloc
- Flux framework - A framework for high-performance computing clusters.
- E4S - The Extreme Scale HPC Scientific Stack - A collection of open-source software packages for HPC environments.
- OpenHPC - A community-led set of HPC components.
- Lustre Parallel File System - A high-performance distributed filesystem for large-scale cluster computing.
- Spack - A package manager for HPC/supercomputers.
- Easybuild - A package manager for HPC/supercomputers.
- Lmod - A Lua-based module system for software environment management on HPC systems.
- LSF - A batch system for HPC and distributed computing environments.
- moosefs - A fault-tolerant, highly available, distributed file system.
- OpenOnDemand - A web portal for accessing supercomputing resources.
- mOS - A specialized operating system for high-performance computing, designed to support large-scale, manycore processors.
- Kitten - A lightweight kernel designed for high-performance computing. It focuses on providing low noise and predictable performance for HPC applications.
- Apptainer (formerly Singularity) - Container platform designed for scientific and high-performance computing (HPC) environments.
- Docker - A set of platform as a service products that use OS-level virtualization to deliver software in packages called containers.
- HTCondor - An open-source high-throughput computing software framework.
- grpc - A high-performance, open-source universal RPC framework.
- Apache Airflow - A platform to programmatically author, schedule, and monitor workflows.
- the yt project - An open-source, Python-based package for analyzing and visualizing volumetric data.
- Summary of C/C++ debugging tools
- ddt
- totalview - A comprehensive source code analysis and debugging tool designed for complex software running on HPC systems, supporting a wide range of languages and architectures.
- marmot MPI checker - A tool for detecting and reporting issues in MPI (Message Passing Interface) applications.
- python debugging tools - A collection of tools for debugging Python applications, including pdb and other utilities.
- Summary of profiling tools - A comprehensive list of profiling tools for performance analysis in HPC.
- Summary of code performance analysis tools
- vampir - A tool for detailed analysis of MPI program executions by visualizing their event traces.
- Flamegraphs - Visualization tool for profiling software, allowing quick identification of performance bottlenecks.
- fio - Flexible I/O tester for benchmarking and stress/hardware verification.
- SPEC CPU Benchmark - A benchmark suite designed to provide a comparative measure of compute-intensive performance across the widest practical range of hardware.
- STREAM Memory Bandwidth Benchmark - Measures sustainable memory bandwidth and the corresponding computation rate for simple vector kernels.
- Intel MPI benchmarks - A set of benchmarks designed to measure the performance and scalability of MPI implementations on Intel architectures.
- tnl project
- visit
- petsc
- ginkgo
- GSL
- Scalapack
- trilinos
- Software utilization at UK National Supercomputing Service, ARCHER2
- Comparison of cluster software
- List of cluster management software
- Intel DAOS
- fio flexible I/O tester
- cpuid - A software instruction available on Intel, AMD, and other processors that can be used to determine processor type and features.
- intel cpuinfo - Intel tool providing information about the characteristics of Intel CPUs.
- LIKWID.jl - Julia wrapper for LIKWID.
- BeeGFS - A parallel file system designed for performance-critical environments.
- Ceph - An open-source distributed storage system.
- fpsync - A tool for fast parallel data transfer using fpart and rsync.
- GPFS - A high-performance parallel file system developed by IBM.
- Guix - A package manager for HPC/supercomputers.
- OpenPBS - A software for workload management and job scheduling.
- OpenXdMod - A tool for managing high-performance computing resources.
- RADIUSS - Rapid Application Development via an Institutional Universal Software Stack.
- rocks - An open-source Linux cluster distribution.
- SGE - A resource management software for large clusters of computers.
- Slurm - A cluster management and job scheduling system for Linux clusters.
- Starfish - Unstructured data management and metadata solution for files and objects.
- Warewulf - An operating system provisioning system and cluster management tool.
- xCat - A distributed computing management and provisioning tool.
- XDMoD - An open-source tool for managing high-performance computing resources.
- Globus Connect - A fast data transfer tool between supercomputers.
- HIP - HIP is a C++ Runtime API and Kernel Language for AMD/Nvidia GPU
- Jacamar-ci - CI/CD tool designed for HPC and scientific computing workflows.
- Kubernetes - An open-source system for automating deployment, scaling, and management of containerized applications.
- nextflow - A workflow framework to deploy data-driven computational pipelines.
- Prefect - A workflow management system, designed for modern infrastructure and powered by the open-source Prefect Core workflow engine.
- Prometheus - An open-source monitoring system with a dimensional data model, flexible query language, efficient time series database and modern alerting approach.
- snakemake - A workflow management system that reduces the complexity of creating reproducible and scalable data analyses.
- Metal - Apple's GPU API
- Intel DAOS - A software-defined scale-out object store for HPC applications.
- Amira - A powerful, multifaceted 3D software platform for visualizing, manipulating, and understanding Life Science and bio-medical data coming from all types of sources.
- paraview - An open-source, multi-platform data analysis and visualization application.
- Scientific Visualization Wiki - A comprehensive guide to the field of scientific visualization, detailing techniques, tools, and applications.
- vedo - A lightweight and powerful python module for scientific analysis and visualization of 3D objects and point clouds based on VTK.
- HPL benchmark - The High Performance Linpack Benchmark for measuring floating-point computing power of systems.
- EESSI - A shared stack of scientific software installations.
- NASA parallel benchmark suite - A set of benchmarks designed to evaluate the performance of parallel supercomputers.
- papi - Provides standard APIs for accessing hardware performance counters available on modern microprocessors.
- scalasca - A software tool that supports performance analysis of large-scale parallel applications.
- tau - TAU (Tuning and Analysis Utilities) is a profiling and tracing toolkit for performance analysis of parallel programs.
- hpctoolkit - An integrated suite of tools for measurement and analysis of program performance on computers ranging from desktops to supercomputers.
- Differential Flamegraphs - A visualization technique developed by Brendan Gregg that highlights differences between performance profiles, making it easier to spot performance regressions or improvements.
- Openfoam HPC benchmark - A benchmarking suite for evaluating the High Performance Computing capabilities of OpenFOAM, an open-source CFD software, under various computational loads.
- Slurm Web - Open source web dashboard for Slurm HPC clusters.
- Open Cluster Scheduler - A scalable HPC/AI workload manager based on SGE.
- STLab - High-level Constructs for Implementing Multicore Algorithms with Minimized Contention
- OSU microbenchmarks - A collection of microbenchmarks designed to evaluate the performance of MPI implementations across various communication protocols and message sizes.
- Roofline Visualizer for ERT - Visualizer for ERT
- Empirical Roofline Tool (ERT) - Create empirical roofline plots, alternative to intel vtune for any machine
- Triton - Triton is a language and compiler for parallel programming
- hdf5 - The Hierarchical Data Format version 5 (HDF5), is an open source file format that supports large, complex, heterogeneous data.
-
-
Hardware
-
Cloud
- The use of Microsoft Azure for high performance cloud computing – A case study
- AWS HPC
- Azure HPC
- rescale
- vast.ai
- hetzner - cheap servers incl. 80-core ARM
- Ampere ARM cloud-native processors
- Scaleway
- Chameleon Cloud
- Runpod
- AWS Cluster in the cloud
- AWS Parallel Cluster
- An Empirical Study of Containerized MPI and GUI Application on HPC in the Cloud
-
Interconnects/Topology
-
CPU
-
Other/Wikis
-
GPU
-
TPU/Tensor Cores
-
Many integrated core processor (MIC)
-
Custom/FPGA/ASIC/APU
-
Certification
-
Student Opportunities / Workshops
-
-
Resources
-
Other/Wikis
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- Sad Server - practice linux server management
- The OpenMP Common Core: Making OpenMP Simple Again
- Parallel and High Performance Computing
- Algorithms for Modern Hardware
- High Performance Computing: Modern Systems and Practices - Thomas Sterling, Maciej Brodowicz, Matthew Anderson 2017
- Introduction to High Performance Computing for Scientists and Engineers - Hager 2010
- Computer Organization and Design
- Introduction to High Performance Scientific Computing - Victor Eijkhout 2021
- Free Modern HPC Books by Victor Eijkhout
- High Performance Parallel Runtimes
- Parallel Programming for Science and Engineering - Victor EIjkhout 2021
- Parallel Programming for Science and Engineering - HTML Version
- C++ High Performance
- Data Parallel C++ Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL
- High Performance Python
- C++ Concurrency in Action: Practical Multithreading - Anthony Williams 2012
- The Art of Multiprocessor Programming - Maurice Herlihy 2012
- Parallel Computing: Theory and Practice - Umut A. Acar 2016
- Introduction to Parallel Computing - Zbigniew J. Czech
- Optimizing software in C++
- Optimizing subroutines in assembly code
- Parallel Programming with MPI
- HPC, Big Data, AI Convergence Towards Exascale: Challenge and Vision
- Introduction to parallel computing - Ananth Grama
- Guide into OpenMP: Easy multithreading programming for C++
- An Introduction to the Partitioned Global Address Space (PGAS) Programming Model
- Jax in 2022
- C++ Benchmarking for beginners
- Argonne supercomputer tour
- HPC Notes
- Oak Ridge National Lab Github
- Compute Canada
- High Performance Computing: A Bird's Eye View
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- The Student Supercomputer Challenge Guide
- The Rust Performance Book
- E-Zines on Bash, Linux, Perf, etc - Julia Evans
- The Art of Writing Efficient Programs: An Advanced Programmer's Guide to Efficient Hardware Utilization and Compiler Optimizations Using C++ Examples
- OpenMP Examples - openmp.org
- Latest books on OpemMP - openmp.org
- Programming Massively Parallel Processors 4th Edition 2023
- Software Optimization Cookbook
- Power and Performance_ Software Analysis and Optimization
- Gropp books on MPI
- Performance Analysis and Tuning on Modern CPUs
- HPC Carpentry
- Berkeley: Applications of Parallel Computers - Detailed course on HPC
- Udacity High Performance Computing
- Parallel Numerical Algorithms
- Illinois - Intro to HPC - Creator of PyCuda
- Archer1 Courses
- TACC tutorials
- Building pipelines using slurm dependencies
- Parallel Computation Math
- Foundations of HPC 2020/2021
- Principles of Distributed Computing
- High Performance Visualization
- Temple course on building/maintaining a cluster
- Nvidia Deep Learning Course
- Coursera GPU Programming Specialization
- Coursera Fundamentals of Parallelism on Intel Architecture
- Archer2 Shared Memory Programming with OpenMP
- Archer2 Message-Passing Programming with MPI
- HetSys 2022 Course
- Edukamu Introduction to Supercomputing
- Heterogeneous Parallel Programming by S K
- Supercomputing in plain english
- Cornell workshop
- UL HPC School
- Introduction to High-Performance Parallel Distributed Computing using Chapel, UPC++ and Coarray Fortran
- Performance Engineering off Software Systems (MIT-OCW)
- Introduction to Parallel Computing (CMSC 498X/818X)
- Beginners Guide to HPC
- Rookie HPC Guide
- RedHat High Performance Computing 101
- Foundations of Multithreaded, Parallel, and Distributed Programming
- Writing slurm scripts in python,r and bash
- Xsede new user tutorials
- Improving Performance with SIMD intrinsics
- Want speed? Pass by value
- Introduction to low level bit hacks
- How to write fast numerical code: An Introduction
- Lecture notes on Loop optimizations
- A practical approach to code optimization
- Software optimization manuals
- Oak Ridge National Lab Tutorials
- How to perform large scale data processing in bioinformatics
- Step by step SGEMM in OpenCL
- Frontier User Guide
- Allocating large blocks of memory in bare-metal C programming
- LLNL HPC Tutorials
- High Performance Computing: A Bird's Eye View
- The dirty secret of high performance computing
- Multiple GPUs with pytorch
- Brendan Gregg on Linux Performance
- Automatic Slurm build scripts
- Memory bandwith NapkinMath
- Avoiding Instruction Cache Misses
- Multi-GPU Programming with Standard Parallel C++
- EuroCC National Competence Center Sweden (ENCCS) HPC tutorials
- LLNL hpc tutorials
- python.org Python Performance Tips
- OpenMP tutorials
- CUDA best practices guide
- Understanding CPU Architecture And Performance Using LIKWID
- 32 OpenMP Traps For C++ Developers
- The Landscape of Parallel Computing Research: A View from Berkeley
- Programming for Exascale Computers - Will Gropp, Marc Snir
- On the Memory Underutilization: Exploring Disaggregated Memory on HPC Systems (2020)
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- Designing Heterogeneous Systems: Large Scale Architectural Exploration Via Simulation
- Reinventing High Performance Computing: Challenges and Opportunities (2022)
- Challenges in Heterogeneous HPC White Paper (2022)
- An Evolutionary Technical & Conceptual Review on High Performance Computing Systems (Dec 2021)
- New Horizons for High-Performance Computing (2022)
- CConfidential High-Performance Computing in the Public Cloud
- Containerisation for High Performance Computing Systems: Survey and Prospects
- Heterogeneous Computing Systems (2023)
- Myths and Legends in High-Performance Computing
- Energy-Aware Scheduling for High-Performance Computing Systems: A Survey
- Ultimate Physical limits to computation - Seth Lloyd
- Myths and Legends in High-Performance Computing
- Abstract Machine Models and Proxy Architectures for Exascale Computing, 2014, Sandia National Laboratories and Lawrence Berkeley National Laboratory
- Some thoughts on the environmental impact of High Performance Computing
- InsideHPC
- HPCWire
- NextPlatform
- Datacenter Dynamics
- Admin Magazine HPC
- Toms hardware
- Tech Radar
- This week in HPC
- Preparing Applications for Aurora in the Exascale Era
- Slurm podcast
- HPCPodcast
- Argonne lectures on Extreme Scale Computing 2022
- HPC Tech Shorts
- Create a clustering server
- Argonne national lab
- Oak Ridge National Lab
- Concurrency in C++20 and Beyond - A. Williams
- Is Parallel Programming still Hard? - P. McKenney, M. Michael, and M. Wong at CppCon 2017
- The Speed of Concurrency: Is Lock-free Faster? - Fedor G Pikus in CppCon 2016
- Expressing Parallelism in C++ with Threading Building Blocks - Mike Voss at Intel Webinar 2018
- A Work-stealing Runtime for Rust - Aaron Todd in Air Mozilla 2017
- C++11/14/17 atomics and memory model: Before the story consumes you - Michael Wong in CppCon 2015
- The C++ Memory Model - Valentin Ziegler at C++ Meeting 2014
- Sharcnet HPC
- Low Latency C++ for fun and profit
- scalane python profiler
- Kokkos lectures
- EasyBuild Tech Talk I - The ABCs of Open MPI, part 1 (by Jeff Squyres & Ralph Castain)
- The Spack 2022 Roadmap
- A Not So Simple Matter of Software | Talk by Turing Award Winner Prof. Jack Dongarra
- Vectorization/SIMD intrinsics
- New Silicon for Supercomputers: A Guide for Software Engineers
- Task based Parallelism and why it's awesome - Pedro Gonnet
- Tuning Slurm Scheduling for Optimal Responsiveness and Utilization
- Parallel Programming Models Overview (2020)
- Comparative Analysis of Kokkos and Sycl (Jeff Hammond)
- Hybrid OpenMP/MPI Programming
- Designs, Lessons and Advice from Building Large Distributed Systems - Jeff Dean (Google)
- Practical Debugging and Performance Engineering
- Resources for learning about HPC networks and storage r/HPC
- Build a cluster under 50k
- Build a Raspberry Pi Cluster
- Puget Systems
- Titan computers
- Detailed reddit discussion on setting up a small cluster
- Tiny titan - build a really cool pi supercomputer
- Building an Intel HPC cluster with OpenHPC
- Reddit r/HPC post on building clusters
- Build a virtual cluster with PelicanHPC
- Building a High-performance Computing Cluster Using FreeBSD
- Supermicro GPU racks
- r/hpc
- r/homelab
- r/slurm
- HPC University Careers search
- HPC wire career site
- HPC certification
- HPC SysAdmin Jobs (reddit)
- The United States Research Software Engineer Association
- NCSA Internship
- AI and Future HPC Job Prospect
- HPC sys admin career (reddit)
- ETP4HPC
- The Black Art of Concurrency - Internal Pointers
- Cluster Monkey
- Johnathon Dursi
- Arm Vendor HPC blog
- Brendan Gregg Performance Blog
- Performance engineering blog
- Concurrency Freaks
- IEEE Transactions on Parallel and Distributed Systems (TPDS)
- Journal of Parallel and Distributed Computing
- ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (PPoPP)
- ACM Symposium on Parallel Algorithms and Architectures (SPAA)
- SC conference (SC)
- IEEE International Parallel and Distributed Processing Symposium (IPDPS)
- International Conference on Parallel Processing (ICPP)
- IEEE High Performance Extreme Computing Conference (HPEC)
- HPC Social Discord server
- HPC Social slack group
- HPC Social
- Beowulf Mailing List
- Top500
- HPE HPC
- HPC Wire
- Rookie HPC
- HPC_Guru
- Jeff Hammond
- Redline Performance
- R systems
- Advanced Clustering
- Reddit Entry Level HPC interview help
- Prace
- Xsede
- Compute Canada
- Riken CSS
- Pawsey
- International Data Corporation
- List of Federally funded research and development centers
- Amdahl's Law
- HPC Wiki
- FLOPS
- Computational complexity of math operations
- Many Task Computing
- High Throughput Computing
- Parallel Virtual Machine
- OSI Model
- Workflow management
- Compute Canada Documentation
- Network Interface Controller (NIC)
- Just in time compilation
- List of distributed computing projects
- Quasi-opportunistic supercomputing
- Limits of Computation
- Bremermann's Limit
- Concurrency patterns
- Advanced Parallel Programming in C++
- Tools for scientific computing
- Quantum Computing for High Performance Computing
- Benchmarking data science: Twelve ways to lie with statistics and performance on parallel computers.
- Establishing the IO500 Benchmark
- NVIDIA High Performance Computing articles
- Let's write a superoptimizer
- Why I think C++ is still a desirable coding platform compared to Rust
- The State of Fortran (arxiv paper 2022)
- RRZE-HPC
- Argonne Github
- Argonne Leadership Computing Facility
- Texas Advanced Computing Center (TACC) Github
- LANL HPC Github
- University of Buffalo - Center for Computational Research
- Center for High Performance Computing - University of Utah
- Rust programming language in the high-performance computing environment
- Exascale Project
- Pocket HPC Survival Guide
- HPC Summer school
- Latency numbers
- Nvidia HPC benchmarks
- Intel Intrinsics Guide
- AWS Cloud calculator
- Quickly benchmark C++ functions
- LLNL Software repository
- Boinc - volunteer computing projects
- Prace Training Events
- Nice discussion on FlameGraph profiling
- Nice discussion on parts of a supercomputer on reddit
- Technical Report on C++ performance
- BOINC Compute for science
- Count prime numbers using MPI
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing: A Bird's Eye View
- Bright Computing Cluster Management Technical Overview
- What is HPC? An introduction by Canonical
- Slurm job schedular basics
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- Phoronix
- Servers@Home
- finding a supercomputer to use for research
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- FosDem 2024 HPC Big Data Conference videos
- Fastest unordered_map implementation / benchmarks
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing: A Bird's Eye View
- Interactive and Urgent HPC Challenges (2024)
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- Dr.Bandwith Blog
- Johnny's Software Lab
- Daniel Lemire Blog
- Estimating your memory bandwith
- Infiniband Essentials
- Lambda Systems
- Containers in HPC - what they fix and what they break
- How to write the perfect hash table
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- The SIGHPC Systems Professionals
- 50 years later, is two phase locking still the best
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- Computer cluster
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- Coursera Introduction to High Performance Computing
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing: A Bird's Eye View
- NCSA HPC Training Moodle
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- Tiny titan - build a really cool pi supercomputer
- Is there a reason to build a raspberry pi clluster
- Server Management
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Interactive and Urgent HPC Challenges (2024)
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- A Research Retrospective on AMD's Exascale Computing Journey
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- Containers in HPC - what they fix and what they break
- scalane python profiler
- Kokkos lectures
- A Not So Simple Matter of Software | Talk by Turing Award Winner Prof. Jack Dongarra
- New Silicon for Supercomputers: A Guide for Software Engineers
- EasyBuild Tech Talk I - The ABCs of Open MPI, part 1 (by Jeff Squyres & Ralph Castain)
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- EasyBuild Tech Talk I - The ABCs of Open MPI, part 1 (by Jeff Squyres & Ralph Castain)
- Microarchitecture of Intel/AMD CPUs
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- Developer Stories - The path to a career in high performance computing is not always equitable or clear.
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- Developer Stories - HPCToolkit
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- Reinventing High Performance Computing: Challenges and Opportunities (2022)
- CConfidential High-Performance Computing in the Public Cloud
- Heterogeneous Computing Systems (2023)
- Myths and Legends in High-Performance Computing
- Tools for scientific computing
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- HPC Administration Virtual Residency 2024
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
- High Performance Computing in Biomimetics Modeling, Architecture and Applications
- High Performance Computing: A Bird's Eye View
- Advances in Parallel & Distributed Processing, and Applications (conference proceedings)
-
-
General Info
-
A Few Upcoming Supercomputers
- El Capitan - 2023, AMD-based, ~1.5 exaflops
- Tianhe-3 - 2022, ~700 Petaflop (Linpack500)
- Venado - 2024, Grace-Hopper based ~10 exaflops
-
Most Recent List of the Top500 Supercomputers
-
History
-
Trends
-
-
Other Curated Lists
-
Other/Wikis
-
Programming Languages
Sub Categories
Keywords
hpc
12
gpu
7
c-plus-plus
6
mpi
6
cuda
6
python
6
parallel-programming
5
parallel-computing
5
gpu-programming
5
machine-learning
4
deep-learning
4
cpp
4
linux
4
intel
3
heterogeneous-parallel-programming
3
data-science
3
high-performance-computing
3
c
3
parallel
3
pytorch
3
rust
3
gpu-computing
3
machinelearning
2
monitoring
2
ray
2
tensorflow
2
simd
2
multithreading
2
docker
2
pgas
2
shmem
2
kokkos
2
programming-model
2
distributed-computing
2
kubernetes
2
threading
2
nvidia-gpu
2
benchmarking
2
macos
2
openacc
2
async
2
cpp17
2
compiler
2
slurm-cluster
2
hip
2
slurm
2
openmp
2
message-passing
1
blt
1
actor-model
1