An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with high-performance-computing

A curated list of projects in awesome lists tagged with high-performance-computing .

https://github.com/parallel101/course

高性能并行编程与优化 - 课件

course cpp cpp17 high-performance-computing parallel-computing slides

Last synced: 29 Apr 2025

https://github.com/kokkos/kokkos

Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction

abstraction c-plus-plus high-performance-computing hpsf kokkos parallel-computing programming-model

Last synced: 13 May 2025

https://github.com/adaptivecpp/adaptivecpp

Compiler for multiple programming models (SYCL, C++ standard parallelism, HIP/CUDA) for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programming models. Lets applications adapt themselves to all the hardware in the system - even at runtime!

adaptivecpp compiler gpgpu gpu-computing high-performance high-performance-computing hipsycl hpc opensycl stdpar sycl

Last synced: 11 Dec 2025

https://github.com/AdaptiveCpp/AdaptiveCpp

Compiler for multiple programming models (SYCL, C++ standard parallelism, HIP/CUDA) for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programming models. Lets applications adapt themselves to all the hardware in the system - even at runtime!

adaptivecpp compiler gpgpu gpu-computing high-performance high-performance-computing hipsycl hpc opensycl stdpar sycl

Last synced: 21 Apr 2025

https://github.com/mratsim/arraymancer

A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends

autograd automatic-differentiation cuda cudnn deep-learning gpgpu gpu-computing high-performance-computing iot linear-algebra machine-learning matrix-library multidimensional-arrays ndarray neural-networks nim opencl openmp parallel-computing tensor

Last synced: 14 May 2025

https://github.com/ropensci/drake

An R-focused pipeline toolkit for reproducibility and high-performance computing

data-science drake high-performance-computing makefile peer-reviewed pipeline r r-package reproducibility reproducible-research ropensci rstats workflow

Last synced: 13 May 2025

https://mratsim.github.io/Arraymancer/

A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends

autograd automatic-differentiation cuda cudnn deep-learning gpgpu gpu-computing high-performance-computing iot linear-algebra machine-learning matrix-library multidimensional-arrays ndarray neural-networks nim opencl openmp parallel-computing tensor

Last synced: 08 May 2025

https://github.com/mratsim/Arraymancer

A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends

autograd automatic-differentiation cuda cudnn deep-learning gpgpu gpu-computing high-performance-computing iot linear-algebra machine-learning matrix-library multidimensional-arrays ndarray neural-networks nim opencl openmp parallel-computing tensor

Last synced: 16 Apr 2025

https://github.com/liu-xiandong/how_to_optimize_in_gpu

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.

elementwise gpu-acceleration high-performance-computing hpc reduce sgemm sgemv

Last synced: 03 Oct 2025

https://github.com/Liu-xiandong/How_to_optimize_in_GPU

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.

elementwise gpu-acceleration high-performance-computing hpc reduce sgemm sgemv

Last synced: 14 May 2025

https://github.com/precice/precice

A coupling library for partitioned multi-physics simulations, including, but not restricted to fluid-structure interaction and conjugate heat transfer simulations.

calculix co-simulation code-aster computer-aided-engineering conjugate-heat-transfer coupling cpp dealii fenics fluent fluid-structure-interaction high-performance-computing multi-physics multiphysics openfoam precice research-and-development simulation su2

Last synced: 14 May 2025

https://github.com/zanellia/prometeo

An experimental Python-to-C transpiler and domain specific language for embedded high-performance computing

c compiler domain-specific-language embedded-systems high-performance-computing hpc python python-to-c source-to-source static-analysis static-typing transcompiler transpiler

Last synced: 16 May 2025

https://github.com/llnl/sundials

Official development repository for SUNDIALS - a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers. Pull requests are welcome for bug fixes and minor changes.

dae-solver high-performance-computing hpc math-physics nonlinear-equation-solver ode-solver parallel-computing radiuss scientific-computing sensitivity-analysis solver time-integration

Last synced: 15 May 2025

https://github.com/cselab/aphros

Finite volume solver for incompressible multiphase flows with surface tension. Foaming flows in complex geometries.

cfd chemical-engineering fluid high-performance-computing multiphase-flow paraview simulation surface-tension

Last synced: 14 Mar 2025

https://github.com/mpi4jax/mpi4jax

Zero-copy MPI communication of JAX arrays, for turbo-charged HPC applications in Python :zap:

gpu high-performance-computing jax jit mpi parallel-computing xla

Last synced: 21 Oct 2025

https://github.com/mrshaw01/software-engineer

A curated learning repository focused on High-Performance Computing (HPC) — covering fundamentals to advanced topics in CUDA, MPI, C++, and Python-C++ interoperability.

cpp cuda high-performance-computing hip python

Last synced: 16 Jul 2025

https://github.com/dionhaefner/pyhpc-benchmarks

A suite of benchmarks for CPU and GPU performance of the most popular high-performance libraries for Python :rocket:

benchmarks cupy gpu high-performance-computing jax parallel-computing python pytorch tensorflow

Last synced: 12 Apr 2025

https://github.com/QMCPACK/qmcpack

Main repository for QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids with full performance portable GPU support

c-plus-plus cuda electronic-structure gpu high-performance-computing hpc mpi oneapi quantum-chemistry quantum-monte-carlo rocm

Last synced: 26 Mar 2025

https://github.com/ornladios/adios2

Next generation of ADIOS developed in the Exascale Computing Program

adios cmake ecp exascale exascale-computing hdf5 high-performance-computing hpc io

Last synced: 21 Oct 2025

https://github.com/Xiangyu-Hu/SPHinXsys

SPHinXsys provides C++ APIs for engineering simulation and optimization. It aims at complex systems driven by fluid, structure, multi-body dynamics and beyond. The multi-physics library is based on a unique and unified computational framework by which strong coupling has been achieved for all involved physics.

computer-aided-engineering cpp finite-volume-method fluid-dynamics fluid-structure-interaction gpu high-performance-computing multi-physics multi-platforms multiphysics-coupling research-and-development smoothed-particle-hydrodynamics solid-dynamics sycl

Last synced: 04 Apr 2025

https://github.com/mratsim/laser

The HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for floats and integers

assembler blas compiler-optimization convolution deep-learning gemm high-performance-computing jit matrix-multiplication openmp parallel runtime-cpu-detection simd tensor

Last synced: 08 Apr 2025

https://github.com/Trinkle23897/Fast-Poisson-Image-Editing

A fast poisson image editing implementation that can utilize multi-core CPU or GPU to handle a high-resolution image input.

cpp cuda high-performance-computing image-processing jacobi-iteration jacobi-method mpi numpy openmp parallel-computing poisson-image-editing pybind11 python

Last synced: 02 Apr 2025

https://github.com/trinkle23897/fast-poisson-image-editing

A fast poisson image editing implementation that can utilize multi-core CPU or GPU to handle a high-resolution image input.

cpp cuda high-performance-computing image-processing jacobi-iteration jacobi-method mpi numpy openmp parallel-computing poisson-image-editing pybind11 python

Last synced: 05 Apr 2025

https://github.com/sciml/nonlinearsolve.jl

High-performance and differentiation-enabled nonlinear solvers (Newton methods), bracketed rootfinding (bisection, Falsi), with sparsity and Newton-Krylov support.

bracketing deep-equilibrium-models differential-equations equilibrium factorization high-performance-computing julia newton-krylov newton-method newton-raphson nonlinear-equations scientific-machine-learning sciml sparse-matrices sparse-matrix steady-state

Last synced: 14 May 2025

https://github.com/SciML/NonlinearSolve.jl

High-performance and differentiation-enabled nonlinear solvers (Newton methods), bracketed rootfinding (bisection, Falsi), with sparsity and Newton-Krylov support.

bracketing deep-equilibrium-models differential-equations equilibrium factorization high-performance-computing julia newton-krylov newton-method newton-raphson nonlinear-equations scientific-machine-learning sciml sparse-matrices sparse-matrix steady-state

Last synced: 04 May 2025

https://github.com/df308/x9

high performance message passing library

high-frequency-trading high-performance-computing low-latency ultra-low-latency

Last synced: 09 Aug 2025

https://github.com/ECP-copa/Cabana

Performance-portable library for particle-based simulations

co-design exascale exascale-computing high-performance-computing hpc kokkos particles

Last synced: 28 Mar 2025

https://github.com/ceed/libceed

CEED Library: Code for Efficient Extensible Discretizations

api ceed cuda ecp exascale-computing gpu high-order high-performance-computing hpc julia linear-algebra

Last synced: 15 May 2025

https://github.com/CEED/libCEED

CEED Library: Code for Efficient Extensible Discretizations

api ceed cuda ecp exascale-computing gpu high-order high-performance-computing hpc julia linear-algebra

Last synced: 07 May 2025

https://github.com/dlr-amr/t8code

Parallel algorithms and data structures for tree-based adaptive mesh refinement (AMR) with arbitrary element shapes.

adaptive-mesh-refinement high-performance-computing hpc mesh modeling mpi parallel parallel-computing simulation

Last synced: 16 May 2025

https://github.com/projectphysx/opencl-benchmark

A small OpenCL benchmark program to measure peak GPU/CPU performance.

bandwidth benchmark benchmarking flops gpgpu gpu gpu-computing high-performance-computing hpc opencl tool tools

Last synced: 04 Apr 2025

https://github.com/tikv/minstant

Performant time measuring in Rust

high-performance high-performance-computing timing tsc

Last synced: 12 Apr 2025

https://github.com/CaNS-World/CaNS

A code for fast, massively-parallel direct numerical simulations (DNS) of canonical flows

cfd computational-fluid-dynamics fluid-dynamics fluid-simulation fortran gpu gpu-computing high-performance-computing turbulence

Last synced: 14 Mar 2025

https://github.com/p-costa/CaNS

A code for fast, massively-parallel direct numerical simulations (DNS) of canonical flows

cfd computational-fluid-dynamics fluid-dynamics fluid-simulation fortran gpu gpu-computing high-performance-computing turbulence

Last synced: 22 Feb 2025

https://github.com/hao-lh/the-books-making-you-better

A list of time-lasting classic books, which not only help you figure out how it works, but also grasp when it works and why it works in that way.

bayesian-inference computer-architecture computer-vision deep-learning high-performance-computing linear-algebra machine-learning probabilistic-graphical-models reinforcement-learning statistical-learning

Last synced: 15 Apr 2025

https://github.com/librapid/librapid

A highly optimised C++ library for mathematical applications and neural networks.

array cpp cpp20 cpp23 cuda gpu high-performance-computing library matrix multidimensional-arrays multithreading parallel-programming pypy pypy3 python python3 simd

Last synced: 08 Oct 2025

https://github.com/DLR-AMR/t8code

Parallel algorithms and data structures for tree-based adaptive mesh refinement (AMR) with arbitrary element shapes.

adaptive-mesh-refinement high-performance-computing hpc mesh modeling mpi parallel parallel-computing simulation

Last synced: 09 Sep 2025

https://github.com/LibRapid/librapid

A highly optimised C++ library for mathematical applications and neural networks.

array cpp cpp20 cpp23 cuda gpu high-performance-computing library matrix multidimensional-arrays multithreading parallel-programming pypy pypy3 python python3 simd

Last synced: 01 Aug 2025

https://github.com/mschubert/clustermq

R package to send function calls as jobs on LSF, SGE, Slurm, PBS/Torque, or each via SSH

cluster high-performance-computing lsf r-package sge slurm ssh

Last synced: 15 May 2025

https://github.com/kahypar/mt-kahypar

Mt-KaHyPar (Multi-Threaded Karlsruhe Hypergraph Partitioner) is a shared-memory multilevel graph and hypergraph partitioner equipped with parallel implementations of techniques used in the best sequential partitioning algorithms. Mt-KaHyPar can partition extremely large hypergraphs very fast and with high quality.

algorithm-engineering graph-algorithms graph-partitioning graphs high-performance-computing hypergraph hypergraph-partitioning hypergraphs parallel-computing partitioning partitioning-algorithms shared-memory tbb

Last synced: 04 Apr 2025

https://github.com/pranabdas/espresso

Notes and tutorials on Density Functional Theory calculation using Quantum ESPRESSO.

density-functional-theory dft first-principles-calculations high-performance-computing hpc materials-modelling quantum-espresso tutorial wannier

Last synced: 07 May 2025