Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
gpu-guide
Graphics Processing Unit (GPU) Architecture Guide
https://github.com/mikeroyal/gpu-guide
Last synced: about 19 hours ago
JSON representation
-
Parallel Computing Tools, Libraries, and Frameworks
- AWS ParallelCluster - supported open source cluster management tool that makes it easy for you to deploy and manage High Performance Computing (HPC) clusters on AWS. ParallelCluster uses a simple text file to model and provision all the resources needed for your HPC applications in an automated and secure manner.
- Numba - aware optimizing compiler for Python sponsored by Anaconda, Inc. It uses the LLVM compiler project to generate machine code from Python syntax. Numba can compile a large subset of numerically-focused Python, including many NumPy functions. Additionally, Numba has support for automatic parallelization of loops, generation of GPU-accelerated code, and creation of ufuncs and C callbacks.
- Chainer - based deep learning framework aiming at flexibility. It provides automatic differentiation APIs based on the define-by-run approach (dynamic computational graphs) as well as object-oriented high-level APIs to build and train neural networks. It also supports CUDA/cuDNN using [CuPy](https://github.com/cupy/cupy) for high performance training and inference.
- cuML - learn.
- Apache Flume
- Slurm - source workload manager designed specifically to satisfy the demanding needs of high performance computing.
- Parallel Computing Toolbox™ - intensive problems using multicore processors, GPUs, and computer clusters. High-level constructs such as parallel for-loops, special array types, and parallelized numerical algorithms enable you to parallelize MATLAB® applications without CUDA or MPI programming. The toolbox lets you use parallel-enabled functions in MATLAB and other toolboxes. You can use the toolbox with Simulink® to run multiple simulations of a model in parallel. Programs and models can run in both interactive and batch modes.
- Statistics and Machine Learning Toolbox™
- OpenMP - platform shared-memory parallel programming in C/C++ and Fortran. The OpenMP API defines a portable, scalable model with a simple and flexible interface for developing parallel applications on platforms from the desktop to the supercomputer.
- CUDA®
- Message Passing Interface (MPI) - passing standard designed to function on parallel computing architectures.
- MATLAB Parallel Server™
- Statistics and Machine Learning Toolbox™
- OpenMP - platform shared-memory parallel programming in C/C++ and Fortran. The OpenMP API defines a portable, scalable model with a simple and flexible interface for developing parallel applications on platforms from the desktop to the supercomputer.
- CUDA®
- Message Passing Interface (MPI) - passing standard designed to function on parallel computing architectures.
- Slurm - source workload manager designed specifically to satisfy the demanding needs of high performance computing.
- Chainer - based deep learning framework aiming at flexibility. It provides automatic differentiation APIs based on the define-by-run approach (dynamic computational graphs) as well as object-oriented high-level APIs to build and train neural networks. It also supports CUDA/cuDNN using [CuPy](https://github.com/cupy/cupy) for high performance training and inference.
- cuML - learn.
- Apache Flume
- AWS ParallelCluster - supported open source cluster management tool that makes it easy for you to deploy and manage High Performance Computing (HPC) clusters on AWS. ParallelCluster uses a simple text file to model and provision all the resources needed for your HPC applications in an automated and secure manner.
- Numba - aware optimizing compiler for Python sponsored by Anaconda, Inc. It uses the LLVM compiler project to generate machine code from Python syntax. Numba can compile a large subset of numerically-focused Python, including many NumPy functions. Additionally, Numba has support for automatic parallelization of loops, generation of GPU-accelerated code, and creation of ufuncs and C callbacks.
-
Parallel Computing Learning Resources
- GPU computing in Vulkan | Udemy
- High Performance Computing Courses | Udacity
- Parallel Computing Courses | Stanford Online
- Parallel Computing with CUDA | Pluralsight
- HPC Architecture and System Design | Intel
- Parallel Computing - level]https://en.wikipedia.org/wiki/Bit-level_parallelism), [instruction-level](https://en.wikipedia.org/wiki/Instruction-level_parallelism), [data](https://en.wikipedia.org/wiki/Data_parallelism), and [task parallelism](https://en.wikipedia.org/wiki/Task_parallelism).
- Accelerated Computing - Training | NVIDIA Developer
- Fundamentals of Accelerated Computing with CUDA Python Course | NVIDIA
- Top Parallel Computing Courses Online | Udemy
- Scientific Computing Masterclass: Parallel and Distributed
- Learn Parallel Computing in Python | Udemy
- Parallel Computing - level]https://en.wikipedia.org/wiki/Bit-level_parallelism), [instruction-level](https://en.wikipedia.org/wiki/Instruction-level_parallelism), [data](https://en.wikipedia.org/wiki/Data_parallelism), and [task parallelism](https://en.wikipedia.org/wiki/Task_parallelism).
- Accelerated Computing - Training | NVIDIA Developer
- Fundamentals of Accelerated Computing with CUDA Python Course | NVIDIA
- Top Parallel Computing Courses Online | Udemy
- Scientific Computing Masterclass: Parallel and Distributed
- Learn Parallel Computing in Python | Udemy
- GPU computing in Vulkan | Udemy
- High Performance Computing Courses | Udacity
- Parallel Computing Courses | Stanford Online
- Parallel Computing with CUDA | Pluralsight
- HPC Architecture and System Design | Intel
Categories
Sub Categories