Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jeng1220/openacc_fortran_examples
Simple OpenACC Fortran Examples
https://github.com/jeng1220/openacc_fortran_examples
cuda fortran openacc
Last synced: 3 months ago
JSON representation
Simple OpenACC Fortran Examples
- Host: GitHub
- URL: https://github.com/jeng1220/openacc_fortran_examples
- Owner: jeng1220
- License: apache-2.0
- Created: 2020-02-20T08:06:51.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2021-08-01T07:03:50.000Z (over 3 years ago)
- Last Synced: 2024-10-14T20:47:46.775Z (3 months ago)
- Topics: cuda, fortran, openacc
- Language: Fortran
- Homepage:
- Size: 119 KB
- Stars: 52
- Watchers: 6
- Forks: 10
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Simple OpenACC Fortran Examples #
Author: Jeng Bai-Cheng([email protected])An example code is worth a thousand words. This repository intends to host fundamental, but useful examples. Each example is just a few dozen lines of code. Most of them come from my past experience in HPC projects, but readers do not need to have the HPC background to understand the examples.
## Eexamples ##
### Basic ###
* [acc_async](/acc_async) - faster way to enqueue GPU routines(kernels)
* [access_efficiency](/access_efficiency) - faster way to access a GPU arrary
* [alternative_nested_parallelism](/alternative_nested_parallelism) - alternative to nested parallelism on the GPU\
* [array_of_pointers](/array_of_pointers) - array of pointers usage
* [array_setting](/array_setting) - faster way to initialize a GPU array
* [atomic_op](/atomic_op) - use atomic operation to maximize parallelism
* [auto_depend](/auto_depend) - automatic dependency solution
* [cuda_c_binding](/cuda_c_binding) - call CUDA C from Fortran
* [cuda_graph](/cuda_graph) - faster way to launch GPU kernels
* [device_routine](/device_routine) - usage of GPU routine. Call other routines in the GPU kernel
* [device_variable](/device_variable) - usage of GPU variable. Access a global variable from other modules in the GPU kernel
* [do_concurrent](/do_concurrent) - DO CONCURRENT construct (Fortran 2008)
* [hybrid_omp_acc](/hybrid_omp_acc) - usage of OpenMP and OpenACC### MPI ###
* [cuda_mpi_sendrecv](/cuda_mpi_sendrecv) - CUDA-Aware MPI, faster way to use MPI on GPU
* [cuda_unified_memory_mpi_bcast](/cuda_unified_memory_mpi_bcast) - usage of CUDA Unified Memory and MPI, more convenient way to use MPI on GPU
* [nccl_alltoall](/nccl_alltoall) - faster Alltoall on GPU
* [nccl_alltoallv](/nccl_alltoallv) - faster Alltoallv on GPU### Profiling ###
* [auto_nvtx](/auto_nvtx) - use compiler to insert CPU profiling routines automatically
* [profiling_range](/profiling_range) - demonstration of focused profiling via profiling tool## Requirement ##
* NVIDIA HPC SDK 21.3To install HPC SDK via Docker, visit NVIDIA GPU Cloud: https://ngc.nvidia.com/catalog/containers/nvidia:nvhpc/tags
Or download HPC SDK from official website: https://developer.nvidia.com/hpc-sdk
## Bulid ##
```sh
$ cd
$ make
```## Run ##
```sh
$ cd
$ ./
```