https://github.com/rmeli/cuda-pg

CUDA C++ Playground
https://github.com/rmeli/cuda-pg

cpp cuda gpu

Last synced: 2 months ago
JSON representation

CUDA C++ Playground

Host: GitHub
URL: https://github.com/rmeli/cuda-pg
Owner: RMeli
Created: 2019-05-24T15:33:22.000Z (about 7 years ago)
Default Branch: master
Last Pushed: 2022-07-22T09:10:00.000Z (almost 4 years ago)
Last Synced: 2025-10-05T08:39:43.248Z (9 months ago)
Topics: cpp, cuda, gpu
Language: Cuda
Homepage:
Size: 318 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSES/README.md

Awesome Lists containing this project

README

          # CUDA Playground

Playground for experimentation with CUDA C++. Most of the code in this repository is an adaptation of the code associated to [CUDA by Example - An Introduction to General Purpose GPU Programming](https://developer.nvidia.com/cuda-example) (see `LICENSES`).

## CUDA Concepts

* `01_info`: [device information](01_info/README.md#cuda-concepts)

* `02_axpy`: [kernel functions](02_axpy/README.md#kernel-function), [kernel calls](02_axpy/README.md#kernel-call), [blocks](02_axpy/README.md#blocks), [host/device memory](02_axpy/README.md#device-memory)

* `03_mandelbrot`: [device functions](03_mandelbrot/README.md#device-functions), [grid of blocks](03_mandelbrot/README.md#grid-of-blocks)

* `04_axpy`: [threads](04_axpy/README.md#threads), [blocks and threads](04_axpy/README.md#blocks-and-threads)

* `05_dot`: [shared memory](05_dot/README.md#shared-memory), [thread synchronization](05_dot/README.md#thread-synchronization)

* `06_raytracer`: [constant memory](06_raytracer#constant-memory), [warps](06_raytracer#warps), [host/device functions](06_raytracer#hostdevice-functions), [GPU timing and events](06_raytracer#gpu-timing)

* `07_hist`: [atomic operations](07_hist#atomic-operations), [set device memory](07_hist#set-device-memory)

* `08_mem`: [page-locked memory](08_mem#page-locked-memory)

* `09_memasync`: [CUDA streams](09_memasync#cuda-streams), [asyncronous memory copy](09_memasync#asyncronous-memory-copy)

### Timings

CPU timings are measured for the kernel execution. GPU timings are measured for device memory allocation, copy of data from host to device, kernel execution, and copy of data from device to host (unless such timings are clearly disaggregated).

## Build Experiments

```bash

mkdir build        # create build directory

cd build           # change to build directory

cmake ..           # generato Makefile

make               # compile project

```

### Singularity

Run [Singularity](https://singularity.hpcng.org/) container with CUDA and [CMake](https://cmake.org/) interactively:

```bash

singularity shell --nv singularity/.sif

```

## References

* [CSCS HPC Summer School 2018](https://github.com/eth-cscs/SummerSchool2018)

* [CUDA by Example - An Introduction to General Purpose GPU Programming](https://developer.nvidia.com/cuda-example)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/rmeli/cuda-pg

Awesome Lists containing this project

README