Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/david-palma/cuda-programming
Educational CUDA C/C++ programming repository with commented examples on GPU parallel computing, matrix operations, and performance profiling. Requires a CUDA-enabled NVIDIA GPU.
https://github.com/david-palma/cuda-programming
c-cpp cpp cuda cuda-toolkit education gpu gpu-programming kernel matrix-operations nvcc nvidia parallel-computing parallel-programming practice profiling threads
Last synced: about 2 months ago
JSON representation
Educational CUDA C/C++ programming repository with commented examples on GPU parallel computing, matrix operations, and performance profiling. Requires a CUDA-enabled NVIDIA GPU.
- Host: GitHub
- URL: https://github.com/david-palma/cuda-programming
- Owner: david-palma
- License: mit
- Created: 2019-05-19T15:15:01.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2024-11-15T16:17:19.000Z (2 months ago)
- Last Synced: 2024-11-15T17:24:25.766Z (2 months ago)
- Topics: c-cpp, cpp, cuda, cuda-toolkit, education, gpu, gpu-programming, kernel, matrix-operations, nvcc, nvidia, parallel-computing, parallel-programming, practice, profiling, threads
- Language: Cuda
- Homepage:
- Size: 23.4 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# CUDA C/C++ programming
This repository is meant to provide open source resources for educational purposes about CUDA C/C++ programming, which is the C/C++ interface to the CUDA parallel computing platform.
In CUDA, the host refers to the CPU and its memory, while the device refers to the GPU and its memory.
Code run on the host can manage memory on both the host and device, and also launches kernels which are functions executed on the device by many GPU threads in parallel.**NOTE**: it is assumed that you have access to a computer with a CUDA-enabled NVIDIA GPU.
## List of the exercises
Here you can find the solutions for different simple exercises about GPU programming in CUDA C/C++.
The source code is well commented and easy to follow, though a minimum knowledge of parallel architectures is recommended.- [exercise 00](./exercises/ex00.cu): hello, world!
- [exercise 01](./exercises/ex01.cu): print devices properties
- [exercise 02](./exercises/ex02.cu): addition
- [exercise 03](./exercises/ex03.cu): vector addition using parallel blocks
- [exercise 04](./exercises/ex04.cu): vector addition using parallel threads
- [exercise 05](./exercises/ex05.cu): vector addition combining blocks and threads
- [exercise 06](./exercises/ex06.cu): single-precision A\*X Plus Y
- [exercise 07](./exercises/ex07.cu): time, bandwidth, and throughput computation (single-precision A\*X Plus Y)
- [exercise 08](./exercises/ex08.cu): multiplication of square matrices
- [exercise 09](./exercises/ex09.cu): transpose of a square matrix
- [exercise 10](./exercises/ex10.cu): dot product using shared memory
- [exercise 11](./exercises/ex11.cu): prefix sum (exclusive scan) using shared memory## Compiling and running the code
The CUDA C/C++ compiler `nvcc` is part of the NVIDIA CUDA Toolkit which is used to separate source code into host and device components. Then, you can compile the code with `nvcc`.
**NOTE**: to find out how long the kernel takes to run or to check the memory usage, you can type `nvprof ./` or `cuda-memcheck ./` on the command line, respectively.
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.