Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/chrisdalvit/gpu-matrix-transpose
Implementation and benchmarking of different matrix transpose with CUDA
https://github.com/chrisdalvit/gpu-matrix-transpose
c cpp cuda cuda-kernels cuda-programming gpu-acceleration gpu-computing gpu-programming matrix-transpose nvidia-gpu
Last synced: about 2 months ago
JSON representation
Implementation and benchmarking of different matrix transpose with CUDA
- Host: GitHub
- URL: https://github.com/chrisdalvit/gpu-matrix-transpose
- Owner: chrisdalvit
- Created: 2024-12-19T17:53:06.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2024-12-19T19:22:03.000Z (about 2 months ago)
- Last Synced: 2024-12-19T20:24:28.035Z (about 2 months ago)
- Topics: c, cpp, cuda, cuda-kernels, cuda-programming, gpu-acceleration, gpu-computing, gpu-programming, matrix-transpose, nvidia-gpu
- Language: C++
- Homepage: https://chrisdalvit.github.io/gpu-matrix-transpose
- Size: 193 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# GPU Matrix Transpose
This repository implements and benchmarks different matrix transpose algorithms for GPU's using CUDA. Definitely check out the [corresponding blog post](https://chrisdalvit.github.io/gpu-matrix-transpose)
## Repository structure
The ```data``` folder contains the original benchmark data from the tested architectures that was used in the experimental analyses.The ```lib``` folder contains C-functions used by all tested algorithms with the corresponding header file.
The ```src``` folder contains C files with the different algorithms for matrix transposition.
## Setup project
After cloning the repository you can run
```
make
```
and the files in ```src``` are compiled, and the benchmark test is started and stored in the ```stats``` folder (that is created by the Makefile). __Waring: It can take a lot of time for the benchmarks to finish!__If you only want to compile the files in ```src``` create a folder ```bin``` and run
```
make compile_c
```
This should compile all C files in ```src``` and store them into the ```bin``` folder without starting the benchmarks. Compiled binaries follow the naming convention of ```-```In order to run the experiments on the Marzola cluster run
```
make marzola
```
The script compiles the source code and launches SLURM jobs on the cluster. Results are stored in the ```stats/``` folder.## Validate implementations
The correctness of the provided implementations can be verified by running the compiled binaries in 'debug mode'. After compilation you can run
```
./bin/ --debug
```
For example
```
./bin/naive-0 2 --debug
```
Should output a randomly initialized matrix with dimension 2^2 and the corresponding transposed matrix. Additionaly the execution time and the effective bandwidth are displayed.