https://github.com/zalo/matmul_cuda
A simple learning example for CUDA
https://github.com/zalo/matmul_cuda
cuda
Last synced: 12 months ago
JSON representation
A simple learning example for CUDA
- Host: GitHub
- URL: https://github.com/zalo/matmul_cuda
- Owner: zalo
- License: mit
- Created: 2022-08-04T03:31:38.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2022-08-19T19:00:44.000Z (almost 4 years ago)
- Last Synced: 2025-01-14T23:40:12.266Z (over 1 year ago)
- Topics: cuda
- Language: Cuda
- Homepage:
- Size: 55.7 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# matmul_cuda
A simple learning example for CUDA.
Take in a `.bin` file specifying a series of `int32 rows, int32 cols, int64_t* matrixContents` matrices and multiply them together as quickly as possible. Save the resultant matrix as `output.bin` in the same format.
On Windows, this application compiles with:
```
nvcc -o matmul_cuda_v2 matmul_cuda.cu -ccbin "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\Hostx64\x64" -O 3 -arch=all --extra-device-vectorization
```
On Linux, this application compiles with:
```
nvcc -o matmul_cuda_v2 matmul_cuda.cu -O 3 -arch=all --extra-device-vectorization
```