Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jundaf2/cutlass-kernel-volta-gemm
volta fp16 gemm kernel
https://github.com/jundaf2/cutlass-kernel-volta-gemm
Last synced: 2 days ago
JSON representation
volta fp16 gemm kernel
- Host: GitHub
- URL: https://github.com/jundaf2/cutlass-kernel-volta-gemm
- Owner: jundaf2
- Created: 2023-09-04T03:44:58.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-10-01T10:13:35.000Z (over 1 year ago)
- Last Synced: 2024-11-15T13:33:06.074Z (2 months ago)
- Language: Cuda
- Homepage:
- Size: 161 KB
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# cutlass-kernel-volta-gemm
This is a cutlass-based kernel-level GEMM for Volta architecture.## Dependencies
- pytorch
- pytest## Notes
First deploy blocks, then deploy warps.
- Block
- BM = 64
- BN = 64
- Warp
- WM = 16
- WN = 16
NUM_WARPS = (BMxBN)/(WMxWN) = 16
NUM_THREADS_PER_CTA = WARP_SIZExNUM_WARPS = 32x16 = 512