https://github.com/xlite-dev/CUDA-Learn-Notes
📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).
https://github.com/xlite-dev/CUDA-Learn-Notes
cuda cuda-kernels cuda-programming cuda-toolkit cudnn cutlass flash-attention flash-mla gemm gemv hgemm
Last synced: 27 days ago
JSON representation
📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).
- Host: GitHub
- URL: https://github.com/xlite-dev/CUDA-Learn-Notes
- Owner: xlite-dev
- License: gpl-3.0
- Created: 2022-12-17T08:19:52.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2025-03-22T15:59:13.000Z (about 1 month ago)
- Last Synced: 2025-03-24T15:01:55.614Z (29 days ago)
- Topics: cuda, cuda-kernels, cuda-programming, cuda-toolkit, cudnn, cutlass, flash-attention, flash-mla, gemm, gemv, hgemm
- Language: Cuda
- Homepage:
- Size: 221 MB
- Stars: 2,973
- Watchers: 22
- Forks: 309
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE