An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with gemv

A curated list of projects in awesome lists tagged with gemv .

https://github.com/xlite-dev/cuda-learn-notes

📚Modern CUDA Learn Notes: 200+ Tensor/CUDA Cores Kernels🎉, HGEMM, FA2 via MMA and CuTe, 98~100% TFLOPS of cuBLAS/FA2.

cuda cuda-kernels cuda-programming cuda-toolkit cudnn cutlass flash-attention flash-mla gemm gemv hgemm

Last synced: 15 Apr 2025

https://github.com/xlite-dev/CUDA-Learn-Notes

📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).

cuda cuda-kernels cuda-programming cuda-toolkit cudnn cutlass flash-attention flash-mla gemm gemv hgemm

Last synced: 26 Mar 2025

https://github.com/DefTruth/CUDA-Learn-Notes

📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).

cuda cuda-kernels cuda-programming cuda-toolkit cudnn cutlass flash-attention flash-mla gemm gemv hgemm

Last synced: 20 Mar 2025

https://github.com/Bruce-Lee-LY/cuda_hgemv

Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.

cublas cuda cuda-core gemm gemv gpu hgemm hgemv matrix-multiply nvidia tensor-core

Last synced: 14 May 2025

https://github.com/bruce-lee-ly/cuda_hgemv

Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.

cublas cuda cuda-core gemm gemv gpu hgemm hgemv matrix-multiply nvidia tensor-core

Last synced: 19 Dec 2024