https://github.com/tlc-pack/cutlass_fpA_intB_gemm

A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer
https://github.com/tlc-pack/cutlass_fpA_intB_gemm

Last synced: 16 days ago
JSON representation

A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer

Host: GitHub
URL: https://github.com/tlc-pack/cutlass_fpA_intB_gemm
Owner: tlc-pack
License: apache-2.0
Created: 2023-06-06T22:26:34.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2025-02-22T02:11:00.000Z (about 2 months ago)
Last Synced: 2025-02-22T03:19:34.988Z (about 2 months ago)
Language: C++
Size: 204 KB
Stars: 88
Watchers: 18
Forks: 22
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-gemm - cutlass_fpA_intB_gemm - 2.0`](https://github.com/tlc-pack/cutlass_fpA_intB_gemm/blob/main/LICENSE) (Libraries / GPU Libraries)

README

Extracted fp16 A and int8/4 B CUTLASS GEMM kernels from FasterTransformer for easier integration in third-party projects. See the original code below.
* https://github.com/NVIDIA/FasterTransformer/tree/main/src/fastertransformer/cutlass_extensions/include/cutlass_extensions
* https://github.com/NVIDIA/FasterTransformer/tree/main/src/fastertransformer/kernels/cutlass_kernels/fpA_intB_gemm

Build with
```
mkdir build && cd build
cmake ..
make
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/tlc-pack/cutlass_fpA_intB_gemm

Awesome Lists containing this project

README