An open API service indexing awesome lists of open source software.

https://github.com/jiaau/kernels

This repository showcases common optimization techniques for kernels.
https://github.com/jiaau/kernels

cpp cuda cute cutlass hpc kernel

Last synced: about 2 months ago
JSON representation

This repository showcases common optimization techniques for kernels.

Awesome Lists containing this project

README

          

# Kernels

## 关注点

- [reduce](./src/reduce)
- CUDA Warp-Level Primitives
- Parallel reduction

- [transpose](./src/transpose)
- Memory Coalescing
- Shared Memory
- Bank Conflict
- Swizzling
- CuTe

- [sgemm](./src/sgemm)
- Tile Size Tuning
- Shared Memory
- Bank Conflict
- Double Buffer
- Warp Divergence
- Vectorized memory access

## 编译与运行

### 编译项目

```bash
make build
make install
```

### 运行测试

```bash
make run
```

### 使用NVIDIA Compute Profiler进行性能分析

```bash
make ncu
```

### 清理构建文件

```bash
make clean
```

## 命令行选项

运行SGEMM测试时支持以下选项:

- `--bench`: 启用基准测试模式
- `--times N`: 指定基准测试迭代次数(默认:3)
- `--help`: 显示帮助信息

例如:

```bash
make run -- --bench --times 10
```

## Acknowledgments

- 本项目使用了 [Chtholly-Boss/swizzle](https://github.com/Chtholly-Boss/swizzle) 的一些工具函数