https://github.com/statusfailed/vibekernels

LLM-written matmul kernels
https://github.com/statusfailed/vibekernels

Last synced: 4 months ago
JSON representation

LLM-written matmul kernels

Host: GitHub
URL: https://github.com/statusfailed/vibekernels
Owner: statusfailed
Created: 2025-12-10T16:16:25.000Z (6 months ago)
Default Branch: master
Last Pushed: 2025-12-11T13:42:43.000Z (6 months ago)
Last Synced: 2025-12-12T16:40:18.872Z (6 months ago)
Language: C++
Size: 25.4 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # Vibe Kernels

Fast CPU matrix multiplication kernels built by LLMs.

See PROMPT.md for LLM instructions.

## Kernels

**Reference**: OpenBLAS single-threaded SGEMM ~147.07 GFLOPS

| Kernel | Technique | GFLOPS | vs Reference |

|--------|-----------|--------|--------------|

| blocked                        | Cache blocking            | ~4.96   | 3.4%     |

| blocked_avx2                   | AVX2 SIMD                 | ~25.66  | 17.4%    |

| blocked_avx2_microkernel4x4    | 4×8 microkernel + packing | ~83.40  | 56.7%    |

| blocked_avx512_microkernel4x16 | AVX512 16-wide            | ~162.31 | 110.4%   |

| blocked_avx512_microkernel4x16_prefetch | Memory prefetching        | ~154.52 | 105.1%   |

## Usage

```bash

nix develop

make clean && make

./run list                   # List kernels

./run test           # Test correctness  

./run bench          # Benchmark performance

./run compare        # Compare kernels

```

## Architecture

- `kernels/` - Kernel implementations  

- `*_harness.hpp` - Test/benchmark/tuning frameworks

- `main.cpp` - CLI interface

Each kernel adds one optimization technique for systematic performance analysis.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/statusfailed/vibekernels

Awesome Lists containing this project

README