https://github.com/coffeevampir3/hyper-amx

Repo for AMX + FAST
https://github.com/coffeevampir3/hyper-amx

amx avx512 inference inference-engine matmul numa-aware quantization tensor tensor-parallelism

Last synced: 9 days ago
JSON representation

Repo for AMX + FAST

Host: GitHub
URL: https://github.com/coffeevampir3/hyper-amx
Owner: CoffeeVampir3
Created: 2025-10-24T17:43:15.000Z (8 months ago)
Default Branch: master
Last Pushed: 2025-11-01T06:53:47.000Z (8 months ago)
Last Synced: 2025-11-01T08:31:23.855Z (8 months ago)
Topics: amx, avx512, inference, inference-engine, matmul, numa-aware, quantization, tensor, tensor-parallelism
Language: C++
Homepage:
Size: 674 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

### Actively in Development
This is currently in progress and is not feature complete. Below are the existing features, but there's still quite a lot of work before inference can be done.

### Main points:
- Modern C++ (cpp23)
- Modules
- No external dependencies
- Megatron Tensor-Parallel Row/Col interleaving
- NUMA Awareness
- AVX512 + AMX exclusive
- Pure AMX GEMM

### AMXQ
- AMXQ (Grouped asymmetric mean-centered quantization)
- Fused AMXQ AMX GEMM (Reduces bandwidth pressure by shrinking the accumulator)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/coffeevampir3/hyper-amx

Awesome Lists containing this project

README