An open API service indexing awesome lists of open source software.

https://github.com/ericlbuehler/simd_matmul

O(n^2) matmul with SIMD.
https://github.com/ericlbuehler/simd_matmul

Last synced: 12 months ago
JSON representation

O(n^2) matmul with SIMD.

Awesome Lists containing this project

README

          

# SIMD Matmul

This is an optimization of the naive matmul algorithm which uses SIMD.

## Asymptotic complexity analysis
It is $O(n^2)$. For $A = (m,n)$, $B = (n,p)$ and $A*B = C = (m,p)$, the exact running time for SIMD matmul including addition and multiplication operations is $mp(1+n-1) = mpn$ while the running time for Naive matmul is $mp(2n-1) = 2mpn-mp$. Therefore, the precise ratio is $$\frac{n}{2n-1}$$

However, when calculating the Big-O complexity we ignore addition operations and as such the running time is $mp$ or $O(n^2)$.

## Mathematical Formulation
For $A = (m,n)$, $B = (n,p)$, I calculate $B^T$. This results in the inputs to the algorithm, $A = (m,n)$, $B = (p,n)$. I note that the transpose algorithm is also $O(n^2)$.
The output $C' = (m,p)$ and is equivalent to $C = A*B$.

## Advantages
- Far lower theoretical Big-O
- Header-only library

## Disadvantages
- $n$ is constrained by SIMD lanes
- Requires transpose or store matrices in transposed form
- Requires conversion of matrix rows to SIMD vectors