https://github.com/dssgabriel/arm-sve-benchmarks

Performance comparison between small hand-written SVE kernels and compiler-generated ones.
https://github.com/dssgabriel/arm-sve-benchmarks

arm64 assembly benchmarks compiler simd sve

Last synced: about 2 months ago
JSON representation

Performance comparison between small hand-written SVE kernels and compiler-generated ones.

Host: GitHub
URL: https://github.com/dssgabriel/arm-sve-benchmarks
Owner: dssgabriel
License: mit
Created: 2022-05-24T19:01:07.000Z (about 4 years ago)
Default Branch: main
Last Pushed: 2022-06-17T18:19:51.000Z (about 4 years ago)
Last Synced: 2025-07-07T18:52:43.734Z (12 months ago)
Topics: arm64, assembly, benchmarks, compiler, simd, sve
Language: C
Homepage:
Size: 52.7 KB
Stars: 10
Watchers: 2
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# ARM SVE Benchmarks

This repository gathers some small kernels to benchmark the performance of hand-written SVE code compared to compiler-generated one.
Implemented kernels are:
- Initialization (store);
- Copy (load, store);
- Reduction (load, add);
- Dot product (load, load, mul, add);
- DAXPY (load, load, load, mul, add, store);
- Vector sum (load, load, load, add, store);
- Vector scale (load, load, load, mul, store).

## Usage
*Note:* the provided Makefile uses the `armclang` compiler, however both `clang` and `gcc` have been tested and can be used as well.
Keep in mind that the architecture specific flags (`AFLAGS`) might need to be changed depending on the chosen compiler.
See the [comparison between compiler flags across architectures](https://community.arm.com/arm-community-blogs/b/tools-software-ides-blog/posts/compiler-flags-across-architectures-march-mtune-and-mcpu) for more information.

To build the benchmarks:
```
make build
```

You can then execute one of the benchmarks presented above and specify the vectors' size (in bytes), number of iterations and error tolerance through the provided option flags.

Example (reduction benchmark with 64KiB vectors, 100k iterations and an error tolerance of $10^{-14}$:
```
target/arm_bench -k reduc -s 8192 -i 100000 -e 1e-14
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dssgabriel/arm-sve-benchmarks

Awesome Lists containing this project

README