Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/nihui/ruapu

Detect CPU features with single-file
https://github.com/nihui/ruapu

android arm clang cpu-features gcc ios linux loongarch macos mingw mips msvc powerpc risc-v ruapu s390x sigill windows x86

Last synced: about 2 months ago
JSON representation

Detect CPU features with single-file

Awesome Lists containing this project

README

        

# ruapu

![GitHub License](https://img.shields.io/github/license/nihui/ruapu?style=for-the-badge)
![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/nihui/ruapu/ci.yml?style=for-the-badge)

Detect CPU ISA features with single-file

CPU✅ x86, x86-64
✅ arm, aarch64
✅ mips
✅ powerpc
✅ s390x
✅ loongarch
✅ risc-v
✅ openrisc

```c
#define RUAPU_IMPLEMENTATION
#include "ruapu.h"

int main()
{
// initialize ruapu once
ruapu_init();

// now, tell me if this cpu has avx2
int has_avx2 = ruapu_supports("avx2");

// loop all supported features
const char* const* supported = ruapu_rua();
while (*supported)
{
fprintf(stderr, "%s\n", *supported);
supported++;
}

return 0;
}
```

OS✅ Windows
✅ Linux
✅ macOS
✅ Android
✅ iOS
✅ FreeBSD
✅ NetBSD
✅ OpenBSD
✅ DragonflyBSD
✅ Solaris
✅ SyterKit
Compiler✅ GCC
✅ Clang
✅ MSVC
✅ MinGW

#### Best practice for using `ruapu.h` in multiple compilation units

1. Create one `ruapu.c` for your project
2. `ruapu.c` is **ONLY** `#define RUAPU_IMPLEMENTATION` and `#include "ruapu.h"`
3. Other sources `#include "ruapu.h"` but **NO** `#define RUAPU_IMPLEMENTATION`

## Features

* Detect **CPU ISA with single-file**   
_`sse2`, `avx`, `avx512f`, `neon`, etc._
* Detect **vendor extended ISA**    
_apple `amx`, risc-v vendor ISA, etc._
* Detect **richer ISA on Windows ARM**  
_`IsProcessorFeaturePresent()` returns little ISA information_
* Detect **`x86-avx512` on macOS correctly** 
_macOS hides it in `cpuid`_
* Detect **new CPU's ISA on old systems** 
_they are usually not exposed in `auxv` or `MISA`_
* Detect **CPU hidden ISA**       
_`fma4` on zen1, ISA in hypervisor, etc._

## Supported ISA _ (more is comming ... :)_

|CPU|ISA|
|:---:|---|
|x86|`mmx` `sse` `sse2` `sse3` `ssse3` `sse41` `sse42` `sse4a` `xop` `avx` `f16c` `fma` `fma4` `avx2` `avx512f` `avx512bw` `avx512cd` `avx512dq` `avx512vl` `avx512vnni` `avx512bf16` `avx512ifma` `avx512vbmi` `avx512vbmi2` `avx512fp16` `avx512er` `avx5124fmaps` `avx5124vnniw` `avxvnni` `avxvnniint8` `avxvnniint16` `avxifma` `amxfp16` `amxbf16` `amxint8` `amxtile` `aesni` `sha` |
|arm|`half` `edsp` `neon` `vfpv4` `idiv`|
|aarch64|`neon` `vfpv4` `lse` `cpuid` `asimdrdm` `asimdhp` `asimddp` `asimdfhm` `bf16` `i8mm` `frint` `jscvt` `fcma` `mte` `mte2` `sve` `sve2` `svebf16` `svei8mm` `svef32mm` `svef64mm` `sme` `smef16f16` `smef64f64` `smei64i64` `pmull` `crc32` `aes` `sha1` `sha2` `sha3` `sha512` `sm3` `sm4` `svepmull` `svebitperm` `sveaes` `svesha3` `svesm4` `amx`|
|mips|`msa` `mmi` `sx` `asx` `msa2` `crypto`|
|powerpc|`vsx`|
|s390x|`zvector`|
|loongarch|`lsx` `lasx`|
|risc-v|`i` `m` `a` `f` `d` `c` `v` `zba` `zbb` `zbc` `zbs` `zbkb` `zbkc` `zbkx` `zcb` `zfa` `zfbfmin` `zfh` `zfhmin` `zicond` `zicsr` `zifencei` `zmmul` `zvbb` `zvbc` `zvfh` `zvfhmin` `zvfbfmin` `zvfbfwma` `zvkb` `zvl32b` `zvl64b` `zvl128b` `zvl256b` `zvl512b` `zvl1024b` `xtheadba` `xtheadbb` `xtheadbs` `xtheadcondmov` `xtheadfmemidx` `xtheadfmv` `xtheadmac` `xtheadmemidx` `xtheadmempair` `xtheadsync` `xtheadvdot` `spacemitvmadot` `spacemitvmadotn` `spacemitvfmadot`|
|openrisc| `orbis32` `orbis64` `orfpx32` `orfpx64` `orvdx64` |

## Let's ruapu

### ruapu with C

Compile ruapu test program

```shell
# GCC / MinGW
gcc main.c -o ruapu
```
```shell
# Clang
clang main.c -o ruapu
```
```shell
# MSVC
cl.exe /Fe: ruapu.exe main.c
```

Run ruapu in command line

```shell
./ruapu
mmx = 1
sse = 1
sse2 = 1
sse3 = 1
ssse3 = 1
sse41 = 1
sse42 = 1
sse4a = 1
xop = 0
... more lines omitted ...
```

### ruapu with Python

Compile and install ruapu library

```shell
# from pypi
pip3 install ruapu
```

```shell
# from source code
pip3 install ./python
```

Use ruapu in python

```python
import ruapu

ruapu.supports("avx2")
# True

ruapu.supports(isa="avx2")
# True

ruapu.rua()
#(mmx', 'sse', 'sse2', 'sse3', 'ssse3', 'sse41', 'sse42', 'avx', 'f16c', 'fma', 'avx2')
```

### ruapu with Rust

Compile ruapu library

```shell
# from source code
cd rust
cargo build --release
```

Use ruapu in Rust

```rust
extern crate ruapu;

fn main() {
println!("supports neon: {}", ruapu::supports("neon").unwrap());
println!("supports avx2: {}", ruapu::supports("avx2").unwrap());
println!("rua: {:?}", ruapu::rua());
}
```

### ruapu with Lua

Compile ruapu library

```shell
# from source code
cd lua
# lua binding has been tested on Lua 5.2~5.4
luarocks make
```

Use ruapu in Lua

```Lua
ruapu = require "ruapu";
print(ruapu.supports("mmx"));
for _, ext in ipairs(ruapu.rua()) do
print(ext);
end
```

### ruapu with Erlang

Compile ruapu library

```erlang
% add this to deps list
% in your rebar.config
{ruapu, "0.1.0"}
```

Use ruapu in Erlang `rebar3 shell`

```erlang
ruapu:rua().
{ok,["neon","vfpv4","asimdrdm","asimdhp","asimddp",
"asimdfhm","bf16","i8mm","pmull","crc32","aes","sha1",
"sha2","sha3","sha512","amx"]}
> ruapu:supports("neon").
true
> ruapu:supports(neon).
true
> ruapu:supports(<<"neon">>).
true
> ruapu:supports("avx2").
false
> ruapu:supports(avx2).
false
> ruapu:supports(<<"avx2">>).
false
```

### ruapu with Fortran

Compile ruapu library

```shell
# from source code
cd fortran
cmake -B build
cmake --build build
```

Use ruapu in Fortran

```fortran
program main
use ruapu, only: ruapu_init, ruapu_supports, ruapu_rua
implicit none

character(len=:), allocatable :: isa_supported(:)
integer :: i

call ruapu_init()

print *, "supports sse: ", ruapu_supports("sse")
print *, "supports neon: ", ruapu_supports("neon")

isa_supported = ruapu_rua()
do i = 1, size(isa_supported)
print *, trim(isa_supported(i))
end do
end program main

```

### ruapu with Golang

Compile ruapu library

```shell
cd go
go build -o ruapu-go
```

Use ruapu in Golang

```go
package main

import (
"fmt"
"ruapu-go/ruapu"
"strconv"
)

func main() {
ruapu.Init()
avx2Status := ruapu.Supports("avx2")
fmt.Println("avx2:" + strconv.Itoa(avx2Status))
rua := ruapu.Rua()
fmt.Println(rua)
}
```

### ruapu with Haskell

Add ruapu library to your project

`haskell/Ruapu.hs`, `haskell/ruapu.c` and `ruapu.h` should be copied in your
project.

Use ruapu in Haskell

```haskell
import Ruapu
-- Ruapu.rua :: IO [String]
-- Ruapu.supports :: String -> IO Bool
main = do
Ruapu.init
Ruapu.supports "mmx" >>= putStrLn . show
Ruapu.rua >>= sequence_ . map putStrLn
```

### ruapu with Vlang

Compile ruapu library

```shell
cd vlang
v .
```

Use ruapu in Vlang

```go
module main

import ruapu

fn main() {
ruapu.ruapu_init()
mut avx2_status := ruapu.ruapu_supports('avx2')
if avx2_status {
println('avx2: ' + avx2_status.str())
}

println(ruapu.ruapu_rua())
}
```

### ruapu with Pascal

Compile ruapu library

```shell
cd pascal
sudo apt install fpc
cmake .
make
fpc ruapu.lpr
```

Use ruapu in Pascal

```pascal
program ruapu;

uses ruapu_pascal;

var
has_avx2: integer;
supported: PPAnsiChar;
begin
// initialize ruapu once
ruapu_init();

// now, tell me if this cpu has avx2
has_avx2 := ruapu_supports('avx2');

// loop all supported features
supported := ruapu_rua();
while supported^ <> nil do
begin
writeln(supported^);
inc(supported);
end;

readln();
end.

```

### ruapu with Java

Compile ruapu library and example

```shell
./gradlew build
```
Run example
```shell
java -cp \
./build/libs/ruapu-1.0-SNAPSHOT.jar \
./Example.java
```

Use ruapu in Java

```java
import ruapu.Ruapu;
import java.util.*;

class Example {
public static void main(String args[]) {
Ruapu ruapu = new Ruapu();

System.out.println("avx: " + ruapu.supports("avx"));
// avx: 1
System.out.println(Arrays.toString(ruapu.rua()));
// [mmx, sse, sse2, sse3, ssse3, sse41, sse42, avx, f16c, fma, avx2]
}
}

```

### ruapu with cangjie

Compile ruapu library

```bash
cd cangjie
cd c-src
cmake .
make
```
run example
```bash
cd cangjie
cjpm run
```
or compile example
```bash
cd cangjie
cjpm build
./target/release/bin/main
```

Use ruapu in cangjie

```swift
import ruapu.*
main(): Int64 {
ruapu_init()
let neon_supported = ruapu_supports("neon")
println("supports neon: ${neon_supported}")
let d = ruapu_rua()
for (i in d) {
println(i)
}
return 0
}
```

Github-hosted runner result (Linux)

```
mmx = 1
sse = 1
sse2 = 1
sse3 = 1
ssse3 = 1
sse41 = 1
sse42 = 1
sse4a = 1
xop = 0
avx = 1
f16c = 1
fma = 1
avx2 = 1
avx512f = 0
avx512bw = 0
avx512cd = 0
avx512dq = 0
avx512vl = 0
avx512vnni = 0
avx512bf16 = 0
avx512ifma = 0
avx512vbmi = 0
avx512vbmi2 = 0
avx512fp16 = 0
avx512er = 0
avx5124fmaps = 0
avx5124vnniw = 0
avxvnni = 0
avxvnniint8 = 0
avxifma = 0
amxfp16 = 0
amxbf16 = 0
amxint8 = 0
amxtile = 0
```

Github-hosted runner result (macOS)

```
mmx = 1
sse = 1
sse2 = 1
sse3 = 1
ssse3 = 1
sse41 = 1
sse42 = 1
sse4a = 0
xop = 0
avx = 1
f16c = 1
fma = 1
avx2 = 1
avx512f = 0
avx512bw = 0
avx512cd = 0
avx512dq = 0
avx512vl = 0
avx512vnni = 0
avx512bf16 = 0
avx512ifma = 0
avx512vbmi = 0
avx512vbmi2 = 0
avx512fp16 = 0
avx512er = 0
avx5124fmaps = 0
avx5124vnniw = 0
avxvnni = 0
avxvnniint8 = 0
avxifma = 0
amxfp16 = 0
amxbf16 = 0
amxint8 = 0
amxtile = 0
```

Github-hosted runner result (macOS M1)

```
neon = 1
vfpv4 = 1
cpuid = 0
asimdhp = 1
asimddp = 1
asimdfhm = 1
bf16 = 0
i8mm = 0
sve = 0
sve2 = 0
svebf16 = 0
svei8mm = 0
svef32mm = 0
```

Github-hosted runner result (Windows)

```
mmx = 1
sse = 1
sse2 = 1
sse3 = 1
ssse3 = 1
sse41 = 1
sse42 = 1
sse4a = 1
xop = 0
avx = 1
f16c = 1
fma = 1
avx2 = 1
avx512f = 0
avx512bw = 0
avx512cd = 0
avx512dq = 0
avx512vl = 0
avx512vnni = 0
avx512bf16 = 0
avx512ifma = 0
avx512vbmi = 0
avx512vbmi2 = 0
avx512fp16 = 0
avx512er = 0
avx5124fmaps = 0
avx5124vnniw = 0
avxvnni = 0
avxvnniint8 = 0
avxifma = 0
amxfp16 = 0
amxbf16 = 0
amxint8 = 0
amxtile = 0
```

FreeBSD/NetBSD/OpenBSD VM result (x86_64)

```
mmx = 1
sse = 1
sse2 = 1
sse3 = 1
ssse3 = 1
sse41 = 1
sse42 = 1
sse4a = 1
xop = 0
avx = 1
f16c = 1
fma = 1
fma4 = 0
avx2 = 1
avx512f = 0
avx512bw = 0
avx512cd = 0
avx512dq = 0
avx512vl = 0
avx512vnni = 0
avx512bf16 = 0
avx512ifma = 0
avx512vbmi = 0
avx512vbmi2 = 0
avx512fp16 = 0
avx512er = 0
avx5124fmaps = 0
avx5124vnniw = 0
avxvnni = 0
avxvnniint8 = 0
avxifma = 0
amxfp16 = 0
amxbf16 = 0
amxint8 = 0
amxtile = 0
```

## Techniques inside ruapu
ruapu is implemented in C language to ensure the widest possible portability.

ruapu determines whether the CPU supports certain instruction sets by trying to execute instructions and detecting whether an `Illegal Instruction` exception occurs. ruapu does not rely on the cpuid instructions and registers related to the CPU architecture, nor does it rely on the `MISA` information and system calls of the operating system. This can help us get more detailed CPU ISA information.

## FAQ
#### Why is the project named ruapu

 ruapu is the abbreviation of rua-cpu, which means using various extended instructions to harass and amuse the CPU (rua!). Based on whether the CPU reacts violently (throws an illegal instruction exception), it is inferred whether the CPU supports a certain extended instruction set.

#### Why is ruapu API designed like this

 We consider gcc builtin functions to be good practice, saying `__builtin_cpu_init()` and `__builtin_cpu_supports()`. ruapu refers to this design, which can be a 1:1 replacement for gcc functions, and supports more operating systems and compilers, giving it better portability.

#### Why does SIGILL occur when executing in debugger or simulator, such as `gdb`, `lldb`, `qemu-user`, `sde` etc.

 Because debuggers and simulators capture the signal and stop the ruapu signal handler function by default, we can continue execution at this time, or configure it specifically, such as `handle SIGILL nostop` in gdb. ruapu technically cannot prevent programs from stopping in debuggers and emulators

#### How to add detection capabilities for new instructions to ruapu

 _Assume that the new extended instruction set is named `rua`_

1. Add `RUAPU_INSTCODE(rua, rua-inst-hex) // rua r0,r0` and `RUAPU_ISAENTRY(rua)` in `ruapu.h`
2. Add `PRINT_ISA_SUPPORT(rua)` in `main.c` to print the detection result
3. Add entries about `rua` in README.md
4. Create a pull request!

 _https://godbolt.org/ is a good helper to view the compiled binary code of instructions._

## Repos that use ruapu
* [ncnn](https://github.com/Tencent/ncnn)  _High-performance neural network inference framework_
* [libllm](https://github.com/ling0322/libllm)  _Efficient inference of large language models_

## Credits


### Contribution behavior
* [@nihui](https://github.com/nihui)  _Write the initial POC code and ruapu maintainer_
* [@kernelbin](https://github.com/kernelbin)  _Implement exception handling for Windows_
* [@zchrissirhcz](https://github.com/zchrissirhcz)  _Detect x86 FMA4_
* [@MollySophia](https://github.com/MollySophia)  _Fix C++ export symbol_
* [@strongtz](https://github.com/strongtz)  _Detect more aarch64 ISA_
* [@monkeyking](https://github.com/monkeyking)  _Detect apple arm64 AMX_
* [@junchao-loongson](https://github.com/junchao-loongson)  _Add loongarch support_
* [@ziyao233](https://github.com/ziyao233)  _Detect more risc-v ISA_
* [@dreamcmi](https://github.com/dreamcmi)  _Detect more risc-v ISA_
* [@cocoa-xu](https://github.com/cocoa-xu)  _Add FreeBSD support, python support_
* [@YuzukiTsuru](https://github.com/YuzukiTsuru)  _Add OpenRISC support_
* [@whyb](https://github.com/whyb)  _Detect x86 AMX_

## License
MIT License