https://github.com/themactep/ai-benchmarks
A set of scripts to prep your hardware for AI
https://github.com/themactep/ai-benchmarks
Last synced: 10 days ago
JSON representation
A set of scripts to prep your hardware for AI
- Host: GitHub
- URL: https://github.com/themactep/ai-benchmarks
- Owner: themactep
- Created: 2026-05-01T02:54:18.000Z (about 1 month ago)
- Default Branch: master
- Last Pushed: 2026-05-01T03:14:39.000Z (about 1 month ago)
- Last Synced: 2026-05-23T19:31:13.506Z (15 days ago)
- Language: Python
- Size: 21.5 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# AI Benchmarks
Benchmark suite for the Intel Core Ultra 9 185H (Meteor Lake) AI PC.
**Accelerators**: CPU (AVX2 + VNNI), Intel Arc iGPU, Intel AI Boost NPU (11 TOPS INT8), AMD Radeon RX 580 (Vulkan)
## Quick start
```bash
# Run everything
./run-all.sh
# Or individually:
python3 01-system-info.py # Hardware detection
python3 02-pytorch-cpu.py # PyTorch CPU benchmarks
python3 03-openvino-all.py # OpenVINO CPU / GPU / NPU
bash 04-llama-cpp-check.sh # llama.cpp Vulkan devices
```
Run a specific device only:
```bash
python3 03-openvino-all.py --device NPU
python3 03-openvino-all.py --device GPU
python3 03-openvino-all.py --matmul-size 2048
```
## Expected results (this system)
| Device | Matmul 4096² f32 | CNN FPS | Best for |
|--------|-----------------|---------|----------|
| CPU | ~0.2 TFLOPS | ~32 | Training, float32 inference |
| GPU | ~0.7 TFLOPS | ~65 | Vision models, OpenVINO |
| NPU | ~1.2 TFLOPS | ~58 | Low-power INT8 inference |
**PyTorch CPU** (with IPEX + perf governor): ~0.57 TFLOPS matmul, ~493 GFLOPS peak at 16 threads.
**llama.cpp Vulkan LLM inference**:
- Intel Arc: 48 GB shared → run 70B models
- AMD RX 580: 8 GB VRAM → 7B-13B models at ~20-40 tok/s
## Prerequisites
All packages are already installed. If setting up from scratch:
```bash
# Core AI libraries
pip install torch intel-extension-for-pytorch openvino numpy
# Intel GPU compute (OpenVINO GPU plugin needs this)
sudo apt install intel-opencl-icd libze-intel-gpu1
# Intel NPU (downloaded from GitHub)
# 1. Level Zero NPU driver:
# https://github.com/intel/linux-npu-driver/releases
# → intel-level-zero-npu_*.deb
# 2. NPU compiler:
# https://github.com/openvinotoolkit/npu_compiler/releases
# → extract lib → copy to openvino/libs/libopenvino_intel_npu_compiler.so
# CPU governor (already set to performance)
echo performance | sudo tee /sys/devices/system/cpu/cpufreq/policy*/scaling_governor
# User must be in render group for NPU access
sudo usermod -a -G render $USER
# Log out and back in for this to take effect
# llama.cpp with Vulkan
cd ~/llama.cpp
cmake -B build -DGGML_VULKAN=ON
cmake --build build --config Release -j$(nproc)
```
## Architecture notes
- **CPU**: 6 P-cores (4.8-5.1 GHz) + 8 E-cores (2.5-3.8 GHz) + 2 LP E-cores. No AMX (datacenter-only). Has AVX-VNNI for INT8 acceleration. **bfloat16 is extremely slow** — use float32 or INT8 instead. Pin PyTorch to P-cores with `torch.set_num_threads(8)` or `taskset -c 0-11`.
- **GPU (Arc)**: Integrated GPU sharing system RAM (48 GB max). Access via OpenVINO GPU plugin or llama.cpp Vulkan backend.
- **NPU (AI Boost)**: Dedicated AI accelerator, ~11 TOPS INT8. Best for sustained low-power INT8 inference. Requires Level Zero NPU driver + OpenVINO NPU compiler.
- **GPU (RX 580)**: Discrete Polaris GPU, 8 GB VRAM. No ROCm support. Use via Vulkan (llama.cpp only).
## File structure
```
ai-benchmarks/
01-system-info.py Hardware detection, drivers, instruction sets
02-pytorch-cpu.py Matmul, thread scaling, transformer sim, mem bw
03-openvino-all.py CPU vs GPU vs NPU (matmul + CNN)
04-llama-cpp-check.sh Vulkan device listing for LLM inference
run-all.sh Run all benchmarks sequentially
README.md This file
```