https://github.com/jinmingyi1998/opencl_kernels
An easy way to run, test, benchmark and tune OpenCL kernel files
https://github.com/jinmingyi1998/opencl_kernels
benchmark numpy opencl opencv python3 tuner
Last synced: 7 months ago
JSON representation
An easy way to run, test, benchmark and tune OpenCL kernel files
- Host: GitHub
- URL: https://github.com/jinmingyi1998/opencl_kernels
- Owner: jinmingyi1998
- License: mit
- Created: 2023-04-01T16:14:32.000Z (over 2 years ago)
- Default Branch: master
- Last Pushed: 2023-08-25T09:48:52.000Z (about 2 years ago)
- Last Synced: 2025-02-28T01:38:40.642Z (7 months ago)
- Topics: benchmark, numpy, opencl, opencv, python3, tuner
- Language: C++
- Homepage: https://opencl-kernel-python-wrapper.readthedocs.io/en/latest/
- Size: 483 KB
- Stars: 23
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# OpenCL Kernel Python Wrapper
[](https://github.com/jinmingyi1998/opencl_kernels)
[](https://opencl-kernel-python-wrapper.readthedocs.io/en/latest/)

[](https://pypi.org/project/pyoclk/)



[](https://pypi.org/project/pyoclk/)## Install
### Requirements
* OpenCL GPU hardware
* numpy
* cmake(if compile from source)### Install from wheel
```shell
pip install pyoclk
```or download wheel from [release](https://github.com/jinmingyi1998/opencl_kernels/releases) and install
### Compile from source
**Clone this repo**
clone by http
```shell
git clone --recursive https://github.com/jinmingyi1998/opencl_kernels.git
```with ssh
```shell
git clone --recursive git@github.com:jinmingyi1998/opencl_kernels.git
```**Install**
```shell
cd opencl_kernels
python setup.py install
```***DO NOT move this directory after install***
## Usage
### Kernel File:
a file named `add.cl`
```c
kernel void add(global float*a, global float*out, int int_arg, float float_arg){
int x = get_global_id(0);
if(x==0){
printf(" accept int arg: %d, accept float arg: %f\n",int_arg,float_arg);
}
out[x] = a[x] * float_arg + int_arg;
}
```### Python Code
#### OOP Style
```python
import numpy as np
import oclka = np.random.rand(100, 100).reshape([10, -1])
a = np.ascontiguousarray(a, np.float32)
out = np.zeros(a.shape)
out = np.ascontiguousarray(out, np.float32)runner = oclk.Runner()
runner.load_kernel("add.cl", "add", "")timer = oclk.TimerArgs(
enable=True,
warmup=10,
repeat=50,
name='add_kernel'
)
runner.run(
kernel_name="add",
input=[
{"name": "a", "value": a, },
{"name": "out", "value": out, },
{"name": "int_arg", "value": 1, "type": "int"},
{"name": "float_arg", "value": 12.34}
],
output=['out'],
local_work_size=[1, 1],
global_work_size=a.shape,
timer=timer
)
# check result
a = a.reshape([-1])
out = out.reshape([-1])
print(a[:8])
print(out[:8])
```### Kernel Benchmark
1. write a config like [bench_add.yaml](examples/bench_add.yaml)
2. run `python -m oclk benchmark -f examples/bench_add.yaml`#### Example
```shell
python -m oclk benchmark -f examples/bench_add.yaml
```
output:
```text
[Timer bench_add.add] [CNT: 1] [AVG: 0.539ms] [STDEV 0.000ms] [TOTAL 0.539ms]
[Timer bench_add.add_constant] [CNT: 1] [AVG: 0.576ms] [STDEV 0.000ms] [TOTAL 0.576ms]
[Timer bench_add.add_batch] [CNT: 1] [AVG: 0.150ms] [STDEV 0.000ms] [TOTAL 0.150ms]
``````shell
python -m oclk benchmark -f examples/bench_add.yaml -s table
```
output:
```text
benchmark results
┏━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ timer name ┃ avg time(ms) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ bench_add.add │ 0.538525390625 │
│ bench_add.add_constant │ 0.581396484375 │
│ bench_add.add_batch │ 0.149169921875 │
└────────────────────────┴────────────────┘
``````shell
python -m oclk benchmark -f examples/bench_add.yaml -s json -o bench_add.json
```
output to json file `bench_add.json`
```json
[
{
"name": "bench_add.add",
"time(ms)": 0.54248046875
},
{
"name": "bench_add.add_constant",
"time(ms)": 0.5767089843750001
},
{
"name": "bench_add.add_batch",
"time(ms)": 0.15048828125000002
}
]
```### Kernel Tune
1. given a OpenCL kernel file `add.cl`
2. run `python -m oclk new tune add`, then generate a new file `tune_add.py`
3. edit `tune_add.py`
4. run `python -m oclk tune -f tune_add.py -o add_tune_result.json`
5. results are stored in `add_tune_result.json`#### Example
```shell
python -m oclk tune -f examples/tune/tune_add.py -k 3
```
then output `output.json`
```json
[
{
"name": [
"examples.tune.tune_add",
"AddTuner"
],
"k": 3,
"topk_results": [
{
"kwargs": {
"local_work_size": [
512
],
"vector_size": 4,
"tile_size": 4,
"method": "naive"
},
"time_ms": 0.67691162109375
},
{
"kwargs": {
"local_work_size": [
128
],
"vector_size": 4,
"tile_size": 4,
"method": "naive"
},
"time_ms": 0.6769140625
},
{
"kwargs": {
"local_work_size": [
64
],
"vector_size": 4,
"tile_size": 4,
"method": "naive"
},
"time_ms": 0.677001953125
}
]
}
]
```