{"id":16031736,"url":"https://github.com/jinmingyi1998/opencl_kernels","last_synced_at":"2025-03-17T16:30:49.063Z","repository":{"id":162426426,"uuid":"622272830","full_name":"jinmingyi1998/opencl_kernels","owner":"jinmingyi1998","description":"An easy way to run, test, benchmark and tune OpenCL kernel files","archived":false,"fork":false,"pushed_at":"2023-08-25T09:48:52.000Z","size":495,"stargazers_count":23,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-02-28T01:38:40.642Z","etag":null,"topics":["benchmark","numpy","opencl","opencv","python3","tuner"],"latest_commit_sha":null,"homepage":"https://opencl-kernel-python-wrapper.readthedocs.io/en/latest/","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jinmingyi1998.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-01T16:14:32.000Z","updated_at":"2024-01-28T04:25:45.000Z","dependencies_parsed_at":null,"dependency_job_id":"f4f8af87-eea5-4615-9449-4b1d925f3097","html_url":"https://github.com/jinmingyi1998/opencl_kernels","commit_stats":{"total_commits":90,"total_committers":2,"mean_commits":45.0,"dds":"0.022222222222222254","last_synced_commit":"4c178e1f0765bbd0b3fe2a4365c4aef4a68a77eb"},"previous_names":[],"tags_count":12,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jinmingyi1998%2Fopencl_kernels","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jinmingyi1998%2Fopencl_kernels/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jinmingyi1998%2Fopencl_kernels/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jinmingyi1998%2Fopencl_kernels/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jinmingyi1998","download_url":"https://codeload.github.com/jinmingyi1998/opencl_kernels/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243871473,"owners_count":20361358,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmark","numpy","opencl","opencv","python3","tuner"],"created_at":"2024-10-08T21:05:14.691Z","updated_at":"2025-03-17T16:30:49.057Z","avatar_url":"https://github.com/jinmingyi1998.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# OpenCL Kernel Python Wrapper\n\n[![github badge](https://img.shields.io/badge/view%20on%20github-gray?style=plastic\u0026logo=github)](https://github.com/jinmingyi1998/opencl_kernels)\n[![readthedocs](https://img.shields.io/badge/readthedocs-8CA1AF?logo=readthedocs\u0026labelColor=white)](https://opencl-kernel-python-wrapper.readthedocs.io/en/latest/)\n![GitHub release (with filter)](https://img.shields.io/github/v/release/jinmingyi1998/opencl_kernels)\n[![PyPI - Version](https://img.shields.io/pypi/v/pyoclk)](https://pypi.org/project/pyoclk/)\n![PyPI - Downloads](https://img.shields.io/pypi/dm/pyoclk)\n![license](https://img.shields.io/pypi/l/pyoclk)\n![GitHub Repo stars](https://img.shields.io/github/stars/jinmingyi1998/opencl_kernels)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/pyoclk)](https://pypi.org/project/pyoclk/)\n\n## Install\n\n### Requirements\n\n* OpenCL GPU hardware\n* numpy\n* cmake(if compile from source)\n\n### Install from wheel\n\n```shell\npip install pyoclk\n```\n\nor download wheel from [release](https://github.com/jinmingyi1998/opencl_kernels/releases) and install\n\n### Compile from source\n\n**Clone this repo**\n\nclone by http\n\n```shell\ngit clone --recursive https://github.com/jinmingyi1998/opencl_kernels.git\n```\n\nwith ssh\n\n```shell\ngit clone --recursive git@github.com:jinmingyi1998/opencl_kernels.git\n```\n\n**Install**\n\n```shell\ncd opencl_kernels\npython setup.py install\n```\n\n***DO NOT move this directory after install***\n\n## Usage\n\n### Kernel File:\n\na file named `add.cl`\n\n```c\nkernel void add(global float*a, global float*out, int int_arg, float float_arg){\n    int x = get_global_id(0);\n    if(x==0){\n        printf(\" accept int arg: %d, accept float arg: %f\\n\",int_arg,float_arg);\n    }\n    out[x] = a[x] * float_arg + int_arg;    \n}\n```\n\n### Python Code\n\n#### OOP Style\n\n```python\nimport numpy as np\nimport oclk\n\na = np.random.rand(100, 100).reshape([10, -1])\na = np.ascontiguousarray(a, np.float32)\nout = np.zeros(a.shape)\nout = np.ascontiguousarray(out, np.float32)\n\nrunner = oclk.Runner()\nrunner.load_kernel(\"add.cl\", \"add\", \"\")\n\ntimer = oclk.TimerArgs(\n    enable=True,\n    warmup=10,\n    repeat=50,\n    name='add_kernel'\n)\nrunner.run(\n    kernel_name=\"add\",\n    input=[\n        {\"name\": \"a\", \"value\": a, },\n        {\"name\": \"out\", \"value\": out, },\n        {\"name\": \"int_arg\", \"value\": 1, \"type\": \"int\"},\n        {\"name\": \"float_arg\", \"value\": 12.34}\n    ],\n    output=['out'],\n    local_work_size=[1, 1],\n    global_work_size=a.shape,\n    timer=timer\n)\n# check result\na = a.reshape([-1])\nout = out.reshape([-1])\nprint(a[:8])\nprint(out[:8])\n```\n\n### Kernel Benchmark\n\n1. write a config like [bench_add.yaml](examples/bench_add.yaml)\n2. run `python -m oclk benchmark -f examples/bench_add.yaml`\n\n#### Example\n\n```shell\npython -m oclk benchmark -f examples/bench_add.yaml                          \n```\noutput:\n```text\n[Timer bench_add.add] [CNT: 1] [AVG: 0.539ms] [STDEV 0.000ms] [TOTAL 0.539ms]\n[Timer bench_add.add_constant] [CNT: 1] [AVG: 0.576ms] [STDEV 0.000ms] [TOTAL 0.576ms]\n[Timer bench_add.add_batch] [CNT: 1] [AVG: 0.150ms] [STDEV 0.000ms] [TOTAL 0.150ms]\n```\n\n\n```shell\npython -m oclk benchmark -f examples/bench_add.yaml -s table\n```\noutput:\n```text\n             benchmark results             \n┏━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓\n┃ timer name             ┃   avg time(ms) ┃\n┡━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩\n│ bench_add.add          │ 0.538525390625 │\n│ bench_add.add_constant │ 0.581396484375 │\n│ bench_add.add_batch    │ 0.149169921875 │\n└────────────────────────┴────────────────┘\n```\n\n```shell\npython -m oclk benchmark -f examples/bench_add.yaml -s json -o bench_add.json\n```\noutput to json file `bench_add.json`\n```json\n[\n  {\n    \"name\": \"bench_add.add\",\n    \"time(ms)\": 0.54248046875\n  },\n  {\n    \"name\": \"bench_add.add_constant\",\n    \"time(ms)\": 0.5767089843750001\n  },\n  {\n    \"name\": \"bench_add.add_batch\",\n    \"time(ms)\": 0.15048828125000002\n  }\n]\n```\n\n### Kernel Tune\n\n1. given a OpenCL kernel file `add.cl`\n2. run `python -m oclk new tune add`, then generate a new file `tune_add.py`\n3. edit `tune_add.py`\n4. run `python -m oclk tune -f tune_add.py -o add_tune_result.json`\n5. results are stored in `add_tune_result.json`\n\n#### Example\n\n```shell\npython -m oclk tune -f examples/tune/tune_add.py -k 3\n```\nthen output `output.json`\n```json\n[\n  {\n    \"name\": [\n      \"examples.tune.tune_add\",\n      \"AddTuner\"\n    ],\n    \"k\": 3,\n    \"topk_results\": [\n      {\n        \"kwargs\": {\n          \"local_work_size\": [\n            512\n          ],\n          \"vector_size\": 4,\n          \"tile_size\": 4,\n          \"method\": \"naive\"\n        },\n        \"time_ms\": 0.67691162109375\n      },\n      {\n        \"kwargs\": {\n          \"local_work_size\": [\n            128\n          ],\n          \"vector_size\": 4,\n          \"tile_size\": 4,\n          \"method\": \"naive\"\n        },\n        \"time_ms\": 0.6769140625\n      },\n      {\n        \"kwargs\": {\n          \"local_work_size\": [\n            64\n          ],\n          \"vector_size\": 4,\n          \"tile_size\": 4,\n          \"method\": \"naive\"\n        },\n        \"time_ms\": 0.677001953125\n      }\n    ]\n  }\n]\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjinmingyi1998%2Fopencl_kernels","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjinmingyi1998%2Fopencl_kernels","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjinmingyi1998%2Fopencl_kernels/lists"}