An open API service indexing awesome lists of open source software.

https://github.com/fumiama/gozel

Pure-Go GPU/NPU compute wrapper for Intel NEO driver Level Zero APIs.
https://github.com/fumiama/gozel

dpcpp golang golang-examples golang-library gpgpu gpgpu-computing gpu gpu-acceleration gpu-computing gpu-programming intel intel-gpu level-zero level-zero-loader llvm llvm-ir oneapi opencl spirv sycl

Last synced: 1 day ago
JSON representation

Pure-Go GPU/NPU compute wrapper for Intel NEO driver Level Zero APIs.

Awesome Lists containing this project

README

          

# GoZeL

[![CI](https://github.com/fumiama/gozel/actions/workflows/ci.yml/badge.svg)](https://github.com/fumiama/gozel/actions/workflows/ci.yml)
[![Go Reference](https://pkg.go.dev/badge/github.com/fumiama/gozel.svg)](https://pkg.go.dev/github.com/fumiama/gozel)
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)

**gozel** is a pure-Go binding for the [Intel oneAPI Level Zero](https://github.com/oneapi-src/level-zero) API — enabling direct GPU/NPU compute from Go **without cgo**.

Built on [purego](https://github.com/ebitengine/purego) and Windows syscall, gozel loads `ze_loader` at runtime via FFI, avoiding all C compiler dependencies. The entire API surface is auto-generated from the official Level Zero SDK headers, keeping bindings always in sync with upstream.

| Before Scaling (1272 x 855) | After Scaling (512 x 344) |
|:--:|:--:|
|![暖笺贺春.webp](examples/image_scale/暖笺贺春.webp)|![small.png](examples/image_scale/small.png)|

---

## Table of Contents

- [Projects Using gozel](#projects-using-gozel)
- [Features](#features)
- [Platform Support](#platform-support)
- [Quick Start](#quick-start)
- [The `ze` Package — High-Level API](#the-ze-package--high-level-api)
- [Contributing](#contributing)
- [License](#license)
- [Appendix I. Architecture](#appendix-i-architecture)
- [Appendix II. Regenerating Bindings from Headers](#appendix-ii-regenerating-bindings-from-headers)
- [Appendix III. Building SPIR-V Kernels](#appendix-iii-building-spir-v-kernels)

---

## Projects Using gozel

We maintain a list of projects built with gozel. If your project uses this library, please open a PR to add it here!

| Project | Description | Link |
|---|---|---|
| | | |

## Features

- **No cgo** — Pure Go via [purego](https://github.com/ebitengine/purego) and Windows syscall. No C toolchain required at build time.
- **Auto-generated** — All 200+ Level Zero functions generated directly from the [official C headers](https://github.com/oneapi-src/level-zero/tree/master/include), covering Core, Sysman, Tools and Runtime APIs.
- **High-level wrapper** — The `ze` sub-package provides an idiomatic Go API over the raw bindings: typed handles, error returns, automatic resource lifetime.
- **Kernel in Go** — Embed SPIR-V binaries with `//go:embed`, load and launch GPU kernels entirely from Go.
- **Extensible** — Add new examples, wrap more APIs, or plug in your own SPIR-V pipelines.

## Platform Support
> gozel targets all Intel GPUs including Intel Arc / Iris Xe / Data Center GPUs that ship a Level Zero driver. Any device visible to `ze_loader` should work.

| OS | Architecture | Runtime Requirement |
|---|---|---|
| Windows | amd64 | [Intel GPU driver](https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html) with `ze_loader.dll` |
| Linux | amd64 | [Intel compute-runtime](https://github.com/intel/compute-runtime) with `libze_loader.so` |
| More | under | testing... |

## Quick Start

### Install

```bash
go get github.com/fumiama/gozel
```

### Minimal Example

```go
package main

import (
"fmt"

"github.com/fumiama/gozel/ze"
)

func main() {
gpus, err := ze.InitGPUDrivers()
if err != nil {
panic(err)
}
fmt.Printf("Found %d GPU driver(s)\n", len(gpus))

devs, _ := gpus[0].DeviceGet()
for _, d := range devs {
prop, _ := d.DeviceGetProperties()
fmt.Printf(" Device: %s\n", string(prop.Name[:]))
}

// Found 1 GPU driver(s)
// Device: Graphics
}
```

### Vector-Add

This example adds up two large float32 vectors.

```bash
cd examples/vadd
# go generate # Optional, requires SYCL toolchain (see below)
go run main.go
```

You will see the result like

```bash
=============== Device Basic Properties ===============
Running on device: ID = 1 , Name = Graphics @ 4.00 GHz.
=============== Device Compute Properties ===============
Max Group Size (X, Y, Z): (1024, 1024, 1024)
Max Group Count (X, Y, Z): (4294967295, 4294967295, 4294967295)
Max Total Group Size: 1024
Max Shared Local Memory: 65536
Num Subgroup Sizes: 3
Subgroup Sizes: [8 16 32 0 0 0 0 0]
=============== Computation Configuration ===============
Group Size (X, Y, Z): (1024, 1, 1)
Group Count: 65536
Total Elements (N): 67108864
Buffer Size: 256 MiB
=============== Calculation Results ===============
GPU Execution Time: 56.114500 ms
GPU Throughput: 4.78 GiB/s
=============== Validation Results ===============
CPU Execution Time: 77.190700 ms
CPU Throughput: 3.48 GiB/s
Test Passed!!!
```

### More Examples

| Example | Description | Source |
|---|---|---|
| **vadd** | Vector addition | [examples/vadd](examples/vadd/) |
| **vadd_event** | Vector addition with event | [examples/vadd_event](examples/vadd_event/) |
| **image_scale** | Scale image using hardware sampler | [examples/image_scale](examples/image_scale/) |

## The `ze` Package — High-Level API

All examples used `ze` sub-package, which wraps the raw Level Zero bindings into idiomatic Go with typed handles and method chains:

```go
// Initialize
gpus, _ := ze.InitGPUDrivers()

// Context + Device
ctx, _ := gpus[0].ContextCreate()
devs, _ := gpus[0].DeviceGet()

// Memory
hostBuf, _ := ctx.MemAllocHost(size, alignment)
devBuf, _ := ctx.MemAllocDevice(devs[0], size, alignment)
defer ctx.MemFree(hostBuf)
defer ctx.MemFree(devBuf)

// Module + Kernel
mod, _ := ctx.ModuleCreate(devs[0], spirvBytes)
krn, _ := mod.KernelCreate("my_kernel")
krn.SetArgumentValue(0, unsafe.Sizeof(uintptr(0)), unsafe.Pointer(&devBuf))
krn.SetGroupSize(256, 1, 1)

// Command submission
q, _ := ctx.CommandQueueCreate(devs[0])
cl, _ := ctx.CommandListCreate(devs[0])
cl.AppendMemoryCopy(devBuf, hostBuf, size)
cl.AppendBarrier()
cl.AppendLaunchKernel(krn, &gozel.ZeGroupCount{Groupcountx: 4, Groupcounty: 1, Groupcountz: 1})
cl.AppendBarrier()
cl.Close()
q.ExecuteCommandLists(cl)
q.Synchronize()
```

Compared to the raw `gozel.ZeCommandListCreate(...)` calls, the `ze` package:

- Eliminates boilerplate descriptor initialization (struct types, `Stype` fields)
- Returns Go `error` instead of `ZeResult` codes
- Provides method syntax on handles (`ctx.CommandListCreate(dev)` vs standalone functions)
- Manages handle lifetimes with `defer`-friendly `Destroy()` methods

The `ze` package currently covers the most common workflows. **Contributions to expand coverage are very welcome** — see [Contributing](#contributing).

> 🙌 **Have an example to share?** We'd love to grow this table — see [Contributing](#contributing).

## Contributing

Contributions of all kinds are welcome. Some particularly impactful areas:

- **Examples** — Add new `examples/` demonstrating different GPU compute patterns (matrix multiply, reduction, image processing, etc.). Every example helps new users get started faster.
- **`ze` package coverage** — The high-level wrapper doesn't cover the full Level Zero surface yet. Wrapping additional APIs (events, fences, images, sysman queries, etc.) directly benefits all downstream users.
- **Testing** — Help improve test coverage across packages.
- **Documentation** — Improve godoc comments, add usage guides, or translate documentation.
- **Project showcase** — If you use gozel in your project, open a PR to add it to the [Projects Using gozel](#projects-using-gozel) table above.

## License

- This project is generally licensed under the [GNU Affero General Public License v3.0](LICENSE).
- The files in [gozel](gozel) folder follows their original license, which is [MIT](https://github.com/oneapi-src/level-zero/blob/master/LICENSE).

---

## Appendix I. Architecture

The FFI layer (`internal/zecall`) loads the Level Zero loader library at runtime, caches procedure addresses, and dispatches calls through `purego` or Windows syscall. All pointer arguments are protected against GC collection during syscalls via `go:uintptrescapes`.

```
gozel/
├── api.go # Auto-generated: registers all L0 functions at init
├── core_*.go # Auto-generated: Core API bindings (ze*)
├── rntm_*.go # Auto-generated: Runtime API bindings (zer*)
├── sysm_*.go # Auto-generated: Sysman API bindings (zes*)
├── tols_*.go # Auto-generated: Tools API bindings (zet*)
├── ze/ # High-level idiomatic Go wrapper
│ ├── init.go # Driver initialization
│ ├── context.go # Context management
│ ├── device.go # Device enumeration & properties
│ ├── module.go # SPIR-V module loading
│ ├── kernel.go # Kernel creation & argument binding
│ ├── mem.go # Device/host memory allocation
│ └── command.go # Command queues, lists, barriers
├── internal/zecall/ # purego FFI layer (loads ze_loader at runtime)
├── cmd/gen/ # Code generator: parses L0 headers → Go source
├── spec/ # Optional L0 SDK headers for dev purpose (input to cmd/gen)
└── examples/
├── quick_start/ # The quick start shown in this README
└── vadd/ # Vector addition: SYCL kernel + Go host
```

## Appendix II. Regenerating Bindings from Headers

gozel includes a code generator (`cmd/gen`) that parses the four Level Zero API headers and produces all `core_*.go`, `rntm_*.go`, `sysm_*.go`, `tols_*.go` files plus `api.go`.

### From a local SDK

Place (or symlink) the Level Zero SDK under `spec/`:

```
spec/
├── include/level_zero/
│ ├── ze_api.h # Core
│ ├── zer_api.h # Runtime
│ ├── zes_api.h # Sysman
│ └── zet_api.h # Tools
└── lib/
```

Then run:

```bash
go run ./cmd/gen -spec ./spec
```

### From a specific release

The generator can download the SDK directly from the [level-zero releases](https://github.com/oneapi-src/level-zero/releases):

```bash
go run ./cmd/gen -spec v1.28.0
```

### Via `go generate`

```bash
go generate .
```

This invokes `cmd/gen` with the local `spec/` directory as configured in `doc.go`.

## Appendix III. Building SPIR-V Kernels

GPU kernels in the [examples](examples) folder are written in SYCL C++ and compiled to SPIR-V for embedding into Go programs, which is a little bit hacky. You can also use `ocloc`, which is a common practice and you can search for the build doc elsewhere. The build pipeline uses `go generate` directives:

```
main.cpp ──clang++ -fsycl──▶ device_kern.bc

sycl-post-link

▼ device_kern_0.bc

llvm-spirv

▼ main.spv ← embedded via //go:embed
```

### Prerequisites

- [Intel LLVM SYCL Compiler](https://github.com/intel/llvm) (provides `clang++` with SYCL support, `sycl-post-link`, etc.)

### Build

```bash
cd examples/vadd
go generate # compiles main.cpp → main.spv
go run main.go # runs vector addition on GPU
```