{"id":50977674,"url":"https://github.com/townsendmerino/wgpu","last_synced_at":"2026-06-19T10:01:36.424Z","repository":{"id":365599959,"uuid":"1272854058","full_name":"townsendmerino/wgpu","owner":"townsendmerino","description":"Minimal compute-only Go (CGO) binding over wgpu-native v29 — a drop-in for the cogentcore/webgpu subset, with full control over the WGSL dot4I8Packed builtin.","archived":false,"fork":false,"pushed_at":"2026-06-18T02:55:22.000Z","size":57337,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-18T04:25:41.249Z","etag":null,"topics":["cgo","compute","dp4a","go","gpu","metal","webgpu","wgpu-native","wgsl"],"latest_commit_sha":null,"homepage":null,"language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/townsendmerino.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":"NOTICE","maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-06-18T01:56:30.000Z","updated_at":"2026-06-18T02:55:26.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/townsendmerino/wgpu","commit_stats":null,"previous_names":["townsendmerino/wgpu"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/townsendmerino/wgpu","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/townsendmerino%2Fwgpu","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/townsendmerino%2Fwgpu/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/townsendmerino%2Fwgpu/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/townsendmerino%2Fwgpu/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/townsendmerino","download_url":"https://codeload.github.com/townsendmerino/wgpu/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/townsendmerino%2Fwgpu/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34489247,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-18T02:00:06.871Z","response_time":128,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cgo","compute","dp4a","go","gpu","metal","webgpu","wgpu-native","wgsl"],"created_at":"2026-06-19T10:01:32.668Z","updated_at":"2026-06-19T10:01:36.407Z","avatar_url":"https://github.com/townsendmerino.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# wgpu — minimal compute-only Go binding over wgpu-native v29.0.0.0\n\nA small **CGO** binding over [wgpu-native](https://github.com/gfx-rs/wgpu-native)\n**v29.0.0.0**, built for one purpose: full control over the WGSL\n`dot4I8Packed` builtin from Go, with an API that is a **drop-in for the slice of\n[`github.com/cogentcore/webgpu`](https://github.com/cogentcore/webgpu) that\ngoinfer's `./gpu` package uses**. Migrating goinfer is a near-mechanical import\nswap.\n\n## Why this exists\n\n`go-webgpu` (the zero-CGO `goffi` binding) SIGABRTs at `RequestAdapter` on Go\n1.26: `goffi` targets the Go 1.25 `crosscall2` callback ABI, which broke. **cgo\ncallbacks are robust across Go versions.** This binding uses cgo and statically\nlinks a prebuilt `libwgpu_native.a` into a single binary.\n\n## Status\n\nBuilt and validated on **darwin/arm64 (Apple M1 Pro, Metal)** against the real\nv29.0.0.0 static lib:\n\n```\nadapter: Apple M1 Pro | backend=Metal | type=IntegratedGPU\nABI validation: PASS (dot4I8Packed results match CPU reference)\ndot4I8Packed :  38.7 Gdot4/s\nscalar       :  12.3 Gdot4/s\nspeedup (scalar/dot4): 3.16x   →  GO ✅\n```\n\n`dot4I8Packed` needs **no device feature** for correctness (naga polyfills it\neverywhere). There is **no packed-dot feature flag** in wgpu-native v29 — the\nDP4A fast path is selected automatically by wgpu-core/naga. Detection is\nempirical: compile the builtin and **measure** (see `cmd/dot4probe`).\n\n## Layout\n\n```\n*.go                       the binding (package wgpu)\nlib/\u003cgoos\u003e/\u003cgoarch\u003e/libwgpu_native.a   vendored static libs\nlib/webgpu.h, lib/wgpu.h   headers from the v29.0.0.0 release (match the libs)\nlib/licenses/              wgpu-native MIT + Apache-2.0 texts\nscripts/fetch.sh           re-vendor the libs/headers (build works offline after)\ncmd/dot4probe/             Phase-2 go/no-go: ABI validation + DP4A measurement\n```\n\nVendored platforms: `darwin/arm64`, `darwin/amd64`, `linux/amd64`,\n`linux/arm64`, `windows/amd64` (GNU). Re-fetch with `bash scripts/fetch.sh`.\n\n## Build \u0026 run\n\n```sh\nCGO_ENABLED=1 go build ./...\nCGO_ENABLED=1 go test ./...          # ABI correctness test (skips without a GPU)\nCGO_ENABLED=1 go run ./cmd/dot4probe # full DP4A measurement\n```\n\n## Migrating goinfer\n\ngoinfer's `./gpu` imports `github.com/cogentcore/webgpu/wgpu` as `wgpu`. The\nexported type, method, and descriptor names here match that subset, so:\n\n```sh\n# in goinfer/gpu\ngrep -rl 'cogentcore/webgpu/wgpu' . | xargs sed -i '' \\\n  's#github.com/cogentcore/webgpu/wgpu#github.com/townsendmerino/wgpu#g'\n```\n\nThe import alias stays `wgpu`, every `wgpu.X` call site is unchanged, and the\nblocking call style (`RequestAdapter` returns `(*Adapter, error)`; `MapAsync` +\n`Poll(true, nil)`) is preserved. v29's async futures are hidden behind\nsynchronous wrappers.\n\n## Beyond the drop-in (v29 extras)\n\nThe cogentcore surface is mirrored exactly; on top of it this binding also\nexposes v29-only capabilities useful for the dot4 work:\n\n| Feature | API |\n|---|---|\n| GPU timestamp queries | `Device.CreateQuerySet`, `ComputePassDescriptor.TimestampWrites`, `CommandEncoder.ResolveQuerySet`, `Queue.GetTimestampPeriod` |\n| Pipeline-overridable WGSL constants | `ProgrammableStageDescriptor.Constants []ConstantEntry` |\n| Push-constant-equivalent immediates | `ComputePassEncoder.SetImmediates`, `NativeFeatureImmediates`, `Limits.MaxPushConstantSize` (→ `maxImmediateSize`) |\n| Subgroup adapter info | `AdapterInfo.SubgroupMinSize/MaxSize`, `FeatureNameSubgroups`, `NativeFeatureSubgroup` |\n| Batched dispatch recording (one CGO crossing for a whole chain; ~5× faster record) | `ComputePassEncoder.RecordSteps([]ComputeStep)` |\n\n## License\n\nMIT (this binding). Vendored wgpu-native blobs are MIT OR Apache-2.0 — see\n`NOTICE` and `lib/licenses/`.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftownsendmerino%2Fwgpu","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftownsendmerino%2Fwgpu","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftownsendmerino%2Fwgpu/lists"}