https://github.com/planetis-m/compute-sim

Learn and understand compute shader operations and control flow.
https://github.com/planetis-m/compute-sim

compute-shader compute-shaders gpgpu gpgpu-computing gpu-poor gpu-simulation nim

Last synced: 3 months ago
JSON representation

Learn and understand compute shader operations and control flow.

Host: GitHub
URL: https://github.com/planetis-m/compute-sim
Owner: planetis-m
License: mit
Created: 2024-12-25T11:11:58.000Z (7 months ago)
Default Branch: master
Last Pushed: 2025-02-20T10:36:18.000Z (5 months ago)
Last Synced: 2025-04-05T07:31:56.082Z (3 months ago)
Topics: compute-shader, compute-shaders, gpgpu, gpgpu-computing, gpu-poor, gpu-simulation, nim
Language: Nim
Homepage: https://planetis-m.github.io/compute-sim/computesim.html
Size: 206 KB
Stars: 19
Watchers: 1
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: readme.md
- License: LICENSE

Awesome Lists containing this project

README

        # computesim

A compute shader emulator for learning and debugging GPU compute shaders.

## Features

- Emulates GPU compute shader execution on CPU

- Simulates workgroups and subgroups with lockstep execution

- Supports GLSL subgroup operations

- Thread state visualization and debugging

- Works with any Nim code that follows compute shader patterns

## Example

```nim

# Compile with appropriate thread pool size and optimization settings

# -d:ThreadPoolSize=MaxConcurrentWorkGroups*(ceilDiv(workgroupSize, SubgroupSize)+1)

# -d:danger --threads:on --mm:arc

import std/math, computesim

type

  Buffers = object

    input: seq[int32]

    atomicSum: int32

proc reduce(b: ptr Buffers; numElements: uint32) {.computeShader.} =

  let gid = gl_GlobalInvocationID.x

  let value = if gid < numElements: b.input[gid] else: 0

  # First reduce within subgroup using efficient subgroup operation

  let sum = subgroupAdd(value)

  # Only one thread per subgroup needs to add to global sum

  if gl_SubgroupInvocationID == 0:

    atomicAdd b.atomicSum, sum

const

  NumElements = 1024'u32

  WorkGroupSize = 256'u32

proc main() =

  # Set up compute dimensions

  let numWorkGroups = uvec3(ceilDiv(NumElements, WorkGroupSize), 1, 1)

  let workGroupSize = uvec3(WorkGroupSize, 1, 1)

  # Initialize buffers

  var buffers = Buffers(

    input: newSeq[int32](NumElements),

    atomicSum: 0

  )

  for i in 0.. [!WARNING]

> ### Workgroup Scheduling

> While this emulator runs workgroups using CPU threads, real GPU compute shaders have no fairness guarantees between workgroups. This means your code might work correctly in this CPU emulator but fail on real GPU hardware where workgroups can execute in any order and with varying levels of parallelism. Do not rely on any assumptions about workgroup execution order or scheduling that might be true in this CPU emulator but not guaranteed on actual GPUs.

## Compile-time Defines

### Thread Management

- `ThreadPoolSize` - Required. Must be at least `MaxConcurrentWorkGroups*(ceilDiv(workgroupSize, SubgroupSize)+1)`

- `SubgroupSize` - Size of each subgroup/wavefront (default: 8)

- `MaxConcurrentWorkGroups` - Maximum concurrent workgroups (default: 2)

### Debug Options

With `-d:debugSubgroup`, these control which workgroup/subgroup to debug:

- `debugWorkgroupX/Y/Z` - Workgroup coordinates to debug (default: 0)

- `debugSubgroupID` - Subgroup ID to debug (default: 0)

```nim

# Example: Configure thread pool and groups

nim c -d:ThreadPoolSize=8 -d:SubgroupSize=4 myshader.nim

# Example: Enable debugging for specific group

nim c -d:debugSubgroup -d:debugWorkgroupX=1 myshader.nim

```

## License

MIT

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/planetis-m/compute-sim

Awesome Lists containing this project

README