https://github.com/planetis-m/compute-sim
Learn and understand compute shader operations and control flow.
https://github.com/planetis-m/compute-sim
compute-shader compute-shaders gpgpu gpgpu-computing gpu-poor gpu-simulation nim
Last synced: 23 days ago
JSON representation
Learn and understand compute shader operations and control flow.
- Host: GitHub
- URL: https://github.com/planetis-m/compute-sim
- Owner: planetis-m
- License: mit
- Created: 2024-12-25T11:11:58.000Z (4 months ago)
- Default Branch: master
- Last Pushed: 2025-02-20T10:36:18.000Z (2 months ago)
- Last Synced: 2025-04-05T07:31:56.082Z (27 days ago)
- Topics: compute-shader, compute-shaders, gpgpu, gpgpu-computing, gpu-poor, gpu-simulation, nim
- Language: Nim
- Homepage: https://planetis-m.github.io/compute-sim/computesim.html
- Size: 206 KB
- Stars: 19
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: readme.md
- License: LICENSE
Awesome Lists containing this project
README
# computesim
A compute shader emulator for learning and debugging GPU compute shaders.
## Features
- Emulates GPU compute shader execution on CPU
- Simulates workgroups and subgroups with lockstep execution
- Supports GLSL subgroup operations
- Thread state visualization and debugging
- Works with any Nim code that follows compute shader patterns## Example
```nim
# Compile with appropriate thread pool size and optimization settings
# -d:ThreadPoolSize=MaxConcurrentWorkGroups*(ceilDiv(workgroupSize, SubgroupSize)+1)
# -d:danger --threads:on --mm:arcimport std/math, computesim
type
Buffers = object
input: seq[int32]
atomicSum: int32proc reduce(b: ptr Buffers; numElements: uint32) {.computeShader.} =
let gid = gl_GlobalInvocationID.x
let value = if gid < numElements: b.input[gid] else: 0# First reduce within subgroup using efficient subgroup operation
let sum = subgroupAdd(value)# Only one thread per subgroup needs to add to global sum
if gl_SubgroupInvocationID == 0:
atomicAdd b.atomicSum, sumconst
NumElements = 1024'u32
WorkGroupSize = 256'u32proc main() =
# Set up compute dimensions
let numWorkGroups = uvec3(ceilDiv(NumElements, WorkGroupSize), 1, 1)
let workGroupSize = uvec3(WorkGroupSize, 1, 1)# Initialize buffers
var buffers = Buffers(
input: newSeq[int32](NumElements),
atomicSum: 0
)
for i in 0.. [!WARNING]
> ### Workgroup Scheduling
> While this emulator runs workgroups using CPU threads, real GPU compute shaders have no fairness guarantees between workgroups. This means your code might work correctly in this CPU emulator but fail on real GPU hardware where workgroups can execute in any order and with varying levels of parallelism. Do not rely on any assumptions about workgroup execution order or scheduling that might be true in this CPU emulator but not guaranteed on actual GPUs.## Compile-time Defines
### Thread Management
- `ThreadPoolSize` - Required. Must be at least `MaxConcurrentWorkGroups*(ceilDiv(workgroupSize, SubgroupSize)+1)`
- `SubgroupSize` - Size of each subgroup/wavefront (default: 8)
- `MaxConcurrentWorkGroups` - Maximum concurrent workgroups (default: 2)### Debug Options
With `-d:debugSubgroup`, these control which workgroup/subgroup to debug:
- `debugWorkgroupX/Y/Z` - Workgroup coordinates to debug (default: 0)
- `debugSubgroupID` - Subgroup ID to debug (default: 0)```nim
# Example: Configure thread pool and groups
nim c -d:ThreadPoolSize=8 -d:SubgroupSize=4 myshader.nim# Example: Enable debugging for specific group
nim c -d:debugSubgroup -d:debugWorkgroupX=1 myshader.nim
```## License
MIT