{"id":26600719,"url":"https://github.com/planetis-m/compute-sim","last_synced_at":"2025-04-09T16:23:40.474Z","repository":{"id":270515520,"uuid":"908180138","full_name":"planetis-m/compute-sim","owner":"planetis-m","description":"Learn and understand compute shader operations and control flow.","archived":false,"fork":false,"pushed_at":"2025-02-20T10:36:18.000Z","size":211,"stargazers_count":19,"open_issues_count":1,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-05T07:31:56.082Z","etag":null,"topics":["compute-shader","compute-shaders","gpgpu","gpgpu-computing","gpu-poor","gpu-simulation","nim"],"latest_commit_sha":null,"homepage":"https://planetis-m.github.io/compute-sim/computesim.html","language":"Nim","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/planetis-m.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-12-25T11:11:58.000Z","updated_at":"2025-03-24T14:27:51.000Z","dependencies_parsed_at":"2024-12-31T21:21:20.945Z","dependency_job_id":"0c08fef4-be60-4fb9-aeeb-91668f4ed9a8","html_url":"https://github.com/planetis-m/compute-sim","commit_stats":null,"previous_names":["planetis-m/compute-sim"],"tags_count":20,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/planetis-m%2Fcompute-sim","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/planetis-m%2Fcompute-sim/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/planetis-m%2Fcompute-sim/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/planetis-m%2Fcompute-sim/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/planetis-m","download_url":"https://codeload.github.com/planetis-m/compute-sim/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248065924,"owners_count":21042000,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["compute-shader","compute-shaders","gpgpu","gpgpu-computing","gpu-poor","gpu-simulation","nim"],"created_at":"2025-03-23T18:34:03.064Z","updated_at":"2025-04-09T16:23:40.468Z","avatar_url":"https://github.com/planetis-m.png","language":"Nim","funding_links":[],"categories":[],"sub_categories":[],"readme":"# computesim\n\nA compute shader emulator for learning and debugging GPU compute shaders.\n\n## Features\n- Emulates GPU compute shader execution on CPU\n- Simulates workgroups and subgroups with lockstep execution\n- Supports GLSL subgroup operations\n- Thread state visualization and debugging\n- Works with any Nim code that follows compute shader patterns\n\n## Example\n\n```nim\n# Compile with appropriate thread pool size and optimization settings\n# -d:ThreadPoolSize=MaxConcurrentWorkGroups*(ceilDiv(workgroupSize, SubgroupSize)+1)\n# -d:danger --threads:on --mm:arc\n\nimport std/math, computesim\n\ntype\n  Buffers = object\n    input: seq[int32]\n    atomicSum: int32\n\nproc reduce(b: ptr Buffers; numElements: uint32) {.computeShader.} =\n  let gid = gl_GlobalInvocationID.x\n  let value = if gid \u003c numElements: b.input[gid] else: 0\n\n  # First reduce within subgroup using efficient subgroup operation\n  let sum = subgroupAdd(value)\n\n  # Only one thread per subgroup needs to add to global sum\n  if gl_SubgroupInvocationID == 0:\n    atomicAdd b.atomicSum, sum\n\nconst\n  NumElements = 1024'u32\n  WorkGroupSize = 256'u32\n\nproc main() =\n  # Set up compute dimensions\n  let numWorkGroups = uvec3(ceilDiv(NumElements, WorkGroupSize), 1, 1)\n  let workGroupSize = uvec3(WorkGroupSize, 1, 1)\n\n  # Initialize buffers\n  var buffers = Buffers(\n    input: newSeq[int32](NumElements),\n    atomicSum: 0\n  )\n  for i in 0..\u003cNumElements:\n    buffers.input[i] = int32(i)\n\n  # Run reduction on CPU\n  runComputeOnCpu(\n    numWorkGroups = numWorkGroups,\n    workGroupSize = workGroupSize,\n    compute = reduce,\n    ssbo = addr buffers,\n    args = NumElements\n  )\n\n  let result = buffers.atomicSum\n  let expected = int32(NumElements * (NumElements - 1)) div 2\n  echo \"Reduction result: \", result, \", expected: \", expected\n\nmain()\n```\n\nThe example demonstrates:\n- Using subgroup operations for efficient reduction\n- Automatic handling of divergent control flow\n- Atomic operations for cross-workgroup communication\n- Proper synchronization between threads\n\n## Installation\n```\nnimble install computesim\n```\n\n## Usage\n\n1. Write your shader using the `computeShader` macro which:\n   - Transforms control flow for lockstep execution\n   - Converts subgroup operations into commands\n   - Handles thread synchronization\n\n2. Configure execution:\n   - Set up workgroup dimensions\n   - Prepare data buffers and shared memory\n   - Call `runComputeOnCpu` with your shader\n\nSee the examples directory for more patterns and use cases.\n\n## Limitations\n- Single wavefront/subgroup size\n- Limited subset of GLSL/compute operations\n- Performance is not representative of real GPU execution\n\n\u003e [!WARNING]\n\u003e ### Workgroup Scheduling\n\u003e While this emulator runs workgroups using CPU threads, real GPU compute shaders have no fairness guarantees between workgroups. This means your code might work correctly in this CPU emulator but fail on real GPU hardware where workgroups can execute in any order and with varying levels of parallelism. Do not rely on any assumptions about workgroup execution order or scheduling that might be true in this CPU emulator but not guaranteed on actual GPUs.\n\n## Compile-time Defines\n\n### Thread Management\n- `ThreadPoolSize` - Required. Must be at least `MaxConcurrentWorkGroups*(ceilDiv(workgroupSize, SubgroupSize)+1)`\n- `SubgroupSize` - Size of each subgroup/wavefront (default: 8)\n- `MaxConcurrentWorkGroups` - Maximum concurrent workgroups (default: 2)\n\n### Debug Options\nWith `-d:debugSubgroup`, these control which workgroup/subgroup to debug:\n- `debugWorkgroupX/Y/Z` - Workgroup coordinates to debug (default: 0)\n- `debugSubgroupID` - Subgroup ID to debug (default: 0)\n\n```nim\n# Example: Configure thread pool and groups\nnim c -d:ThreadPoolSize=8 -d:SubgroupSize=4 myshader.nim\n\n# Example: Enable debugging for specific group\nnim c -d:debugSubgroup -d:debugWorkgroupX=1 myshader.nim\n```\n\n## License\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fplanetis-m%2Fcompute-sim","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fplanetis-m%2Fcompute-sim","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fplanetis-m%2Fcompute-sim/lists"}