{"id":28751095,"url":"https://github.com/nvpro-samples/nv_cluster_builder","last_synced_at":"2025-06-16T22:09:10.967Z","repository":{"id":296439043,"uuid":"921073785","full_name":"nvpro-samples/nv_cluster_builder","owner":"nvpro-samples","description":"small generic spatial clustering C++ library","archived":false,"fork":false,"pushed_at":"2025-05-30T17:58:04.000Z","size":137,"stargazers_count":29,"open_issues_count":0,"forks_count":2,"subscribers_count":8,"default_branch":"main","last_synced_at":"2025-05-31T02:07:49.137Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nvpro-samples.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.txt","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-01-23T09:37:54.000Z","updated_at":"2025-05-31T02:00:35.000Z","dependencies_parsed_at":"2025-05-31T02:27:17.262Z","dependency_job_id":"737a9e21-85b1-4f4e-b0cd-2f1a15946987","html_url":"https://github.com/nvpro-samples/nv_cluster_builder","commit_stats":null,"previous_names":["nvpro-samples/nv_cluster_builder"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/nvpro-samples/nv_cluster_builder","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nvpro-samples%2Fnv_cluster_builder","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nvpro-samples%2Fnv_cluster_builder/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nvpro-samples%2Fnv_cluster_builder/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nvpro-samples%2Fnv_cluster_builder/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nvpro-samples","download_url":"https://codeload.github.com/nvpro-samples/nv_cluster_builder/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nvpro-samples%2Fnv_cluster_builder/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260250005,"owners_count":22980768,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-16T22:09:10.351Z","updated_at":"2025-06-16T22:09:10.957Z","avatar_url":"https://github.com/nvpro-samples.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# nv_cluster_builder\n\n**nv_cluster_builder** is a small generic spatial clustering C++ library, created\nto cluster triangle meshes for ray tracing. It is less than 2k lines of code and\nvery similar to a recursive node splitting algorithm to create a bounding volume\nhierarchy (BVH). It is limited to axis aligned splits but also produces clusters\nwith desirable attributes for raytracing.\n\n![clusters](doc/clusters.svg)\n\n**Input**\n\n- Spatial locality\n  - ${\\color{red}\\text{Bounding\\ boxes}}$\n  - ${\\color{blue}\\text{Centroids}}$\n- ${\\color{green}\\text{Connectivity}}$ (Optional)\n  - Adjacency lists\n  - Weights\n\n![input](doc/input.svg)\n\n**Output**\n\nCluster items (membership)\n- Ranges: \\{ \\{ ${\\color{blue}0,4}$ \\} , \\{ ${\\color{red}4,4}$ \\} \\}\n- Items: \\{\n    ${\\color{blue}3}$, ${\\color{blue}4}$, ${\\color{blue}6}$, ${\\color{blue}1}$,\n    ${\\color{red}2}$, ${\\color{red}7}$, ${\\color{red}0}$, ${\\color{red}1}$\n    \\}\n\n![output](doc/output.svg)\n\n**Notable features:**\n\n- Primarily spatial, making clusters from bounding boxes\n- Optional user-defined weighted adjacency\n- Generic, not just triangles\n- Customizable [min–max] cluster sizes\n- Parallel, using std::execution\n- Segmented API for clustering multiple subsets at once\n- Knobs to balance optimizations\n\nFor a complete usage example, see https://github.com/nvpro-samples/vk_animated_clusters.\n\n## Usage Example\n\nFor more details, refer to [`nvcluster.h`](include/nvcluster/nvcluster.h) (and optionally [`nvcluster_storage.hpp`](include/nvcluster/nvcluster_storage.hpp)).\n\n```\n\n...\n\n// Create bounding boxes for each item to be clustered\nstd::vector\u003cnvcluster::AABB\u003e boundingBoxes{\n    nvcluster::AABB{{0, 0, 0}, {1, 1, 1}}, // for example\n    ...\n};\n\n// Generate centroids\nstd::vector\u003cglm::vec3\u003e centroids(boundingBoxes.size());\nfor(size_t i = 0; i \u003c boundingBoxes.size(); i++)\n{\n  centroids[i] = 0.5f * (glm::vec3(boundingBoxes[i].bboxMin) + glm::vec3(boundingBoxes[i].bboxMax));\n}\n\n// Input structs\nnvcluster::SpatialElements spatialElements{.boundingBoxes = boundingBoxes.data(),\n                                           .centroids     = reinterpret_cast\u003cconst float*\u003e(centroids.data()),\n                                           .elementCount  = static_cast\u003cuint32_t\u003e(boundingBoxes.size())};\nnvcluster::Input           input{\n    .config =\n        {\n            .minClusterSize    = 128,\n            .maxClusterSize    = 128,\n            .costUnderfill     = 0.0f,  // zero to one (exclusive)\n            .costOverlap       = 0.0f,  // zero to one (exclusive)\n            .preSplitThreshold = 0,     // median-split bigger nodes (0=disable)\n        },\n    .spatialElements = \u0026spatialElements,\n    .graph           = nullptr, // optional adjacency - see test_clusterizer.cpp for examples of this in use\n};\n\n// Create context\nnvcluster::ContextCreateInfo info{};\nnvcluster::Context context;\nnvclusterCreateContext(\u0026info, \u0026context);\n\n// Create clusters\n// This is a thin wrapper with std::vector output storage for nvcluster::nvclusterCreate(...)\nnvcluster::ClusterStorage clustering;\nnvcluster::generateClusters(context, input, clustering);\n\n// Do something with the result\nfor(size_t clusterIndex = 0; clusterIndex \u003c clustering.clusterRanges.size(); ++clusterIndex)\n{\n  const nvcluster::Range\u0026 range = clustering.clusterRanges[clusterIndex];\n  for(uint32_t clusterItemIndex = 0; clusterItemIndex \u003c range.count; ++clusterItemIndex)\n  {\n    uint32_t clusterItem = clustering.clusterItems[range.offset + clusterItemIndex];\n    ...\n  }\n}\n```\n\n## Build Integration\n\nThis library uses CMake and requires C++20. It is currently a static\nlibrary, designed with C compatibility in mind with data passed as a structure\nof arrays and output allocated by the user. Integration has been verified by\ndirectly including it with `add_subdirectory`:\n\n```\nadd_subdirectory(nv_cluster_builder)\n...\ntarget_link_libraries(my_target PUBLIC nv_cluster_builder)\n```\n\nIf there is interest, please reach out for CMake config files (for\n`find_package()`) or any other features. GitHub issues are welcome.\n\n### Dependencies\n\nNone.\n\nIf tests are enabled (set the CMake `BUILD_TESTING` variable to `ON`),\nnv_cluster_builder will use [`FetchContent`](https://cmake.org/cmake/help/latest/module/FetchContent.html)\nto download GoogleTest.\n\n## How it works\n\nAuthors and contact:\n\n- Pyarelal Knowles (pknowles 'at' nvidia.com), NVIDIA\n- Karthik Vaidyanathan, NVIDIA\n\nCluster goals:\n\n- Consistent size for batch processing\n- Spatially adjacent\n- Well connected\n- Small bounding box (low SAH cost)\n- Low overlap\n- Useful for ray tracing\n\nThe algorithm is basic recursive bisection:\n\n1. Sorts inputs by centroids on each axis\n2. Initialize with a root node containing everything\n3. Recursively split until the desired leaf size is reached\n   - Compute candidate split costs for all positions in all axes\n   - Split at the lowest cost, maintaining sorted centroids by partitioning\n4. Leaves become clusters\n\nNovel additions:\n\n- Limit split candidates to guarantee fixed cluster sizes\n- Optimize for full clusters\n- Optimize for less bounding box overlap\n- Optimize for minimum *ratio cut* cost if adjacency exists\n\nThe optimizations are implemented by converting and summing additional costs\nwith the surface area heuristic (SAH) cost and choosing a split position on any\naxis with minimum cost.\n\n### Fixed Size Clusters\n\nOnly split at $i \\bmod C = 0$, with $i$ items to the left of a candidate split,\nto make clusters of size $C$. There will be at most one undersized cluster. This\nrule alone will largely break SAH, as shown for the clustering along just one\naxis. In reality, split candidates would be chosen for any axis\n\n![fixed_breaks_sah](doc/fixed_breaks_sah.svg)\n\nRelax the fixed $C$ constraint to allow a range, $[C_A, C_B]$ where $(1 \\le C_A\n\\le C_B)$. Only split if the target range cluster sizes could be formed on both\nsides. For example, the figure below shows forming clusters of size 127 or 128\nitems. Choosing splits in grey regions will produce clusters in the left node\n(top) and right node (bottom) of the desired size range. Limit split candidates\nto the intersection of the grey regions. The equivalent conditions are described the equations below, where\n$n$ is the number of items in the node being split.\n\n![valid_split_positions](doc/valid_split_positions.svg)\n\n$$𝑖 \\bmod 𝐶_𝐴 \\le (𝐶_𝐵 − 𝐶_𝐴) \\lfloor \\frac{𝑖}{𝐶_𝐴} \\rfloor$$\n$$(n - 𝑖) \\bmod 𝐶_𝐴 \\le (𝐶_𝐵 − 𝐶_𝐴) \\lfloor \\frac{n - 𝑖}{𝐶_𝐴} \\rfloor$$\n\nFor small inputs it is possible that there is no overlap in valid ranges, in\nwhich case the algorithm falls back to choosing just one. Similarly to the fixed\n$C_A = C_B$ case, there will be at most one undersized cluster.\n\n### Maximize Cluster Sizes\n\nA cluster \"underfill\" cost is introdued to encourage bigger clusters. For\nexample, in the figure below a split position is being considered for clusters\nin the range [1, 4]. The split candidate would produce a node of 2.75 clusters\non the left and 1.25 on the right. This results in $p$ missing cluster items.\nThis value is converted to SAH units and summed. This library currently uses a\nlinear cost with a tunable `costUnderfill` constant, but a transfer function to\nmodel the true cost, of e.g. perf or memory, would be ideal.\n\n![underfill_cost](doc/underfill_cost.svg)\n\n$$p_{\\text{left}} = C_B \\lceil \\frac{i}{C_B} \\rceil - i$$\n$$p_{\\text{right}} = C_B \\lceil \\frac{n - i}{C_B} \\rceil - (n - i)$$\n$$p = C_B ( \\lceil \\frac{i}{C_B} \\rceil + \\lceil \\frac{n - i}{C_B} \\rceil ) - n$$\n\n### Minimize Bounding Box Overlap\n\nBounding box overlap is bad for ray tracing because rays must enter both while\nin the overlap volume. A cost is added for overlapping bounding boxes, ver much\nlike SAH it is just $n$ multiplied by the surface area of the bounding box\nintersection's box and balanced with a tunable `costOverlap` constant.\n\n### Minimize Adjacency Cut Cost\n\nIf provided, adjacency is integrated by adding the *cut cost* - the sum of\nweights of all item connections broken by the split - to each candidate split\nposition. *Ratio cut* [Wei and Cheng, 1989] is used to avoid degenerate\nsolutions. The cut cost is arbitrarily scaled by the number of items in the node\nto be SAH relative and added to the other costs above.\n\nTo compute the cut cost, the adjacency data is rebuilt to reference node items\nbefore each iteration of recursive node splitting. This allows cut costs to be\ncomputed with a prefix sum scan of summed starting and ending connection\nweights.\n\nTo explain, there are initially three arrays, sorted by item centroids in *X*,\n*Y* and *Z* respectively. After splitting, these arrays are partitioned,\nmaintaining sorted order within nodes. These hold original item indices and in\nfact trivially hold the clustering result after splitting. The input adjacency\narrays index original items, but we instead need the index in those initially\nsorted arrays. This is done by duplicating the adjacency arrays and scatter\nwriting their node-sorted indices. The image below shows this for one axis.\n\n![adjacency_sweep](doc/adjacency_sorted.svg)\n\nWhen computing cut costs for a node, an array of summed weights is created. The\nimage below shows an example with unit weights. The array is initialized with\nthe sum of connecting item weights - positive for connections to the right and\nnegative for connections to the left. Connections to other nodes are ignored.\nThe reindexed adjacency arrays trivially give this information, comparing the\nconnection index with the current item's index and the node boundaries. The\nweights array is then prefix summed to obtain the cut cost for each position in\nthe node.\n\n![adjacency_sweep](doc/adjacency_sweep.svg)\n\n### Citation\n\nThe BibTex entry to cite `nv_cluster_builder` is\n\n```bibtex\n@online{nv_cluster_builder,\n   title   = {{{NVIDIA}}\\textregistered{} {nv_cluster_builder}},\n   author  = {{NVIDIA}},\n   year    = 2025,\n   url     = {https://github.com/nvpro-samples/nv_cluster_builder},\n   urldate = {2025-01-30},\n}\n```\n\n## Limitations\n\nClusters are created by making recursive axis aligned splits. This is useful as\nit greatly reduces the search space and improves performance when clusters are\nused in ray tracing. However, more general clustering solutions than\naxis aligned splits are not considered.\n\nRecursively splitting is done greedily, picking the lowest cost split which may\nnot be a global optimum.\n\nThe algorithm is primarily spatial due to splitting in order of centroids, but\nsolutions can be skewed by adjusting the costs in `nvcluster::Config` and\nadjacency weights in `nvcluster::Graph::connectionWeights`. For example, choosing\nadjacency weights to represent connected triangles or number of shared vertices\ncan result in more vertex reuse within clusters. Weights may also represent face\nnormal similarity or a balance of multiple attributes.\n\nBadly chosen weights can result in degenerate solutions where recursive\nbisection splits off single leaves. This is both slow and rarely desirable.\n\nParallel execution is only supported with libstdc++ and MSVC STL, not libc++.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnvpro-samples%2Fnv_cluster_builder","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnvpro-samples%2Fnv_cluster_builder","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnvpro-samples%2Fnv_cluster_builder/lists"}