{"id":27679876,"url":"https://github.com/owensgroup/RXMesh","last_synced_at":"2025-04-25T04:01:49.486Z","repository":{"id":41444767,"uuid":"357954110","full_name":"owensgroup/RXMesh","owner":"owensgroup","description":"GPU-accelerated triangle mesh processing","archived":false,"fork":false,"pushed_at":"2025-04-23T15:33:32.000Z","size":11434,"stargazers_count":255,"open_issues_count":2,"forks_count":35,"subscribers_count":21,"default_branch":"main","last_synced_at":"2025-04-23T16:47:38.805Z","etag":null,"topics":["3d","3d-graphics","cuda","data-structure","geometry","geometry-processing","gpu","mesh","mesh-processing","parallel-computing","surface-mesh"],"latest_commit_sha":null,"homepage":"","language":"Cuda","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/owensgroup.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2021-04-14T15:29:48.000Z","updated_at":"2025-04-23T15:33:36.000Z","dependencies_parsed_at":"2024-04-02T04:30:36.908Z","dependency_job_id":"7d47ef35-92d3-4a2c-a183-58b107951dae","html_url":"https://github.com/owensgroup/RXMesh","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/owensgroup%2FRXMesh","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/owensgroup%2FRXMesh/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/owensgroup%2FRXMesh/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/owensgroup%2FRXMesh/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/owensgroup","download_url":"https://codeload.github.com/owensgroup/RXMesh/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250751840,"owners_count":21481313,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["3d","3d-graphics","cuda","data-structure","geometry","geometry-processing","gpu","mesh","mesh-processing","parallel-computing","surface-mesh"],"created_at":"2025-04-25T04:00:57.168Z","updated_at":"2025-04-25T04:01:49.471Z","avatar_url":"https://github.com/owensgroup.png","language":"Cuda","funding_links":[],"categories":["Applications"],"sub_categories":[],"readme":"# **RXMesh** [![Ubuntu](https://github.com/owensgroup/RXMesh/actions/workflows/Ubuntu.yml/badge.svg)](https://github.com/owensgroup/RXMesh/actions/workflows/Ubuntu.yml) [![Windows](https://github.com/owensgroup/RXMesh/actions/workflows/Windows.yml/badge.svg)](https://github.com/owensgroup/RXMesh/actions/workflows/Windows.yml)\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"./assets/david_pacthes.png\" width=\"80%\"\u003e\u003cbr\u003e\n\u003c/p\u003e\n\n## **Contents**\n- [**About**](#about)\n- [**Compilation**](#compilation)\n  * [**Dependencies**](#dependencies)\n- [**Organization**](#organization)\n- [**Programming Model**](#programming-model)  \n  * [**Structures**](#structures)\n  * [**Computation**](#computation)\n  * [**Viewer**](#viewer)\n  * [**Matrices and Vectors**](#matrices-and-vectors)\n- [**Replicability**](#replicability)\n- [**Bibtex**](#bibtex)\n\n## **About**\nRXMesh is a surface triangle mesh data structure and programming model for processing static meshes on the GPU. RXMesh aims at provides a high-performance, generic, and compact data structure that can handle meshes regardless of their quality (e.g., non-manifold). The programming model helps to hide the complexity of the data structure and provides an intuitive access model for different use cases. For more details, please check out our paper and GTC talk:\n\n- *[RXMesh: A GPU Mesh Data Structure](https://escholarship.org/uc/item/8r5848vp)*\u003cbr\u003e\n*[Ahmed H. Mahmoud](https://www.ece.ucdavis.edu/~ahdhn/), [Serban D. Porumbescu](https://web.cs.ucdavis.edu/~porumbes/), and [John D. Owens](https://www.ece.ucdavis.edu/~jowens/)*\u003cbr\u003e\n*[ACM Transaction on Graphics](https://dl.acm.org/doi/abs/10.1145/3450626.3459748) (Proceedings of SIGGRAPH 2021)*\n\n- *[RXMesh: A High-performance Mesh Data Structure and Programming Model on the GPU  [S41051]](https://www.nvidia.com/gtc/session-catalog/?tab.scheduledorondemand=1583520458947001NJiE\u0026search=rxmesh#/session/1633891051385001Q9SE)—NVIDIA GTC 2022*\n\nThe library also features a sparse and dense matrix infrastructure that is tightly coupled with the mesh data structure. We expose various [cuSolver](https://docs.nvidia.com/cuda/cusolver/index.html), [cuSparse](https://docs.nvidia.com/cuda/cusparse/), and [cuBlas](https://docs.nvidia.com/cuda/cublas/) operations through the sparse and dense matrices, tailored for geometry processing applications.\n\nThis repository provides 1) source code to reproduce the results presented in the paper (git tag [`v0.1.0`](https://github.com/owensgroup/RXMesh/tree/v0.1.0)) and 2) ongoing development of RXMesh.\n\n## **Compilation**\nThe code can be compiled on Ubuntu, Windows, and WSL providing that CUDA (\u003e=11.1.0) is installed. To run the executable(s), an NVIDIA GPU should be installed on the machine.\n\n### **Dependencies**\n- [OpenMesh](https://www.graphics.rwth-aachen.de:9000/OpenMesh/OpenMesh) to verify the applications against reference CPU implementation\n- [RapidJson](https://github.com/Tencent/rapidjson) to report the results in JSON file(s)\n- [GoogleTest](https://github.com/google/googletest) for unit tests\n- [spdlog](https://github.com/gabime/spdlog) for logging\n- [glm](https://github.com/g-truc/glm.git) for small vectors and matrices operations \n- [Eigen](https://gitlab.com/libeigen/eigen) for small vectors and matrices operations \n- [Polyscope ](https://github.com/nmwsharp/polyscope) for visualization  \n- [cereal](https://github.com/USCiLab/cereal.git) for serialization \n\n\nAll the dependencies are installed automatically! To compile the code:\n\n```\n\u003e git clone https://github.com/owensgroup/RXMesh.git\n\u003e cd RXMesh\n\u003e mkdir build \n\u003e cd build \n\u003e cmake ../\n```\nDepending on the system, this will generate either a `.sln` project on Windows or a `make` file for a Linux system.\n\n## **Organization**\nRXMesh is a CUDA/C++ header-only library. All unit tests are under the `tests/` folder. This includes the unit test for some basic functionalities along with the unit test for the query operations. All applications are under the `apps/` folder.\n\n## **Programming Model**\nThe goal of defining a programming  model is to make it easy to write applications using RXMesh without getting into the nuances of the data structure. Applications written using RXMesh are composed of one or more of the high-level building blocks defined under [**Computation**](#computation). To use these building blocks, the user would have to interact with data structures specific to RXMesh discussed under [**Structures**](#structures). Finally, RXMesh integrates [Polyscope](https://polyscope.run) as a mesh [**Viewer**](#viewer) which the user can use to render their final results or for debugging purposes. \n\n### **Structures**\n- **Attributes** are the metadata (geometry information) attached to vertices, edges, or faces. Allocation of the attributes is per-patch basis and managed internally by RXMesh. The allocation could be done on the host, device, or both. Allocating attributes on the host is only beneficial for I/O operations or initializing attributes and then eventually moving them to the device. \n  - Example: allocation\n    ```c++\n    RXMeshStatic rx(\"input.obj\");\n    auto vertex_color = \n      rx.add_vertex_attribute\u003cfloat\u003e(\"vColor\", //Unique name \n                                     3,        //Number of attribute per vertex \n                                     DEVICE,   //Allocation place \n                                     SoA);     //Memory layout (SoA vs. AoS)                                 \n\n    ```\n  - Example: reading from `std::vector`\n    ```c++\n    RXMeshStatic rx(\"input.obj\");\n    std::vector\u003cstd::vector\u003cfloat\u003e\u003e face_color_vector;\n    //....\n\n    auto face_color = \n      rx.add_face_attribute\u003cint\u003e(face_color_vector,//Input attribute where number of attributes per face is inferred \n                                 \"fColor\",         //Unique name                                \n                                 SoA);             //Memory layout (SoA vs. AoS)                                  \n    ```\n  - Example: move, reset, and copy\n    ```c++    \n    //By default, attributes are allocated on both host and device     \n    auto edge_attr = rx.add_edge_attribute\u003cfloat\u003e(\"eAttr\", 1);  \n    //Initialize edge_attr on the host \n    // ..... \n\n    //Move attributes from host to device \n    edge_attr.move(HOST, DEVICE);\n\n    //Reset all entries to zero\n    edge_attr.reset(0, DEVICE);\n\n    auto edge_attr_1 = rx.add_edge_attribute\u003cfloat\u003e(\"eAttr1\", 1);  \n\n    //Copy from another attribute. \n    //Here, what is on the host sde of edge_attr will be copied into the device side of edge_attr_1\n    edge_attr_1.copy_from(edge_attr, HOST, DEVICE);\n    ```\n\n- **Handles** are the unique identifiers for vertices, edges, and faces. They are usually internally populated by RXMesh (by concatenating the patch ID and mesh element index within the patch). Handles can be used to access attributes, `for_each` operations, and query operations. \n\n  - Example: Setting vertex attribute using vertex handle \n    ```c++  \n    auto vertex_color = ...    \n    VertexHandle vh; \n    //...\n    \n    vertex_color(vh, 0) = 0.9;\n    vertex_color(vh, 1) = 0.5;\n    vertex_color(vh, 2) = 0.6;\n    ```\n\n- **Iterators** are used during query operations to iterate over the output of the query operation. The type of iterator defines the type of mesh element iterated on e.g., `VertexIterator` iterates over vertices which is the output of `VV`, `EV`, or `FV` query operations. Since query operations are only supported on the device, iterators can be only used inside the GPU kernel. Iterators are usually populated internally. \n\n  - Example: Iterating over faces \n      ```c++  \n      FaceIterator f_iter; \n      //...\n\n      for (uint32_t f = 0; f \u003c f_iter.size(); ++f) {\t\n        FaceHandle fh = f_iter[f];\n        //do something with fh ....\n      }\n      ```\n\n\n### **Computation**\n- **`for_each`** runs a computation over all vertices, edges, or faces _without_ requiring information from neighbor mesh elements. The computation that runs on each mesh element is defined as a lambda function that takes a handle as an input. The lambda function could run either on the host, device, or both. On the host, we parallelize the computation using OpenMP. Care must be taken for lambda function on the device since it needs to be annotated using `__device__` and it can only capture by value. More about lambda function in CUDA can be found [here](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#extended-lambda)\n  - Example: using `for_each` to initialize attributes \n    ```cpp\n    RXMeshStatic rx(\"input.obj\");\n    auto vertex_pos   = rx.get_input_vertex_coordinates();                   //vertex position \n    auto vertex_color = rx.add_vertex_attribute\u003cfloat\u003e(\"vColor\", 3, DEVICE); //vertex color \n\n    //This function will be executed on the device \n    rx.for_each_vertex(\n        DEVICE,\n        [vertex_color, vertex_pos] __device__(const VertexHandle vh) {\n            vertex_color(vh, 0) = 0.9;\n            vertex_color(vh, 1) = vertex_pos(vh, 1);\n            vertex_color(vh, 2) = 0.9;\n        });\n    ```\n  Alternatively, `for_each` operations could be written the same way as Queries operations (see below). This might be useful if the user would like to combine a `for_each` with queries operations in the same kernel. For more examples, checkout [`ForEach`](/tests/RXMesh_test/test_for_each.cuh) unit test. \n\n- **Queries** operations supported by RXMesh with description are listed below \n\n  | Query |  Description                               |\n  |-------|:-------------------------------------------|\n  | `VV`  | For vertex V, return its adjacent vertices |\n  | `VE`  | For vertex V, return its incident edges    |\n  | `VF`  | For vertex V, return its incident faces    |\n  | `EV`  | For edge E, return its incident vertices   |\n  | `EF`  | For edge E, return its incident faces      |\n  | `FV`  | For face F, return its incident vertices   |\n  | `FE`  | For face F, return its incident edges      |\n  | `FF`  | For face F, return its adjacent faces      |\n\n  Queries are only supported on the device. RXMesh API for queries takes a lambda function along with the type of query. The lambda function defines the computation that will be run on the query output. \n\n  - Example: [vertex normal computation](./apps/VertexNormal/vertex_normal_kernel.cuh)\n    ```cpp\n    template\u003cuint32_t blockSize\u003e\n    __global__ void vertex_normal (Context context){      \n\t    auto compute_vn = [\u0026](const FaceHandle face_id, const VertexIterator\u0026 fv) {\n        \t//This thread is assigned to face_id\n\n        \t// get the face's three vertices coordinates\n        \tvec3\u003cT\u003e c0(coords(fv[0], 0), coords(fv[0], 1), coords(fv[0], 2));\n        \tvec3\u003cT\u003e c1(coords(fv[1], 0), coords(fv[1], 1), coords(fv[1], 2));\n\t        vec3\u003cT\u003e c2(coords(fv[2], 0), coords(fv[2], 1), coords(fv[2], 2));\n\n          //compute face normal\n          vec3\u003cT\u003e n = cross(c1 - c0, c2 - c0);\n\n\t        // add the face's normal to its vertices\n        \t\tfor (uint32_t v = 0; v \u003c 3; ++v)     // for every vertex in this face\n\t            for (uint32_t i = 0; i \u003c 3; ++i)   // for the vertex 3 coordinates\n        \t\t        atomicAdd(\u0026normals(fv[v], i), n[i]);          \n\t    };\n\n      //Query must be called by all threads in the block. Thus, we create this cooperative_group\n      //that uses all threads in the block and pass to the Query \n      auto block = cooperative_groups::this_thread_block();\n      \n      Query\u003cblockThreads\u003e query(context);\n\n      //Qeury will first perform the query, store the results in shared memory. ShmemAllocator is \n      //passed to the function to make sure we don't over-allocate or overwrite user-allocated shared\n      //memory \n      ShmemAllocator shrd_alloc;\n\n      //Finally, we run the user-defined computation i.e., compute_vn\n      query.dispatch\u003cOp::FV\u003e(block, shrd_alloc, compute_vn);\n    } \n    ```\n  To save computation, `query.dispatch` could be run on a subset of the input mesh element i.e., _active set_. The user can define the active set using a lambda function that returns true if the input mesh element is in the active set. \n\n  - Example: defining active set\n    ```cpp\n    template\u003cuint32_t blockSize\u003e\n    __global__ void active_set_query (Context context){\n      auto active_set = [\u0026](FaceHandle face_id) -\u003e bool{ \n        // ....         \n\t    };\n\n\t    auto computation = [\u0026](const FaceHandle face_id, const VertexIterator\u0026 fv) {          \n        // ....         \n\t    };\n\n\t    query.dispatch\u003cOp::FV, blockSize\u003e(context, computation, active_set);\n    } \n    ```\n\n- **Reduction** operations apply a binary associative operation on the input attributes. RXMesh provides dot products between two attributes (of the same type), L2 norm of an input attribute, and user-defined reduction operation on an input attribute. For user-defined reduction operation, the user needs to pass a binary reduction functor with member `__device__ T operator()(const T \u0026a, const T \u0026b)` or use on of [CUB's thread operators](https://github.com/NVIDIA/cub/blob/main/cub/thread/thread_operators.cuh) e.g., `cub::Max()`. Reduction operations require allocation of temporary buffers which we abstract away using `ReduceHandle`. \n\n  - Example: dot product, L2 norm, user-defined reduction \n    ```cpp \n    RXMeshStatic rx(\"input.obj\");\n    auto vertex_attr1 = rx.add_vertex_attribute\u003cfloat\u003e(\"v_attr1\", 3, DEVICE);\n    auto vertex_attr2 = rx.add_vertex_attribute\u003cfloat\u003e(\"v_attr2\", 3, DEVICE);\n\n    // Populate vertex_attr1 and vertex_attr2 \n    //....\n\n    //Reduction handle \n    ReduceHandle reduce(v1_attr);\n\n    //Dot product between two attributes. Results are returned on the host \n    float dot_product = reduce.dot(v1_attr, v2_attr);\n\n    cudaStream_t stream; \n    //init stream \n    //...\n\n    //Reduction operation could be performed on specific attribute and using specific stream \n    float l2_norm = reduce.norm2(v1_attr, //input attribute \n                                 1,       //attribute ID. If not specified, reduction is run on all attributes \n                                 stream); //stream used for reduction. \n    \n\n    //User-defined reduction operation \n    float l2_norm = reduce.reduce(v1_attr,                               //input attribute \n                                  cub::Max(),                            //binary reduction functor \n                                  std::numeric_limits\u003cfloat\u003e::lowest()); //initial value \n    ```\n\n### **Viewer**\nStarting v0.2.1, RXMesh integrates [Polyscope](https://polyscope.run) as a mesh viewer. To use it, make sure to turn on the CMake parameter `USE_POLYSCOPE` i.e., \n\n```\n\u003e cd build \n\u003e cmake -DUSE_POLYSCOPE=True ../\n``` \nBy default, the parameter is set to True. RXMesh implements the necessary functionalities to pass attributes to Polyscope—thanks to its [data adaptors](https://polyscope.run/data_adaptors/). However, this needs attributes to be moved to the host first before passing it to Polyscope. For more information about Polyscope's different visualization options, please checkout Polyscope's [Surface Mesh documentation](https://polyscope.run/structures/surface_mesh/basics/).\n\n  - Example: [render vertex color](./tests/Polyscope_test/test_polyscope.cu)      \n    ```cpp\n    RXMeshStatic rx(\"dragon.obj\");\n\n    //vertex color attribute \n    auto vertex_color = rx.add_vertex_attribute\u003cfloat\u003e(\"vColor\", 3);\n\n    //Populate vertex color on the device\n    //....\n    \n    //Move vertex color to the host \n    vertex_color.move(DEVICE, HOST);\n\n    //polyscope instance associated with rx \n    auto polyscope_mesh = rx.get_polyscope_mesh();\n\n    //pass vertex color to polyscope \n    polyscope_mesh-\u003eaddVertexColorQuantity(\"vColor\", vertex_color);\n\n    //render \n    polyscope::show();\n    ```\n    \u003cp align=\"center\"\u003e\n    \t\u003cimg src=\"./assets/polyscope_dragon.PNG\" width=\"80%\"\u003e\u003cbr\u003e\n    \u003c/p\u003e    \n\n### **Matrices and Vectors**\n- **Large Matrices:** RXMesh has built-in support for large sparse and dense matrices built on top of [cuSparse](https://docs.nvidia.com/cuda/cusparse/) and [cuBlas](https://docs.nvidia.com/cuda/cublas/), respectively. For example, attributes can be converted to dense matrices as follows \n\n```cpp\n\nRXMeshStatic rx(\"input.obj\");\n\n//Input mesh coordinates as VertexAttribute\nstd::shared_ptr\u003cVertexAttribute\u003cfloat\u003e\u003e x = rx.get_input_vertex_coordinates();\n\n//Convert the attributes to a (#vertices x 3) dense matrix \nstd::shared_ptr\u003cDenseMatrix\u003cfloat\u003e\u003e x_mat = x-\u003eto_matrix();\n\n//do something with x_mat\n//....\n\n//Populate the VertexAttribute coordinates back with the content of the dense matrix\nx-\u003efrom_matrix(x_mat.get());\n\n```\nDense matrices can be accessed using the usual row and column indices or via the mesh element handle (Vertex/Edge/FaceHandle) as a row index. This allows for easy access to the correct row associated with a specific vertex, edge, or face. Dense matrices support various operations such as absolute sum, AXPY, dot products, norm2, scaling, and swapping.\n\nRXMesh supports sparse matrices, where the sparsity pattern matches the query operations. For example, it is often necessary to build a sparse matrix of size #V x #V with non-zero values at (i, j) only if the vertex corresponding to row i is connected by an edge to the vertex corresponding to column j. Currently, we only support the VV sparsity pattern, but we are working on expanding to all other types of queries.\n\nThe sparse matrix can be used to solve a linear system via Cholesky, LU, or QR factorization (relying on [cuSolver](https://docs.nvidia.com/cuda/cusolver/index.html))). The solver offers two APIs. The high-level API reorders the input sparse matrix (to reduce non-zero fill-in after matrix factorization) and allocates the additional memory needed to solve the system. Repeated calls to this API will reorder the matrix and allocate/deallocate the temporary memory with each call. For scenarios where the matrix remains unchanged but multiple right-hand sides need to be solved, users can utilize the low-level API, which splits the solve method into pre_solve() and solve(). The former reorders the matrix and allocates temporary memory only once. The low-level API is currently only supported for Cholesky-based factorization. Check out the MCF application for an example of how to set up and use the solver.\n\nSimilar to dense matrices, sparse matrices also support accessing the matrix using the VertexHandle and multiplication by dense matrices.\n\n- **Small Matrices:**\nIt is often necessary to perform operations on small matrices as part of geometry processing applications, such as computing the SVD of a 3x3 matrix or normalizing a 1x3 vector. For this purpose, RXMesh attributes can be converted into glm or Eigen matrices, as demonstrated in the vertex_normal example above. Both glm and Eigen support small matrix operations inside the GPU kernel.\n\n\n\n## **Replicability**\nThis repo was awarded the [replicability stamp](http://www.replicabilitystamp.org#https-github-com-owensgroup-rxmesh) by the Graphics Replicability Stamp Initiative (GRSI) :tada:. Visit git tag [`v0.1.0`](https://github.com/owensgroup/RXMesh/tree/v0.1.0) for more information about replicability scripts.\n\n## **Bibtex**\n```\n@article{Mahmoud:2021:RAG,\n  author       = {Ahmed H. Mahmoud and Serban D. Porumbescu and John D. Owens},\n  title        = {{RXM}esh: A {GPU} Mesh Data Structure},\n  journal      = {ACM Transactions on Graphics},\n  year         = 2021,\n  volume       = 40,\n  number       = 4,\n  month        = aug,\n  issue_date   = {August 2021},\n  articleno    = 104,\n  numpages     = 16,\n  pages        = {104:1--104:16},\n  url          = {https://escholarship.org/uc/item/8r5848vp},\n  full_talk    = {https://youtu.be/Se_cNAol4hY},\n  short_talk   = {https://youtu.be/V_SHMXnCVws},\n  doi          = {10.1145/3450626.3459748},\n  acmauthorize = {https://dl.acm.org/doi/10.1145/3450626.3459748?cid=81100458295},\n  acceptance   = {149/444 (33.6\\%)},\n  ucdcite      = {a140}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fowensgroup%2FRXMesh","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fowensgroup%2FRXMesh","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fowensgroup%2FRXMesh/lists"}