{"id":19817157,"url":"https://github.com/nvpro-samples/gl_cuda_simple_interop","last_synced_at":"2025-09-12T07:39:24.406Z","repository":{"id":169191735,"uuid":"623188982","full_name":"nvpro-samples/gl_cuda_simple_interop","owner":"nvpro-samples","description":"Sample showing OpenGL and CUDA interop","archived":false,"fork":false,"pushed_at":"2024-06-28T09:54:37.000Z","size":747,"stargazers_count":10,"open_issues_count":0,"forks_count":1,"subscribers_count":12,"default_branch":"master","last_synced_at":"2024-06-28T11:15:23.385Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nvpro-samples.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-03T21:55:46.000Z","updated_at":"2024-06-28T09:54:40.000Z","dependencies_parsed_at":"2024-06-28T11:08:27.003Z","dependency_job_id":null,"html_url":"https://github.com/nvpro-samples/gl_cuda_simple_interop","commit_stats":null,"previous_names":["nvpro-samples/gl_cuda_simple_interop"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nvpro-samples%2Fgl_cuda_simple_interop","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nvpro-samples%2Fgl_cuda_simple_interop/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nvpro-samples%2Fgl_cuda_simple_interop/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nvpro-samples%2Fgl_cuda_simple_interop/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nvpro-samples","download_url":"https://codeload.github.com/nvpro-samples/gl_cuda_simple_interop/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224253425,"owners_count":17280934,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-12T10:11:51.521Z","updated_at":"2024-11-12T10:11:51.672Z","avatar_url":"https://github.com/nvpro-samples.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# OpenGL Interop\n\n\nBy Maximilian Müller\n\nThis blog is an introduction to fast OpenGL and CUDA interop. The goal is to explain how to mix CUDA as compute backend and \nOpenGL for displaying in the same application. In a nutshell, to achieve this, all objects are allocated in Vulkan, \nbut rendered with OpenGL.\nA sample how to do this with Vulkan only using a compute shader is shown [here](https://github.com/nvpro-samples/gl_vk_simple_interop).\n\nTopics covered:\n- Importing Vulkan memory to GL and CUDA\n- Interoperability OGL \u003c==\u003e CUDA using VK semaphores\n\n![Screenshot](doc/screenshot.png)\n\n# Interop Paradigm\n\nFor OpenGL to work with CUDA, it is important that all memory objects (buffers and semaphores) are allocated in Vulkan. \nA handle of those objects needs to be retrieved which is used to import those elements to CUDA and GL. Those new \nOpenGL and CUDA objects are pointing to the exact same memory location as the Vulkan one, meaning that changes through \neither API are visible on all sides.\n\nIn the current example, we will deal with two memory objects:\n\n- Vertices: holding the triangle objects\n- Image: the pixels of the image\n\nAnother important aspect is the synchronization between OpenGL, CUDA and Vulkan. This topic will be discussed in detail\nin the section Semaphores.\n\n\n![Screenshot](doc/interop_api.jpg )\n\n# Prerequisite\n\nTo compile the project please clone the [nvpro_core](https://github.com/nvpro-samples/nvpro_core) repository into the same parent folder as this repository, \nor provide the path to the parent directory of the nvpro_core repository via the cmake variable `BASE_DIRECTORY`.\nPlease note that it is needed to clone the repository recursively.\nFurthermore, you need to have the Vulkan SDK and [CUDA toolkit](https://developer.nvidia.com/cuda-downloads) installed.\nIt was tested with the following CUDA versions: 11.7 and 11.8.\nPlease note that a cmake version higher than 3.12 is required.\n\n## Vulkan Instance and Device \n\nA Vulkan Instance and a Device must be created to be able to create and allocate memory buffers on a physical device. \n\nIn the example (main.cpp), Vulkan Instance is created calling `createInstance()`. To create the Vulkan Device, we do not need a \nsurface since we will not draw anything using Vulkan. We are creating using `createDevice()` and using the first device(GPU) \non the computer.\n\n\n\n## Vulkan Extensions\n\nBefore being able to start allocating Vulkan buffers and using semaphores, Vulkan needs \nto have extensions enabled to be able to make the export of objects working.\n\nInstance extensions through `requireExtensions`:\n- **VK_KHR_EXTERNAL_MEMORY_CAPABILITIES_EXTENSION_NAME**\n- **VK_KHR_EXTERNAL_SEMAPHORE_CAPABILITIES_EXTENSION_NAME**\n\nFor the creation of the Device through, extensions are set with `requireDeviceExtensions`:\n- **VK_KHR_EXTERNAL_MEMORY_EXTENSION_NAME**\n- **VK_KHR_EXTERNAL_MEMORY_WIN32_EXTENSION_NAME**\n- **VK_KHR_EXTERNAL_SEMAPHORE_EXTENSION_NAME**\n- **VK_KHR_EXTERNAL_SEMAPHORE_WIN32_EXTENSION_NAME**\n\n## OpenGL\nFor OpenGL we are using OpenGL 4.5 and need the extensions [EXT_external_objects](https://www.khronos.org/registry/OpenGL/extensions/EXT/EXT_external_objects.txt) \nand [GL_EXT_semaphore](https://www.khronos.org/registry/OpenGL/extensions/EXT/EXT_external_objects.txt) \n\nHere are the extra functions we are using:\n* `glCreateMemoryObjectsEXT`\n* `glImportMemoryWin32HandleEXT`\n* `glNamedBufferStorageMemEXT`\n* `glTextureStorageMem2DEXT`\n* `glSignalSemaphoreEXT`\n* `glWaitSemaphoreEXT`\n\n\n# Vulkan Allocation\n\nWhen allocating a Vulkan buffer, it is required to use the [ExportMemoryAllocation](https://www.khronos.org/registry/vulkan/specs/1.1-extensions/man/html/VK_KHR_external_memory.html) extension.\n\nIn this example, we are using a simple Vulkan memory allocator. This allocator is doing dedicated allocation, one memory allocation per buffer. \nThis is not the recommended way, it would be better to allocate larger memory block and bind buffers to some memory sections, but it is fine for the purpose of this example.\n\nForm this sample we use the export vulkan memory allocator(`ExportResourceAllocatorDedicated`) to export all memory allocation.\nSee (`nvpro-samples\\nvpro_core\\nvvkpp\\resourceallocator_vk.hpp`)\n\nSince we want to flag this to memory be exported, we have to set pNext as seen below:\n~~~C++\nVkExternalMemoryImageCreateInfo extMemInfo{VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_IMAGE_CREATE_INFO};\nextMemInfo.handleTypes = getDefaultMemHandleType();\nimageCreateInfo.pNext  = \u0026extMemInfo;  // \u003c-- Enabling Export\nnvvk::Image     image  = m_alloc.createImage(imageCreateInfo);\n~~~\n\nHaving this done, we will have an exportable handle type for a device memory object.\n\n\n**!!! note**\n    This must be done for all memory objects that need to be visible for both Vulkan and OpenGL/CUDA.\n\n**!!! warn Best Memory Usage Practice**\n    We have used a very simplistic approach, for better usage of memory, see this [blog](https://developer.nvidia.com/vulkan-memory-management).\n\n\n# CUDA \n\nTo import by Vulkan allocated objects we are using the [External Resource Interoperability](https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__EXTRES__INTEROP.html) of CUDA.\nWe first have to import the semaphore to CUDA by retrieving the file handle and defining the `cudaExternalSemaphoreHandleDesc` as can be seen below.\n\n```C++\n    cudaExternalSemaphoreHandleDesc externalSemaphoreHandleDesc = {};\n    externalSemaphoreHandleDesc.type = cudaExternalSemaphoreHandleTypeOpaqueWin32;\n    externalSemaphoreHandleDesc.flags = 0;\n    externalSemaphoreHandleDesc.handle.win32.handle = vk_semaphore_handle;\n    cudaImportExternalSemaphore(\u0026cuda_semaphore, \u0026externalSemaphoreHandleDesc));\n```\n\nOn a per-frame basis we have to wait for a semaphore and signal that processing is finished using `cudaWaitExternalSemaphoresAsync` and `cudaSignalExternalSemaphoresAsync`.\nFurthermore, we are creating a CUDA surface to be able to write to the graphics memory using `surf2Dwrite`. \n\n# Handle and Memory Object retrieval\n\nTo retrieve the memory object for OpenGL or CUDA, we must get the memory `HANDLE`. \nSee file: `gl_vkpp.hpp`\n\nNote: the Vulkan buffer structure was extended to hold the OpenGL information\n\n~~~~C++\n// #VKGL Extra for Interop\nstruct BufferVkGL : public Buffer\n{\n  HANDLE handle       = nullptr;  // The Win32 handle\n  GLuint memoryObject = 0;        // OpenGL memory object\n  GLuint oglId        = 0;        // OpenGL object ID\n};\n~~~~\n\n\n~~~~ C++\n  // #VKGL:  Get the share Win32 handle between Vulkan and other APIs\n  bufGl.handle = device.getMemoryWin32HandleKHR(\n\t\t\t\t\t{bufGl.bufVk.allocation, vk::ExternalMemoryHandleTypeFlagBits::eOpaqueWin32});\n~~~~\n\nWith the `HANDLE` we can retrieve the equivalent OpenGL or CUDA memory object.\n\n~~~~ C++\n  // Get the OpenGL Memory object\n  glCreateMemoryObjectsEXT(1, \u0026bufGl.memoryObject);\n  auto req     = device.getBufferMemoryRequirements(bufGl.bufVk.buffer);\n  glImportMemoryWin32HandleEXT(bufGl.memoryObject, req.size, GL_HANDLE_TYPE_OPAQUE_WIN32_EXT, bufGl.handle);\n~~~~\n\n\n~~~~ C++\n  // Get the CUDA Memory object\n  nvvk::Image    image  = m_alloc.createImage(imageCreateInfo);\n  auto mem_info = m_alloc.getMemoryAllocator()-\u003egetMemoryInfo(image.memHandle);\n  HANDLE mem_handle;\n  device.getMemoryFdKHR(\u0026mem_info, \u0026mem_handle);\n  cudaExternalMemoryHandleDesc cudaExtMemHandleDesc = {};\n  cudaExtMemHandleDesc.type = cudaExternalMemoryHandleTypeOpaqueWin32;\n  cudaExtMemHandleDesc.handle.win32.handle = mem_handle;\n  cudaExtMemHandleDesc.size = mem_info.size;\n  cudaExternalMemory_t cudaImageMemory; \n  cudaImportExternalMemory(\u0026cudaImageMemory, \u0026cudaExtMemHandleDesc);\n~~~~\n\n\n# OpenGL Memory Binding\n\nTo use the retrieved OpenGL memory object, you must create the buffer then _link it_ using the \n[External Memory Object](https://www.khronos.org/registry/OpenGL/extensions/EXT/EXT_external_objects.txt) extension.\n\nIn Vulkan we bind memory to our resources, in OpenGL we can create new resources from a range within imported memory, \nor we can attach existing resources to use that memory via [NV_memory_attachment](https://www.khronos.org/registry/OpenGL/extensions/NV/NV_memory_attachment.txt).\n\n~~~~C++\n  glCreateBuffers(1, \u0026bufGl.oglId);\n  glNamedBufferStorageMemEXT(bufGl.oglId, req.size, bufGl.memoryObject, 0);\n~~~~\n\nAt this point, `m_bufferVk` is sharing the data that was allocated in Vulkan.\n\n\n\n# OpenGL Images\n\nFor images, everything is done the same way as for buffers. The memory \nallocation information needs to know to export the object, therefore the allocation is \nalso adding the `memoryHandleEx` to `memAllocInfo.pNext`.\n\nIn this example, a compute shader in Vulkan is creating an image. That image\nis converted to OpenGL in the function `createTextureGL`. \n\nThe handle for the texture is retrieved with: \n~~~~C++\n  // Retrieving the memory handle\n  texGl.handle = device.getMemoryWin32HandleKHR({texGl.texVk.allocation, vk::ExternalMemoryHandleTypeFlagBits::eOpaqueWin32}, d);\n~~~~ \n\nThe buffer containing the image will be imported just like a buffer:\n~~~~ C++\n  // Create a 'memory object' in OpenGL, and associate it with the memory allocated in Vulkan\n  glCreateMemoryObjectsEXT(1, \u0026texGl.memoryObject);\n  auto req = device.getImageMemoryRequirements(texGl.texVk.image);\n  glImportMemoryWin32HandleEXT(texGl.memoryObject, req.size, GL_HANDLE_TYPE_OPAQUE_WIN32_EXT, texGl.handle);\n~~~~ \n\nFinally, the texture will be created using the memory object \n\n~~~~C++\n  glCreateTextures(GL_TEXTURE_2D, 1, \u0026texGl.oglId);\n  glTextureStorageMem2DEXT(texGl.oglId, texGl.mipLevels, format, texGl.imgSize.width, texGl.imgSize.height, texGl.memoryObject, 0);\n~~~~\n\n\n\n# Semaphores\n\nAs we are writing an image through CUDA and displaying it with OpenGL,\nit is necessary to synchronize the two environments. Semaphores will be created by\nVulkan to wait for OpenGL or CUDA to finish.\n\n~~~~ batch\n                                                           \n  +------------+                             +------------+\n  | GL Context | signal               wait   | GL Context |\n  +------------+     |                  ^    +------------+\n                     v  +-------------+   |                  \n                   wait |CUDA Context | signal               \n                        +-------------+                      \n~~~~\n\n**!!! note**\nTo achieve correct layout transitions from VK to GL and back we need to specify the corresponding layout.\nFor `WaitSemaphoreEXT` we have to specify the matching GL layout to the last VK layout that was used (see [table 4.4](https://github.com/KhronosGroup/OpenGL-Registry/blob/5bae8738b23d06968e7c3a41308568120943ae77/extensions/EXT/EXT_external_objects.txt#L472EXT_external_objects.txt)).\nThat way GL can take care of transitioning the layout to the correct layout being used by following GL calls.\nThe other way round with `SignalSemaphoreEXT` we want to provide the layout that we want to have in VK. \n\nThose semaphores are created in Vulkan, and as previously, the OpenGL version will be retrieved similar to the CUDA section above.\n\n~~~~ C++\nstruct Semaphores\n{\n  vk::Semaphore vkReady;\n  vk::Semaphore vkComplete;\n  GLuint        glReady;\n  GLuint        glComplete;\n} m_semaphores;\n~~~~~\n\nThis is the handle informing the creation of the semaphore to get exported.\n~~~~C++\nauto handleType = vk::ExternalSemaphoreHandleTypeFlagBits::eOpaqueWin32;\n~~~~ \n\nThe creation of the semaphores needs to have the export object information. \n~~~~C++\nvk::ExportSemaphoreCreateInfo esci{ handleType };\nvk::SemaphoreCreateInfo       sci;\nsci.pNext = \u0026esci;\nm_semaphores.vkReady = m_device.createSemaphore (sci);\nm_semaphores.vkComplete = m_device.createSemaphore (sci);\n~~~~ \n\nThe conversion to OpenGL will be done the following way:\n~~~~C++\n// Import semaphores\nHANDLE hglReady = m_device.getSemaphoreWin32HandleKHR({ m_semaphores.vkReady, handleType }, \n                                                       m_dynamicDispatch);\nHANDLE hglComplete = m_device.getSemaphoreWin32HandleKHR({ m_semaphores.vkComplete, handleType }, \n                                                         m_dynamicDispatch);\nglGenSemaphoresEXT (1, \u0026m_semaphores.glReady);\nglGenSemaphoresEXT (1, \u0026m_semaphores.glComplete);\nglImportSemaphoreWin32HandleEXT (m_semaphores.glReady, \n                                 GL_HANDLE_TYPE_OPAQUE_WIN32_EXT, hglReady);\nglImportSemaphoreWin32HandleEXT (m_semaphores.glComplete, \n                                 GL_HANDLE_TYPE_OPAQUE_WIN32_EXT, hglComplete);\n~~~~\n\n\n\n# Animation\n\nSince the Vulkan memory for the vertex buffer was allocated using\nthe flags: \n\n`VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT`\n\n`vk::MemoryPropertyFlagBits::eHostVisible | vk::MemoryPropertyFlagBits::eHostCoherent`\n\nWe can easily update the buffer doing the following:\n\n~~~~C++\ng_vertexDataVK[0].pos.x = sin(t);\ng_vertexDataVK[1].pos.y = cos(t);\ng_vertexDataVK[2].pos.x = -sin(t);\nmemcpy(m_vkBuffer.mapped, g_vertexDataVK.data(), g_vertexDataVK.size() * sizeof(Vertex));\n~~~~\n \nNote we use a host-visible buffer for the sake of simplicity, at the expense of efficiency. For best performance the geometry\nwould need to be uploaded to device-local memory through a staging buffer.\n\n\n## License\n\nThis project uses the Apache 2.0 license. Please see the copyright notice in the [LICENSE](LICENSE) file.\n\nThis project also uses the NVIDIA nvpro-samples framework. Please see the license for nvpro-samples' shared_sources [here](https://github.com/nvpro-samples/shared_sources/blob/master/LICENSE.md), and the third-party packages it uses in shared_external [here](https://github.com/nvpro-samples/shared_external/blob/master/README.md).","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnvpro-samples%2Fgl_cuda_simple_interop","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnvpro-samples%2Fgl_cuda_simple_interop","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnvpro-samples%2Fgl_cuda_simple_interop/lists"}