Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/stotko/stdgpu

stdgpu: Efficient STL-like Data Structures on the GPU
https://github.com/stotko/stdgpu

cpp cpp17 cpp20 cuda data-structures gpgpu gpu gpu-acceleration gpu-computing hip modern-cpp openmp rocm stl stl-containers stl-like

Last synced: about 2 months ago
JSON representation

stdgpu: Efficient STL-like Data Structures on the GPU

Lists

README

        



stdgpu: Efficient STL-like Data Structures on the GPU






















Features |
Examples |
Getting Started |
Contributing |
License |
Contact

## Features

stdgpu is an open-source library providing **generic GPU data structures** for fast and reliable data management.

- Lightweight C++17 library with minimal dependencies
- **CUDA**, **OpenMP**, and **HIP (experimental)** backends
- Familiar STL-like GPU containers
- High-level, *agnostic* container functions like `insert(begin, end)`, to write shared C++ code
- Low-level, *native* container functions like `find(key)`, to write custom CUDA kernels, etc.
- Interoperability with [thrust](https://github.com/NVIDIA/thrust) GPU algorithms

Instead of providing yet another ecosystem, stdgpu is designed to be a *lightweight container library*. Previous libraries such as thrust, VexCL, ArrayFire or Boost.Compute focus on the fast and efficient implementation of various algorithms and only operate on contiguously stored data. stdgpu follows an *orthogonal approach* and focuses on *fast and reliable data management* to enable the rapid development of more general and flexible GPU algorithms just like their CPU counterparts.

At its heart, stdgpu offers the following GPU data structures and containers:

atomic & atomic_ref
Atomic primitive types and references
bitset
Space-efficient bit array
deque
Dynamically sized double-ended queue

queue & stack
Container adapters
unordered_map & unordered_set
Hashed collection of unique keys and key-value pairs
vector
Dynamically sized contiguous array

In addition, stdgpu also provides further commonly used helper functionality in [`algorithm`](https://stotko.github.io/stdgpu/doxygen/group__algorithm.html), [`bit`](https://stotko.github.io/stdgpu/doxygen/group__bit.html), [`contract`](https://stotko.github.io/stdgpu/doxygen/group__contract.html), [`cstddef`](https://stotko.github.io/stdgpu/doxygen/group__cstddef.html), [`execution`](https://stotko.github.io/stdgpu/doxygen/group__execution.html), [`functional`](https://stotko.github.io/stdgpu/doxygen/group__functional.html), [`iterator`](https://stotko.github.io/stdgpu/doxygen/group__iterator.html), [`limits`](https://stotko.github.io/stdgpu/doxygen/group__limits.html), [`memory`](https://stotko.github.io/stdgpu/doxygen/group__memory.html), [`mutex`](https://stotko.github.io/stdgpu/doxygen/group__mutex.html), [`numeric`](https://stotko.github.io/stdgpu/doxygen/group__numeric.html), [`ranges`](https://stotko.github.io/stdgpu/doxygen/group__ranges.html), [`type_traits`](https://stotko.github.io/stdgpu/doxygen/group__type__traits.html), [`utility`](https://stotko.github.io/stdgpu/doxygen/group__utility.html).

## Examples

In order to reliably perform complex tasks on the GPU, stdgpu offers flexible interfaces that can be used in both **agnostic code**, e.g. via the algorithms provided by thrust, as well as in **native code**, e.g. in custom CUDA kernels.

For instance, stdgpu is extensively used in [SLAMCast](https://www.researchgate.net/publication/331303359_SLAMCast_Large-Scale_Real-Time_3D_Reconstruction_and_Streaming_for_Immersive_Multi-Client_Live_Telepresence), a scalable live telepresence system, to implement real-time, large-scale 3D scene reconstruction as well as real-time 3D data streaming between a server and an arbitrary number of remote clients.

**Agnostic code**. In the context of [SLAMCast](https://www.researchgate.net/publication/331303359_SLAMCast_Large-Scale_Real-Time_3D_Reconstruction_and_Streaming_for_Immersive_Multi-Client_Live_Telepresence), a simple task is the integration of a range of updated blocks into the duplicate-free set of queued blocks for data streaming which can be expressed very conveniently:

```cpp
#include // stdgpu::index_t
#include // stdgpu::make_device
#include // stdgpu::unordered_set

class stream_set
{
public:
void
add_blocks(const short3* blocks,
const stdgpu::index_t n)
{
set.insert(stdgpu::make_device(blocks),
stdgpu::make_device(blocks + n));
}

// Further functions

private:
stdgpu::unordered_set set;
// Further members
};
```

**Native code**. More complex operations such as the creation of the duplicate-free set of updated blocks or other algorithms can be implemented natively, e.g. in custom CUDA kernels with stdgpu's CUDA backend enabled:

```cpp
#include // stdgpu::index_t
#include // stdgpu::unordered_map
#include // stdgpu::unordered_set

__global__ void
compute_update_set(const short3* blocks,
const stdgpu::index_t n,
const stdgpu::unordered_map tsdf_block_map,
stdgpu::unordered_set mc_update_set)
{
// Global thread index
stdgpu::index_t i = blockIdx.x * blockDim.x + threadIdx.x;
if (i >= n) return;

short3 b_i = blocks[i];

// Neighboring candidate blocks for the update
short3 mc_blocks[8]
= {
short3(b_i.x - 0, b_i.y - 0, b_i.z - 0),
short3(b_i.x - 1, b_i.y - 0, b_i.z - 0),
short3(b_i.x - 0, b_i.y - 1, b_i.z - 0),
short3(b_i.x - 0, b_i.y - 0, b_i.z - 1),
short3(b_i.x - 1, b_i.y - 1, b_i.z - 0),
short3(b_i.x - 1, b_i.y - 0, b_i.z - 1),
short3(b_i.x - 0, b_i.y - 1, b_i.z - 1),
short3(b_i.x - 1, b_i.y - 1, b_i.z - 1),
};

for (stdgpu::index_t j = 0; j < 8; ++j)
{
// Only consider existing neighbors
if (tsdf_block_map.contains(mc_blocks[j]))
{
mc_update_set.insert(mc_blocks[j]);
}
}
}
```

More examples can be found in the [`examples`](https://github.com/stotko/stdgpu/tree/master/examples) directory.

## Getting Started

stdgpu requires a **C++17 compiler** as well as minimal backend dependencies and can be easily built and integrated into your project via **CMake**:

- [Building From Source](https://stotko.github.io/stdgpu/getting_started/building_from_source.html)
- [Integrating Into Your Project](https://stotko.github.io/stdgpu/getting_started/integrating_into_your_project.html)

More guidelines as well as a comprehensive introduction into the design and API of stdgpu can be found in the [documentation](https://stotko.github.io/stdgpu).

## Contributing

For detailed information on how to contribute, see the [Contributing](https://stotko.github.io/stdgpu/development/contributing.html) section in the documentation.

## License

Distributed under the Apache 2.0 License. See [`LICENSE`](https://github.com/stotko/stdgpu/blob/master/LICENSE) for more information.

If you use stdgpu in one of your projects, please cite the following publications:

[**stdgpu: Efficient STL-like Data Structures on the GPU**](https://www.researchgate.net/publication/335233070_stdgpu_Efficient_STL-like_Data_Structures_on_the_GPU)

```bib
@UNPUBLISHED{stotko2019stdgpu,
author = {Stotko, P.},
title = {{stdgpu: Efficient STL-like Data Structures on the GPU}},
year = {2019},
month = aug,
note = {arXiv:1908.05936},
url = {https://arxiv.org/abs/1908.05936}
}
```

[**SLAMCast: Large-Scale, Real-Time 3D Reconstruction and Streaming for Immersive Multi-Client Live Telepresence**](https://www.researchgate.net/publication/331303359_SLAMCast_Large-Scale_Real-Time_3D_Reconstruction_and_Streaming_for_Immersive_Multi-Client_Live_Telepresence)

```bib
@article{stotko2019slamcast,
author = {Stotko, P. and Krumpen, S. and Hullin, M. B. and Weinmann, M. and Klein, R.},
title = {{SLAMCast: Large-Scale, Real-Time 3D Reconstruction and Streaming for Immersive Multi-Client Live Telepresence}},
journal = {IEEE Transactions on Visualization and Computer Graphics},
volume = {25},
number = {5},
pages = {2102--2112},
year = {2019},
month = may
}
```

## Contact

Patrick Stotko - [[email protected]](mailto:[email protected])