Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/sandeepnmenon/gpu_datastructures

CUDA implementation of Datastructures
https://github.com/sandeepnmenon/gpu_datastructures

Last synced: 24 days ago
JSON representation

CUDA implementation of Datastructures

Awesome Lists containing this project

README

        

# GPU-Accelerated Data Structures

# Hashmap
## Description
This project is a custom implemmentation of a GPU Hashmap Data structure with the insert and find functionalities. The project is written in C++ and CUDA. The project is a part of the course "GPU Programming" at New York University.
[Notion Link of resources]([https://www.notion.so/Priority-Search-Trees-in-CUDA-28918a8db28d4a5387ff715f21ce2ecf?pvs=4](https://sandeepmenon.notion.site/HashMaps-ec799e90c9894fd4aca45c06a1ed6f8e?pvs=4))

## Usage
The usage of the code is demonstrated in `driver.cu` and `kernels.cu`.
`basic_hashmap.cu` contains the hashmap implementation. The definition of the hashmap is as follows:
```cpp
template
class Hashmap
{
public:
Hashmap(size_t capacity);
~Hashmap();

__device__ bool insert(const Key k, const Value v);

__device__ bool insert(cg::thread_block_tile group, const Key k, const Value v);

__device__ Value find(const Key k) const;

__device__ void find(cg::thread_block_tile group, const Key k, Value *out) const;

Bucket *buckets;
size_t capacity{};
```
Example usage
```cpp
#include "hashmap_gpu.cu" // Include the hashmap header file

Hashmap *hashmap; // Integer to Integer hashmap
cudaMallocManaged(&hashmap, sizeof(Hashmap));
new (hashmap) Hashmap(capacity);

// Initialize the hashmap. This is done by default in the constructor. You can call it again if you want to reinitialize the hashmap.
hashmap->initialize();

// Using this hashmap in a kernel
__global__ void testIntInsert(const int *keys, const int *values, const size_t numElements, Hashmap *hashmap)
{
int idx = threadIdx.x + blockIdx.x * blockDim.x;
if (idx < numElements)
{
hashmap->insert(keys[idx], values[idx]);
}
}

// Using this hashmap in a kernel with cooperative groups
__global__ void testIntInsertCG(const int *keys, const int *values, const size_t numElements, Hashmap *hashmap)
{
int idx = (threadIdx.x + blockIdx.x * blockDim.x) / CG_SIZE;
if (idx < numElements)
{
auto group = cg::tiled_partition(cg::this_thread_block());
hashmap->insert(group, keys[idx], values[idx]);
}
}
```
The `find` function can be used in a similar way by calling `hashmap->find(key)` or `hashmap->find(group, key, &out)`.

## Running Experiments
**Note:The code runs best on the cuda2 cluster. Other clusters might give errors**
1. Build using the `make` command.
2. Run the code using this template
```bash
./driver -n -l -t -b -iIsS
```
* `-n` is the number of elements to be inserted into the hashmap.
* `-l` is the load factor of the hashmap.
* `-t` is the number of threads per block.
* `-b` is the number of blocks.
* `-i` runs default insert
* `-I` runs insert with cooperative groups
* `-s` runs default search
* `-S` runs search with cooperative groups

For benchmarking run `./benchmark.sh` and you will get the results in `benchmark_results.csv`..