Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/sandeepnmenon/gpu_datastructures

CUDA implementation of Datastructures
https://github.com/sandeepnmenon/gpu_datastructures

Last synced: 24 days ago
JSON representation

CUDA implementation of Datastructures

Host: GitHub
URL: https://github.com/sandeepnmenon/gpu_datastructures
Owner: sandeepnmenon
License: mit
Created: 2023-10-25T20:39:49.000Z (about 1 year ago)
Default Branch: master
Last Pushed: 2024-06-14T00:11:01.000Z (5 months ago)
Last Synced: 2024-06-15T01:56:23.741Z (5 months ago)
Language: Cuda
Homepage:
Size: 1.88 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # GPU-Accelerated Data Structures

# Hashmap

## Description

This project is a custom implemmentation of a GPU Hashmap Data structure with the insert and find functionalities. The project is written in C++ and CUDA. The project is a part of the course "GPU Programming" at New York University.

[Notion Link of resources]([https://www.notion.so/Priority-Search-Trees-in-CUDA-28918a8db28d4a5387ff715f21ce2ecf?pvs=4](https://sandeepmenon.notion.site/HashMaps-ec799e90c9894fd4aca45c06a1ed6f8e?pvs=4))

## Usage

The usage of the code is demonstrated in `driver.cu` and `kernels.cu`. 

`basic_hashmap.cu` contains the hashmap implementation. The definition of the hashmap is as follows:

```cpp

template 

class Hashmap

{

public:

    Hashmap(size_t capacity);

    ~Hashmap();

    __device__ bool insert(const Key k, const Value v);

    __device__ bool insert(cg::thread_block_tile group, const Key k, const Value v);

    __device__ Value find(const Key k) const;

    __device__ void find(cg::thread_block_tile group, const Key k, Value *out) const;

    Bucket *buckets;

    size_t capacity{};

```

Example usage

```cpp

#include "hashmap_gpu.cu"   // Include the hashmap header file

Hashmap *hashmap; // Integer to Integer hashmap

cudaMallocManaged(&hashmap, sizeof(Hashmap));

new (hashmap) Hashmap(capacity);

// Initialize the hashmap. This is done by default in the constructor. You can call it again if you want to reinitialize the hashmap.    

hashmap->initialize();

// Using this hashmap in a kernel

__global__ void testIntInsert(const int *keys, const int *values, const size_t numElements, Hashmap *hashmap)

{

    int idx = threadIdx.x + blockIdx.x * blockDim.x;

    if (idx < numElements)

    {

        hashmap->insert(keys[idx], values[idx]);

    }

}

// Using this hashmap in a kernel with cooperative groups

__global__ void testIntInsertCG(const int *keys, const int *values, const size_t numElements, Hashmap *hashmap)

{

    int idx = (threadIdx.x + blockIdx.x * blockDim.x) / CG_SIZE;

    if (idx < numElements)

    {

        auto group = cg::tiled_partition(cg::this_thread_block());

        hashmap->insert(group, keys[idx], values[idx]);

    }

}

```

The `find` function can be used in a similar way by calling `hashmap->find(key)` or `hashmap->find(group, key, &out)`.

## Running Experiments

**Note:The code runs best on the cuda2 cluster. Other clusters might give errors**

1. Build using the `make` command.

2. Run the code using this template

```bash

./driver -n  -l  -t  -b  -iIsS

```

* `-n` is the number of elements to be inserted into the hashmap.

* `-l` is the load factor of the hashmap.

* `-t` is the number of threads per block.

* `-b` is the number of blocks.

* `-i` runs default insert

* `-I` runs insert with cooperative groups

* `-s` runs default search

* `-S` runs search with cooperative groups

For benchmarking run `./benchmark.sh` and you will get the results in `benchmark_results.csv`..