Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/dizys/nyu-gpu-final-project

NYU CV Final Project: Comparing CUDA to OpenMP on GPUs
https://github.com/dizys/nyu-gpu-final-project

Last synced: 15 days ago
JSON representation

NYU CV Final Project: Comparing CUDA to OpenMP on GPUs

Host: GitHub
URL: https://github.com/dizys/nyu-gpu-final-project
Owner: dizys
License: mit
Created: 2022-11-17T20:13:35.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2022-12-07T21:28:06.000Z (about 2 years ago)
Last Synced: 2024-12-20T20:03:10.607Z (21 days ago)
Language: C++
Homepage:
Size: 137 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# nyu-gpu-final-project

Comparing CUDA to OpenMP on GPUs with K-Means, BFS, Floyd-Warshall, and N-Queens algorithms.

## Getting Started

### Setup CIMS GPU machines

We experimented on `cuda3.cims.nyu.edu`.

1. Load the CUDA module:

```bash
module load cuda-10.1
```

2. Download GCC with OpenMP GPU-offloading support:

```bash
mkdir -p /tmp/$(whoami)
wget -O /tmp/$(whoami)/gcc.zip https://github.com/nyu-multicore/cims-gpu/releases/download/gcc/gcc-11.3.1_cims_gpu_offload_22112501.zip && unzip /tmp/$(whoami)/gcc.zip -d /tmp/$(whoami) && rm -f /tmp/$(whoami)/gcc.zip
```

> **Or for step 2, alternatively, you can build GCC from source with OpenMP GPU-offload support**
>
> ```bash
> wget -O build_gcc.sh https://gist.githubusercontent.com/dizys/8dedbe94439b91d759b6c1e6e316d542/raw/3ddbd8def8cc5bc7ce42549317820df16daf9e96/build_gcc_with_offload.sh && sh build_gcc.sh && rm -f build_gcc.sh
> ```
>
> This will take roughly 30 minutes to build GCC 11 from source. And GCC will be installed to `/tmp//gcc` temporarily.

### Compile the programs

Before compilation, environment variable `LD_LIBRARY_PATH` must be set to the path of the GCC installation.

```bash
export MY_GCC_PATH=/tmp/$(whoami)/gcc
export LD_LIBRARY_PATH=$MY_GCC_PATH/lib64:${LD_LIBRARY_PATH}
```

Then, build the project:

```bash
make GPP_BIN=$MY_GCC_PATH/bin/g++
```

### Download datasets

```bash
mkdir -p /tmp/$(whoami)/data

# KMeans datasets
wget -O /tmp/$(whoami)/data/kmeans_10000.txt https://media.githubusercontent.com/media/nyu-multicore/k-means/main/data/dataset-10000.txt
wget -O /tmp/$(whoami)/data/kmeans_100000.txt https://media.githubusercontent.com/media/nyu-multicore/k-means/main/data/dataset-100000.txt
wget -O /tmp/$(whoami)/data/kmeans_1000000.txt https://media.githubusercontent.com/media/nyu-multicore/k-means/main/data/dataset-1000000.txt
wget -O /tmp/$(whoami)/data/kmeans_5000000.txt https://media.githubusercontent.com/media/nyu-multicore/k-means/main/data/dataset-5000000.txt
wget -O /tmp/$(whoami)/data/kmeans_10000000.txt https://media.githubusercontent.com/media/nyu-multicore/k-means/main/data/dataset-10000000.txt

# BFS datasets
wget -O /tmp/$(whoami)/data/graph_g1000_s100.txt https://github.com/nyu-multicore/cims-gpu/releases/download/bfs-data/graph_g1000_s100.txt
wget -O /tmp/$(whoami)/data/graph_g2000_s100.txt https://github.com/nyu-multicore/cims-gpu/releases/download/bfs-data/graph_g2000_s100.txt
wget -O /tmp/$(whoami)/data/graph_g4000_s100.txt https://github.com/nyu-multicore/cims-gpu/releases/download/bfs-data/graph_g4000_s100.txt
wget -O /tmp/$(whoami)/data/graph_g8000_s100.txt https://github.com/nyu-multicore/cims-gpu/releases/download/bfs-data/graph_g8000_s100.txt
wget -O /tmp/$(whoami)/data/graph_g16000_s100.txt https://github.com/nyu-multicore/cims-gpu/releases/download/bfs-data/graph_g16000_s100.txt

# Floyd-Warshall datasets
cd bin && ./generategraph && cd .. # generate dataset .txt

# N-Queens: N-Queens programs don't need any extra dataset files to run
```

## Run the programs

```bash
cd bin

# KMeans
./kmeans_seq /tmp/$(whoami)/data/kmeans_.txt # run the sequential version
./kmeans_cuda /tmp/$(whoami)/data/kmeans_.txt # run the CUDA version
./kmeans_openmp /tmp/$(whoami)/data/kmeans_.txt # run the OpenMP version

# BFS
./bfs_seq /tmp/$(whoami)/data/graph_g_s.txt # run the sequential version
./bfs_cuda /tmp/$(whoami)/data/graph_g_s.txt # run the CUDA version
./bfs_openmp /tmp/$(whoami)/data/graph_g_s.txt # run the OpenMP version

# Floyd-Warshall
./warshall_seq .txt # run the sequential version
./warshall_cuda .txt # run the CUDA version
./warshall_openmp .txt # run the OpenMP version

# N-Queens
./nqueens_seq # run the sequential version
./nqueens_cuda # run the CUDA version
./nqueens_openmp # run the OpenMP version
```

## Experiment Results

We experimented on `cuda3.cims.nyu.edu`.

For time spent measurement, every experiment setting has been run 5 times and should be averaged to get the final result.

Result data files:

| Filename | Description |
| ------------------------------------ | ----------------------------------------------------------------------------------------------------- |
| [exp_data.csv](exp_data.csv) | Raw time cost experiment data with 5 iterations |
| [exp_data_avg.csv](exp_data_avg.csv) | Averaged time cost experiment data, generated by [exp_data_crunching.ipynb](exp_data_crunching.ipynb) |
| [exp_mem_data.csv](exp_mem_data.csv) | Max GPU memory usage data |

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details