https://github.com/xmas7/cudampi
A large hybrid CPU/GPU sorting network using CUDA and MPI. The sorting network uses a standard Quicksort for CPUs and a custom Bitonic Sort for GPUs. These two algorithms were the fastest in a number of prior benchmarks.
https://github.com/xmas7/cudampi
cpu cuda gpu hybrid mpi network
Last synced: about 2 months ago
JSON representation
A large hybrid CPU/GPU sorting network using CUDA and MPI. The sorting network uses a standard Quicksort for CPUs and a custom Bitonic Sort for GPUs. These two algorithms were the fastest in a number of prior benchmarks.
- Host: GitHub
- URL: https://github.com/xmas7/cudampi
- Owner: xmas7
- License: lgpl-3.0
- Created: 2022-09-02T03:29:58.000Z (over 2 years ago)
- Default Branch: master
- Last Pushed: 2022-09-02T03:30:31.000Z (over 2 years ago)
- Last Synced: 2025-02-01T09:28:03.711Z (4 months ago)
- Topics: cpu, cuda, gpu, hybrid, mpi, network
- Language: Shell
- Homepage:
- Size: 1.17 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
CUDAMPI
=======A large hybrid CPU/GPU sorting network using CUDA and MPI.
The sorting network uses a standard Quicksort for CPUs and a custom
Bitonic Sort for GPUs. These two algorithms were the fastest
in a number of prior benchmarks.We execute the first step of a bucketsort algorithm to presort the data.
A bucket only contains numbers in a given range.
We put each number into its corresponding bucket. This can be done in parallel.
Now each bucket can be sorted on either a CPU or a GPU.The sorting network uses the filesystem as a process management solution.
Therefore no explicit locks are required.
MPI/IO is used to write to the filesystem in parallel.
The result is a number of sorted files.