https://github.com/alexyzha/cuda-bioinformatics
A CUDA-Accelerated Bioinformatics Toolchain
https://github.com/alexyzha/cuda-bioinformatics
bioinformatics bioinformatics-tool cplusplus cuda
Last synced: 2 months ago
JSON representation
A CUDA-Accelerated Bioinformatics Toolchain
- Host: GitHub
- URL: https://github.com/alexyzha/cuda-bioinformatics
- Owner: alexyzha
- Created: 2025-03-06T22:33:33.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-04-08T05:37:03.000Z (about 1 year ago)
- Last Synced: 2025-04-08T06:27:45.979Z (about 1 year ago)
- Topics: bioinformatics, bioinformatics-tool, cplusplus, cuda
- Language: C++
- Homepage: https://github.com/alexyzha/CUDA-Bioinformatics/wiki
- Size: 219 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# About This Project
All documentation can be found [here](https://github.com/alexyzha/CUDA-Bioinformatics/wiki).
## Why?
I decided to write a bioinformatics toolchain because even though I am a Quantitative/Computational Biology major, we don't get to work with that much code in class. The things I've done with code so far have mostly been statistics/data related, and I have a lot more fun coding from the ground up. Additionally, I wanted to get more experience coding in `CUDA`, and algorithms used in bioinformatics are 1. massively parallel and 2. somewhat familiar to me. Finally, I also wanted to gain more experience setting up things like unit/integration tests and CI/CD pipelines.
## What's in it?
The code in this repository is written to parse, format/package, and analyze `.fasta`, `.fastq`, and `.sam` files. The regular `C++` code supports all 3 of these files. However, the `CUDA` code only supports operations on `.fastq` file data. However, all of the `CUDA` code is neatly wrapped in `__host__` code wrappers: you don't actually have to process/format data to use the `CUDA` kernels I wrote.
## Compiling
- Use CMake to compile any projects.
- An example template can be found in the root directory (`./CMakeLists.txt`). It compiles all `CUDA` code with all `CUDA` tests.
- To build, from the root directory:
```{bash}
mkdir && cd build
cmake ..
make
./[EXEC_NAME]
```
- For compiling with the pure `C++` part of this project, you can also just use a makefile. A template can be found at `./tests/cpu/Makefile`. It compiles all `C++` code with all CPU tests (`GTEST`).
## Developing Environment
- Docker base image: `cuda:12.1.1-devel-ubuntu22.04`. On my Mac, I use `Ubuntu:LATEST` since I don't have an NVIDIA GPU on there.
- Docker packages: `GTEST`, `sra-toolkit`, `valgrind`, `gdb`, `cmake`, `build-essential`, and some other not super important ones (see `./Dockerfile`).
- On my desktop, I use `nvidia-ctk` to run `CUDA` code inside a Docker container with a `RTX 4060`.