https://github.com/im-mou/gpu-kmer-counter
Implementación del algoritmo para contar K-mers en una secuencia genética usando GPUs.
https://github.com/im-mou/gpu-kmer-counter
bioinformatics gpu k-mer-counting lock-free-hashtable
Last synced: 2 months ago
JSON representation
Implementación del algoritmo para contar K-mers en una secuencia genética usando GPUs.
- Host: GitHub
- URL: https://github.com/im-mou/gpu-kmer-counter
- Owner: im-mou
- Created: 2021-06-28T21:08:19.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2021-07-05T16:10:22.000Z (almost 4 years ago)
- Last Synced: 2025-01-19T20:58:14.595Z (4 months ago)
- Topics: bioinformatics, gpu, k-mer-counting, lock-free-hashtable
- Language: C
- Homepage:
- Size: 21.5 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
#### im-mou/gpu-kmer-counter
# Speeding up the algorithm to count K-mers in a genetic sequence using GPUs.## Abstract
K-mer counting is a process with the goal of creating a histogram of all possible combinations of length k for an input string S. From an algorithmic point of view, counting k-mers in a string seems like a very simple task but with recent advances in sequencing technology, more and more sequencing machines are generating a large amount of data in a very short time and makes the simple task of generating a histogram a challenge. In recent years, the performance of k-mer counting algorithms has improved significantly, and there has been much interest in using graphics processing units (GPUs) to accomplish the task of counting k-mers. The fundamental purpose of this research is to analyze different algorithms to count the number of occurrences in a sequence with different k-mer settings and subsequently to optimize and speed up one of the algorithms by using GPUs.## Source repository
[Source code repository: lh3/kmer-cnt](https://github.com/lh3/kmer-cnt)
This repository contains all the the source code of the diferent implementations that has been used and experimented with for this research.## Instructions to use and test the implementations
```sh
git clone https://github.com/im-mou/gpu-kmer-counter
cd gpu-kmer-counter
make
```### Download and parse the dataset for different implementations
```sh
wget https://github.com/lh3/kmer-cnt/releases/download/v0.1/M_abscessus_HiSeq_10M.fa.gz
./parse-data ./M_abscessus_HiSeq_10M.fa.gz
```## Execute implementations
By default all these script execute the code with k-mer length of 32. If you choose to experiment with a diferent k size, you can edit the corresponding slurm file and uncomment the line with the desired k-length.
### Secuential: kc-c1-fast.c
```sh
sbatch ./slurm_scripts/slurm-kc-c1-fast.sub
```### Parallel: cuda-fast.c - Best implementation
```sh
sbatch ./slurm_scripts/slurm-cuda-fast.sub
```### Parallel: cuda-dumb.c - Pretty dumb. Non-atomic, experimental purpose only.
```sh
sbatch ./slurm_scripts/slurm-cuda-dumb.sub
```## Scripts
The two properly working final sequential and parallel implementation with the correct outputs are the following:
- kc-c1-fast.c
- cuda-fast.cu