Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/m4tx/masters-thesis

Implementation of Context Binning and Model Clustering for Compression of Genetic Data
https://github.com/m4tx/masters-thesis

compression genetic-data latex thesis

Last synced: about 1 month ago
JSON representation

Implementation of Context Binning and Model Clustering for Compression of Genetic Data

Awesome Lists containing this project

README

        

# Implementation of Context Binning and Model Clustering for Compression of Genetic Data

My master's thesis written as part of the computer science course at Jagiellonian University.

## Abstract

In recent years, there happened a gigantic leap in the speed of DNA sequencing
methods, which allowed us to sequence DNAs of complex organisms, such as humans,
quickly. However, this leads to increasing demand for disk storage, as the
sizes of the databases containing such data can easily reach dozens of
terabytes. In his article "Context binning, model clustering and adaptivity
for data compression of genetic data", Jarek Duda proposes promising compression
techniques that should help build a compressor better than the current state of
the art. This thesis describes the compressor built to evaluate those
techniques, tests it with real-world data and compares it to other genetic data
compression tools.

## Download

The PDF file can be downloaded from the
[GitHub Releases page](https://github.com/m4tx/masters-thesis/releases/download/final/Implementation_of_Context_Binning_and_Model_Clustering_for_Compression_of_Genetic_Data.pdf).

## Building

Make sure you have Inkscape and a distribution of LaTeX installed in your
system.

```bash
make
```

## License
This work is licensed under a
[Creative Commons Attribution-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-sa/4.0/).