An open API service indexing awesome lists of open source software.

https://github.com/stjudecloud/ngsderive

Forensic analysis tool useful in backwards computing information from next-generation sequencing data.
https://github.com/stjudecloud/ngsderive

bioinformatics computational-biology gene-model genomics next-generation-sequencing ngs strandedness strandedness-inference workflow workflow-engine

Last synced: 3 months ago
JSON representation

Forensic analysis tool useful in backwards computing information from next-generation sequencing data.

Awesome Lists containing this project

README

          



ngsderive



Actions: CI Status


PyPI


PyPI: Downloads


PyPI: Downloads


License: MIT


Forensic analysis tool useful in backwards computing information from next-generation sequencing data and annotating splice junctions.


Explore the docs »




Request Feature
·
Report Bug
·
⭐ Consider starring the repo! ⭐


> Notice: `ngsderive` is largely a forensic analysis tool useful in backwards computing information
> from next-generation sequencing data. Notably, most results are provided as a 'best guess' —
> the tool does not claim 100% accuracy and results should be considered with that understanding.
> An exception would be the `junction-annotation` tool which analyzes more concrete evidence than the other tools.

## 🎨 Features

The following attributes can be guessed using ngsderive:

* Illumina Instrument. Infer which Illumina instrument was used to generate the data by matching against known instrument and flowcell naming patterns. Each guess comes with a confidence score.
* RNA-Seq Strandedness. Infer from the data whether RNA-Seq data was generated using a Stranded-Forward, Stranded-Reverse, or Unstranded protocol.
* Pre-trimmed Read Length. Compute the distribution of read lengths in the file and attempt to guess what the original read length of the experiment was.
* PHRED Score Encoding. Infers which encoding scheme was used to store PHRED scores as ASCII characters.
* Junction Annotation. Annotates splice junctions as novel, partial novel, or known in comparison to a reference gene model.

## 📚 Getting Started

### Installation

You can install ngsderive using the Python Package Index ([PyPI](https://pypi.org/)).

```bash
pip install ngsderive
```

## 🖥️ Development

If you are interested in contributing to the code, please first review our [CONTRIBUTING.md][contributing-md] document.

To bootstrap a development environment, please use the following commands.

```bash
# Clone the repository
git clone git@github.com:stjudecloud/ngsderive.git
cd ngsderive

# Install the project using poetry
poetry install
```

## 🚧️ Tests

ngsderive provides a (currently patchy) set of tests — both unit and end-to-end.

```bash
py.test
```

## 🤝 Contributing

Contributions, issues and feature requests are welcome!
Feel free to check [issues page](https://github.com/stjudecloud/ngsderive/issues). You can also take a look at the [contributing guide][contributing-md].

## 📝 License

This project is licensed as follows:

* All code related to the `instrument` subcommand is licensed under the [AGPL v2.0][agpl-v2]. This is not due to any strict requirement, but out of deference to some [code][10x-inspiration] I drew inspiration from (and copied patterns from), the decision was made to license this code consistently.
* The rest of the project is licensed under the MIT License - see the [LICENSE.md](LICENSE.md) file for details.

Copyright © 2020 [St. Jude Cloud Team](https://github.com/stjudecloud).

[10x-inspiration]: https://github.com/10XGenomics/supernova/blob/master/tenkit/lib/python/tenkit/illumina_instrument.py
[agpl-v2]: http://www.affero.org/agpl2.html
[contributing-md]: https://github.com/stjudecloud/ngsderive/blob/master/CONTRIBUTING.md
[license-md]: https://github.com/stjudecloud/ngsderive/blob/master/LICENSE.md