https://kundajelab.github.io/dragonn/
A toolkit to learn how to model and interpret regulatory sequence data using deep learning.
https://kundajelab.github.io/dragonn/
deep-learning genomics
Last synced: 5 months ago
JSON representation
A toolkit to learn how to model and interpret regulatory sequence data using deep learning.
- Host: GitHub
- URL: https://kundajelab.github.io/dragonn/
- Owner: kundajelab
- License: mit
- Created: 2016-06-02T00:03:18.000Z (almost 9 years ago)
- Default Branch: master
- Last Pushed: 2023-08-01T06:04:46.000Z (over 1 year ago)
- Last Synced: 2024-10-29T00:29:45.822Z (6 months ago)
- Topics: deep-learning, genomics
- Language: Jupyter Notebook
- Homepage: http://kundajelab.github.io/dragonn/
- Size: 165 MB
- Stars: 254
- Watchers: 26
- Forks: 71
- Open Issues: 17
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGES.txt
- License: LICENSE
Awesome Lists containing this project
- awesome-dl4g - DragoNN - [TensorFlow] - Predictive modeling of regulatory genomics, nucleotide-resolution feature discovery, and simulations for systematic development and benchmarking. (2016) (Software packages / DL4G Packages)
README
# [DragoNN](http://kundajelab.github.io/dragonn/)
[](https://travis-ci.org/kundajelab/dragonn)
[](https://github.com/kundajelab/dragonn/blob/master/LICENSE)The `dragonn` package implements Deep RegulAtory GenOmic Neural Networks (DragoNNs) for predictive modeling of regulatory genomics, nucleotide-resolution feature discovery, and simulations for systematic development and benchmarking.

## Installation
To install the latest released version of DragoNN, install the [Anaconda](https://www.continuum.io/downloads) python distribution. Then, run:
```
conda install dragonn -c kundajelab
```
DragoNN is compatible with Python2 and Python3. Specific optional features such as [DeepLIFT](https://github.com/kundajelab/deeplift) and [MOE](https://github.com/Yelp/MOE) are compatible with Python2 only.## 15 seconds to your first DragoNN model
The `dragonn` package provides a simple command line interface to train DragoNN models, test them, and predict on sequence data. Train an example model by running:```
dragonn train --pos-sequences examples/example_pos_sequences.fa --neg-sequences examples/example_neg_sequences.fa --prefix training_example
```This will store a model file, training_example.model.json, with the model architecture and a weights file, training_example.weights.h5, with the parameters of the trained model. Test the model by running:
```
dragonn test --pos-sequences examples/example_pos_sequences.fa --neg-sequences examples/example_neg_sequences.fa --arch-file training_example.arch.json --weights-file training_example.weights.h5
```This will print the model's test performance metrics. Model predictions on sequence data can be obtained by running:
```
dragonn predict --sequences examples/example_pos_sequences.fa --arch-file training_example.arch.json --weights-file training_example.weights.h5 --output-file example_predictions.txt
```This will store the model predictions for sequences in example_pos_sequences.fa in the output file example_predictions.txt. Interpret sequence data with a dragonn model by running:
```
dragonn interpret --sequences examples/example_pos_sequences.fa --arch-file training_example.arch.json --weights-file training_example.weights.h5 --prefix example_interpretation
```This will write the most important subsequence in each input sequence along with its location in the input sequence in the file example_interpretation.task_0.important_sequences.txt.
Note: by default, only examples with predicted positive class probability >0.5 are interpreted. Examples below this threshold yield important subsequence of Ns with location -1. This default can be changed with the flag --pos-threshold.We encourage DragoNN users to share models in the [Kipoi Genomics Model Zoo](http://kipoi.org/). Enjoy!
## DragoNN paper supplement
We provide trained models, data, and code in the [paper supplement](paper_supplement) to reproduce results in the DragoNN manuscript.