Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/pseudogene/lamp-virus

Pipeline to develop one-step real-time RT-LAMP assays for specific detection of virus
https://github.com/pseudogene/lamp-virus

bioinformatics docker lamp primer scientific virus

Last synced: about 2 months ago
JSON representation

Pipeline to develop one-step real-time RT-LAMP assays for specific detection of virus

Awesome Lists containing this project

README

        

# lamp-virus

[![Build Status](https://travis-ci.org/pseudogene/lamp-virus.svg?branch=master)](https://travis-ci.org/pseudogene/lamp-virus)

We foster the openness, integrity, and reproducibility of scientific research.

Scripts and tools used to develop one-step real-time RT-LAMP assays.

## Associated publications

> **Development of four one-step real-time RT-LAMP assays for specific detection of dengue virus serotypes**.
> Lopez-Jimena B, Bekaert M, Bakheit M, Frischmann S, Patel P, Sakuntabhai A, Lambrechts L, Fall C, Faye O, Sall A and Weidmann M.
>_PLOS Negl. Trop. Dis._. 12(7): e0180625

[![DOI](https://img.shields.io/badge/DOI-10.1371%2Fjournal.pntd.0006381-blue.svg)](https://doi.org/10.1371/journal.pntd.0006381)

> **Development of a single-tube one-step RT-LAMP assay to detect the Chikungunya virus genome**.
> Lopez-Jimena B, Wehner S, Harold G, Bakheit M, Frischmann S, Bekaert M, Faye O, Sall A and Weidmman M.
>_PLOS Negl. Trop. Dis._. 12(7): e0180625

[![DOI](https://img.shields.io/badge/DOI-10.1371%2Fjournal.pntd.0006448-blue.svg)](https://doi.org/10.1371/journal.pntd.0006448)

## How to use this repository?

This repository host both the scripts and tools developed by this study. Feel free to adapt the scripts and tools, but remember to cite their authors!

To look at our scripts and raw results, **browse** through this repository. If you want to reproduce our results you will need to **clone** this repository, build the docker, and the run all the scripts. If you want to use our data for our own research, **fork** this repository and **cite** the authors.

## Prepare a docker

All required files and tools run in a self-contained [docker](https://www.docker.com/) image.

#### Clone the repository

```
git clone https://github.com/pseudogene/lamp-virus.git
cd lamp-virus
```

#### Create a docker

```
docker build --rm=true --file=Dockerfile -t lamp-virus .
```

#### Start the docker

To import and export the results of your analysis you need to link a folder to the docker. It this example your data will be store in `results` (current filesystem) which will be seem as been `/virus` from within the docker by using `-v :/virus`.

```
mkdir ~/results
docker run -i -t --rm -v ~/results:/virus lamp-virus /bin/bash
```

#### Run a new analysis

**1** - Collect NCBI genomes, automatically aligned them with [GramAlign v3.0](http://bioinfo.unl.edu/gramalign.php) and run [R/adegenet](http://adegenet.r-forge.r-project.org/) on the alignment to generate PCA and phylogeny.

```
collect_genomevirus.pl -p zika -q "txid64320[Organism:exp]" -a 2004 -b 2016
```

where:

`-p` is the filename prefix for all output file.

`-q` is the NCBI entrez query string. (e.g. Zika virus: "txid64320[Organism:exp]")

`-a` (after) is the lower limit for the year.

`-b` (before) is the upper limit for the year.

**2** - In `/virus` (from within the docker) or `results` (from outside) you now have all the results files including PCA, phylogeny and TSV file.

Edit the TSV (Tabulation Separated Values) genotype/group numbers you manually identified. The second column been calculated groups (to be used with the standard parameter sets), the third column been the more granular subgrouping (to be used with the --alt parameter)

e.g.:

```
sequence_1 1 1
sequence_2 1 2
sequence_3 2 3
```

**3** - Retrieve the subgroups/genotypes from the TSV file and run [LAVA-DNA](https://github.com/dylanstorey/lava-dna) on each genotype and each combination.

```
class_sequences.pl -p zika -a zika.align.fa -g zika.groups.tsv -c > zika.log
or
class_sequences.pl -p zika -a zika.align.fa -g zika.groups.edited.tsv -c -l > zika.loose.log
```

where:

`-p` is the filename prefix for all output file.

`-a` is the alignment generated by GramAlign (fasta-align format).

`-g` is the list of subgroups/genotypes (TSV file).

`-c` allows to test combination of genotypes and not only each genotypes.

`-l` allows for the `loose` parameter set rather than `standard` by [LAVA-DNA](https://github.com/dylanstorey/lava-dna).

**4** - Then evaluate the best minimal set of primer and map each primer set

```
map_lamp.pl -svg -a zika.align.fa \
-p zika.1.primers \
-p zika.2_5.primers \
-p zika.4.primers \
> zika.svg
```

where:

`-svg` or `-png` will generate a SVG or PNG image.

`-a` is the alignment generated by GramAlign (fasta-align format).

`-p` are lists of primer sets generated by `class_sequences.pl`.

## Parameters

![Additional description of a LAMP signature](https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-12-240/MediaObjects/12859_2010_Article_4632_Fig1_HTML.jpg)

Additional description of a LAMP signature. Each named pair refers to a sequence location corresponding to the primer regions of like-numbered primers. These pairs represent the location and orientation of the primers with respect to the target template during each extension in which they participate.

#### Super strict ("Stricter")
Parameter | Default Target
--- | ---
Outer primer length | 18-22 bp
Middle primer length | 18-22 bp
Loop primer length | 18-22 bp
Inner primer length | 18-22 bp
Outer primer Tm | 58-62°C
Middle primer Tm | 58-62°C
Loop primer Tm | 60-64°C
Inner primer Tm | 63-67°C
Maximum signature length | 180 bp
Minimum spacing from middle to inner primers | 40 bp
Maximum consecutive repeated bases | 4

#### Strict
Parameter | Default Target
--- | ---
Outer primer length | 15-25 bp
Middle primer length | 15-25 bp
Loop primer length | 15-25 bp
Inner primer length | 15-25 bp
Outer primer Tm | 53-67°C
Middle primer Tm | 53-67°C
Loop primer Tm | 55-69°C
Inner primer Tm | 58-72°C
Maximum signature length | 200 bp
Minimum spacing from middle to inner primers | 20 bp
Maximum consecutive repeated bases | 4

#### Standard
Default values of the most commonly adjusted LAVA parameters ([Torres et al, 2011](http://dx.doi.org/10.1186/1471-2105-12-240))

Parameter | Default Target
--- | ---
Outer primer length | 18-23 bp
Middle primer length | 18-23 bp
Loop primer length | 18-23 bp
Inner primer length | 20-26 bp
Outer primer Tm | 59-61°C
Middle primer Tm | 59-61°C
Loop primer Tm | 58-62°C
Inner primer Tm | 62-66°C
Maximum signature length | 320 bp
Minimum spacing from middle to inner primers | 25 bp
Maximum consecutive repeated bases | 5

#### Loose
Parameter | Default Target
--- | ---
Outer primer length | 17-24 bp
Middle primer length | 17-24 bp
Loop primer length | 17-24 bp
Inner primer length | 18-28 bp
Outer primer Tm | 58-62°C
Middle primer Tm | 58-62°C
Loop primer Tm | 57-63°C
Inner primer Tm | 61-66°C
Maximum signature length | 400 bp
Minimum spacing from middle to inner primers | 20 bp
Maximum consecutive repeated bases | 5

#### Very Loose ("Looser")

Parameter | Default Target
--- | ---
Outer primer length | 16-25 bp
Middle primer length | 16-25 bp
Loop primer length | 16-25 bp
Inner primer length | 16-29 bp
Outer primer Tm | 57-63°C
Middle primer Tm | 57-63°C
Loop primer Tm | 56-64°C
Inner primer Tm | 68-67°C
Maximum signature length | 500 bp
Minimum spacing from middle to inner primers | 18 bp
Maximum consecutive repeated bases | 6

## Scripts
#### collect_genomevirus.pl

```
Usage: collect_genomevirus.pl --prefix --query [..]
Description: Collect all complete genomes using the provided ENTREZ query and align them.

--prefix
Filename output prefix. [mandatory]
--query
NCBI entrez query. [mandatory]
--after
minimum date to retrieve from (YYYY).
--before
maximum date to retrieve from (YYYY).
--norun
Disable the automatic run of R and the PCA and cluster analysis
--verbose
Becomes very chatty.
```

#### class_sequences.pl

```
Usage: class_sequences.pl --prefix --align --groups [..]
Description: Generate RT-LAMP primer sets for each group of genomes provided.

--prefix
Filename output prefix. [mandatory]
--groups
Path to a TSV file with the genome groups (e.g. output.group.tsv). [mandatory]
sequence_idgroupingalt_grouping
sequence_1 1 1
sequence_2 1 2
sequence_3 2 3

--align
Alignment generated by collect_genomevirus.pl. [mandatory]
--realign
Force each group or group combination to be realign prior running LAVA.
--combinatory
Force to test group combinations (recommended).
--extra
Provide LAVA XML parameter file.
--strict
Force usage of strict LAMP parameters (see documentation).
--strict --strict
Force usage of even stricter LAMP parameters (see documentation).
--loose
Force usage of loose LAMP parameters (see documentation).
--loose --loose
Force usage of even looser LAMP parameters (see documentation).
--alt
Force usage alternative grouping column from the group file.
--verbose
Becomes very chatty.
```

#### map_lamp.pl

```
Usage: map_lamp.pl --align --primer [--primer ...] [--svg|--png]
Description: Generate a simple visualisation of the location of the RT-LAMP primer on the alignment.

--align
Alignment generated by collect_genomevirus.pl. [mandatory]
--primer
Path to the primer file generated by class_sequence.pl. Can be use multiple time for multiple file.
--svg
Produce a vectorial figure (SVG format).
--png
Provide a bitmap figure (PNG format).
--verbose
Becomes very chatty.
```

## Issues

If you have any problems with or questions about the scripts, please contact us through a [GitHub issue](https://github.com/pseudogene/lamp-virus/issues).
Any issue related to the scientific results themselves must be done directly with the authors.

## Contributing

You are invited to contribute new features, fixes, or updates, large or small; we are always thrilled to receive pull requests, and do our best to process them as fast as we can.

## License and distribution

This code is distributed under the [GNU GPL license v3](http://www.gnu.org/licenses/gpl-3.0.html). The documentation, raw data and work are licensed under a [Creative Commons Attribution-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-sa/4.0/).​