https://github.com/simonhmartin/asynt

Genome alignment and synteny plots
https://github.com/simonhmartin/asynt

Last synced: 2 months ago
JSON representation

Genome alignment and synteny plots

Host: GitHub
URL: https://github.com/simonhmartin/asynt
Owner: simonhmartin
License: gpl-3.0
Created: 2021-10-05T15:16:30.000Z (over 4 years ago)
Default Branch: master
Last Pushed: 2023-06-05T16:32:04.000Z (about 3 years ago)
Last Synced: 2023-06-05T17:30:06.256Z (about 3 years ago)
Language: R
Size: 810 KB
Stars: 13
Watchers: 2
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-genome-visualization - asynt - genome-visualization/asynt.png) (Comparative)

README

# Asynt: R functions for exploring synteny using whole genome alignments

* Make diagonal 'dot' plots
* Plot alignment tracts between a pair of genomes
* Merge adjacent alignments into synteny 'blocks' for cleaner plots

See our [paper](https://doi.org/10.1098/rstb.2021.0207) for examples of the plots you can make (see Figures 1 and S1)

## How to use this code

Make sure you have the R package "Intervals" installed.

If you already have alignment coordinate files from minimap2 (recommended) or mummer (using nucmer and show-coords), you are ready to go.
Open the script `asynt_example_plots.R` in an interactive R session (e.g. Rstudio) and work through it line by line to explore the kinds of plots you can make.

## Where do I get alignments from?

Make alignemnts between two assemblies (or a single assembly) using [minimap2](https://github.com/lh3/minimap2) or [mummer](https://mummer4.github.io/)

Here is an example command for minimap2:

`minimap2 -x asm20 reference.fa query.fasta | gzip > mm2asm20.paf.gz`

`-x asm20` uses presets suited for genomes up to 20% divergent.

## How does asynt infer synteny blocks?

There are some sophisticated tools that use probabilistic approaches for infering synteny blocks. This is not one of those.

The algorithm has three steps:
1. Alignments are split into ‘sub-blocks’ that each correspond to a unique tract of the reference assembly.
2. Sub-blocks below a minimum size are discarded.
3. Adjacent sub-blocks that are in the same orientation and are below some threshold distance apart are merged to yield syntenic blocks.

These three steps can be performed iteratively to first identify regions of fine-scale synteny and build these up into larger syntenic blocks (discarding short overlaps, small inversions etc).
The nature of this approach means that you will get a different result depending on what you use as the reference. If possible, use a reference that represents the ancestral state, such that your query genome is being represented as a new arrangement of ancestral blocks.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/simonhmartin/asynt

Awesome Lists containing this project

README