Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/lotharukpongjs/phylomapr
Get precomputed gene age maps (phylomaps) in R
https://github.com/lotharukpongjs/phylomapr
bioinformatics gene-age genomics-data phylostratigraphy
Last synced: 11 days ago
JSON representation
Get precomputed gene age maps (phylomaps) in R
- Host: GitHub
- URL: https://github.com/lotharukpongjs/phylomapr
- Owner: LotharukpongJS
- License: other
- Created: 2023-06-22T13:29:29.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-11-02T10:32:52.000Z (about 1 year ago)
- Last Synced: 2024-03-19T23:31:47.910Z (8 months ago)
- Topics: bioinformatics, gene-age, genomics-data, phylostratigraphy
- Language: R
- Homepage: https://lotharukpongjs.github.io/phylomapr/
- Size: 12.1 MB
- Stars: 3
- Watchers: 2
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# phylomapr
![Visitors](https://api.visitorbadge.io/api/visitors?path=https%3A%2F%2Fgithub.com%2FLotharukpongJS%2Fphylomapr&label=Visitors&countColor=%23263759&style=flat)Gene founder events facilitate evolutionary innovations. `phylomapr` enables quick retrieval of precomputed gene age maps (phylomaps) in R. Gene age maps loaded from `phylomapr` integrate _seamlessly_ with [`myTAI`](https://github.com/drostlab/myTAI).
Furthermore, [carbon footprint of computational work is on the rise](https://www.ebi.ac.uk/about/news/perspectives/greener-principles/). This package helps alleviate that for gene age inference.## Installation
```r
# install biomartr
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")BiocManager::install("ropensci/biomartr")
devtools::install_github("LotharukpongJS/phylomapr")
```## Use Cases
### Retrieve gene age maps using `phylomapr`
Load the `phylomap` of _Apostichopus japonicus_ (Japanese sea cucumber) generated using the [GenEra](https://github.com/josuebarrera/GenEra).
```r
# either
Aj.map <- phylomapr::Apostichopus_japonicus.PhyloMap
# or alternatively
library(phylomapr)
Aj.map <- Apostichopus_japonicus.PhyloMaphead(Aj.map)
```
```
Phylostratum GeneID
1 2 tr|A0A0B6VS88|A0A0B6VS88_STIJA
2 1 tr|A0A0G2R1N3|A0A0G2R1N3_STIJA
3 1 tr|A0A0H4BK46|A0A0H4BK46_STIJA
4 3 tr|A0A0X7YCD7|A0A0X7YCD7_STIJA
5 1 tr|A0A1B2ZDN7|A0A1B2ZDN7_STIJA
6 2 tr|A0A1X9J403|A0A1X9J403_STIJA
```
To get the data description.
```r
?Apostichopus_japonicus.PhyloMap
```
```r
Apostichopus_japonicus.PhyloMap package:phylomapr R DocumentationPhylomap of Apostichopus japonicus
Description:
Gene ages inferred using GenEra on refence protein sequences from
Uniprot proteomes. Note: DIAMOND was run using the ultra-sensitive
mode.Usage:
Apostichopus_japonicus.PhyloMap
Format:A tibble with 30,032 rows and 2 variables:
Phylostratum dbl Phylostratum (or gene age) assignment
GeneID chr GeneID annotation from UniProt
Source:
```
### Loading gene age maps into `myTAI`
[`myTAI`](https://github.com/drostlab/myTAI) facilitates evolutionary transcriptomic studies.
Below are some ways in which gene age maps retrieved via `phylomapr` can be integrate _seamlessly_ into `myTAI`.#### Plot the developmental hourglass (on simulated gene expression data)
using simulated developmental gene expression of _Apostichopus japonicus_ (Japanese sea cucumber).```r
Aj.map <- phylomapr::Apostichopus_japonicus.PhyloMap
```Simulate developmental gene expression.
```r
# Set the random seed for reproducibility
set.seed(123)# Generate log-normally distributed counts (controversial) for each gene and developmental stage, and
# Create a data frame with the count table
Aj.ExpressionMatrix <- tibble::tibble(
GeneID = Aj.map$GeneID,
`24H` = stats::rlnorm(length(Aj.map$GeneID), meanlog = 3, sdlog = 1),
`48H` = stats::rlnorm(length(Aj.map$GeneID), meanlog = 3, sdlog = 1),
`72H` = stats::rlnorm(length(Aj.map$GeneID), meanlog = 3, sdlog = 1)
)
```
```r
Aj.PES <- myTAI::MatchMap(Aj.map, Aj.ExpressionMatrix, remove.duplicates = FALSE, accumulate = NULL)
```
And test the hourglass on the simulated data.
```
myTAI::PlotSignature(tidyr::drop_na(Aj.PES))
```
![](https://github.com/LotharukpongJS/phylomapr/assets/80110649/29c1866f-9abc-4657-bf6a-013570053090)#### Next, transform the simulated gene expression data
Note: this requires `myTAI (version > 1.0.1.0000)`.
```r
Aj.PES.log2 <- myTAI::tf(tidyr::drop_na(Aj.PES),FUN = log2, pseudocount = 1)
hist(Aj.PES.log2$`24H`)
```
![](https://github.com/LotharukpongJS/phylomapr/assets/80110649/1c5ed279-2a13-48a9-af62-f2709ee16fda)Compare this to the distribution of raw abundance (TPM).
```r
hist(Aj.PES$`24H`, breaks = 200)
```
![](https://github.com/LotharukpongJS/phylomapr/assets/80110649/a29b15a7-c269-427a-9848-acba6b56af9e)```r
myTAI::PlotSignature(tidyr::drop_na(Aj.PES.log2))
```
![](https://github.com/LotharukpongJS/phylomapr/assets/80110649/144d0c68-54f8-4af2-be46-539f37fc5211)## Tutorials
- [Gene names in different databases](https://lotharukpongjs.github.io/phylomapr/articles/GeneIDs.html): GeneIDs can differ between databases. This could be an issue when the gene age is estimated with one gene naming convention and the RNA-seq mapping is done with another. This tutorial shows how one could convert gene IDs (`convertID()`) between databases.
- [Adding phylomaps to `phylomapr`](https://lotharukpongjs.github.io/phylomapr/articles/Adding_phylomaps.html): Advanced gene age (phylo)mappers who ran their own gene age inference may want to contribute to `phylomapr`, which is at its core a collaborative effort. This tutorial shows how one could add new phylomaps to `phylomapr`.
## Citation
Citations are provided in the data description. Just put a `?` in front of the dataset.## Acknowledgement
I would like to thank several individuals for making this mini-project possible.First I would like to thank Hajk-Georg Drost for providing me with the intellectual environment that enabled this project.
Furthermore, I would like to thank Susana M. Coelho for hosting and facilitating this research, as well as the Max Planck Institute for Biology Tübingen and the Max Planck Society.
I also thank the BMBF-funded de.NBI Cloud within the German Network for Bioinformatics Infrastructure (de.NBI) (031A532B, 031A533A, 031A533B, 031A534A, 031A535A, 031A537A, 031A537B, 031A537C, 031A537D, 031A538A).