https://github.com/neurogenomics/orthogene
𧬠o r t h o g e n e π§¬β¨β¨β¨β¨β¨β¨β¨ Interspecies gene mappingβ¨β¨β¨β¨β¨ π¦ π π± π π³ π π π π π πͺ± π πͺ° π π π π¦ π π π π¦ π π π π π π π π π π π π π π π π π 𦧠π π¦ π πββοΈ
https://github.com/neurogenomics/orthogene
animal-models bioconductor bioconductor-package bioinformatics biomedicine comparative-genomics evolutionary-biology genes genomics ontologies r r-package translational-research
Last synced: 4 months ago
JSON representation
𧬠o r t h o g e n e π§¬β¨β¨β¨β¨β¨β¨β¨ Interspecies gene mappingβ¨β¨β¨β¨β¨ π¦ π π± π π³ π π π π π πͺ± π πͺ° π π π π¦ π π π π¦ π π π π π π π π π π π π π π π π π 𦧠π π¦ π πββοΈ
- Host: GitHub
- URL: https://github.com/neurogenomics/orthogene
- Owner: neurogenomics
- Created: 2021-07-27T21:22:48.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2023-12-22T05:46:07.000Z (over 1 year ago)
- Last Synced: 2024-05-09T21:09:48.146Z (about 1 year ago)
- Topics: animal-models, bioconductor, bioconductor-package, bioinformatics, biomedicine, comparative-genomics, evolutionary-biology, genes, genomics, ontologies, r, r-package, translational-research
- Language: R
- Homepage: https://doi.org/doi:10.18129/B9.bioc.orthogene
- Size: 7.21 MB
- Stars: 35
- Watchers: 2
- Forks: 4
- Open Issues: 11
-
Metadata Files:
- Readme: README.Rmd
Awesome Lists containing this project
README
---
title: "`orthogene`: Interspecies gene mapping"
author: "`r rworkflows::use_badges(branch='main', add_bioc_release = TRUE, add_bioc_download_month = TRUE, add_bioc_download_rank = TRUE, add_bioc_download_total = TRUE)`"
date: "README updated: `r format( Sys.Date(), '%b-%d-%Y')`
"
output:
github_document
---```{r, echo=FALSE, include=FALSE}
pkg <- read.dcf("DESCRIPTION", fields = "Package")[1]
title <- read.dcf("DESCRIPTION", fields = "Title")[1]
description <- read.dcf("DESCRIPTION", fields = "Description")[1]
URL <- read.dcf('DESCRIPTION', fields = 'URL')[1]
owner <- tolower(strsplit(URL,"/")[[1]][4])
```
# Intro`r description`
In brief, `orthogene` lets you easily:- [**`convert_orthologs`** between any two species.](https://neurogenomics.github.io/orthogene/articles/orthogene#convert-orthologs)
- [**`map_species`** names onto standard taxonomic ontologies.](https://neurogenomics.github.io/orthogene/articles/orthogene#map-species)
- [**`report_orthologs`** between any two species.](https://neurogenomics.github.io/orthogene/articles/orthogene#report-orthologs)
- [**`map_genes`** onto standard ontologies](https://neurogenomics.github.io/orthogene/articles/orthogene#map-genes)
- [**`aggregate_mapped_genes`** in a matrix.](https://neurogenomics.github.io/orthogene/articles/orthogene#aggregate-mapped-genes)
- [get **`all_genes`** from any species.](https://neurogenomics.github.io/orthogene/articles/orthogene#get-all-genes)
- [**`infer_species`** from gene names.](https://neurogenomics.github.io/orthogene/articles/infer_species.html)
- [**`create_background`** gene lists based one, two, or more species.](https://neurogenomics.github.io/orthogene/reference/create_background.html)
- [**`get_silhouettes`** of each species from phylopic.](https://neurogenomics.github.io/orthogene/reference/get_silhouettes.html)
- [**`prepare_tree`** with evolutionary divergence times across >147,000 species.](https://neurogenomics.github.io/orthogene/reference/prepare_tree.html)## Citation
If you use ``r pkg``, please cite:> `r citation(pkg)$textVersion`
## [Documentation website](https://neurogenomics.github.io/orthogene/)
## [PDF manual](https://github.com/neurogenomics/orthogene/blob/main/inst/orthogene_1.5.1.pdf)# Installation
```{r, eval=FALSE}
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")
# orthogene is only available on Bioconductor>=3.14
if(BiocManager::version()<"3.14") BiocManager::install(update = TRUE, ask = FALSE)BiocManager::install("orthogene")
```## Docker
`orthogene` can also be installed via a [Docker](https://hub.docker.com/repository/docker/neurogenomicslab/orthogene) or [Singularity](https://sylabs.io/guides/2.6/user-guide/singularity_and_docker.html)
container with Rstudio pre-installed. Further [instructions provided here](https://neurogenomics.github.io/orthogene/articles/docker).# Methods
```{r setup}
library(orthogene)data("exp_mouse")
# Setting to "homologene" for the purposes of quick demonstration.
# We generally recommend using method="gprofiler" (default).
method <- "homologene"
```For most functions, `orthogene` lets users choose between different methods,
each with complementary strengths and weaknesses:
`"gprofiler"`, `"homologene"`, and `"babelgene"`In general, we recommend you use `"gprofiler"` when possible,
as it tends to be more comprehensive.While `"babelgene"` contains less species, it queries a wide variety
of orthology databases and can return a column "support_n" that tells
you how many databases support each ortholog gene mapping.
This can be helpful when you need a semi-quantitative
measure of mapping quality.It's also worth noting that for smaller gene sets,
the speed difference between these methods becomes negligible.```{r pros_cons, echo=FALSE}
pros_cons <- data.frame(
gprofiler=c("Reference organisms"="700+",
"Gene mappings"="More comprehensive",
"Updates"="Frequent",
"Orthology databases"=paste("Ensembl",
"HomoloGene",
"WormBase",sep = ", "),
"Data location"="Remote",
"Internet connection"="Required",
"Speed"="Slower"),
homologene=c("# reference organisms"="20+",
"Gene mappings"="Less comprehensive",
"Updates"="Less frequent",
"Orthology databases"="HomoloGene",
"Data location"="Local",
"Internet connection"="Not required",
"Speed"="Faster"),
babelgene=c("# reference organisms"="19 (but cannot convert between pairs of non-human species)",
"Gene mappings"="More comprehensive",
"Updates"="Less frequent",
"Orthology databases"="HGNC Comparison of Orthology Predictions (HCOP), which includes predictions from eggNOG, Ensembl Compara, HGNC, HomoloGene, Inparanoid, NCBI Gene Orthology, OMA, OrthoDB, OrthoMCL, Panther, PhylomeDB, TreeFam and ZFIN",
"Data location"="Local",
"Internet connection"="Not required",
"Speed"="Medium")
)
knitr::kable(pros_cons)
```# Quick example
## Convert orthologs
[`convert_orthologs`](https://neurogenomics.github.io/orthogene/reference/convert_orthologs.html)
is very flexible with what users can supply as `gene_df`,
and can take a `data.frame`/`data.table`/`tibble`, (sparse) `matrix`,
or `list`/`vector` containing genes.Genes, transcripts, proteins, SNPs, or genomic ranges will be recognised in
most formats (HGNC, Ensembl, RefSeq, UniProt, etc.)
and can even be a mixture of different formats.All genes will be mapped to gene symbols, unless specified otherwise with the
`...` arguments (see `?orthogene::convert_orthologs` or [here
](https://neurogenomics.github.io/orthogene/reference/convert_orthologs.html)
for details).### Note on non-1:1 orthologs
A key feature of
[`convert_orthologs`](https://neurogenomics.github.io/orthogene/reference/convert_orthologs.html)
is that it handles the issue of genes with many-to-many mappings across species.
This can occur due to evolutionary divergence, and the function of these genes
tend to be less conserved and less translatable.
Users can address this using different strategies via `non121_strategy=`.```{r convert_orthologs}
gene_df <- orthogene::convert_orthologs(gene_df = exp_mouse,
gene_input = "rownames",
gene_output = "rownames",
input_species = "mouse",
output_species = "human",
non121_strategy = "drop_both_species",
method = method)knitr::kable(as.matrix(head(gene_df)))
````convert_orthologs` is just one of the many useful functions in `orthogene`.
Please see the
[documentation website](https://neurogenomics.github.io/orthogene/articles/orthogene)
for the full vignette.# Additional resources
## [Hex sticker creation](https://github.com/neurogenomics/orthogene/blob/main/inst/hex/hexSticker.Rmd)
## [Benchmarking methods](https://github.com/neurogenomics/orthogene/blob/main/inst/benchmark/benchmarks.Rmd)
# Session Info
```{r Session Info}
utils::sessionInfo()
```
# Related projects
## Tools
- [`gprofiler2`](https://cran.r-project.org/web/packages/gprofiler2/vignettes/gprofiler2.html):
`orthogene` uses this package. `gprofiler2::gorth()` pulls from
[many orthology mapping databases](https://biit.cs.ut.ee/gprofiler/page/organism-list).- [`homologene`](https://github.com/oganm/homologene):
`orthogene` uses this package. Provides API access to NCBI
[HomoloGene](https://www.ncbi.nlm.nih.gov/homologene) database.- [`babelgene`](https://cran.r-project.org/web/packages/babelgene/vignettes/babelgene-intro.html): `orthogene` uses this package. `babelgene::orthologs()` pulls from
[many orthology mapping databases](https://cran.r-project.org/web/packages/babelgene/vignettes/babelgene-intro.html).- [`annotationTools`](https://www.bioconductor.org/packages/release/bioc/html/annotationTools.html):
For interspecies microarray data.- [`orthology`](https://www.leibniz-hki.de/en/orthology-r-package.html):
R package for ortholog mapping (deprecated?).- [`hpgltools::load_biomart_orthologs()`](https://rdrr.io/github/elsayed-lab/hpgltools/man/load_biomart_orthologs.html):
Helper function to get orthologs from biomart.- [`JustOrthologs`](https://github.com/ridgelab/JustOrthologs/):
Ortholog inference from multi-species genomic sequences.- [`orthologr`](https://github.com/drostlab/orthologr):
Ortholog inference from multi-species genomic sequences.- [`OrthoFinder`](https://github.com/davidemms/OrthoFinder):
Gene duplication event inference from multi-species genomics.## Databases
- [HomoloGene](https://www.ncbi.nlm.nih.gov/homologene):
NCBI database that the R package
[homologene](https://github.com/oganm/homologene) pulls from.- [gProfiler](https://biit.cs.ut.ee/gprofiler):
Web server for functional enrichment analysis and conversions of gene lists.- [OrtholoGene](http://orthologene.org/resources.html):
Compiled list of gene orthology resources.## Contact
### [Neurogenomics Lab](https://www.neurogenomics.co.uk/)UK Dementia Research Institute
Department of Brain Sciences
Faculty of Medicine
Imperial College London
[GitHub](https://github.com/neurogenomics)
[DockerHub](https://hub.docker.com/orgs/neurogenomicslab)