{"id":24847371,"url":"https://github.com/neurogenomics/orthogene","last_synced_at":"2025-10-14T18:31:49.078Z","repository":{"id":49782345,"uuid":"390130503","full_name":"neurogenomics/orthogene","owner":"neurogenomics","description":"🧬 o r t h o g e n e 🧬✨✨✨✨✨✨✨ Interspecies gene mapping✨✨✨✨✨   🦠 🔁 🌱 🔁 🌳 🔁 🍎 🔁 🍊 🔁 🪱 🔁 🪰 🔁 🐟 🔁 🦎 🔁 🐓 🔁 🦇 🔁 🐄 🔁 🐖 🔁 🐐 🔁 🐎 🔁 🐈 🔁 🐕 🔁 🐁 🔁 🐒 🔁 🦧 🔁 🦍 🔁 🏃‍♀️","archived":false,"fork":false,"pushed_at":"2025-09-27T17:58:31.000Z","size":8777,"stargazers_count":45,"open_issues_count":10,"forks_count":4,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-09-27T18:22:04.123Z","etag":null,"topics":["animal-models","bioconductor","bioconductor-package","bioinformatics","biomedicine","comparative-genomics","evolutionary-biology","genes","genomics","ontologies","r","r-package","translational-research"],"latest_commit_sha":null,"homepage":"https://doi.org/doi:10.18129/B9.bioc.orthogene","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/neurogenomics.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2021-07-27T21:22:48.000Z","updated_at":"2025-09-27T17:58:35.000Z","dependencies_parsed_at":"2023-01-22T20:00:44.896Z","dependency_job_id":"ac88bd04-c317-4737-8469-2b44a5423d9a","html_url":"https://github.com/neurogenomics/orthogene","commit_stats":{"total_commits":265,"total_committers":5,"mean_commits":53.0,"dds":"0.13962264150943393","last_synced_commit":"c57a55abab9696794be1d5359c630acaf3e31871"},"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/neurogenomics/orthogene","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neurogenomics%2Forthogene","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neurogenomics%2Forthogene/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neurogenomics%2Forthogene/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neurogenomics%2Forthogene/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/neurogenomics","download_url":"https://codeload.github.com/neurogenomics/orthogene/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neurogenomics%2Forthogene/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279020355,"owners_count":26086866,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-14T02:00:06.444Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["animal-models","bioconductor","bioconductor-package","bioinformatics","biomedicine","comparative-genomics","evolutionary-biology","genes","genomics","ontologies","r","r-package","translational-research"],"created_at":"2025-01-31T11:20:04.909Z","updated_at":"2025-10-14T18:31:48.093Z","avatar_url":"https://github.com/neurogenomics.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\ntitle: \"`orthogene`: Interspecies gene mapping\"  \nauthor: \"`r rworkflows::use_badges(branch='main', add_bioc_release = TRUE, add_bioc_download_month = TRUE, add_bioc_download_rank = TRUE, add_bioc_download_total = TRUE)`\" \ndate: \"\u003ch4\u003eREADME updated: \u003ci\u003e`r format( Sys.Date(), '%b-%d-%Y')`\u003c/i\u003e\u003c/h4\u003e\"\noutput:\n  github_document\n---\n\n```{r, echo=FALSE, include=FALSE}\npkg \u003c- read.dcf(\"DESCRIPTION\", fields = \"Package\")[1]\ntitle \u003c- read.dcf(\"DESCRIPTION\", fields = \"Title\")[1]\ndescription \u003c- read.dcf(\"DESCRIPTION\", fields = \"Description\")[1]\nURL \u003c- read.dcf('DESCRIPTION', fields = 'URL')[1]\nowner \u003c- tolower(strsplit(URL,\"/\")[[1]][4])\n```\n \n# Intro \n\n`r description`\n \nIn brief, `orthogene` lets you easily: \n\n- [**`convert_orthologs`** between any two species.](https://neurogenomics.github.io/orthogene/articles/orthogene#convert-orthologs) \n- [**`map_species`** names onto standard taxonomic ontologies.](https://neurogenomics.github.io/orthogene/articles/orthogene#map-species)  \n- [**`report_orthologs`** between any two species.](https://neurogenomics.github.io/orthogene/articles/orthogene#report-orthologs) \n- [**`map_genes`** onto standard ontologies](https://neurogenomics.github.io/orthogene/articles/orthogene#map-genes) \n- [**`aggregate_mapped_genes`** in a matrix.](https://neurogenomics.github.io/orthogene/articles/orthogene#aggregate-mapped-genes)  \n- [get **`all_genes`** from any species.](https://neurogenomics.github.io/orthogene/articles/orthogene#get-all-genes) \n- [**`infer_species`** from gene names.](https://neurogenomics.github.io/orthogene/articles/infer_species.html)    \n- [**`create_background`** gene lists based one, two, or more species.](https://neurogenomics.github.io/orthogene/reference/create_background.html)    \n- [**`get_silhouettes`** of each species from phylopic.](https://neurogenomics.github.io/orthogene/reference/get_silhouettes.html)    \n- [**`prepare_tree`** with evolutionary divergence times across \u003e147,000 species.](https://neurogenomics.github.io/orthogene/reference/prepare_tree.html)    \n\n## Citation\n \nIf you use ``r pkg``, please cite: \n\n\u003c!-- Modify this by editing the file: inst/CITATION  --\u003e\n\u003e `r citation(pkg)$textVersion`\n\n\n## [Documentation website](https://neurogenomics.github.io/orthogene/) \n## [PDF manual](https://github.com/neurogenomics/orthogene/blob/main/inst/orthogene_1.5.1.pdf) \n\n# Installation\n\n```{r, eval=FALSE}\nif (!requireNamespace(\"BiocManager\", quietly = TRUE)) install.packages(\"BiocManager\")\n# orthogene is only available on Bioconductor\u003e=3.14\nif(BiocManager::version()\u003c\"3.14\") BiocManager::install(update = TRUE, ask = FALSE)\n\nBiocManager::install(\"orthogene\")\n```\n\n## Docker \n\n`orthogene` can also be installed via a  [Docker](https://hub.docker.com/repository/docker/neurogenomicslab/orthogene) or [Singularity](https://sylabs.io/guides/2.6/user-guide/singularity_and_docker.html)\ncontainer with Rstudio pre-installed. Further [instructions provided here](https://neurogenomics.github.io/orthogene/articles/docker).  \n\n\n# Methods \n\n```{r setup}\nlibrary(orthogene)\n\ndata(\"exp_mouse\")\n# Setting to \"homologene\" for the purposes of quick demonstration.\n# We generally recommend using method=\"gprofiler\" (default).\nmethod \u003c- \"homologene\"  \n```\n\nFor most functions, `orthogene` lets users choose between different methods,\neach with complementary strengths and weaknesses: \n`\"gprofiler\"`, `\"homologene\"`, and `\"babelgene\"`\n\nIn general, we recommend you use `\"gprofiler\"` when possible, \nas it tends to be more comprehensive. \n\nWhile `\"babelgene\"` contains less species, it queries a wide variety \nof orthology databases and can return a column \"support_n\" that tells \nyou how many databases support each ortholog gene mapping. \nThis can be helpful when you need a semi-quantitative\nmeasure of mapping quality.\n\nIt's also worth noting that for smaller gene sets, \nthe speed difference between these methods becomes negligible. \n\n```{r pros_cons, echo=FALSE}\npros_cons \u003c- data.frame(\n    gprofiler=c(\"Reference organisms\"=\"700+\",\n                \"Gene mappings\"=\"More comprehensive\",\n                \"Updates\"=\"Frequent\", \n                \"Orthology databases\"=paste(\"Ensembl\",\n                                            \"HomoloGene\",\n                                            \"WormBase\",sep = \", \"),\n                \"Data location\"=\"Remote\",\n                \"Internet connection\"=\"Required\",\n                \"Speed\"=\"Slower\"),\n                       \n   homologene=c(\"# reference organisms\"=\"20+\", \n                \"Gene mappings\"=\"Less comprehensive\",\n                \"Updates\"=\"Less frequent\",\n                \"Orthology databases\"=\"HomoloGene\",\n                \"Data location\"=\"Local\",\n                \"Internet connection\"=\"Not required\",\n                \"Speed\"=\"Faster\"),\n   \n    babelgene=c(\"# reference organisms\"=\"19 (but cannot convert between pairs of non-human species)\", \n                \"Gene mappings\"=\"More comprehensive\",\n                \"Updates\"=\"Less frequent\",\n                \"Orthology databases\"=\"HGNC Comparison of Orthology Predictions (HCOP), which includes predictions from eggNOG, Ensembl Compara, HGNC, HomoloGene, Inparanoid, NCBI Gene Orthology, OMA, OrthoDB, OrthoMCL, Panther, PhylomeDB, TreeFam and ZFIN\",\n                \"Data location\"=\"Local\",\n                \"Internet connection\"=\"Not required\",\n                \"Speed\"=\"Medium\")\n           )\nknitr::kable(pros_cons)\n```\n\n\n\n# Quick example\n\n## Convert orthologs \n\n[`convert_orthologs`](https://neurogenomics.github.io/orthogene/reference/convert_orthologs.html) \nis very flexible with what users can supply as `gene_df`,\nand can take a `data.frame`/`data.table`/`tibble`, (sparse) `matrix`, \nor `list`/`vector` containing genes.\n\nGenes, transcripts, proteins, SNPs, or genomic ranges will be recognised in \nmost formats (HGNC, Ensembl, RefSeq, UniProt, etc.) \nand can even be a mixture of different formats. \n\nAll genes will be mapped to gene symbols, unless specified otherwise with the\n`...` arguments (see `?orthogene::convert_orthologs` or [here\n](https://neurogenomics.github.io/orthogene/reference/convert_orthologs.html) \nfor details).  \n\n### Note on non-1:1 orthologs \n\nA key feature of \n[`convert_orthologs`](https://neurogenomics.github.io/orthogene/reference/convert_orthologs.html) \nis that it handles the issue of genes with many-to-many mappings across species. \nThis can occur due to evolutionary divergence, and the function of these genes \ntend to be less conserved and less translatable. \nUsers can address this using different strategies via `non121_strategy=`.\n\n```{r convert_orthologs}\ngene_df \u003c- orthogene::convert_orthologs(gene_df = exp_mouse,\n                                        gene_input = \"rownames\", \n                                        gene_output = \"rownames\", \n                                        input_species = \"mouse\",\n                                        output_species = \"human\",\n                                        non121_strategy = \"drop_both_species\",\n                                        method = method) \n\nknitr::kable(as.matrix(head(gene_df)))\n```\n\n`convert_orthologs` is just one of the many useful functions in `orthogene`. \nPlease see the\n[documentation website](https://neurogenomics.github.io/orthogene/articles/orthogene) \nfor the full vignette.\n\n\n# Additional resources \n\n## [Hex sticker creation](https://github.com/neurogenomics/orthogene/blob/main/inst/hex/hexSticker.Rmd)\n\n## [Benchmarking methods](https://github.com/neurogenomics/orthogene/blob/main/inst/benchmark/benchmarks.Rmd)\n\n\n# Session Info \n\n\u003cdetails\u003e \n\n```{r Session Info}\nutils::sessionInfo()\n```\n\n\u003c/details\u003e  \n\n\n# Related projects\n\n## Tools \n\n- [`gprofiler2`](https://cran.r-project.org/web/packages/gprofiler2/vignettes/gprofiler2.html): \n`orthogene` uses this package. `gprofiler2::gorth()` pulls from \n[many orthology mapping databases](https://biit.cs.ut.ee/gprofiler/page/organism-list). \n\n- [`homologene`](https://github.com/oganm/homologene): \n`orthogene` uses this package. Provides API access to NCBI\n[HomoloGene](https://www.ncbi.nlm.nih.gov/homologene) database. \n\n- [`babelgene`](https://cran.r-project.org/web/packages/babelgene/vignettes/babelgene-intro.html): `orthogene` uses this package. `babelgene::orthologs()` pulls from \n[many orthology mapping databases](https://cran.r-project.org/web/packages/babelgene/vignettes/babelgene-intro.html). \n\n- [`annotationTools`](https://www.bioconductor.org/packages/release/bioc/html/annotationTools.html): \nFor interspecies microarray data.  \n\n- [`orthology`](https://www.leibniz-hki.de/en/orthology-r-package.html): \nR package for ortholog mapping (deprecated?). \n\n- [`hpgltools::load_biomart_orthologs()`](https://rdrr.io/github/elsayed-lab/hpgltools/man/load_biomart_orthologs.html): \nHelper function to get orthologs from biomart. \n\n- [`JustOrthologs`](https://github.com/ridgelab/JustOrthologs/): \nOrtholog inference from multi-species genomic sequences. \n\n- [`orthologr`](https://github.com/drostlab/orthologr): \nOrtholog inference from multi-species genomic sequences.  \n\n- [`OrthoFinder`](https://github.com/davidemms/OrthoFinder): \nGene duplication event inference from multi-species genomics.  \n\n\n## Databases  \n\n- [HomoloGene](https://www.ncbi.nlm.nih.gov/homologene): \nNCBI database that the R package \n[homologene](https://github.com/oganm/homologene) pulls from.   \n\n- [gProfiler](https://biit.cs.ut.ee/gprofiler): \nWeb server for functional enrichment analysis and conversions of gene lists.  \n\n- [OrtholoGene](http://orthologene.org/resources.html): \nCompiled list of gene orthology resources. \n\n\n## Contact\n \n### [Neurogenomics Lab](https://www.neurogenomics.co.uk/)\n\nUK Dementia Research Institute  \nDepartment of Brain Sciences  \nFaculty of Medicine  \nImperial College London   \n[GitHub](https://github.com/neurogenomics)  \n[DockerHub](https://hub.docker.com/orgs/neurogenomicslab)  \n\n\u003cbr\u003e\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fneurogenomics%2Forthogene","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fneurogenomics%2Forthogene","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fneurogenomics%2Forthogene/lists"}