{"id":23188785,"url":"https://github.com/jasenfinch/metabolyser","last_synced_at":"2025-10-25T10:31:49.068Z","repository":{"id":21918067,"uuid":"88983134","full_name":"jasenfinch/metabolyseR","owner":"jasenfinch","description":"Methods for pre-treatment, modelling/data mining, and correlation analyses of metabolomics data","archived":false,"fork":false,"pushed_at":"2024-04-10T13:08:17.000Z","size":61616,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-12-18T11:15:35.858Z","etag":null,"topics":["correlation-analyses","data-mining","metabolomics-data","modelling","pre-treatment","r-package"],"latest_commit_sha":null,"homepage":"https://jasenfinch.github.io/metabolyseR/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jasenfinch.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2017-04-21T12:48:23.000Z","updated_at":"2024-08-22T14:46:58.000Z","dependencies_parsed_at":"2023-02-14T10:00:33.355Z","dependency_job_id":"5d25966e-05de-4a48-8458-9fca41b43ace","html_url":"https://github.com/jasenfinch/metabolyseR","commit_stats":null,"previous_names":[],"tags_count":18,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jasenfinch%2FmetabolyseR","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jasenfinch%2FmetabolyseR/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jasenfinch%2FmetabolyseR/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jasenfinch%2FmetabolyseR/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jasenfinch","download_url":"https://codeload.github.com/jasenfinch/metabolyseR/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":238124987,"owners_count":19420457,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["correlation-analyses","data-mining","metabolomics-data","modelling","pre-treatment","r-package"],"created_at":"2024-12-18T11:15:40.157Z","updated_at":"2025-10-25T10:31:43.755Z","avatar_url":"https://github.com/jasenfinch.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, echo = FALSE}\nknitr::opts_chunk$set(collapse = TRUE, \n                      comment = \"#\u003e\",\n                      fig.align = 'center',\n                      fig.path = \"man/figures/README-\",\n                      message = FALSE,\n                      warning = FALSE)\n```\n\n# metabolyseR\n\n\u003c!-- badges: start --\u003e\n[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)\n[![R-CMD-check](https://github.com/jasenfinch/metabolyseR/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/jasenfinch/metabolyseR/actions/workflows/R-CMD-check.yaml)\n[![codecov](https://codecov.io/gh/jasenfinch/metabolyseR/branch/master/graph/badge.svg)](https://codecov.io/gh/jasenfinch/metabolyseR/branch/master) \n[![license](https://img.shields.io/badge/license-GNU%20GPL%20v3.0-blue.svg)](https://github.com/jasenfinch/metabolyseR/blob/master/DESCRIPTION)\n[![DOI](https://zenodo.org/badge/88983134.svg)](https://zenodo.org/badge/latestdoi/88983134)\n[![GitHub release](https://img.shields.io/github/release/jasenfinch/metabolyseR.svg)](https://GitHub.com/jasenfinch/metabolyseR/releases/)\n\u003c!-- badges: end --\u003e\n\n\u003e **A tool kit for pre-treatment, modelling, feature selection and correlation analyses of metabolomics data.**\n\n## Overview\n\nThis package provides a tool kit of methods for metabolomics analyses that includes: \n\n* data pre-treatment\n* multivariate and univariate modelling/data mining techniques\n* correlation analysis\n\n## Installation\n\nThe `metabolyseR` package can be installed from GitHub using the following:\n\n```r\nremotes::install_github('jasenfinch/metabolyseR')\n```\n\nThe package documentation can be browsed online at \u003chttps://jasenfinch.github.io/metabolyseR/\u003e; however, if users want to compile the vignettes locally, the following can be used.\n\n```r\nremotes::install_github('jasenfinch/metabolyseR',build_vignettes = TRUE,dependencies = TRUE)\n```\n\n## Learn more\n\nThe package documentation can be browsed online at \u003chttps://jasenfinch.github.io/metabolyseR/\u003e. \n\nIf this is your first time using `metabolyseR` see the [Introduction](https://jasenfinch.github.io/metabolyseR/articles/metabolyseR.html) vignette or the quick start analysis below for information on how to get started.\n\nIf you believe you've found a bug in `metabolyseR`, please file a bug (and, if\npossible, a [reproducible example](https://reprex.tidyverse.org)) at\n\u003chttps://github.com/jasenfinch/metabolyseR/issues\u003e.\n\n## Quick start example analysis\n\nThis example analysis will use the `abr1` data set from the [metaboData](https://aberhrml.github.io/metaboData/) package. \nIt is nominal mass flow-injection mass spectrometry (FI-MS) fingerprinting data from a plant-pathogen infection time course experiment.\nThe analysis will also include use of the pipe `%\u003e%` from the [magrittr](https://magrittr.tidyverse.org/) package.\nFirst load the necessary packages.\n\n```{r setup}\nlibrary(metabolyseR)\nlibrary(metaboData)\n```\n\nFor this example we will use only the negative acquisition mode data (`abr1$neg`) and sample meta-information (`abr1$fact`).\nCreate an `AnalysisData` class object using the following:\n\n```{r analysis_data}\nd \u003c- analysisData(abr1$neg,abr1$fact)\n```\n\nThe data includes `r nSamples(d)` samples and `r nFeatures(d)` mass spectral features as shown below.\n\n```{r print_analysis_data}\nd\n```\n\nThe `clsAvailable()` function can be used to identify the columns available in our meta-information table. \n\n```{r}\nclsAvailable(d)\n```\n\nFor this analysis, we will be using the infection time course class information contained in the `day` column.\nThis can be extracted and the class frequencies tabulated using the following:\n\n```{r}\nd %\u003e%\n  clsExtract(cls = 'day') %\u003e%\n  table()\n```\n\nAs can be seen above, the experiment is made up of six infection time point classes that includes a healthy control class (`H`) and five day infection time points (`1-5`), each with 20 replicates. \n\nFor data pre-treatment prior to statistical analysis, a two-thirds maximum class occupancy filter can be applied.\nFeatures where the maximum proportion of non-missing data per class is above two-thirds are retained.\nA total ion count normalisation will also be applied.\n\n```{r pre_treat}\nd \u003c- d %\u003e%\n  occupancyMaximum(cls = 'day', occupancy = 2/3) %\u003e%\n  transformTICnorm()\n```\n\n```{r pre_treat_result}\nd\n```\n\nThis has reduced the data set to `r nFeatures(d)` relevant features.\n\nThe structure of the data can be visualised using both unsupervised and supervised methods. For instance, the first two principle components from a principle component analysis (PCA) of the data with the sample points coloured by infection class can be plotted using: \n\n```{r pca}\nplotPCA(d,cls = 'day',xAxis = 'PC1',yAxis = 'PC2')\n```\n\nAnd similarly, multidimensional scaling (MDS) of sample proximity values from a supervised random forest classification model along with receiver operator characteristic (ROC) curves.\n\n```{r supervised_RF}\nplotSupervisedRF(d,cls = 'day')\n```\n\nA progression can clearly be seen from the earliest to latest infected time points.\n\nFor feature selection, one-way analysis of variance (ANOVA) can be performed for each feature to identify features significantly explanatory for the infection time point.\n\n```{r anova}\nanova_results \u003c- d %\u003e%\n  anova(cls = 'day')\n```\n\nA table of the significantly explanatory features can be extracted with a bonferroni correction adjusted p value \u003c 0.05 using:\n\n```{r explanatoty_features_extract}\nexplan_feat \u003c- explanatoryFeatures(anova_results,threshold = 0.05)\n```\n\n```{r,explanatory_features}\nexplan_feat\n```\n\nThe ANOVA has identified `r nrow(explan_feat)` features significantly explanatory over the infection time course.\nA heat map of the mean relative intensity for each class of these explanatory features can be plotted to visualise their trends between the infection time point classes.\n\n```{r rf_heatmap,fig.height=10,fig.width=5}\nplotExplanatoryHeatmap(anova_results,\n                       threshold = 0.05,\n                       featureNames = FALSE)\n```\n\nMany of the explanatory features can be seen to be most highly abundant in the final infection time point `5`.\n\nFinally, box plots of the trends of individual features can be plotted, such as the `N341` feature below.\n\n```{r feature_plot}\nplotFeature(anova_results,feature = 'N341',cls = 'day')\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjasenfinch%2Fmetabolyser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjasenfinch%2Fmetabolyser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjasenfinch%2Fmetabolyser/lists"}