{"id":32111337,"url":"https://github.com/kechrislab/msprep","last_synced_at":"2025-10-20T14:24:51.796Z","repository":{"id":44454476,"uuid":"89963266","full_name":"KechrisLab/MSPrep","owner":"KechrisLab","description":"A processing pipeline for the summarization, normalization and diagnostics of mass spectrometry–based metabolomics data.","archived":false,"fork":false,"pushed_at":"2022-01-28T20:39:19.000Z","size":5430,"stargazers_count":10,"open_issues_count":2,"forks_count":3,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-09-08T13:58:42.359Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/KechrisLab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-05-01T21:09:55.000Z","updated_at":"2024-03-13T05:32:35.000Z","dependencies_parsed_at":"2022-07-28T22:48:47.752Z","dependency_job_id":null,"html_url":"https://github.com/KechrisLab/MSPrep","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/KechrisLab/MSPrep","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KechrisLab%2FMSPrep","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KechrisLab%2FMSPrep/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KechrisLab%2FMSPrep/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KechrisLab%2FMSPrep/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/KechrisLab","download_url":"https://codeload.github.com/KechrisLab/MSPrep/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KechrisLab%2FMSPrep/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":280104867,"owners_count":26272833,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-20T02:00:06.978Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-10-20T14:24:48.487Z","updated_at":"2025-10-20T14:24:51.791Z","avatar_url":"https://github.com/KechrisLab.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"MSPrep\n======\n\n### Introduction\n\n`MSPrep` provides a convenient set of functionalities used in the pre-analytic\nprocessing pipeline for mass spectrometry based metabolomics data. Functions are\nincluded for the following processes commonly performed prior to analysis of\nsuch data:\n\n1. Summarization of technical replicates (if available)\n2. Filtering of metabolites\n3. Imputation of missing values\n4. Transformation, normalization, and batch correction\n\nOriginal manuscript published in\n[Bioinformatics](https://academic.oup.com/bioinformatics/article/30/1/133/236721),\nand package is hosted by [Bioconductor](https://bioconductor.org/packages/release/bioc/html/MSPrep.html).\n\nAdditional helpful links:\n1. [Vignette providing detailed instructions with examples](https://bioconductor.org/packages/release/bioc/vignettes/MSPrep/inst/doc/using_MSPrep.html)\n2. [Reference Manual describing function usage](https://bioconductor.org/packages/release/bioc/manuals/MSPrep/man/MSPrep.pdf)\n\n\n### Installation\n\nInstall via Bioconductor:\n\n    if (!requireNamespace(\"BiocManager\", quietly=TRUE))\n        install.packages(\"BiocManager\")\n\n    BiocManager::install(\"MSPrep\")\n\nInstall via Github:\n\n    if (!require(\"devtools\")) install.packages(\"devtools\")\n    devtools::install_github(\"KechrisLab/MSPrep\")\n\n### Examples\n\nTwo examples are provided below. For more detailed information see the\npackage Vignette which can be accessed [via Bioconductor](https://bioconductor.org/packages/release/bioc/vignettes/MSPrep/inst/doc/using_MSPrep.html)\nor by using the following R command following package installation:\n\n```s\nvignette(\"using_MSPrep\", package = \"MSPrep\")\n```\n\nThe following code loads the example data set, `MSQuant`, summarizes its\ntechnical replicates, filters metabolites by only keeping those which are\npresent in 80% of samples, imputes missing values using k-nearest neighbors,\napplies a log base ten transformation, and finally normalizes and batch corrects\nthe data set using quantile normalization and ComBat batch correction. Data is\nthen returned as a `data.frame`.\n\n```s\nlibrary(MSPrep)\ndata(msquant)\n\npreparedDF \u003c- msPrepare(msquant,\n                        minPropPresent = 1/3,\n                        missingValue = 1,\n                        filterPercent = 0.8,\n                        imputeMethod = \"knn\",\n                        transform = \"log10\",\n                        normalizeMethod = \"quantile + ComBat\",\n                        covariatesOfInterest = c(\"spike\"),\n                        compVars = c(\"mz\", \"rt\"),\n                        sampleVars = c(\"spike\", \"batch\", \"replicate\",\n                                       \"subject_id\"),\n                        colExtraText = \"Neutral_Operator_Dif_Pos_\",\n                        separator = \"_\")\n```\n\nThe second example uses the data set `COPD_131`. The raw data set can be found [here, at Metabolomics Workbench.](https://www.metabolomicsworkbench.org/data/DRCCMetadata.php?Mode=Project\u0026ProjectID=PR000438). The code loads the data set,\nsummarizes its\ntechnical replicates, filters metabolites by only keeping those which are\npresent in 80% of samples, imputes missing values using BPCA imputation,\nand finally normalizes the data set using median normalization. Data is then\nreturned as a `SummarizedExperiment` by setting the argument\n`returnToSE = TRUE`.\n\n```s\nlibrary(MSPrep)\ndata(COPD_131)\n\npreparedSE \u003c- msPrepare(COPD_131,\n                        minPropPresent = 1/3,\n                        filterPercent = 0.8,\n                        missingValue = 0,\n                        imputeMethod = \"bpca\",\n                        nPcs = 3,\n                        normalizeMethod = \"median\",\n                        transform = \"none\",\n                        compVars = c(\"Mass\", \"Retention.Time\",\n                                     \"Compound.Name\"),\n                        sampleVars = c(\"subject_id\", \"replicate\"),\n                        colExtraText = \"X\",\n                        separator = \"_\",\n                        returnToSE = TRUE)\n```\n### Bug Reports\n\nReport bugs as issues on the [GitHub repository new\nissue](https://github.com/KechrisLab/MSPrep/issues/new)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkechrislab%2Fmsprep","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkechrislab%2Fmsprep","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkechrislab%2Fmsprep/lists"}