{"id":24527285,"url":"https://github.com/biobakery/melonnpan","last_synced_at":"2025-10-30T06:03:22.119Z","repository":{"id":41142674,"uuid":"80059641","full_name":"biobakery/melonnpan","owner":"biobakery","description":"Model-based Genomically Informed High-dimensional Predictor of Microbial Community Metabolic Profiles","archived":false,"fork":false,"pushed_at":"2024-03-26T18:07:51.000Z","size":3998,"stargazers_count":35,"open_issues_count":0,"forks_count":8,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-03-28T04:22:52.311Z","etag":null,"topics":["biobakery","bioconductor","machine-learning","metabolite-compounds","metabolite-prediction","metabolomics","metagenomics","microbial-communities","microbiome","multi-omics","public","tools"],"latest_commit_sha":null,"homepage":"http://huttenhower.sph.harvard.edu/melonnpan","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/biobakery.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2017-01-25T21:28:06.000Z","updated_at":"2025-02-12T06:15:03.000Z","dependencies_parsed_at":"2023-10-20T18:25:39.305Z","dependency_job_id":null,"html_url":"https://github.com/biobakery/melonnpan","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/biobakery%2Fmelonnpan","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/biobakery%2Fmelonnpan/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/biobakery%2Fmelonnpan/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/biobakery%2Fmelonnpan/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/biobakery","download_url":"https://codeload.github.com/biobakery/melonnpan/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248905019,"owners_count":21180906,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["biobakery","bioconductor","machine-learning","metabolite-compounds","metabolite-prediction","metabolomics","metagenomics","microbial-communities","microbiome","multi-omics","public","tools"],"created_at":"2025-01-22T06:17:54.732Z","updated_at":"2025-10-30T06:03:22.113Z","avatar_url":"https://github.com/biobakery.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"MelonnPan - Model-based Genomically Informed High-dimensional Predictor of Microbial Community Metabolic Profiles\n================\nHimel Mallick\n2023-02-14\n\n-   [Introduction](#introduction)\n-   [How to Install](#how-to-install)\n    -   [From R](#from-r)\n    -   [From the command line](#from-the-command-line)\n-   [Usage](#usage)\n    -   [Input](#input)\n    -   [Output](#output)\n-   [References](#references)\n-   [Citation](#citation)\n\nIntroduction\n------------\n\nMelonnPan is a computational method for predicting metabolite compositions from microbiome sequencing data.\n\n![Overview of MelonnPan](./vignettes/Figure_0.jpg)\n\nMelonnPan is composed of two high-level workflows: `MelonnPan-Predict` and `MelonnPan-Train`.\n\nThe `MelonnPan-Predict` workflow takes a table of microbial sequence features (i.e., taxonomic or functional abundances on a per sample basis) as input, and outputs a predicted metabolomic table (i.e., relative abundances of metabolite compounds across samples).\n\nThe `MelonnPan-Train` workflow creates an weight matrix that links an optimal set of sequence features to a subset of predictable metabolites following rigorous internal validation, which is then used to generate a table of predicted metabolite compounds (i.e., relative abundances of metabolite compounds per sample). When sufficiently accurate, these predicted metabolite relative abundances can be used for downstream statistical analysis and end-to-end biomarker discovery.\n\nHow to Install\n--------------\n\nThere are two options for installing `MelonnPan`:\n\n### From R\n\nIn R, you can install `MelonnPan` using the `devtools` package as follows (execute from within a fresh R session):\n\n``` r\ninstall.packages('devtools') # Install devtools if not installed already\nlibrary(devtools) # Load devtools\ndevtools::install_github(\"biobakery/melonnpan\") # Install MelonnPan\n```\n\n### From the command line\n\nClone the repository using `git clone`, which downloads the package as its own directory called `melonnpan`.\n\n``` bash\ngit clone https://github.com/biobakery/melonnpan.git\n```\n\nThen, install MelonnPan using `R CMD INSTALL`.\n\n``` bash\nR CMD INSTALL melonnpan\n```\n\nUsage\n-----\n\nMelonnPan can be run from the command line or from within R. Both methods require the same arguments, have the same options, and use the same default settings. Check out the [MelonnPan tutorial](https://github.com/biobakery/biobakery/wiki/melonnpan) for an example application.\n\n-   The default `MelonnPan-Predict` function can be run by executing the script `predict_metabolites.R` from the command line or within R using the function `melonnpan.predict()`. Currently it uses a [pre-trained model](https://github.com/biobakery/melonnpan/blob/master/data/melonnpan.trained.model.txt) from the human gut based on UniRef90 gene families (functionally profiled by [HUMAnN2](http://huttenhower.sph.harvard.edu/humann2)), as described in Franzosa et al. (2019) and the original MelonnPan paper (Mallick et al., 2019), which is included in the package and can also be downloaded from the [`data/`](https://github.com/biobakery/melonnpan/blob/master/data) sub-directory (**melonnpan.trained.model.txt**).\n\n-   If you have paired metabolite and microbial sequencing data (possibly measured from the same biospecimen), you can also train a MelonnPan model by running the script `train_metabolites.R` from the command line or within R using the function `melonnpan.train()`.\n\n-   MelonnPan currently requires input data that is specified using UniRef90 gene families (functionally profiled by [HUMAnN2](http://huttenhower.sph.harvard.edu/humann2)). If you do not have functionally profiled UniRef90 gene families from the human gut or other environments, you may need to first train a MelonnPan model using the `MelonnPan-Train` workflow and supply the resulting weights to the `MelonnPan-Predict` module to get the relevant predictions.\n\n### Input\n\n-   `MelonnPan-Predict` workflow requires the following input:\n    -   a table of microbial sequence features' relative abundances (samples in rows)\n-   `MelonnPan-Train` workflow requires the following inputs:\n    -   a table of metabolite relative abundances (samples in rows)\n    -   a table of microbial sequence features' relative abundances (samples in rows)\n-   For a complete description of the possible parameters for specific `MelonnPan` functions and their default values and output, run the help within R with the `?` operator.\n\n### Output\n\n-   The `MelonnPan-Predict` workflow outputs the following:\n    -   **MelonnPan\\_Predicted\\_Metabolites.txt**: Predicted relative abundances of metabolites as determined by `MelonnPan-Predict`.\n    -   **MelonnPan\\_RTSI.txt**: Table summarizing RTSI scores per sample.\n-   Similarly, the `MelonnPan-Train` workflow outputs the following:\n    -   **MelonnPan\\_Training\\_Summary.txt**: Significant compounds list with per-compound prediction accuracy (correlation coefficient) and the associated p-value and q-value.\n    -   **MelonnPan\\_Trained\\_Metabolites.txt**: Predicted relative abundances of statisticially significant metabolites as determined by `MelonnPan-Train`.\n    -   **MelonnPan\\_Trained\\_Weights.txt**: Table summarizing coefficient estimates (weights) per compound.\n\nReferences\n----------\n\nZou H, Hastie T (2005). Regularization and Variable Selection via the Elastic Net. *Journal of the Royal Statistical Society. Series B (Methodological)* 67(2):301–320.\n\nFranzosa EA et al. (2019). [Gut microbiome structure and metabolic activity in inflammatory bowel disease](https://www.ncbi.nlm.nih.gov/pubmed/30531976). *Nature Microbiology* 4(2):293–305.\n\nCitation\n--------\n\nMallick H, Franzosa EA, McIver LJ, Banerjee S, Sirota-Madi A, Kostic AD, Clish CB, Vlamakis H, Xavier R, Huttenhower C (2019). [Predictive metabolomic profiling of microbial communities using amplicon or metagenomic sequences](https://www.ncbi.nlm.nih.gov/pubmed/31316056). *Nature Communications* 10(1):3136-3146.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbiobakery%2Fmelonnpan","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbiobakery%2Fmelonnpan","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbiobakery%2Fmelonnpan/lists"}