{"id":28190331,"url":"https://github.com/microbiome/microbiomedatasets","last_synced_at":"2025-07-04T15:34:14.272Z","repository":{"id":47230284,"uuid":"309655301","full_name":"microbiome/microbiomeDataSets","owner":"microbiome","description":"Experiment Hub based microbiome datasets","archived":false,"fork":false,"pushed_at":"2024-09-21T09:22:50.000Z","size":15241,"stargazers_count":6,"open_issues_count":1,"forks_count":4,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-05-16T10:12:34.106Z","etag":null,"topics":["bioconductor","experiment-hub","microbiome","microbiome-datasets"],"latest_commit_sha":null,"homepage":"https://bioconductor.org/packages/3.13/microbiomeDataSets/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/microbiome.png","metadata":{"files":{"readme":"README.md","changelog":"NEWS","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-11-03T10:49:11.000Z","updated_at":"2025-03-11T15:14:17.000Z","dependencies_parsed_at":"2024-06-26T14:42:06.596Z","dependency_job_id":null,"html_url":"https://github.com/microbiome/microbiomeDataSets","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/microbiome/microbiomeDataSets","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/microbiome%2FmicrobiomeDataSets","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/microbiome%2FmicrobiomeDataSets/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/microbiome%2FmicrobiomeDataSets/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/microbiome%2FmicrobiomeDataSets/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/microbiome","download_url":"https://codeload.github.com/microbiome/microbiomeDataSets/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/microbiome%2FmicrobiomeDataSets/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263567991,"owners_count":23481595,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioconductor","experiment-hub","microbiome","microbiome-datasets"],"created_at":"2025-05-16T10:12:34.520Z","updated_at":"2025-07-04T15:34:14.259Z","avatar_url":"https://github.com/microbiome.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# `microbiomeDataSets`\n\n\u003c!-- badges: start --\u003e\n\n\u003c!-- badges: end --\u003e\n\nThis R package is a collection of microbiome datasets published initially elsewhere. The data is available as [`TreeSummarizedExperiment`](https://doi.org/doi:10.18129/B9.bioc.TreeSummarizedExperiment) or [`MultiAssayExperiment`](https://doi.org/doi:10.18129/B9.bioc.MultiAssayExperiment) and a list of available dataset can be retrieved via the `availableDataSets()` function.\n\nThe microbiomeDataSets package focuses mainly on non-human studies. The independent [curatedMetagenomicData](https://waldronlab.io/curatedMetagenomicData/index.html) package provides access to a large collection of standardized human microbiome studies in the same format. \n\nThe aim is to provide datasets for teaching, example workflows or comparative efforts. If you have a dataset, which you like to see in this package, please let us know and/or provide a PR for the datasets.\n\n\n# Contribution\n\nFeel free to contribute. Have a look at how existing datasets are\norganized and prepared data accordingly. It is also good to get in\ntouch at the earliest convenience to discuss any issues.\n\n## Technical aspects\n\nLet's use a gitflow approach. Development version should be\ndone against the `master` branch and then merged to `master` for the\nnext release.  (https://guides.github.com/introduction/flow/)\n\nResources on how data is added to Bioconductor's ExperimentHub backend and accessed are available from Bioconductor [ExperimentHub documentation](https://bioconductor.org/packages/release/bioc/vignettes/ExperimentHub/inst/doc/ExperimentHub.html) and in [Creating ExperimentHub Package](https://bioconductor.org/packages/release/bioc/vignettes/AnnotationHub/inst/doc/CreateAHubPackage.html).\n\nBasic steps:\n\n- Assemble a (Tree)SummarizedExperiment from the raw data \n\n- You can include the data creation script in inst/scripts/*-data-* (optional)\n\n- Save the individual data container components as rds files \n\n- Prepare the metadata file, by creating a new\n  metadata-\u003cdataset-name\u003e.R in inst/scripts and run the script to\n  create inst/extdata/\u003cbioc.version.number\u003e/metadata-\u003cdataset-name\u003e.csv\n\n- Make sure that the metadata files passes the check by running a script like:\n  ExperimentHubData::makeExperimentHubMetadata(\"../microbiomeDataSets\",\"3.13/metadata-hintikka-xo.csv\")\n\n- Maintainer will upload the data through their AWS login. The folder structure must match the one\n  referenced in the metadata file; for example:\n  microbiomeDataSets/\u003cbioc.version.number\u003e/lahti-ml/coldata.rds \n\n- Follow the [instructions](https://bioconductor.org/packages/release/bioc/vignettes/AnnotationHub/inst/doc/CreateAHubPackage.html) (See Section 7)\n\n- Afterwards, the maintainer will push the new metadata to\n  Bioconductor git repo and inform hubs@bioconductor that there is new\n  metadata. They will let us know when the upload is done.\n\n- In the meantime, prepare a loading function as found e.g. in\n  microbiomeDataSets::LahtiMLData has to be created and push this to\n  biocs git repo as well.\n\n- Bump the version (note that the version scheme is different)\n\n- For questions, have a look at the other datasets or check with us through [online\n  channels](microbiome.github.io)\n  \n# Code of conduct\n\nPlease note that the microbiomeDataSets project is released with a \n[Contributor Code of Conduct](https://contributor-covenant.org/version/2/0/CODE_OF_CONDUCT.html).\nBy contributing to this project, you agree to abide by its terms.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmicrobiome%2Fmicrobiomedatasets","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmicrobiome%2Fmicrobiomedatasets","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmicrobiome%2Fmicrobiomedatasets/lists"}