{"id":20938158,"url":"https://github.com/gibsramen/xebec","last_synced_at":"2025-05-13T22:31:38.541Z","repository":{"id":38215673,"uuid":"472926918","full_name":"gibsramen/xebec","owner":"gibsramen","description":"Snakemake pipeline for microbiome diversity effect size benchmarking","archived":false,"fork":false,"pushed_at":"2023-05-16T21:33:16.000Z","size":3799,"stargazers_count":5,"open_issues_count":0,"forks_count":2,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-17T14:13:50.586Z","etag":null,"topics":["bioinformatics","microbiome","pipeline","snakemake"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gibsramen.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2022-03-22T20:27:18.000Z","updated_at":"2023-05-15T00:33:46.000Z","dependencies_parsed_at":"2023-12-15T20:18:26.701Z","dependency_job_id":null,"html_url":"https://github.com/gibsramen/xebec","commit_stats":{"total_commits":119,"total_committers":2,"mean_commits":59.5,"dds":0.07563025210084029,"last_synced_commit":"18ea29cc763f24b5ceabaa1473406beaa6cdd981"},"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gibsramen%2Fxebec","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gibsramen%2Fxebec/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gibsramen%2Fxebec/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gibsramen%2Fxebec/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gibsramen","download_url":"https://codeload.github.com/gibsramen/xebec/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254036813,"owners_count":22003655,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","microbiome","pipeline","snakemake"],"created_at":"2024-11-18T22:49:32.539Z","updated_at":"2025-05-13T22:31:34.692Z","avatar_url":"https://github.com/gibsramen.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Main CI](https://github.com/gibsramen/xebec/actions/workflows/main_ci.yml/badge.svg)](https://github.com/gibsramen/xebec/actions/workflows/main_ci.yml)\n[![PyPI](https://img.shields.io/pypi/v/xebec.svg)](https://pypi.org/project/xebec)\n\n# xebec\n\nSnakemake pipeline for microbiome diversity effect size benchmarking\n\n**NOTE**: Please note that xebec is still under active development.\n\n## Installation\n\nTo use xebec, you will need several dependencies.\nWe recommend using [`mamba`](https://github.com/mamba-org/mamba) to install these packages when possible.\n\n```\nmamba install -c conda-forge -c bioconda biom-format h5py==3.1.0 scipy==1.8 numpy==1.23 snakemake pandas unifrac scikit-bio bokeh==3.1.0 unifrac-binaries jinja2\npip install evident\u003e=0.4.0 gemelli\u003e=0.0.8\n```\n\nTo install xebec, run the following command from the command line:\n\n```\npip install xebec\n```\n\n## Usage\n\nIf you run `xebec --help`, you should see the following:\n\n```\n$ xebec --help\nUsage: xebec [OPTIONS]\n\nOptions:\n  --version                       Show the version and exit.\n  -ft, --feature-table PATH       Feature table in BIOM format.  [required]\n  -m, --metadata PATH             Sample metadata in TSV format.  [required]\n  -t, --tree PATH                 Phylogenetic tree in Newick format.\n                                  [required]\n  -o, --output PATH               Output workflow directory.  [required]\n  --max-category-levels INTEGER   Max number of levels in a category.\n                                  [default: 5]\n  --min-level-count INTEGER       Min number of samples per level per\n                                  category.  [default: 3]\n  --rarefy-percentile FLOAT       Percentile of sample depths at which to\n                                  rarefy.  [default: 10]\n  --n-pcoa-components INTEGER     Number of PCoA components to compuate.\n                                  [default: 3]\n  --validate-input / --no-validate-input\n                                  Whether to validate input before creating\n                                  workflow.  [default: validate-input]\n  --help                          Show this message and exit.\n```\n\nTo create the workflow structure, pass in the filepaths for the feature table, sample metadata, and phylogenetic tree.\nYou must also pass in a path to a directory in which to create the workflow.\nAdditionally, you can provide parameters for determining how to process your sample metadata.\n\nAfter running this command, navigate inside the output directory you created.\nThere should be two subdirectories: `workflow/` and `config/`.\n\nTo start the pipeline , run the following command:\n\n```\nsnakemake --cores 1\n```\n\nYou should see the Snakemake pipeline start running the jobs.\nIf this pipeline runs sucessfully, the processed results will be located at `results/`.\nIncluded in the results are the concatenated effect size values as well as interactive plots summarizing the effect sizes for each metadata column for each diversity metric.\nThese plots are generated using [Bokeh](https://github.com/bokeh/bokeh) and can be visualized in any modern web browser.\n\n![Bokeh](https://raw.githubusercontent.com/gibsramen/xebec/main/imgs/bokeh.png)\n\n## Workflow Overview\n\nxebec performs four main steps, some of which have substeps.\n\n1. Process data (filter metadata, rarefaction)\n2. Run diversity analyses\n3. Calculate effect sizes (concatenate together)\n4. Generate visualizations\n\nAn overview of the DAG is shown below:\n\n![xebec DAG](https://raw.githubusercontent.com/gibsramen/xebec/main/imgs/dag.png)\n\n## Configuration\n\n### Diversity Metrics\n\nxebec allows configuration of what alpha and beta diversity metrics are included in the workflow.\nTo add or remove metrics, modify the `config/alpha_div_metrics.yml` and `config/beta_div_metrics.yml` files.\nFor alpha diversity, any metric that can be passed into `skbio.alpha_diversity` should work.\nFor beta diversity, any non-phylogenetic metric that can be passed into `skbio.beta_diversity` should work.\nValid phylogenetic beta diversity are those that can be passed into [Striped UniFrac](https://github.com/biocore/unifrac).\nMake sure that any additional diversity metrics are annotated with `phylo` or `non_phylo` so xebec knows how to process them.\n\n### Snakemake Options\n\nThe xebec workflow can be decorated with many configuration options available in Snakemake, including resource usage and HPC scheduling.\nWe recommend reading through the [Snakemake documentation](https://snakemake.readthedocs.io/en/stable/index.html) for details on these options.\nNote that some of these options may require creating new configuration files.\n\n## Issues\n\nIf you have any issues with xebec, please leave a GitHub issue or pull request.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgibsramen%2Fxebec","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgibsramen%2Fxebec","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgibsramen%2Fxebec/lists"}