{"id":48402078,"url":"https://github.com/vineetver/favor-cli","last_synced_at":"2026-04-29T03:01:39.308Z","repository":{"id":349636806,"uuid":"1202350895","full_name":"vineetver/favor-cli","owner":"vineetver","description":"From raw variants to biological mechanisms in one tool","archived":false,"fork":false,"pushed_at":"2026-04-27T21:03:33.000Z","size":1222,"stargazers_count":0,"open_issues_count":38,"forks_count":1,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-04-27T23:08:08.033Z","etag":null,"topics":["annotation","association-testing","bioinformatics","genomics","rare-variant","rust","staar","whole-genome-sequencing"],"latest_commit_sha":null,"homepage":"https://github.com/vineetver/favor-cli","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vineetver.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-04-05T23:34:21.000Z","updated_at":"2026-04-27T21:03:38.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/vineetver/favor-cli","commit_stats":null,"previous_names":["vineetver/favor-cli","vineetver/cohort-cli"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/vineetver/favor-cli","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vineetver%2Ffavor-cli","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vineetver%2Ffavor-cli/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vineetver%2Ffavor-cli/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vineetver%2Ffavor-cli/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vineetver","download_url":"https://codeload.github.com/vineetver/favor-cli/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vineetver%2Ffavor-cli/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32408446,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-29T02:37:21.628Z","status":"ssl_error","status_checked_at":"2026-04-29T02:36:50.947Z","response_time":110,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["annotation","association-testing","bioinformatics","genomics","rare-variant","rust","staar","whole-genome-sequencing"],"created_at":"2026-04-06T02:14:54.763Z","updated_at":"2026-04-29T03:01:39.301Z","avatar_url":"https://github.com/vineetver.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003ch1 align=\"center\"\u003eFAVOR CLI\u003c/h1\u003e\n  \u003cp align=\"center\"\u003e\n    Raw variants in. Rare-variant results out.\n    \u003cbr /\u003e\n    \u003cstrong\u003eAnnotate. Enrich. Analyze. Interpret.\u003c/strong\u003e\n    \u003cbr /\u003e\n    \u003cbr /\u003e\n    \u003ca href=\"#install\"\u003eInstall\u003c/a\u003e \u0026middot; \u003ca href=\"#quick-start\"\u003eQuick Start\u003c/a\u003e \u0026middot; \u003ca href=\"#commands\"\u003eCommands\u003c/a\u003e \u0026middot; \u003ca href=\"#roadmap\"\u003eRoadmap\u003c/a\u003e \u0026middot; \u003ca href=\"#citation\"\u003eCitation\u003c/a\u003e\n  \u003c/p\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://github.com/vineetver/favor-cli/actions/workflows/ci.yml?query=branch%3Amaster\"\u003e\u003cimg src=\"https://github.com/vineetver/favor-cli/actions/workflows/ci.yml/badge.svg?branch=master\u0026event=push\" alt=\"CI\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/vineetver/favor-cli/releases/latest\"\u003e\u003cimg src=\"https://img.shields.io/github/v/release/vineetver/favor-cli?color=blue\" alt=\"Release\"\u003e\u003c/a\u003e\n  \u003ca href=\"LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/badge/license-GPL--3.0-blue\" alt=\"License\"\u003e\u003c/a\u003e\n  \u003cimg src=\"https://img.shields.io/badge/rust-stable-orange\" alt=\"Rust\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/platform-linux%20%7C%20macos-lightgrey\" alt=\"Platform\"\u003e\n\u003c/p\u003e\n\n---\n\n\u003e **Pre-1.0.** Commands and interfaces may change between releases.\n\n## Install\n\n```bash\ncurl -fsSL https://raw.githubusercontent.com/vineetver/favor-cli/master/install.sh | sh\n```\n\n## Quick Start\n\n```bash\n# 1. configure: point at a data directory + choose annotation tier\nfavor setup --root /data/favor --tier base\n\n# 2. pull annotation data (~200 GB base, ~508 GB full)\nfavor data pull\n\n# 3. ingest and annotate variants\nfavor ingest variants.vcf.gz\nfavor annotate variants.ingested\n\n# 4. run STAAR rare-variant association\nfavor staar --genotypes cohort.vcf.gz --phenotype pheno.tsv \\\n  --trait-name LDL --covariates age,sex,PC1,PC2 \\\n  --annotations variants.annotated\n```\n\n## Commands\n\n| Command | What it does |\n|---------|-------------|\n| `favor setup` | Configure data root, annotation tier, environment |\n| `favor data pull` | Download annotation parquets and optional packs |\n| `favor ingest` | Normalize VCF/TSV/CSV into canonical parquet variant sets |\n| `favor annotate` | Join variants against FAVOR base or full annotations |\n| `favor enrich` | Overlay tissue-specific eQTL, regulatory, enhancer-gene data |\n| `favor staar` | STAAR rare-variant association testing |\n| `favor meta-staar` | Cross-study meta-analysis from summary statistics |\n| `favor schema` | Inspect annotation table columns and types |\n| `favor manifest` | Show installed data and available commands |\n\nUse `--format json` for machine-readable output. Use `--dry-run` before heavy computation.\n\n## Data layout\n\nFAVOR CLI uses two separate storage areas:\n\n**Data root** (`--root` during setup) holds annotation parquets shared across projects:\n\n```\n/data/favor/\n  base/chromosome=*/sorted.parquet      # base tier (~200 GB)\n  full/chromosome=*/sorted.parquet      # full tier (~508 GB)\n  tissue/                               # optional enrichment packs\n    reference/                          #   gene index, cCRE regions (40 MB, always installed)\n    rollups/                            #   gene-level summaries (49 MB, always installed)\n    variant_in_region/                  #   variant-region junction (155 GB, always installed)\n    variant_eqtl/                       #   GTEx eQTL (3 GB, optional)\n    region_ccre_tissue_signals/         #   ENCODE regulatory (18 GB, optional)\n    ...\n```\n\n**Project store** (`.cohort/` in your working directory) holds per-project data:\n\n```\nmy_study/\n  .cohort/\n    cohorts/\u003cid\u003e/                       # built by favor ingest or favor staar\n      manifest.json\n      samples.txt\n      chromosome=*/\n        sparse_g.bin                    # sparse genotype matrix (mmap'd)\n        variants.parquet                # variant metadata + STAAR weights\n        membership.parquet              # gene-variant assignments\n    cache/score_cache/                  # reused across mask/MAF reruns\n    annotations/refs.toml               # attached annotation databases\n```\n\nThe store root is resolved as: `--store-path` flag \u003e `FAVOR_STORE` env \u003e walk up for `.cohort/` \u003e `\u003ccwd\u003e/.cohort/`.\n\nSee [Setup guide](docs/setup.md) for detailed configuration, pack selection, HPC tips, and working directory organization.\n\n## Resource requirements\n\nTested on UKB exome chr22 (~200K samples, ~400K variants, ~17K rare) with 64 GB. Full genome not yet tested.\n\n```text\nsamples    RAM       notes\n───────    ──────    ─────────────────────────────\n 10K       32 GB     comfortable\n200K       64 GB     tested (UKB exome chr22)\n```\n\nMemory, threads, and temp directory are auto-detected from SLURM and cgroup. Override with:\n\n```text\nSLURM_MEM_PER_NODE     memory pool\nFAVOR_KINSHIP_MEM_GB   kinship budget (default 16 GB)\nTMPDIR                 scratch space\n```\n\n## Docs\n\n- **[Setup guide](docs/setup.md)** - installation, configuration, data management, HPC best practices\n- [Ingest](docs/ingest.md) - VCF ingest patterns, preflight, throughput\n- [Genotype store](docs/storage.md) - sparse genotype store for rare-variant analysis\n- [STAAR](docs/staar.md) - null model, score test, masks, outputs, meta-analysis\n- [Validation](docs/validation.md) - statistical accuracy vs R reference\n- [Statistical divergences](docs/statistical-divergences.md) - known differences from R STAAR/SKAT and why\n- [Performance](docs/performance.md) - benchmarks and optimization roadmap\n- [Agent reference](AGENTS.md) - machine interface for LLM agents\n\n## Roadmap\n\n| Milestone | Focus |\n|-----------|-------|\n| [v0.2.0 - STAAR hardening](https://github.com/vineetver/favor-cli/milestone/1) | GRM, score validation, multi-VCF input, performance profiling |\n| [v0.3.0 - MetaSTAAR](https://github.com/vineetver/favor-cli/milestone/2) | cross-biobank meta-analysis, allele flip, conditional, effect sizes |\n| [v0.4.0 - Interpret](https://github.com/vineetver/favor-cli/milestone/3) | variant interpretation, fine-mapping, colocalization, V2G, tiers |\n| [v0.5.0 - memory and thread pool overhaul](https://github.com/vineetver/favor-cli/milestone/5) | one compute handle, bounded scratch, machine-visible resource control |\n| [v0.6.0 - storage and query engine](https://github.com/vineetver/favor-cli/milestone/6) | store format, query paths, incremental ingest, cloud I/O, agent-friendly queries |\n| [v1.0.0 - Production](https://github.com/vineetver/favor-cli/milestone/4) | orchestration, provenance, QC, full test suite |\n\n## Citation\n\nFAVOR CLI implements the [STAAR](https://github.com/xihaoli/STAARpipeline) framework and the [FAVOR](https://favor.genohub.org) annotation database. If you use this tool, please cite:\n\n\u003e Li Z\\*, Li X\\*, Zhou H, et al. **A framework for detecting noncoding rare variant associations of large-scale whole-genome sequencing studies.** *Nature Methods*, 19(12), 1599-1611 (2022). [DOI: 10.1038/s41592-022-01640-x](https://doi.org/10.1038/s41592-022-01640-x)\n\n\u003e Li X\\*, Li Z\\*, Zhou H, et al. **Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale.** *Nature Genetics*, 52(9), 969-983 (2020). [DOI: 10.1038/s41588-020-0676-4](https://doi.org/10.1038/s41588-020-0676-4)\n\n\u003e Zhou H, Verma V, Li X, et al. **FAVOR 2.0: A reengineered functional annotation of variants online resource for interpreting genomic variation.** *Nucleic Acids Research*, 54(D1), D1405-D1414 (2026). [DOI: 10.1093/nar/gkaf1217](https://doi.org/10.1093/nar/gkaf1217)\n\n\u003e Zhou H, Arapoglou T, Li X, et al. **FAVOR: functional annotation of variants online resource and annotator for variation across the human genome.** *Nucleic Acids Research*, 51(D1), D1300-D1311 (2023). [DOI: 10.1093/nar/gkac966](https://doi.org/10.1093/nar/gkac966)\n\n\u003e Li TC, Zhou H, Verma V, et al. **FAVOR-GPT: a generative natural language interface to whole genome variant functional annotations.** *Bioinformatics Advances*, 4(1), vbae143 (2024). [DOI: 10.1093/bioadv/vbae143](https://doi.org/10.1093/bioadv/vbae143)\n\n## License\n\nGPL-3.0\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvineetver%2Ffavor-cli","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvineetver%2Ffavor-cli","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvineetver%2Ffavor-cli/lists"}