{"id":16320087,"url":"https://github.com/helenalc/type-state","last_synced_at":"2025-05-14T07:12:31.179Z","repository":{"id":175831193,"uuid":"627809556","full_name":"HelenaLC/type-state","owner":"HelenaLC","description":null,"archived":false,"fork":false,"pushed_at":"2024-11-20T10:19:30.000Z","size":312104,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-02-17T01:29:44.683Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/HelenaLC.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2023-04-14T08:47:23.000Z","updated_at":"2024-12-04T17:17:21.000Z","dependencies_parsed_at":"2024-03-23T10:29:44.449Z","dependency_job_id":null,"html_url":"https://github.com/HelenaLC/type-state","commit_stats":null,"previous_names":["helenalc/type-state"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HelenaLC%2Ftype-state","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HelenaLC%2Ftype-state/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HelenaLC%2Ftype-state/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HelenaLC%2Ftype-state/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/HelenaLC","download_url":"https://codeload.github.com/HelenaLC/type-state/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254092836,"owners_count":22013294,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-10T22:29:03.664Z","updated_at":"2025-05-14T07:12:26.163Z","avatar_url":"https://github.com/HelenaLC.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"### setup\n\n- workflow was implemented and last executed successfully with\u003cbr\u003e\n  **R v4.4.1 with Bioc 3.20, and Python v3.11.3 with Snakemake v7.26.0**\n- R version and library have to be specified in the `config.yaml` file  \n  (e.g., `R: \"R_LIBS_USER=/path/to/library /path/to/R/executable\"`)\n- `.Rprofile` is used for handling and printing command line arguments\n- `logs/` capture `.Rout` files from `R CMD BATCH` executions\n- `data/` contains any synthetic and real data\n- intermediate results are generated in `outs/` \n- visualizations are generated in `plts/`\n\n### workflow\n\n- `\u003cx\u003e` denotes a wildcard, namely: `t`ype, `s`tate, `b`atch,  \n  `sim`ulation, `sco`re, `sel`ection, `sta`tistic,   \n  `das` = differential state analysis method\n\n- `00-get_sim/dat.R`\n  - **out:** for simulations, `data/sim/00-raw/t\u003ct\u003e,s\u003cs\u003e,b\u003cb\u003e.rds`,\u003cbr\u003e\n    for real data, `data/dat/00-raw/\u003cdid\u003e.rds` (`\u003cdid\u003e` = dataset identifier)\n  - synthetic data generation (`splatter::splatPopSimulate()`)\n  - hereafter, `t\u003ct\u003e,s\u003cs\u003e,b\u003cb\u003e` = `\u003csim\u003e`\n\n- `01-pro_sim/dat.R`\n  - **in:** `data/sim|dat/00-raw/\u003csim|dat\u003e.rds`\u003cbr\u003e\n    **out:** `data/sim|dat/01-fil/\u003csim|dat\u003e.rds`\n  - minimal filtering keeping genes with count \u003e 1  \n    in ≥ 10 cells, and cells with ≥ 10 detected genes\n  - log-library size normalization (`scater::logNormCounts()`)\n  - highly variable gene (HVG) selection (`scran::modelGeneVar()`)\n  - principal component analysis (PCA) using HVGs (`scater::runPCA()`)\n\n- `02-sco.R`\n  - **in:** `data/sim|dat/01-fil/\u003csim|dat\u003e.rds`\u003cbr\u003e\n    **out:** `outs/sim|dat/sco-\u003csim|dat\u003e,\u003csco\u003e.rds`\n  - source method from one of `02-sco-\u003csco\u003e.R`\n  - compute gene-level metrics to quantify type-/state-specificity \n\n- `03-sel.R`\n  - **in:** `outs/sim|dat/sco-\u003csim|dat\u003e,\u003csco\u003e.rds`\u003cbr\u003e\n    **out:** `outs/sim|dat/sel-\u003csim|dat\u003e,\u003csco\u003e.rds`\n  - source method from one of `03-sel-\u003csel\u003e.R`\n  - select genes for reprocessing\n\n- `04-rep.R`\n  - **in:** `outs/sim|dat/sco-\u003csim|dat\u003e,\u003csco\u003e.rds`\u003cbr\u003e\n    **out:** `data/sim|dat/02-rep/\u003csim|dat\u003e,\u003csel\u003e.rds`\n  - data reprocessing (PCA, clustering, reduction)\n  \n- `05-sta.R`\n  - **in:** `data/sim|dat/02-rep/\u003csim|dat\u003e,\u003csel\u003e.rds`\u003cbr\u003e\n    **out:** `outs/sim|dat/sta-\u003csim|dat\u003e,\u003csel\u003e,\u003csta\u003e.rds`\n  - source method from on of `05-sta-\u003csta\u003e.R`\n  - compute evaluation statistics\n\n- `06-das.R`\n  - **in:** `data/sim|dat/02-rep/\u003csim|dat\u003e,\u003csel\u003e.rds`\u003cbr\u003e\n    **out:** `outs/sim|dat/das-\u003csim|dat\u003e,\u003csel\u003e,\u003cdas\u003e.rds`\n  - source method from one of `06-das-\u003cdas\u003e.R`\n  - perform differential state analysis (DSA)\n\n- `07-eva.R`\n  - standalone script applied to experimental data only\n  - collects results across all feature selection strategies,\u003cbr\u003e\n    selects [10, 20, ..., 90\\%] for top-rank features, and recomputes\u003cbr\u003e\n    evaluation statistics for accordingly reprocessed data (PCA, clustering)\n\n- `08-plt_\u003cout\u003e-\u003cplt\u003e.R`\n  - **in:** `outs/sim/\u003cout\u003e.rds`\u003cbr\u003e\n    **out:** `plt/sim/\u003cout\u003e-\u003cplt\u003e.pdf`\n  - e.g., `08-plt_das-F1.pdf` collects all DSA results\u003cbr\u003e\n    (`outs/sim/das-\u003csim\u003e,\u003csel\u003e,\u003cdas\u003e.rds`) and plots F1 scores\n  - visualization of synthetic data analysis results\n\n- `08-qlt_\u003cout\u003e-\u003cqlt\u003e.R`\n  - **in:** `outs/dat/\u003cout\u003e.rds`\u003cbr\u003e\n    **out:** `plt/dat/\u003cout\u003e-\u003cqlt\u003e.pdf`\n  - visualization of experimental data analysis results\n  \n- `09-aes.R`\n  - sourced to fix the order of feature scores (`SCO`),\u003cbr\u003e\n    ground truth-based (`DES`) and other selections (`SEL`),\u003cbr\u003e\n    and differential state analysis methods (`DAS`) across plots\n\n- `10-session_info.R`\n  - lists and may be used to install all R packages used\u003cbr\u003e\n    (across CRAN, GitHub, and Bioconductor), and writes the\u003cbr\u003e\n    corresponding `sessionInfo()` output to `session_info.txt`","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhelenalc%2Ftype-state","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhelenalc%2Ftype-state","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhelenalc%2Ftype-state/lists"}