{"id":49178608,"url":"https://github.com/BoevaLab/CancerFoundation","last_synced_at":"2026-05-09T07:01:25.252Z","repository":{"id":261053725,"uuid":"871630009","full_name":"BoevaLab/CancerFoundation","owner":"BoevaLab","description":"CancerFoundation: A single-cell RNA sequencing foundation model to decipher drug resistance in cancer","archived":false,"fork":false,"pushed_at":"2025-09-12T11:51:44.000Z","size":4049,"stargazers_count":30,"open_issues_count":1,"forks_count":6,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-04-23T23:32:09.544Z","etag":null,"topics":["cancer","foundation-model","single-cell"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BoevaLab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-10-12T13:53:50.000Z","updated_at":"2026-04-22T09:52:39.000Z","dependencies_parsed_at":null,"dependency_job_id":"efa28e77-30ee-4644-aadb-207842e02348","html_url":"https://github.com/BoevaLab/CancerFoundation","commit_stats":null,"previous_names":["boevalab/cancerfoundation"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/BoevaLab/CancerFoundation","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BoevaLab%2FCancerFoundation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BoevaLab%2FCancerFoundation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BoevaLab%2FCancerFoundation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BoevaLab%2FCancerFoundation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BoevaLab","download_url":"https://codeload.github.com/BoevaLab/CancerFoundation/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BoevaLab%2FCancerFoundation/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32810381,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-08T08:22:46.396Z","status":"online","status_checked_at":"2026-05-09T02:00:06.633Z","response_time":123,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cancer","foundation-model","single-cell"],"created_at":"2026-04-23T00:00:40.604Z","updated_at":"2026-05-09T07:01:25.247Z","avatar_url":"https://github.com/BoevaLab.png","language":"Jupyter Notebook","funding_links":[],"categories":["Machine Learning Tasks and Models"],"sub_categories":["Foundation Models"],"readme":"# CancerFoundation: A single-cell RNA sequencing foundation model to decipher drug resistance in cancer\n\n[![Preprint](https://img.shields.io/badge/preprint-available-brightgreen)](https://www.biorxiv.org/content/10.1101/2024.11.01.621087v1) \u0026nbsp;\n[![License](https://img.shields.io/badge/license-MIT-blue)](https://github.com/BoevaLab/CancerFoundation/blob/main/LICENSE)\n\nWe present **CancerFoundation**, a novel single-cell RNA-seq foundation model (scFM) trained exclusively on malignant cells. Despite being trained on only one million total cells, a fraction of the data used by existing models, CancerFoundation outperforms other scFMs in key tasks such as zero-shot batch integration and drug response prediction. During training, we employ tissue and technology-aware oversampling and domain-invariant training to enhance performance on underrepresented cancer types and sequencing technologies. We propose survival prediction as a new downstream task to evaluate the generalizability of single-cell foundation models to bulk RNA data and their applicability to patient stratification. CancerFoundation demonstrates superior batch integration performance and shows significant improvements in predicting drug responses for both unseen cell lines and drugs. These results highlight the potential of focused, smaller foundation models in advancing drug discovery and our understanding of cancer biology.\n\n## Installation\n\n### Prerequisites\n\nMake sure you have [Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html) installed on your machine.\n\n### Step-by-Step Guide\n\n1. **Clone the repository**:\n\n   ```bash\n   git clone https://github.com/BoevaLab/CancerFoundation.git\n   cd CancerFoundation\n   ```\n2. **Create the Conda environment**:\n   ```bash\n   conda env create -f environment.yml\n   ```\n3. **Activate environment**:\n   ```bash\n   conda activate cancerfoundation\n   ```\n4. **Download pretrained model weights**:\n\n   Please download the pretrained model from [this link](https://polybox.ethz.ch/index.php/s/pZR9VH7uEHwO5CL), unzip it, and place it in the following directory: ```model/assets```.\n\n## Generate embeddings\nPlease consult ```tutorial/embeddings_tutorial.ipynb``` for a tutorial on how to generate embeddings with CancerFoundation for your scRNA-seq data.\n\n## Drug response prediction\nRefer to ```drug_response_prediction/README.md``` for instructions on performing drug response prediction.\n\n## Zero-shot batch integration\nRefer to ```zero_shot_batch_integration/README.md``` for instructions on performing and evaluating zero-shot batch integration.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FBoevaLab%2FCancerFoundation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FBoevaLab%2FCancerFoundation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FBoevaLab%2FCancerFoundation/lists"}