{"id":13709241,"url":"https://github.com/georg-wolflein/good-features","last_synced_at":"2026-01-16T10:42:55.206Z","repository":{"id":203766513,"uuid":"688063188","full_name":"georg-wolflein/good-features","owner":"georg-wolflein","description":"Official code for the paper \"A Good Feature Extractor Is All You Need for Weakly Supervised Pathology Slide Classification\"","archived":false,"fork":false,"pushed_at":"2024-08-17T21:32:11.000Z","size":172655,"stargazers_count":18,"open_issues_count":1,"forks_count":2,"subscribers_count":1,"default_branch":"master","last_synced_at":"2026-01-01T08:21:18.072Z","etag":null,"topics":["cancer","computational-pathology","computer-vision","deep-learning","machine-learning","self-supervised-learning"],"latest_commit_sha":null,"homepage":"https://georg.woelflein.eu/good-features","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/georg-wolflein.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-09-06T15:14:35.000Z","updated_at":"2025-03-24T14:20:51.000Z","dependencies_parsed_at":"2024-11-13T19:43:31.495Z","dependency_job_id":null,"html_url":"https://github.com/georg-wolflein/good-features","commit_stats":null,"previous_names":["georg-wolflein/histaug","georg-wolflein/good-features"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/georg-wolflein/good-features","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/georg-wolflein%2Fgood-features","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/georg-wolflein%2Fgood-features/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/georg-wolflein%2Fgood-features/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/georg-wolflein%2Fgood-features/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/georg-wolflein","download_url":"https://codeload.github.com/georg-wolflein/good-features/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/georg-wolflein%2Fgood-features/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28478050,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-16T06:30:42.265Z","status":"ssl_error","status_checked_at":"2026-01-16T06:30:16.248Z","response_time":107,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cancer","computational-pathology","computer-vision","deep-learning","machine-learning","self-supervised-learning"],"created_at":"2024-08-02T23:00:37.068Z","updated_at":"2026-01-16T10:42:55.198Z","avatar_url":"https://github.com/georg-wolflein.png","language":"Jupyter Notebook","funding_links":[],"categories":["Publications"],"sub_categories":["Papers"],"readme":"\u003cdiv align=\"center\"\u003e\n\u003ch1\u003eA Good Feature Extractor Is All You Need\u003c/h1\u003e\n\u003c/div\u003e\n\nThis repository contains the official code for the paper:\n\n\u003e [**A Good Feature Extractor Is All You Need for Weakly Supervised Pathology Slide Classification**](https://arxiv.org/abs/2311.11772)  \n\u003e Georg Wölflein, Dyke Ferber, Asier Rabasco Meneghetti, Omar S. M. El Nahhas, Daniel Truhn, Zunamys I. Carrero, David J. Harrison, Ognjen Arandjelović and Jakob N. Kather  \n\u003e _arXiv_, Nov 2023.\n\n\u003cdetails\u003e\n\u003csummary\u003eRead full abstract.\u003c/summary\u003e\nStain normalisation is thought to be a crucial preprocessing step in computational pathology pipelines. We question this belief in the context of weakly supervised whole slide image classification, motivated by the emergence of powerful feature extractors trained using self-supervised learning on diverse pathology datasets. To this end, we performed the most comprehensive evaluation of publicly available pathology feature extractors to date, involving more than 8,000 training runs across nine tasks, five datasets, three downstream architectures, and various preprocessing setups. Notably, we find that omitting stain normalisation and image augmentations does not compromise downstream slide-level classification performance, while incurring substantial savings in memory and compute. Using a new evaluation metric that facilitates relative downstream performance comparison, we identify the best publicly available extractors, and show that their latent spaces are remarkably robust to variations in stain and augmentations like rotation. Contrary to previous patch-level benchmarking studies, our approach emphasises clinical relevance by focusing on slide-level biomarker prediction tasks in a weakly supervised setting with external validation cohorts. Our findings stand to streamline digital pathology workflows by minimising preprocessing needs and informing the selection of feature extractors.\n\u003c/details\u003e\n\n## Main results\n\n\u003cdiv align=\"center\"\u003e\u003cimg src=\"https://github.com/georg-wolflein/good-features/raw/master/assets/performance_comparison.png\" width=\"500\"\u003e\u003c/img\u003e\u003c/div\u003e\n\n- We compare 14 feature extractors, and find that [UNI](https://www.nature.com/articles/s41591-024-02857-3), [CTransPath](https://github.com/Xiyue-Wang/TransPath) and [Lunit's DINO](https://github.com/lunit-io/benchmark-ssl-pathology) produce the best representations for downstream weakly supervised slide classification tasks.\n- We show that stain normalisation and image augmentations can be omitted without compromising downstream performance.\n\n\u003e [!NOTE]\n\u003e _June 2024:_ We released an extended version of our [preprint](https://arxiv.org/abs/2311.11772v5) that includes two additional feature extractors ([UNI](https://www.nature.com/articles/s41591-024-02857-3) and ViT-L), alongside extensive additional experiments at $20\\times$ magnification (to complement the original set of experiments at $\\approx 9\\times$ magnification).\n\n\u003e [!NOTE]\n\u003e _March 2024:_ We updated our [preprint](https://arxiv.org/abs/2311.11772v4) to include two additional feature extractors: Phikon-Teacher and Lunit-MoCo.\n\n## Overview\n\n![](assets/overview.png)\n\n## Citing\n\nIf you find this useful, please cite:\n\n```bibtex\n@misc{wolflein2023good,\n    title   = {A Good Feature Extractor Is All You Need for Weakly Supervised Pathology Slide Classification},\n    author  = {W\\\"{o}lflein, Georg and Ferber, Dyke and Meneghetti, Asier Rabasco and El Nahhas, Omar S. M. and Truhn, Daniel and Carrero, Zunamys I. and Harrison, David J. and Arandjelovi\\'{c}, Ognjen and Kather, Jakob N.},\n    journal = {arXiv:2311.11772},\n    year    = {2023},\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgeorg-wolflein%2Fgood-features","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgeorg-wolflein%2Fgood-features","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgeorg-wolflein%2Fgood-features/lists"}