{"id":13708863,"url":"https://github.com/mahmoodlab/UNI","last_synced_at":"2025-05-06T15:31:18.518Z","repository":{"id":228614469,"uuid":"751712341","full_name":"mahmoodlab/UNI","owner":"mahmoodlab","description":"Pathology Foundation Model - Nature Medicine","archived":false,"fork":false,"pushed_at":"2025-03-26T04:45:41.000Z","size":7280,"stargazers_count":451,"open_issues_count":23,"forks_count":65,"subscribers_count":9,"default_branch":"main","last_synced_at":"2025-04-24T04:02:43.558Z","etag":null,"topics":["computational-pathology","digital-pathology","foundation","foundation-model","histopathology","mahmoodlab","mass-100k","nature-medicine","pathology","pathology-dinov2","pathology-fm","pathology-foundation","pathology-foundation-model","pathology-self-supervised","quantitative-pathology","uni","uni-foundation-model"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mahmoodlab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-02-02T06:49:11.000Z","updated_at":"2025-04-23T14:03:00.000Z","dependencies_parsed_at":"2024-07-16T15:25:05.778Z","dependency_job_id":"c29f6412-abd6-4d76-aeea-84724f55a834","html_url":"https://github.com/mahmoodlab/UNI","commit_stats":null,"previous_names":["mahmoodlab/uni"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mahmoodlab%2FUNI","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mahmoodlab%2FUNI/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mahmoodlab%2FUNI/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mahmoodlab%2FUNI/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mahmoodlab","download_url":"https://codeload.github.com/mahmoodlab/UNI/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252712880,"owners_count":21792384,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computational-pathology","digital-pathology","foundation","foundation-model","histopathology","mahmoodlab","mass-100k","nature-medicine","pathology","pathology-dinov2","pathology-fm","pathology-foundation","pathology-foundation-model","pathology-self-supervised","quantitative-pathology","uni","uni-foundation-model"],"created_at":"2024-08-02T23:00:33.558Z","updated_at":"2025-05-06T15:31:18.510Z","avatar_url":"https://github.com/mahmoodlab.png","language":"Jupyter Notebook","funding_links":[],"categories":["Software","Machine Learning Tasks and Models","🔬 Domain-Specific Applications"],"sub_categories":["Foundation Model","Foundation Models","🧬 Biology \u0026 Medicine"],"readme":"# UNI \n\n## Towards a General-Purpose Foundation Model for Computational Pathology\n*Nature Medicine* \u003cimg src=\".github/uni.jpg\" width=\"300px\" align=\"right\" /\u003e\n\n[Journal Link](https://www.nature.com/articles/s41591-024-02857-3) | [Open Access Read Link](https://rdcu.be/dBMgh) | [Download Models](#model-weights) | [Download Pre-extracted Embeddings](#pre-extracted-embeddings) | [Cite](#reference) \n\n### Updates\n- 3/20/2025: [One year overview of UNI \u0026 CONCH](https://www.linkedin.com/posts/faisalmmd_its-been-one-year-since-we-release-uni-and-activity-7308523636250820608-NedR?utm_source=share\u0026utm_medium=member_desktop\u0026rcm=ACoAAAtTgDUBogopLVJVJOF9wEPZNmx4mbyt4OI) written by our team with updated table of research applications.\n- 3/6/2025: [Blog Post from Meta AI](https://ai.meta.com/blog/mahmood-lab-human-pathology-dinov2/) on our development of UNI using DINOv2.\n- **01/14/2025: Release of UNI 2 trained on over 200 million pathology H\u0026E and IHC images sampled from 350+ thousand diverse whole slide images. [UNI 2 model weights](https://huggingface.co/MahmoodLab/UNI2-h), benchmark results and [25k+ pre-extracted WSI embeddings from TCGA,CPTAC, and PANDA](https://huggingface.co/datasets/MahmoodLab/UNI2-h-features) are released.**\n- 12/17/2024: [Research Highlight from Nature Medicine](https://www.nature.com/articles/s43018-024-00837-7) on UNI \u0026 CONCH for clinical oncology\n- 03/19/2024: UNI is published! Model weights and initial benchmark results are released.\n\nUnfamiliar with UNI? Please refer to the original README ([here](./README_old.md)) for more details or refer to the accompanying Nature Medicine study ([here](https://www.nature.com/articles/s41591-024-02857-3)).\n\n\n## Model weights\n| Model Name    | Release Date | Model Architecture | Download Link            |\n|---------------------|--------------|---------------------|-------------------------------------------------------------|\n| UNI2-h      |   01/2025        | ViT-h/14-reg8               | [HF Link](https://huggingface.co/MahmoodLab/UNI2-h) |\n| UNI          |   03/2024        | ViT-l/16                 | [HF Link](https://huggingface.co/MahmoodLab/uni)  |\n\n## Research Applications using UNI \u0026 CONCH\n\u003cdetails\u003e\n  \u003csummary\u003e\n    \u003cb\u003eLast Updated 3/20/2025\u003c/b\u003e\n  \u003c/summary\u003e\n\n| Paper Name   | Year | Publication  |\n|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|------|\n| [A self-supervised framework for learning whole slide representations](https://arxiv.org/abs/2402.06188)                                             | 2024 | arXiv:2402.06188                                                   |\n| [Honeybee: a scalable modular framework for creating multimodal oncology datasets with foundational embedding models](https://arxiv.org/abs/2405.07460) | 2024 | arXiv:2405.07460                                                   |\n| [Combining graph neural network and mamba to capture local and global tissue spatial relationships in whole slide images](https://arxiv.org/abs/2406.04377) | 2024 | arXiv:2406.04377                                                   |\n| [STimage-1K4M: A histopathology image-gene expression dataset for spatial transcriptomics](https://arxiv.org/abs/2406.06393)                         | 2024 | arXiv:2406.06393                                                   |\n| [Embedding-based multimodal learning on pan-squamous cell carcinomas for improved survival outcomes](https://arxiv.org/abs/2406.08521)               | 2024 | arXiv:2406.08521                                                   |\n| [A clinical benchmark of public self-supervised pathology foundation models](https://arxiv.org/abs/2407.06508v1)                                     | 2024 | arXiv:2407.06508v1                                                |\n| [Path-SAM2: Transfer SAM2 for digital pathology semantic segmentation](https://arxiv.org/abs/2408.03651)                                             | 2024 | arXiv:2408.03651                                                   |\n| [Benchmarking foundation models as feature extractors for weakly-supervised computational pathology](https://arxiv.org/abs/2408.15823)               | 2024 | arXiv:2408.15823                                                   |\n| [Pediatric brain tumor classification using digital histopathology and deep learning: evaluation of SOTA methods on a multi-center Swedish cohort](https://arxiv.org/abs/2409.01330) | 2024 | arXiv:2409.01330                                                   |\n| [Evaluating Pre-trained Convolutional Neural Networks and Foundation Models as Feature Extractors for Content-based Medical Image Retrieval](https://arxiv.org/abs/2409.09430) | 2024 | arXiv:2409.09430                                                   |\n| [Evaluating Deep Regression Models for WSI-Based Gene-Expression Prediction](https://arxiv.org/abs/2410.00945)                                       | 2024 | arXiv:2410.00945                                                   |\n| [Deep Learning for Fetal Inflammatory Response Diagnosis in the Umbilical Cord](https://arxiv.org/abs/2411.09767)                                    | 2024 | arXiv:2411.09767                                                   |\n| [Diagnostic Text-guided Representation Learning in Hierarchical Classification for Pathological Whole Slide Image](https://arxiv.org/abs/2411.10709) | 2024 | arXiv:2411.10709                                                   |\n| [Leveraging Computational Pathology AI for Noninvasive Optical Imaging Analysis Without Retraining](https://arxiv.org/abs/2411.11613)                | 2024 | arXiv:2411.11613                                                   |\n| [FOCUS: Knowledge-enhanced Adaptive Visual Compression for Few-shot Whole Slide Image Classification](https://arxiv.org/abs/2411.14743)             | 2024 | arXiv:2411.14743                                                   |\n| [RankByGene: Gene-Guided Histopathology Representation Learning Through Cross-Modal Ranking Consistency](https://arxiv.org/abs/2411.15076)           | 2024 | arXiv:2411.15076                                                   |\n| [ST-Align: A Multimodal Foundation Model for Image-Gene Alignment in Spatial Transcriptomics](https://arxiv.org/abs/2411.16793)                     | 2024 | arXiv:2411.16793                                                   |\n| [Multimodal Outer Arithmetic Block Dual Fusion of Whole Slide Images and Omics Data for Precision Oncology](https://arxiv.org/abs/2411.17418)        | 2024 | arXiv:2411.17418                                                   |\n| [Multimodal whole slide foundation model for pathology](https://arxiv.org/abs/2411.19666)                                                            | 2024 | arXiv:2411.19666                                                   |\n| [GCUNet: A GNN-Based Contextual Learning Network for Tertiary Lymphoid Structure Semantic Segmentation in Whole Slide Image](https://arxiv.org/abs/2412.06129) | 2024 | arXiv:2412.06129                                                   |\n| [A multimodal ensemble approach for clear cell renal cell carcinoma treatment outcome prediction](https://arxiv.org/abs/2412.07136)                 | 2024 | arXiv:2412.07136                                                   |\n| [From Histopathology Images to Cell Clouds: Learning Slide Representations with Hierarchical Cell Transformer](https://arxiv.org/abs/2412.16715)     | 2024 | arXiv:2412.16715                                                   |\n| [Vision-language models do not understand negation](https://arxiv.org/abs/2501.09425)                                                                | 2025 | arXiv:2501.09425                                                   |\n| [Prior Knowledge Injection into Deep Learning Models Predicting Gene Expression from Whole Slide Images](https://arxiv.org/abs/2501.14056)          | 2025 | arXiv:2501.14056                                                   |\n| [Molecular-driven Foundation Model for Oncologic Pathology](https://arxiv.org/abs/2501.16652)                                                        | 2025 | arXiv:2501.16652                                                   |\n| [Dynamic Hypergraph Representation for Bone Metastasis Cancer Analysis](https://arxiv.org/abs/2501.16787)                                            | 2025 | arXiv:2501.16787                                                   |\n| [Pathology Report Generation and Multimodal Representation Learning for Cutaneous Melanocytic Lesions](https://arxiv.org/abs/2502.19293)             | 2025 | arXiv:2502.19293                                                   |\n| [DELST: Dual Entailment Learning for Hyperbolic Image-Gene Pretraining in Spatial Transcriptomics](https://arxiv.org/abs/2503.00804)                 | 2025 | arXiv:2503.00804                                                   |\n| [Explainable Classifier for Malignant Lymphoma Subtyping via Cell Graph and Image Fusion](https://arxiv.org/abs/2503.00925)                          | 2025 | arXiv:2503.00925                                                   |\n| [CrossFusion: A Multi-Scale Cross-Attention Convolutional Fusion Model for Cancer Survival Prediction](https://arxiv.org/abs/2503.02064)             | 2025 | arXiv:2503.02064                                                   |\n| [Adaptive Prototype Learning for Multimodal Cancer Survival Analysis](https://arxiv.org/abs/2503.04643)                                              | 2025 | arXiv:2503.04643                                                   |\n| [ecPath detects ecDNA in tumors from histopathology images](https://www.biorxiv.org/content/10.1101/2024.11.13.623494v1.abstract)                    | 2024 | bioRxiv:2024.11.13.623494v1                                        |\n| [Contrastive Learning for Omics-guided Whole-slide Visual Embedding Representation](https://www.biorxiv.org/content/10.1101/2025.01.12.632280.abstract) | 2025 | bioRxiv:2025.01.12.632280                                          |\n| [Multi-modal Disentanglement of Spatial Transcriptomics and Histopathology Imaging](https://www.biorxiv.org/content/10.1101/2025.02.19.638201v1)     | 2025 | bioRxiv:2025.02.19.638201v1                                       |\n| [High-Parameter Spatial Multi-Omics through Histology-Anchored Integration](https://www.biorxiv.org/content/10.1101/2025.02.23.639721v1)             | 2025 | bioRxiv:2025.02.23.639721v1                                       |\n| [Weakly-supervised deep learning models enable HER2-low prediction from H\u0026E stained slides](https://breast-cancer-research.biomedcentral.com/articles/10.1186/s13058-024-01863-0) | 2024 | Breast Cancer Research                                            |\n| [2DMamba: Efficient State Space Model for Image Representation with Applications on Giga-Pixel Whole Slide Image](https://arxiv.org/abs/2412.00678)  | 2025 | Computer Vision \u0026 Pattern Recognition (CVPR)                       |\n| [Transcriptomics-guided slide representation learning in computational pathology](https://openaccess.thecvf.com/content/CVPR2024/html/Jaume_Transcriptomics-guided_Slide_Representation_Learning_in_Computational_Pathology_CVPR_2024_paper.html) | 2024 | Computer Vision \u0026 Pattern Recognition (CVPR)                       |\n| [Morphological prototyping for unsupervised slide representation learning in computational pathology](https://openaccess.thecvf.com/content/CVPR2024/html/Song_Morphological_Prototyping_for_Unsupervised_Slide_Representation_Learning_in_Computational_Pathology_CVPR_2024_paper.html) | 2024 | Computer Vision \u0026 Pattern Recognition (CVPR)                       |\n| [Development and validation of novel deep learning-based models for cancer histopathology image](https://openarchive.ki.se/articles/thesis/Development_and_validation_of_novel_deep_learning-_based_models_for_cancer_histopathology_image/27291567) | 2024 | Doctoral dissertation (Karolinska Institutet)                      |\n| [Multistain pretraining for slide representation learning in pathology](https://eccv.ecva.net/virtual/2024/poster/429)                               | 2024 | European Conference on Computer Vision (ICCV)                      |\n| [Interpretable Vision-Language Survival Analysis with Ordinal Inductive Bias for Computational Pathology](https://openreview.net/forum?id=trj2Jq8riA) | 2025 | International Conference on Learning Representations (ICLR)        |\n| [Multimodal prototyping for cancer survival prediction](https://proceedings.mlr.press/v235/song24b.html)                                            | 2024 | International Conference on Machine Learning (ICML)                |\n| [High-resolution spatial transcriptomics from histology images using histosge](https://arxiv.org/abs/2407.20518)                                     | 2024 | International Conference on Bioinformatics and Biomedicine (BIBM)  |\n| [Multi-resolution histopathology patch graphs for ovarian cancer subtyping](https://link.springer.com/chapter/10.1007/978-3-031-83243-7_7)           | 2024 | International Workshop on Graphs in Biomedical Image Analysis      |\n| [Bridging Classification and Segmentation in Osteosarcoma Assessment via Foundation and Discrete Diffusion Models](https://arxiv.org/abs/2501.01932) | 2025 | International Symposium on Biomedical Imaging (ISBI)               |\n| [1250 H\u0026E-based cell prediction multi-classification models to capture morphologically distinct subpopulations of CD8+ T cells](https://jitc.bmj.com/content/12/Suppl_2/A1399) | 2024 | Journal for ImmunoTherapy of Cancer                                |\n| [Liver fibrosis classification on trichrome histology slides using weakly supervised learning in children and young adults](https://www.sciencedirect.com/science/article/pii/S2153353924000555) | 2025 | Journal of Pathology Informatics                                   |\n| [Winners of the 2024 Tuberculosis Detection Competition](https://www.linkedin.com/posts/zsoltbedohazi_winners-of-the-2024-tuberculosis-detection-activity-7186281385572065280-zpOq) | 2024 | LinkedIn post                                                      |\n| [Model-based cleaning of the QUILT-1M pathology dataset for text-conditional image synthesis](https://openreview.net/forum?id=m7wYKrUjzV)             | 2024 | Medical Imaging with Deep Learning                                 |\n| [Generating highly accurate pathology reports from gigapixel whole slide images with HistoGPT](https://www.medrxiv.org/content/10.1101/2024.03.15.24304211v2) | 2024 | medRxiv:2024.03.15.24304211v2                                     |\n| [HIBRID: Histology and ct-DNA based Risk-stratification with Deep Learning](https://www.medrxiv.org/content/10.1101/2024.07.23.24310822.abstract)      | 2024 | medRxiv:2024.07.23.24310822                                       |\n| [\"SurvivMIL: A Multimodal, Multiple Instance Learning Pipeline for Survival Outcome of Neuroblastoma Patients\"](https://proceedings.mlr.press/v254/naidoo24a.html) | 2024 | MICCAI Workshop on Computational Pathology with Multimodal Data (COMPAYL) |\n| [Early Fusion of H\u0026E and IHC Histology Images for Pediatric Brain Tumor Classification](https://openreview.net/forum?id=PHtzsqDi0n)                  | 2024 | MICCAI Workshop on Computational Pathology with Multimodal Data (COMPAYL) |\n| [Fluoroformer: Scaling multiple instance learning to multiplexed images via attention-based channel fusion](https://arxiv.org/abs/2411.08975)        | 2024 | ML4H symposium                                                     |\n| [Harnessing transcriptional regulation of alternative end-joining to predict cancer treatment](https://academic.oup.com/narcancer/article/7/1/zcaf007/8063268) | 2025 | NAR Cancer                                                         |\n| [A multimodal generative AI copilot for human pathology](https://www.nature.com/articles/s41586-024-07618-3)                                          | 2024 | Nature                                                             |\n| [Digital profiling of gene expression from histology images with linearized attention](https://www.nature.com/articles/s41467-024-54182-5)           | 2024 | Nature Communications                                             |\n| [Demographic bias in misdiagnosis by computational pathology models](https://www.nature.com/articles/s41591-024-02885-z)                             | 2024 | Nature Medicine                                                    |\n| [Hest-1k: A dataset for spatial transcriptomics and histology image analysis](https://proceedings.neurips.cc/paper_files/paper/2024/hash/60a899cc31f763be0bde781a75e04458-Abstract-Datasets_and_Benchmarks_Track.html) | 2024 | Advanced in Neural Information Processing Systems                  |\n| [Rethinking Transformer for Long Contextual Histopathology Whole Slide Image Analysis](https://openreview.net/forum?id=f3oHNyqd83)                   | 2024 | Advanced in Neural Information Processing Systems                  |\n| [Leveraging tumor heterogeneity: Heterogeneous graph representation learning for cancer survival prediction in whole slide images](https://proceedings.neurips.cc/paper_files/paper/2024/hash/760341adc5632de3f1cf2e8d22215a93-Abstract-Conference.html) | 2024 | Advanced in Neural Information Processing Systems                  |\n| [Going Beyond H\u0026E and Oncology: How Do Histopathology Foundation Models Perform for Multi-stain IHC and Immunology?](https://arxiv.org/abs/2410.21560) | 2024 | NeurIPS Workshop on Advancements In Medical Foundation Models      |\n| [Histopathology and proteomics are synergistic for high-grade serous ovarian cancer platinum response prediction](https://www.nature.com/articles/s41698-025-00808-w) | 2025 | npj Precision Oncology                                             |\n| [Deep learning for predicting prognostic consensus molecular subtypes in cervical cancer from histology images](https://www.nature.com/articles/s41698-024-00778-5) | 2025 | npj Precision Oncology                                             |\n| [Integrated multicenter deep learning system for prognostic prediction in bladder cancer](https://www.nature.com/articles/s41698-024-00731-6)        | 2024 | npj Precision Oncology                                             |\n| [Predicting the tumor microenvironment composition and immunotherapy response in non-small cell lung cancer from digital histopathology images](https://www.nature.com/articles/s41698-024-00765-w) | 2024 | npj Precision Oncology                                             |\n| [Artificial intelligence-based morphologic classification and molecular characterization of neuroblastic tumors from digital histopathology](https://www.nature.com/articles/s41698-024-00745-0) | 2024 | npj Precision Oncology                                             |\n| [Deep Learning-Enabled Integration of Histology and Transcriptomics for Tissue Spatial Profile Analysis](https://spj.science.org/doi/10.34133/research.0568) | 2025 | spj Research                                                       |\n| [Validation of histopathology foundation models through whole slide image retrieval](https://www.nature.com/articles/s41598-025-88545-9)             | 2025 | Scientific Reports                                                 |\n| [Deep Learning Framework for Classifying Whole-slide Multiplex Immunofluorescence Images to Predict Immunotherapy Response in Melanoma Patients](https://www.techrxiv.org/doi/full/10.36227/techrxiv.173496563.35713571) | 2024 | TechRxiv:10.36227/techrxiv.173496563.35713571                      |\n| [Deep learning-based lymph node metastasis status predicts prognosis from muscle-invasive bladder cancer histopathology](https://link.springer.com/article/10.1007/s00345-025-05440-8) | 2025 | World Journal of Urology                                           |\n\u003c/details\u003e\n\n## Pre-extracted Embeddings\nTo facilitate downstream tasks, we provide pre-extracted embeddings for the UNI 2 model (UNI2-h) for TCGA, CPTAC and PANDA, which can be downloaded [here](https://huggingface.co/datasets/MahmoodLab/UNI2-h-features).\n\n## Benchmarking UNI 2\n\n### ROI Benchmarks\n\u003ctable\u003e\n  \u003cthead\u003e\n    \u003ctr\u003e\n      \u003cth\u003eModel name\u003c/th\u003e\n      \u003cth\u003ePretraining\u003c/th\u003e\n      \u003cth\u003eModel size\u003c/th\u003e\n      \u003cth\u003eHEST (Regression, Public)\u003c/th\u003e\n      \u003cth\u003eCRC-100K-Raw (9 classes, Public)\u003c/th\u003e\n      \u003cth\u003eTCGA Uniform Tumor (32 classes, Public)\u003c/th\u003e\n      \u003cth\u003eC17-WILDS (2 classes, Public)\u003c/th\u003e\n      \u003cth\u003eKather MSI （2 classes, Public)\u003c/th\u003e\n    \u003c/tr\u003e\n  \u003c/thead\u003e\n  \u003ctbody\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eUNI\u003c/td\u003e\n      \u003ctd\u003eVision\u003c/td\u003e\n      \u003ctd\u003eViT-l/16\u003c/td\u003e\n      \u003ctd\u003e0.386\u003c/td\u003e\n      \u003ctd\u003e0.925\u003c/td\u003e\n      \u003ctd\u003e0.595\u003c/td\u003e\n      \u003ctd\u003e0.972\u003c/td\u003e\n      \u003ctd\u003e0.679\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd colspan=\"8\"\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003e\u003cstrong\u003eUNI2-h\u003c/strong\u003e\u003c/td\u003e\n      \u003ctd\u003eVision\u003c/td\u003e\n      \u003ctd\u003eViT-h/14\u003c/td\u003e\n      \u003ctd\u003e\u003cstrong\u003e0.414\u003c/strong\u003e\u003c/td\u003e\n      \u003ctd\u003e\u003cstrong\u003e0.957\u003c/strong\u003e\u003c/td\u003e\n      \u003ctd\u003e\u003cstrong\u003e0.675\u003c/strong\u003e\u003c/td\u003e\n      \u003ctd\u003e\u003cstrong\u003e0.977\u003c/strong\u003e\u003c/td\u003e\n      \u003ctd\u003e\u003cstrong\u003e0.722\u003c/strong\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eVirchow 2\u003c/td\u003e\n      \u003ctd\u003eVision\u003c/td\u003e\n      \u003ctd\u003eViT-h/14\u003c/td\u003e\n      \u003ctd\u003e0.398\u003c/td\u003e\n      \u003ctd\u003e0.952\u003c/td\u003e\n      \u003ctd\u003e0.620\u003c/td\u003e\n      \u003ctd\u003e0.975\u003c/td\u003e\n      \u003ctd\u003e0.713\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eVirchow\u003c/td\u003e\n      \u003ctd\u003eVision\u003c/td\u003e\n      \u003ctd\u003eViT-h/14\u003c/td\u003e\n      \u003ctd\u003e0.398\u003c/td\u003e\n      \u003ctd\u003e0.919\u003c/td\u003e\n      \u003ctd\u003e0.544\u003c/td\u003e\n      \u003ctd\u003e\u003cstrong\u003e0.977\u003c/strong\u003e\u003c/td\u003e\n      \u003ctd\u003e0.670\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd colspan=\"8\"\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003e\u003cstrong\u003eUNI2-g-preview\u003c/strong\u003e\u003c/td\u003e\n      \u003ctd\u003eVision\u003c/td\u003e\n      \u003ctd\u003eViT-g/14\u003c/td\u003e\n      \u003ctd\u003e\u003cstrong\u003e0.416\u003c/strong\u003e\u003c/td\u003e\n      \u003ctd\u003e\u003cstrong\u003e0.949\u003c/strong\u003e\u003c/td\u003e\n      \u003ctd\u003e\u003cstrong\u003e0.690\u003c/strong\u003e\u003c/td\u003e\n      \u003ctd\u003e\u003cstrong\u003e0.985\u003c/strong\u003e\u003c/td\u003e\n      \u003ctd\u003e\u003cstrong\u003e0.725\u003c/strong\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eh-optimus\u003c/td\u003e\n      \u003ctd\u003eVision\u003c/td\u003e\n      \u003ctd\u003eViT-g/14\u003c/td\u003e\n      \u003ctd\u003e0.415\u003c/td\u003e\n      \u003ctd\u003e0.930\u003c/td\u003e\n      \u003ctd\u003e0.647\u003c/td\u003e\n      \u003ctd\u003e0.970\u003c/td\u003e\n      \u003ctd\u003e0.707\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eProv-GigaPath\u003c/td\u003e\n      \u003ctd\u003eVision\u003c/td\u003e\n      \u003ctd\u003eViT-g/14\u003c/td\u003e\n      \u003ctd\u003e0.385\u003c/td\u003e\n      \u003ctd\u003e0.929\u003c/td\u003e\n      \u003ctd\u003e0.593\u003c/td\u003e\n      \u003ctd\u003e0.961\u003c/td\u003e\n      \u003ctd\u003e0.693\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd colspan=\"8\"\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eCONCH\u003c/td\u003e\n      \u003ctd\u003eVision-language\u003c/td\u003e\n      \u003ctd\u003eViT-b/16\u003c/td\u003e\n      \u003ctd\u003e0.371\u003c/td\u003e\n      \u003ctd\u003e0.941\u003c/td\u003e\n      \u003ctd\u003e0.556\u003c/td\u003e\n      \u003ctd\u003e0.967\u003c/td\u003e\n      \u003ctd\u003e0.685\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eMUSK\u003c/td\u003e\n      \u003ctd\u003eVision-language\u003c/td\u003e\n      \u003ctd\u003eViT-l/16\u003c/td\u003e\n      \u003ctd\u003e0.346\u003c/td\u003e\n      \u003ctd\u003e0.913\u003c/td\u003e\n      \u003ctd\u003e0.464\u003c/td\u003e\n      \u003ctd\u003e0.954\u003c/td\u003e\n      \u003ctd\u003e0.666\u003c/td\u003e\n    \u003c/tr\u003e\n  \u003c/tbody\u003e\n\u003c/table\u003e\n\n### Slide Benchmarks\n\n\u003ctable\u003e\n  \u003cthead\u003e\n    \u003ctr\u003e\n      \u003cth\u003eModel name\u003c/th\u003e\n      \u003cth\u003ePretraining\u003c/th\u003e\n      \u003cth\u003eModel size\u003c/th\u003e\n      \u003cth\u003eEBRAINS (30 classes, Public)\u003c/th\u003e\n      \u003cth\u003ePANDA (5 classes, Public)\u003c/th\u003e\n      \u003cth\u003eIHC ER / PR Assess. (6 classes, Internal)\u003c/th\u003e\n    \u003c/tr\u003e\n  \u003c/thead\u003e\n  \u003ctbody\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eUNI\u003c/td\u003e\n      \u003ctd\u003eVision\u003c/td\u003e\n      \u003ctd\u003eViT-l/16\u003c/td\u003e\n      \u003ctd\u003e0.682\u003c/td\u003e\n      \u003ctd\u003e0.944\u003c/td\u003e\n      \u003ctd\u003e0.776\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd colspan=\"6\"\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003e\u003cstrong\u003eUNI2-h\u003c/strong\u003e\u003c/td\u003e\n      \u003ctd\u003eVision\u003c/td\u003e\n      \u003ctd\u003eViT-h/14\u003c/td\u003e\n      \u003ctd\u003e\u003cstrong\u003e0.711\u003c/strong\u003e\u003c/td\u003e\n      \u003ctd\u003e\u003cstrong\u003e0.946\u003c/strong\u003e\u003c/td\u003e\n      \u003ctd\u003e0.794\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eVirchow 2\u003c/td\u003e\n      \u003ctd\u003eVision\u003c/td\u003e\n      \u003ctd\u003eViT-h/14\u003c/td\u003e\n      \u003ctd\u003e0.691\u003c/td\u003e\n      \u003ctd\u003e0.931\u003c/td\u003e\n      \u003ctd\u003e\u003cstrong\u003e0.808\u003c/strong\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eVirchow\u003c/td\u003e\n      \u003ctd\u003eVision\u003c/td\u003e\n      \u003ctd\u003eViT-h/14\u003c/td\u003e\n      \u003ctd\u003e0.681\u003c/td\u003e\n      \u003ctd\u003e\u003cstrong\u003e0.946\u003c/strong\u003e\u003c/td\u003e\n      \u003ctd\u003e0.756\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd colspan=\"6\"\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003e\u003cstrong\u003eUNI2-g-preview\u003c/strong\u003e\u003c/td\u003e\n      \u003ctd\u003eVision\u003c/td\u003e\n      \u003ctd\u003eViT-g/14\u003c/td\u003e\n      \u003ctd\u003e\u003cstrong\u003e0.746\u003c/strong\u003e\u003c/td\u003e\n      \u003ctd\u003e\u003cstrong\u003e0.953\u003c/strong\u003e\u003c/td\u003e\n      \u003ctd\u003e\u003cstrong\u003e0.795\u003c/strong\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eh-optimus\u003c/td\u003e\n      \u003ctd\u003eVision\u003c/td\u003e\n      \u003ctd\u003eViT-g/14\u003c/td\u003e\n      \u003ctd\u003e0.726\u003c/td\u003e\n      \u003ctd\u003e\u003cstrong\u003e0.953\u003c/strong\u003e\u003c/td\u003e\n      \u003ctd\u003e0.761\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eProv-GigaPath\u003c/td\u003e\n      \u003ctd\u003eVision\u003c/td\u003e\n      \u003ctd\u003eViT-g/14\u003c/td\u003e\n      \u003ctd\u003e0.687\u003c/td\u003e\n      \u003ctd\u003e0.944\u003c/td\u003e\n      \u003ctd\u003e0.775\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd colspan=\"6\"\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eCONCH\u003c/td\u003e\n      \u003ctd\u003eVision-language\u003c/td\u003e\n      \u003ctd\u003eViT-b/16\u003c/td\u003e\n      \u003ctd\u003e0.689\u003c/td\u003e\n      \u003ctd\u003e0.934\u003c/td\u003e\n      \u003ctd\u003e0.794\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eMUSK\u003c/td\u003e\n      \u003ctd\u003eVision-language\u003c/td\u003e\n      \u003ctd\u003eViT-l/16\u003c/td\u003e\n      \u003ctd\u003e0.660\u003c/td\u003e\n      \u003ctd\u003e0.923\u003c/td\u003e\n      \u003ctd\u003e0.764\u003c/td\u003e\n    \u003c/tr\u003e\n  \u003c/tbody\u003e\n\u003c/table\u003e\n\nIn each task, for each model, we sweep over 3 learning rates (1e-5, 5e-5, 1e-4) and report the test performance corresponding to the best performing model on the validation set.\n\nFor all assessments, all models are evaluated using the global representation (e.g. CLS token) without test time augmentation.\n\n## Installation\nFirst clone the repo and cd into the directory:\n```shell\ngit clone https://github.com/mahmoodlab/UNI.git\ncd UNI\n```\nThen create a conda env and install the dependencies:\n```shell\nconda create -n UNI python=3.10 -y\nconda activate UNI\npip install -e .\n```\n\n\n### 1. Getting access\nRequest access to the model weights from the Huggingface model page using links provided in the [Model Weights](#model-weights) section. You will need to login to Huggingface to download the model weights. \n\n\n### 2. Downloading weights + Creating model\nFollowing authentication (using ```huggingface_hub```), the pretrained checkpoints and image transforms for UNI can be directly loaded using the [timm](https://huggingface.co//github/hub/en/timm) library. This method automatically downloads the model weights to the [huggingface_hub cache](https://huggingface.co//github/huggingface_hub/en/guides/manage-cache) in your home directory, which ```timm``` will automatically find when using the commands below:\n\n```python\nimport timm\nimport torch\nfrom timm.data import resolve_data_config\nfrom timm.data.transforms_factory import create_transform\nfrom huggingface_hub import login\n\nlogin()  # login with your User Access Token, found at https://huggingface.co/settings/tokens\n\n# pretrained=True needed to load UNI weights (and download weights for the first time)\n# using UNI2-h as example\ntimm_kwargs = {\n   'img_size': 224, \n   'patch_size': 14, \n   'depth': 24,\n   'num_heads': 24,\n   'init_values': 1e-5, \n   'embed_dim': 1536,\n   'mlp_ratio': 2.66667*2,\n   'num_classes': 0, \n   'no_embed_class': True,\n   'mlp_layer': timm.layers.SwiGLUPacked, \n   'act_layer': torch.nn.SiLU, \n   'reg_tokens': 8, \n   'dynamic_img_size': True\n  }\nmodel = timm.create_model(\"hf-hub:MahmoodLab/UNI2-h\", pretrained=True, **timm_kwargs)\ntransform = create_transform(**resolve_data_config(model.pretrained_cfg, model=model))\nmodel.eval()\n```\n\nYou can also download the model weights to a specified checkpoint location in your local directory. The ```timm``` library is still used for defining the model architecture (e.g. custom ViT-H/14). Pretrained weights and image transforms for UNI need to be manually loaded and defined.\n```python\nimport os\nimport torch\nfrom torchvision import transforms\nimport timm\nfrom huggingface_hub import login, hf_hub_download\n\nlogin()  # login with your User Access Token, found at https://huggingface.co/settings/tokens\n\nlocal_dir = \"../assets/ckpts/uni2-h/\"\nos.makedirs(local_dir, exist_ok=True)  # create directory if it does not exist\nhf_hub_download(\"MahmoodLab/UNI2-h\", filename=\"pytorch_model.bin\", local_dir=local_dir, force_download=True)\ntimm_kwargs = {\n   'model_name': 'vit_giant_patch14_224',\n   'img_size': 224, \n   'patch_size': 14, \n   'depth': 24,\n   'num_heads': 24,\n   'init_values': 1e-5, \n   'embed_dim': 1536,\n   'mlp_ratio': 2.66667*2,\n   'num_classes': 0, \n   'no_embed_class': True,\n   'mlp_layer': timm.layers.SwiGLUPacked, \n   'act_layer': torch.nn.SiLU, \n   'reg_tokens': 8, \n   'dynamic_img_size': True\n  }\nmodel = timm.create_model(**timm_kwargs)\nmodel.load_state_dict(torch.load(os.path.join(local_dir, \"pytorch_model.bin\"), map_location=\"cpu\"), strict=True)\ntransform = transforms.Compose(\n [\n  transforms.Resize(224),\n  transforms.CenterCrop(224),\n  transforms.ToTensor(),\n  transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),\n ]\n)\nmodel.eval()\n```\n\nThe function `get_encoder` performs the commands above, downloading in the checkpoint in the `./assets/ckpts/` relative path of this GitHub repository.\n```python\nfrom uni import get_encoder\nmodel, transform = get_encoder(enc_name='uni2-h', device=device)\n```\n\n### 3. Running Inference\n\nYou can use the UNI pretrained encoder to extract features from histopathology ROIs, as follows:\n\n```python\nfrom PIL import Image\nimage = Image.open(\"uni.jpg\")\nimage = transform(image).unsqueeze(dim=0) # Image (torch.Tensor) with shape [1, 3, 224, 224] following image resizing and normalization (ImageNet parameters)\nwith torch.inference_mode():\n feature_emb = model(image) # Extracted features (torch.Tensor) with shape [1, 1536]\n```\n\nThese pre-extracted features can then be used ROI classification (via linear probing), slide classification (via multiple instance learning), and other machine learning settings.\n\n\n## Overview of specific usages\nWe provide high-level functions for loading the model and using it for inference. For model loading, the function `get_encoder` performs the commands above in Step 2, downloading in the checkpoint in the `./assets/ckpts/` relative path of this GitHub repository.\n```python\nfrom uni import get_encoder\nmodel, transform = get_encoder(enc_name='uni2-h', device=device)\n```\n\nFor inference:\n```python\nfrom uni.downstream.extract_patch_features import extract_patch_features_from_dataloader\nfrom uni.downstream.eval_patch_features.linear_probe import eval_linear_probe\nfrom uni.downstream.eval_patch_features.fewshot import eval_knn, eval_fewshot\nfrom uni.downstream.eval_patch_features.protonet import ProtoNet, prototype_topk_vote\n```\nRefer to the notebooks below for detailed examples.\n\n### More detailed starter code for loading / using the model:\nSee [**./notebooks/uni_walkthrough.ipynb**](notebooks/uni_walkthrough.ipynb) to get started with loading and using the model to create embeddings, and example code for extracting ROI features and performing ROI classification / retrieval.\n\n## License and Terms of Tuse\n\nⓒ Mahmood Lab. The models and associated code are released under the [CC-BY-NC-ND 4.0]((https://creativecommons.org/licenses/by-nc-nd/4.0/deed.en)) license and may only be used for non-commercial, academic research purposes with proper attribution. Any commercial use, sale, or other monetization of the UNI models and their derivatives, which include models trained on outputs from the UNI models or datasets created from the UNI models, is prohibited and requires prior approval. Downloading the model requires prior registration on Hugging Face and agreeing to the terms of use. By downloading the models, you agree not to distribute, publish or reproduce a copy of the models. If another user within your organization wishes to use the UNI models, they must register as an individual user and agree to comply with the terms of use. Users may not attempt to re-identify the deidentified data used to develop the underlying models. If you are a commercial entity, please contact the corresponding author or Mass General Brigham Innovation Office.\n\n\n## Acknowledgements\nThe project was built on top of amazing repositories such as [ViT](https://github.com/google-research/big_vision), [DINOv2](https://github.com/facebookresearch/dinov2), [LGSSL](https://github.com/mbanani/lgssl),  and [Timm](https://github.com/huggingface/pytorch-image-models/) (ViT model implementation). We thank the authors and developers for their contribution. \n\n\n## Reference\nIf you find our work useful in your research or if you use parts of this code please consider citing our [paper](https://www.nature.com/articles/s41591-024-02857-3):\n\nChen, R.J., Ding, T., Lu, M.Y., Williamson, D.F.K., et al. Towards a general-purpose foundation model for computational pathology. Nat Med (2024). https://doi.org/10.1038/s41591-024-02857-3\n\n```\n@article{chen2024uni,\n  title={Towards a General-Purpose Foundation Model for Computational Pathology},\n  author={Chen, Richard J and Ding, Tong and Lu, Ming Y and Williamson, Drew FK and Jaume, Guillaume and Chen, Bowen and Zhang, Andrew and Shao, Daniel and Song, Andrew H and Shaban, Muhammad and others},\n  journal={Nature Medicine},\n  publisher={Nature Publishing Group},\n  year={2024}\n}\n```\n\n\u003cimg src=.github/joint_logo.jpg\u003e \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmahmoodlab%2FUNI","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmahmoodlab%2FUNI","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmahmoodlab%2FUNI/lists"}