Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
Awesome Lists | Featured Topics | Projects
https://github.com/fawazsammani/awesome-self-supervised-vision

Awesome Self-Supervised Vision Learning
https://github.com/fawazsammani/awesome-self-supervised-vision
List: awesome-self-supervised-vision
Last synced: 16 days ago
JSON representation
Awesome Self-Supervised Vision Learning
Host: GitHub
URL: https://github.com/fawazsammani/awesome-self-supervised-vision
Owner: fawazsammani
Created: 2022-05-22T19:55:38.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2024-03-27T16:23:51.000Z (9 months ago)
Last Synced: 2024-11-25T17:02:21.337Z (26 days ago)
Size: 473 KB
Stars: 10
Watchers: 2
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

ultimate-awesome - awesome-self-supervised-vision - Awesome Self-Supervised Vision Learning. (Other Lists / Monkey C Lists)
README

        # Awesome Self-Supervised Vision [![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome)

Self-Supervised Vision Learning is blowing the field! Let's keep track on all the works before it gets too late! Only papers from 2020 onwards are included. Previous papers can be found in another awesome repo at the end. 

If you find some overlooked papers, please open issues or pull requests, and provide the paper(s) in this format:

```

- **[]** Paper Name [[pdf]]() [[code]]()

```

Note: most pretrained models can be found on [hf models](https://huggingface.co/models). You can also use it in the [hugging face](https://huggingface.co/docs/transformers/index) library as long as the model is supported. For example, you can use the [ViTModel](https://huggingface.co/docs/transformers/model_doc/vit) and load [dino](https://huggingface.co/models?sort=trending&search=dino) weights. 

## Papers

- **[SimCLR]** A Simple Framework for Contrastive Learning of Visual Representations [[pdf]](https://arxiv.org/pdf/2002.05709.pdf) [[code]](https://github.com/google-research/simclr) [[code]](https://github.com/leftthomas/SimCLR) [[code]](https://github.com/ae-foster/pytorch-simclr) [[code]](https://github.com/sthalles/SimCLR) [[code]](https://github.com/AndrewAtanov/simclr-pytorch) [[code]](https://github.com/tonylins/simclr-converter) [[video]](https://www.youtube.com/watch?v=a7-qwwAFs_s&t=215s) [[video]](https://www.youtube.com/watch?v=YZgeWsuyRH8&ab_channel=AIBites)

- **[SimCLRv2]** Big Self-Supervised Models are Strong Semi-Supervised Learners [[pdf]](https://arxiv.org/pdf/2006.10029.pdf) [[code]](https://github.com/google-research/simclr) [[code]](https://github.com/Separius/SimCLRv2-Pytorch) [[video]](https://www.youtube.com/watch?v=2lkUNDZld-4&ab_channel=YannicKilcher)

- **[BYOL]** Bootstrap your own latent: A new approach to self-supervised Learning [[pdf]](https://arxiv.org/pdf/2006.07733.pdf) [[code]](https://github.com/sthalles/PyTorch-BYOL) [[code]](https://github.com/lucidrains/byol-pytorch) [[video]](https://www.youtube.com/watch?v=YPfUiOMYOEE&t=1813s&ab_channel=YannicKilcher) 

- **[BYOL does not work]** Understanding Self-Supervised and Contrastive Learning with BYOL [[blog]](https://generallyintelligent.com/research/2020-08-24-understanding-self-supervised-contrastive-learning/)

- **[BYOL works!]** BYOL works even without batch statistics [[pdf]](https://arxiv.org/pdf/2010.10241.pdf)

- **[C-BYOL]** Compressive Visual Representations [[pdf]](https://arxiv.org/pdf/2109.12909v3.pdf) [[code]](https://github.com/google-research/compressive-visual-representations)

- **[DeepCluster]** Deep Clustering for Unsupervised Learning of Visual Features [[pdf]](https://arxiv.org/pdf/1807.05520.pdf) [[code]](https://github.com/facebookresearch/deepcluster)

- **[DeeperCluster]** Unsupervised Pre-Training of Image Features on Non-Curated Data [[pdf]](https://arxiv.org/pdf/1905.01278v3.pdf) [[code]](https://github.com/facebookresearch/DeeperCluster)

- **[SWAV]** Unsupervised Learning of Visual Features by Contrasting Cluster Assignments [[pdf]](https://arxiv.org/pdf/2006.09882.pdf) [[code]](https://github.com/facebookresearch/swav) [[video]](https://www.youtube.com/watch?v=7QmsTleiRLs&t=4s&ab_channel=PyTorchLightning) [[video]](https://www.youtube.com/watch?v=t8gr9N7kmUk&ab_channel=Cheng-YangFu)

- **[SimSiam]** Exploring Simple Siamese Representation Learning [[pdf]](https://arxiv.org/pdf/2011.10566.pdf) [[code]](https://github.com/facebookresearch/simsiam) [[code]](https://github.com/leftthomas/SimSiam)

- **[Barlow Twins]** Self-Supervised Learning via Redundancy Reduction [[pdf]](https://arxiv.org/pdf/2103.03230.pdf) [[code]](https://github.com/facebookresearch/barlowtwins) [[code]](https://github.com/IgorSusmelj/barlowtwins)

- **[MoCo]** Momentum Contrast for Unsupervised Visual Representation Learning [[pdf]](https://arxiv.org/pdf/1911.05722.pdf) [[code]](https://github.com/facebookresearch/moco) [[code]](https://github.com/leftthomas/MoCo) [[colab]](https://colab.research.google.com/github/facebookresearch/moco/blob/colab-notebook/colab/moco_cifar10_demo.ipynb)

- **[MoCo v2]** Improved Baselines with Momentum Contrastive Learning [[pdf]](https://arxiv.org/pdf/2003.04297.pdf) [[code]](https://github.com/facebookresearch/moco)

- **[MoCo v3]** An Empirical Study of Training Self-Supervised Vision Transformers [[pdf]](https://arxiv.org/pdf/2104.02057.pdf) [[code]](https://github.com/facebookresearch/moco-v3)

- **[DINO]** Emerging Properties in Self-Supervised Vision Transformers [[pdf]](https://arxiv.org/pdf/2104.14294.pdf) [[code]](https://github.com/facebookresearch/dino) [[video]](https://www.youtube.com/watch?v=h3ij3F3cPIk&t=11s&ab_channel=YannicKilcher) [[video]](https://www.youtube.com/watch?v=BFivrO_PXt4&ab_channel=TheAIEpiphany) [[video]](https://www.youtube.com/watch?v=yDXdIR7XUxI&t=154s&ab_channel=AIBites) [[video-code]](https://www.youtube.com/watch?v=hNf6RNHKnE4&t=19s&ab_channel=TheAIEpiphany) [[video-code]](https://www.youtube.com/watch?v=psmMEWKk4Uk&t=1082s&ab_channel=mildlyoverfitted)

- **[DINOv2]** Learning Robust Visual Features without Supervision [[pdf]](https://arxiv.org/pdf/2304.07193.pdf) [[pdf]](https://openreview.net/pdf?id=2dnO3LLiJ1) [[code]](https://github.com/facebookresearch/dinov2) [[code]](https://huggingface.co/docs/transformers/main/en/model_doc/dinov2) [[blog]](https://ai.meta.com/blog/dino-v2-computer-vision-self-supervised-learning/) [[demo]](https://dinov2.metademolab.com/) [[hf notebook]](https://github.com/NielsRogge/Transformers-Tutorials/tree/master/DINOv2)

- **[TWIST]** Self-Supervised Learning by Estimating Twin Class Distributions [[pdf]](https://arxiv.org/pdf/2110.07402.pdf) [[code]](https://github.com/bytedance/TWIST)

- **[EsViT]** Efficient Self-supervised Vision Transformers for Representation Learning [[pdf]](https://arxiv.org/pdf/2106.09785.pdf) [[code]](https://github.com/microsoft/esvit)

- **[iBOT]** Image BERT Pre-Training with Online Tokenizer [[pdf]](https://arxiv.org/pdf/2111.07832.pdf) [[code]](https://github.com/bytedance/ibot)

- **[SiT]** Self-supervised vIsion Transformer [[pdf]](https://arxiv.org/pdf/2104.03602.pdf) [[code]](https://github.com/Sara-Ahmed/SiT)

- **[Asym-Siam]** On the Importance of Asymmetry for Siamese Representation Learning [[pdf]](https://arxiv.org/pdf/2204.00613.pdf) [[code]](https://github.com/facebookresearch/asym-siam) 

- **[DetCon]** Efficient Visual Pretraining with Contrastive Detection [[pdf]](https://arxiv.org/pdf/2103.10957.pdf) [[code]](https://github.com/isaaccorley/detcon-pytorch) [[video]](https://www.youtube.com/watch?v=oPfu_Ec5u60&t=1225s&ab_channel=TheAIEpiphany)

- **[MoBY]** Self-Supervised Learning with Swin Transformers [[pdf]](https://arxiv.org/pdf/2105.04553.pdf) [[code]](https://github.com/SwinTransformer/Transformer-SSL)

- **[CARE]** Revitalizing CNN Attentions via Transformers in Self-Supervised Visual Representation Learning [[pdf]](https://arxiv.org/pdf/2110.05340.pdf) [[code]](https://github.com/ChongjianGE/CARE)

- **[ContrastiveCrop]** Crafting Better Contrastive Views for Siamese Representation Learning [[pdf]](https://arxiv.org/pdf/2202.03278.pdf) [[code]](https://github.com/xyupeng/ContrastiveCrop)

- **[SDMP]** A Simple Data Mixing Prior for Improving Self-Supervised Learning [[pdf]](https://cihangxie.github.io/data/SDMP.pdf) [[code]](https://github.com/OliverRensu/SDMP)

- **[ImageGPT]** Generative Pretraining from Pixels [[pdf]](https://cdn.openai.com/papers/Generative_Pretraining_from_Pixels_V2.pdf) [[code]](https://github.com/openai/image-gpt) [[code]](https://github.com/karpathy/minGPT) [[code]](https://github.com/teddykoker/image-gpt) [[website]](https://openai.com/blog/image-gpt/) [[video]](https://www.youtube.com/watch?v=YBlNQK0Ao6g&ab_channel=YannicKilcher)

- **[ReSSL]** Relational Self-Supervised Learning with Weak Augmentation [[pdf]](https://arxiv.org/pdf/2107.09282.pdf) [[code]](https://github.com/KyleZheng1997/ReSSL)

- **[DCL]** Decoupled Contrastive Learning [[pdf]](https://arxiv.org/pdf/2110.06848.pdf) [[code]](https://github.com/raminnakhli/Decoupled-Contrastive-Learning)

- **[LEWEL]** Learning Where to Learn in Cross-View Self-Supervised Learning [[pdf]](https://arxiv.org/pdf/2203.14898.pdf) [[code]](https://github.com/LayneH/LEWEL)

- **[MSF]** Mean Shift for Self-Supervised Learning [[pdf]](https://www.csee.umbc.edu/~hpirsiav/papers/MSF_iccv21.pdf) [[code]](https://github.com/UMBCvision/MSF)

- **[SWAG]** Revisiting Weakly Supervised Pre-Training of Visual Perception Models [[pdf]](https://arxiv.org/pdf/2201.08371.pdf) [[code]](https://github.com/facebookresearch/SWAG)

- **[ISD]** Self-Supervised Learning by Iterative Similarity Distillation [[pdf]](https://www.csee.umbc.edu/~hpirsiav/papers/ISD_iccv21.pdf) [[code]](https://github.com/UMBCvision/ISD)

- **[Self-Label]** Self-labelling via simultaneous clustering and representation learning [[pdf]](https://arxiv.org/pdf/1911.05371.pdf) [[code]](https://github.com/yukimasano/self-label)

- **[InfoCL]** Rethinking Minimal Sufficient Representation in Contrastive Learning [[pdf]](https://arxiv.org/pdf/2203.07004.pdf) [[code]](https://github.com/Haoqing-Wang/InfoCL)

- **[DenseCL]** Dense Contrastive Learning for Self-Supervised Visual Pre-Training [[pdf]](https://arxiv.org/pdf/2011.09157.pdf) [[code]](https://github.com/WXinlong/DenseCL)

- **[FlatNCE]** Breaking The log-K Curse On Contrastive Learners With FlatNCE [[pdf]](https://arxiv.org/pdf/2107.01152.pdf) [[code]](https://github.com/Junya-Chen/FlatCLR)

- **[ARB]** Align Representations with Base: A New Approach to Self-Supervised Learning [[pdf]](https://openaccess.thecvf.com/content/CVPR2022/papers/Zhang_Align_Representations_With_Base_A_New_Approach_to_Self-Supervised_Learning_CVPR_2022_paper.pdf)

- **[SelfPatch]** Patch-level Representation Learning for Self-supervised Vision Transformers [[pdf]](https://arxiv.org/pdf/2206.07990.pdf) [[code]](https://github.com/alinlab/SelfPatch)

- **[EMAN]** Exponential Moving Average Normalization for Self-supervised and Semi-supervised Learning [[pdf]](https://arxiv.org/pdf/2101.08482.pdf) [[code]](https://github.com/amazon-research/exponential-moving-average-normalization)

- **[MPL]** Meta Pseudo Labels [[pdf]](https://arxiv.org/pdf/2003.10580.pdf) [[code]](https://github.com/kekmodel/MPL-pytorch)

- **[RINCE]** Robust Contrastive Learning against Noisy Views [[pdf]](https://arxiv.org/pdf/2201.04309.pdf) [[code]](https://github.com/chingyaoc/RINCE)

- **[CoKe]** Unsupervised Visual Representation Learning by Online Constrained K-Means [[pdf]](https://arxiv.org/pdf/2105.11527.pdf) [[code]](https://github.com/idstcv/CoKe)

- **[ReSim]** Region Similarity Representation Learning [[pdf]](https://arxiv.org/pdf/2103.12902.pdf) [[code]](https://github.com/Tete-Xiao/ReSim)

- **[CAST]** Learning to Localize Improves Self-Supervised Representations [[pdf]](https://arxiv.org/pdf/2012.04630.pdf)

- **[LoGo]** Leverage Your Local and Global Representations: A New Self-Supervised Learning Strategy [[pdf]](https://arxiv.org/pdf/2203.17205.pdf) [[code]](https://github.com/ztt1024/LoGo-SSL)

- **[CsMl]** Hierarchical Semantic Alignment for Contrastive Representation Learning [[pdf]](https://arxiv.org/pdf/2012.02733.pdf)

- **[SetSim]** Exploring Set Similarity for Dense Self-supervised Representation Learning [[pdf]](https://arxiv.org/pdf/2107.08712.pdf) 

- **[UniVIP]**  A Unified Framework for Self-Supervised Visual Pre-training [[pdf]](https://arxiv.org/pdf/2203.06965.pdf) 

- **[Dual Temperature]** Towards Understanding and Simplifying MoCo [[pdf]](https://arxiv.org/pdf/2203.17248.pdf) [[code]](https://github.com/ChaoningZhang/Dual-temperature)

- **[DATA]** Domain-Aware and Task-Aware Self-supervised Learning [[pdf]](https://arxiv.org/pdf/2203.09041.pdf) [[code]](https://github.com/GAIA-vision/GAIA-ssl)

- **[SaGe]** Semantic-Aware Generation for Self-Supervised Visual Representation Learning [[pdf]](https://arxiv.org/pdf/2111.13163.pdf) [[code]](https://github.com/sunsmarterjie/SaGe)

- **[MST]** Masked Self-Supervised Transformer for Visual Representation [[pdf]](https://arxiv.org/pdf/2106.05656.pdf) 

- **[IP-IRM]** Self-Supervised Learning Disentangled Group Representation as Feature [[pdf]](https://arxiv.org/pdf/2110.15255.pdf) [[code]](https://github.com/Wangt-CN/IP-IRM)

- **[SSL-HSIC]** Self-Supervised Learning with Kernel Dependence Maximization [[pdf]](https://arxiv.org/pdf/2106.08320.pdf) [[code]](https://github.com/deepmind/ssl_hsic)

- **[JigClu]** Jigsaw Clustering for Unsupervised Visual Representation Learning [[pdf]](https://arxiv.org/pdf/2104.00323.pdf) [[code]](https://github.com/dvlab-research/JigsawClustering)

- **[SelfAugment]** Automatic Augmentation Policies for Self-Supervised Learning [[pdf]](https://arxiv.org/pdf/2009.07724.pdf) [[code]](https://github.com/cjrd/selfaugment)

- **[ProtoNCE]** Prototypical Contrastive Learning of Unsupervised Representations [[pdf]](https://arxiv.org/pdf/2005.04966.pdf) [[code]](https://github.com/salesforce/PCL)

- **[OBoW]** Online Bag-of-Visual-Words Generation for Self-Supervised Learning [[pdf]](https://arxiv.org/pdf/2012.11552.pdf) [[code]](https://github.com/valeoai/obow)

- **[SEERv1]** Self-supervised Pretraining of Visual Features in the Wild [[pdf]](https://arxiv.org/pdf/2103.01988.pdf) [[code]](https://github.com/facebookresearch/vissl/tree/main/projects/SEER)

- **[SEERv2]** Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision [[pdf]](https://arxiv.org/pdf/2202.08360v2.pdf) [[code]](https://github.com/facebookresearch/vissl/tree/main/projects/SEER)

- **[CLSA]** Contrastive Learning with Stronger Augmentations [[pdf]](https://arxiv.org/pdf/2104.07713v1.pdf) [[code]](https://github.com/maple-research-lab/CLSA)

- **[VICReg]** Variance-Invariance-Covariance Regularization for Self-Supervised Learning [[pdf]](https://arxiv.org/pdf/2105.04906.pdf)

- **[VQ-VAE]** Neural Discrete Representation Learning [[pdf]](https://arxiv.org/pdf/1711.00937.pdf) [[code]](https://github.com/zalandoresearch/pytorch-vq-vae) [[code]](https://github.com/lucidrains/vector-quantize-pytorch) [[code]](https://github.com/karpathy/deep-vector-quantization) [[code]](https://github.com/openai/DALL-E) [[code]](https://github.com/ritheshkumar95/pytorch-vqvae) [[code]](https://github.com/nadavbh12/VQ-VAE) [[code]](https://github.com/nakosung/VQ-VAE) [[code]](https://juliusruseckas.github.io/ml/vq-vae.html) [[code]](https://github.com/AntixK/PyTorch-VAE/blob/master/models/vq_vae.py) [[colab]](https://colab.research.google.com/github/zalandoresearch/pytorch-vq-vae/blob/master/vq-vae.ipynb) [[video]](https://www.youtube.com/watch?v=VZFVUrYcig0&ab_channel=TheAIEpiphany) [[blog]](https://ml.berkeley.edu/blog/posts/vq-vae/)

- **[VQ-VAE-2]** Generating Diverse High-Fidelity Images with VQ-VAE-2 [[pdf]](https://arxiv.org/pdf/1906.00446.pdf) [[code]](https://github.com/rosinality/vq-vae-2-pytorch) [[code]](https://github.com/vvvm23/vqvae-2) [[code]](https://github.com/lucidrains/vector-quantize-pytorch)

- **[VQ-GAN]** Taming Transformers for High-Resolution Image Synthesis [[pdf]](https://arxiv.org/pdf/2012.09841.pdf) [[code]](https://github.com/CompVis/taming-transformers) [[code]](https://github.com/dome272/VQGAN-pytorch) [[website]](https://compvis.github.io/taming-transformers/) [[video]](https://www.youtube.com/watch?v=j2PXES-liuc&t=2s&ab_channel=TheAIEpiphany) [[video]](https://www.youtube.com/watch?v=wcqLFDXaDO8&ab_channel=Outlier) [[video]](https://www.youtube.com/watch?v=-wDSDtIAyWQ&ab_channel=GradientDude) [[video code]](https://www.youtube.com/watch?v=_Br5WRwUz_U&ab_channel=Outlier) [[blog]](https://ljvmiranda921.github.io/notebook/2021/08/08/clip-vqgan/) [[ViT-VQGAN]](https://arxiv.org/pdf/2110.04627.pdf)

- **[CycleGAN]** Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks [[pdf]](https://arxiv.org/pdf/1703.10593.pdf) [[website]](https://junyanz.github.io/CycleGAN/) [[code]](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix) [[code]](https://github.com/eriklindernoren/PyTorch-GAN/tree/master/implementations/cyclegan) [[code]](https://github.com/leftthomas/CycleGAN) [[code]](https://nn.labml.ai/gan/cycle_gan/index.html)

- **[Restormer]** Efficient Transformer for High-Resolution Image Restoration [[pdf]](https://arxiv.org/pdf/2111.09881.pdf) [[code]](https://github.com/swz30/restormer) [[code]](https://github.com/leftthomas/Restormer) [[colab]](https://colab.research.google.com/drive/1C2818h7KnjNv4R1sabe14_AYL7lWhmu6?usp=sharing)

- **[MC-SSL0.0]** Towards Multi-Concept Self-Supervised Learning [[pdf]](https://arxiv.org/pdf/2111.15340.pdf)

- **[t-ReX]** Improving the Generalization of Supervised Models [[pdf]](https://arxiv.org/pdf/2206.15369.pdf) [[website]](https://europe.naverlabs.com/research/computer-vision/improving-the-generalization-of-supervised-models/)

- **[HCSC]** Hierarchical Contrastive Selective Coding [[pdf]](https://arxiv.org/pdf/2202.00455.pdf) [[code]](https://github.com/hirl-team/HCSC)

- **[Mugs]** A Multi-Granular Self-Supervised Learning Framework [[pdf]](https://arxiv.org/pdf/2203.14415v1.pdf) [[code]](https://github.com/sail-sg/mugs)

- **[BatchFormer]** Learning to Explore Sample Relationships for Robust Representation Learning [[pdf]](https://arxiv.org/pdf/2203.01522.pdf) [[code]](https://github.com/zhihou7/BatchFormer)

- **[RELICv2]** Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet? [[pdf]](https://arxiv.org/pdf/2201.05119v1.pdf)

- **[CaCo]** Both Positive and Negative Samples are Directly Learnable via Cooperative-adversarial Contrastive Learning [[pdf]](https://arxiv.org/pdf/2203.14370v1.pdf) [[code]](https://github.com/maple-research-lab/caco)

- **[DnC]** Divide and Contrast: Self-supervised Learning from Uncurated Data [[pdf]](https://arxiv.org/pdf/2105.08054v1.pdf)

- **[LIFT]**: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks [[pdf]](https://arxiv.org/pdf/2206.06565.pdf) [[code]](https://github.com/UW-Madison-Lee-Lab/LanguageInterfacedFineTuning)

- **[VICReg]** What Do We Maximize in Self-Supervised Learning? [[pdf]](https://arxiv.org/pdf/2207.10081v1.pdf)

- **[VICRegL]** Self-Supervised Learning of Local Visual Features [[pdf]](https://arxiv.org/pdf/2210.01571.pdf) [[code]](https://github.com/facebookresearch/VICRegL)

- **[Propagate Yourself]** Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning [[pdf]](https://arxiv.org/pdf/2011.10043.pdf) [[code]](https://github.com/zdaxie/PixPro)

- **[SDCLR]** Self-Damaging Contrastive Learning [[pdf]](https://arxiv.org/pdf/2106.02990.pdf) [[code]](https://github.com/VITA-Group/SDCLR)

- **[I-JEPA]** Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture [[pdf]](https://arxiv.org/pdf/2301.08243.pdf) [[code]](https://github.com/facebookresearch/ijepa) [[blog]](https://ai.facebook.com/blog/yann-lecun-ai-model-i-jepa/)

- **[CorInfoMax]** Self-Supervised Learning with an Information Maximization Criterion [[pdf]](https://arxiv.org/pdf/2209.07999.pdf)

- **[All4One]** Symbiotic Neighbour Contrastive Learning via Self-Attention and Redundancy Reduction [[pdf]](https://arxiv.org/pdf/2303.09417.pdf)

- **[SimDis]** Simple Distillation Baselines for Improving Small Self-supervised Models [[pdf]](https://arxiv.org/pdf/2106.11304.pdf) [[code]](https://github.com/JindongGu/SimDis)

- **[MOKD]** Multi-Mode Online Knowledge Distillation for Self-Supervised Visual Representation Learning [[pdf]](https://arxiv.org/pdf/2304.06461.pdf)

- **[SiameseIM]** Siamese Image Modeling for Self-Supervised Vision Representation Learning [[pdf]](https://arxiv.org/pdf/2206.01204.pdf) [[code]](https://github.com/fundamentalvision/Siamese-Image-Modeling)

- **[MixedAE]** Mixed Autoencoder for Self-supervised Visual Representation Learning [[pdf]](https://arxiv.org/pdf/2303.17152.pdf)

- **[CIM]** Correlational Image Modeling for Self-Supervised Visual Pre-Training [[pdf]](https://arxiv.org/pdf/2303.12670.pdf) [[code]](https://github.com/weivision/Correlational-Image-Modeling)

- **[VD]** Variable Discretization for Self-Supervised Learning [[pdf]](https://openreview.net/pdf?id=p7DIDSzT8x)

- Evolved Part Masking for Self-Supervised Learning [[pdf]](https://openaccess.thecvf.com/content/CVPR2023/papers/Feng_Evolved_Part_Masking_for_Self-Supervised_Learning_CVPR_2023_paper.pdf)

- Semi-supervised learning made simple with self-supervised clustering [[pdf]](https://arxiv.org/pdf/2306.07483.pdf)

- Self-Supervised Relational Reasoning for Representation Learning [[pdf]](https://arxiv.org/pdf/2006.05849.pdf) [[code]](https://github.com/mpatacchiola/self-supervised-relational-reasoning)

- Solving Inefficiency of Self-supervised Representation Learning [[pdf]](https://arxiv.org/pdf/2104.08760v3.pdf) [[code]](https://github.com/wanggrun/triplet)

- Self-training with Noisy Student improves ImageNet classification [[pdf]](https://arxiv.org/pdf/1911.04252.pdf) [[video]](https://www.youtube.com/watch?v=q7PjrmGNx5A&t=1565s&ab_channel=YannicKilcher)

- Directional Self-supervised Learning for Heavy Image Augmentations [[pdf]](https://arxiv.org/pdf/2110.13555.pdf)

- Exploring the Equivalence of Siamese Self-Supervised Learning via A Unified Gradient Framework [[pdf]](https://arxiv.org/pdf/2112.05141.pdf) 

- Representation Learning via Invariant Causal Mechanisms [[pdf]](https://arxiv.org/pdf/2010.07922.pdf)

- Hard Negative Mixing for Contrastive Learning [[pdf]](https://arxiv.org/pdf/2010.01028.pdf) [[website]](https://europe.naverlabs.com/research/computer-vision/mochi/)

- Robust Contrastive Learning Using Negative Samples with Diminished Semantics [[pdf]](https://arxiv.org/pdf/2110.14189.pdf) [[code]](https://github.com/SongweiGe/Contrastive-Learning-with-Non-Semantic-Negatives)

- Compressive Visual Representations [[pdf]](https://arxiv.org/pdf/2109.12909v3.pdf) [[code]](https://github.com/google-research/compressive-visual-representations)

- Rethinking the Augmentation Module in Contrastive Learning: Learning Hierarchical Augmentation Invariance with Expanded Views [[pdf]](https://arxiv.org/pdf/2206.00227.pdf)

- Self-Supervised Pre-training of Vision Transformers for Dense Prediction Tasks [[pdf]](https://arxiv.org/pdf/2205.15173.pdf)

- Learning Common Rationale to Improve Self-Supervised Representation for Fine-Grained Visual Recognition Problems [[pdf]](https://arxiv.org/pdf/2303.01669.pdf) [[code]](https://github.com/GANPerf/LCR)

## Masked Image Pretraining

- **[BEiT]** BERT Pre-Training of Image Transformers [[pdf]](https://arxiv.org/pdf/2106.08254.pdf) [[code]](https://github.com/microsoft/unilm/tree/master/beit)

- **[BEiT v2]** Masked Image Modeling with Vector-Quantized Visual Tokenizers [[pdf]](https://arxiv.org/pdf/2208.06366.pdf) [[code]](https://github.com/microsoft/unilm/tree/master/beit2)

- **[MAE]** Masked Autoencoders Are Scalable Vision Learners [[pdf]](https://arxiv.org/pdf/2111.06377.pdf) [[code]](https://github.com/facebookresearch/mae) [[code]](https://github.com/huggingface/transformers/tree/main/examples/pytorch/image-pretraining) [[code]](https://github.com/pengzhiliang/MAE-pytorch) [[code]](https://github.com/FlyEgle/MAE-pytorch) [[hf notebook]](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/ViTMAE/ViT_MAE_visualization_demo.ipynb)

- **[ConvNeXt v2]**  Co-designing and Scaling ConvNets with Masked Autoencoders [[pdf]](https://arxiv.org/pdf/2301.00808.pdf) [[code]](https://github.com/facebookresearch/ConvNeXt-V2)

- **[SparK]** Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling [[pdf]](https://arxiv.org/pdf/2301.03580.pdf) [[code]](https://github.com/keyu-tian/SparK)

- **[MaskFeat]** Masked Feature Prediction for Self-Supervised Visual Pre-Training [[pdf]](https://arxiv.org/pdf/2112.09133.pdf)

- **[SimMIM]** A Simple Framework for Masked Image Modeling [[pdf]](https://arxiv.org/pdf/2111.09886.pdf) [[code]](https://github.com/microsoft/SimMIM) [[code]](https://github.com/huggingface/transformers/tree/main/examples/pytorch/image-pretraining)

- **[GMML]** GMML is All you Need [[pdf]](https://arxiv.org/pdf/2205.14986.pdf) [[code]](https://github.com/Sara-Ahmed/GMML)

- [Awesome Masked Autoencoders](https://github.com/EdisonLeeeee/Awesome-Masked-Autoencoders)

## Unsupervised Segmentation With/Using Self-Supervised Models

- **[TransCAM]**: Transformer Attention-based CAM Refinement for Weakly Supervised Semantic Segmentation [[pdf]](https://arxiv.org/pdf/2203.07239.pdf) [[code]](https://github.com/liruiwen/TransCAM)

- **[GroupViT]**: Semantic Segmentation Emerges from Text Supervision [[pdf]](https://arxiv.org/pdf/2202.11094.pdf) [[code]](https://github.com/NVlabs/GroupViT)

- **[LOST]** Localizing Objects with Self-Supervised Transformers and no Labels [[pdf]](https://arxiv.org/pdf/2109.14279.pdf) [[code]](https://github.com/valeoai/LOST)

- **[MaskContrast]** Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals [[pdf]](https://arxiv.org/pdf/2102.06191.pdf) [[code]](https://github.com/wvangansbeke/Unsupervised-Semantic-Segmentation)

- **[MaskDistill]** Discovering Object Masks with Transformers for Unsupervised Semantic Segmentation [[pdf]](https://arxiv.org/pdf/2206.06363v1.pdf) [[code]](https://github.com/wvangansbeke/MaskDistill)

- **[Leopart]** Self-Supervised Learning of Object Parts for Semantic Segmentation [[pdf]](https://arxiv.org/pdf/2204.13101v2.pdf) [[code]](https://github.com/mkuuwaujinga/leopart)

- **[TokenCut]** Self-Supervised Transformers for Unsupervised Object Discovery using Normalized Cut [[pdf]](https://arxiv.org/pdf/2202.11539.pdf) [[code]](https://github.com/YangtaoWANG95/TokenCut) [[colab]](https://colab.research.google.com/github/YangtaoWANG95/TokenCut/blob/master/inference_demo.ipynb) [[website]](https://www.m-psi.fr/Papers/TokenCut2022/)

- Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization [[pdf]](https://arxiv.org/pdf/2205.07839.pdf) [[code]](https://github.com/lukemelas/deep-spectral-segmentation) [[demo]](https://huggingface.co/spaces/lukemelas/deep-spectral-segmentation) [[website]](https://lukemelas.github.io/deep-spectral-segmentation/)

## Review Papers

- A Cookbook of Self-Supervised Learning [[pdf]](https://arxiv.org/pdf/2304.12210.pdf)

- Self-Supervised Representation Learning: Introduction, Advances and Challenges [[pdf]](https://arxiv.org/pdf/2110.09327.pdf)

- Self-supervised Learning: Generative or Contrastive [[pdf]](https://arxiv.org/pdf/2006.08218.pdf)

- Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey [[pdf]](https://arxiv.org/pdf/1902.06162.pdf)

- A Survey on Contrastive Self-supervised Learning [[pdf]](https://arxiv.org/pdf/2011.00362.pdf)

## Libraries

- [VISSL](https://github.com/facebookresearch/vissl)

- [mmselfsup](https://github.com/open-mmlab/mmselfsup)

- [Lightly](https://github.com/lightly-ai/lightly)

- [solo-learn](https://github.com/vturrisi/solo-learn)

- [benchmark_VAE](https://github.com/clementchadebec/benchmark_VAE)

- [unilm](https://github.com/microsoft/unilm)

## Other Awesomes

- [jason718](https://github.com/jason718/awesome-self-supervised-learning)

- [dev-sungman](https://github.com/dev-sungman/Awesome-Self-Supervised-Papers)

- [asheeshcric](https://github.com/asheeshcric/awesome-contrastive-self-supervised-learning)

## Some Nice Resources

- [Stanford CS231n slides](http://cs231n.stanford.edu/slides/2022/lecture_14_jiajun.pdf)

- [OpenAI NeurIPS Tutorial](https://nips.cc/media/neurips-2021/Slides/21895.pdf)

- [Amit Chaudhary](https://amitness.com/archives/)

- [AI Summer](https://theaisummer.com/topics/unsupervised-learning/)

- Lil'Log [[Self-Supervised Representation Learning]](https://lilianweng.github.io/posts/2019-11-10-self-supervised/) [[Contrastive Representation Learning]](https://lilianweng.github.io/posts/2021-05-31-contrastive/) [[Semi-Supervised Learning]](https://lilianweng.github.io/posts/2021-12-05-semi-supervised/) [[Active Learning]](https://lilianweng.github.io/posts/2022-02-20-active-learning/) [[Data Generation]](https://lilianweng.github.io/posts/2022-04-15-data-gen/)

- Optimal Transport and Hungarian Algorithm [[blog]](https://michielstock.github.io/posts/2017/2017-11-5-OptimalTransport/) [[blog]](https://towardsdatascience.com/optimal-transport-a-hidden-gem-that-empowers-todays-machine-learning-2609bbf67e59) [[blog]](https://leimao.github.io/blog/Hungarian-Matching-Algorithm/) [[blog]](https://brilliant.org/wiki/hungarian-matching/) [[blog]](https://www.topcoder.com/thrive/articles/Assignment%20Problem%20and%20Hungarian%20Algorithm) [[blog]](https://medium.com/@riya.tendulkar/the-assignment-problem-using-hungarian-algorithm-4f105729af18)

- Gumbel-Softmax [[pdf]](https://arxiv.org/pdf/1611.01144.pdf) [[pytorch]](https://pytorch.org/docs/stable/generated/torch.nn.functional.gumbel_softmax.html) [[code]](https://github.com/shaabhishek/gumbel-softmax-pytorch) [[blog and code]](https://neptune.ai/blog/gumbel-softmax-loss-function-guide-how-to-implement-it-in-pytorch) [[blog]](https://sassafras13.github.io/GumbelSoftmax/) [[blog]](https://towardsdatascience.com/synthetic-data-with-gumbel-softmax-activations-49e990168565) [[gumbel-softmax vae]](https://github.com/YongfeiYan/Gumbel_Softmax_VAE)