awesome-self-supervised-learning
Curated List of papers on Self-Supervised Representation Learning
https://github.com/lightly-ai/awesome-self-supervised-learning
Last synced: 4 days ago
JSON representation
-
2023
- Patch nā Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution - 2310.06907-b31b1b.svg)](https://proceedings.neurips.cc/paper_files/paper/2023/file/06ea400b9b7cfce6428ec27a371632eb-Paper-Conference.pdf) |
- A Cookbook of Self-Supervised Learning - 2304.12210-b31b1b.svg)](https://arxiv.org/abs/2304.12210) |
- Masked Autoencoders Enable Efficient Knowledge Distillers - 2208.12256-b31b1b.svg)](https://arxiv.org/abs/2208.12256) [](https://drive.google.com/file/d/1bzuOab5fvKK7jpxv5bMoGk1gW446SCUL/view?usp=sharing) |
- Understanding and Generalizing Contrastive Learning from the Inverse Optimal Transport Perspective - ICML_2023-b31b1b.svg)](https://openreview.net/forum?id=DBlWCsOy94) [](https://drive.google.com/file/d/1hBEy-yh_KtkqY3rjeato-Cuo6ITzhowr/view?usp=sharing) |
- CycleCL: Self-supervised Learning for Periodic Videos - 2311.03402-b31b1b.svg)](https://arxiv.org/abs/2311.03402) [](https://drive.google.com/file/d/1BDC891HX_JxF84UK_x8RKgHZockJqQFU/view?usp=sharing) |
- Temperature Schedules for Self-Supervised Contrastive Methods on Long-Tail Data - 2303.13664-b31b1b.svg)](https://arxiv.org/abs/2303.13664) [](https://drive.google.com/file/d/1RabJuwtOevH9hg9wuFTN4z8y4gjQxCT_/view?usp=sharing) |
- Reverse Engineering Self-Supervised Learning - 2305.15614-b31b1b.svg)](https://arxiv.org/abs/2305.15614) [](https://drive.google.com/file/d/1KsqV9_HE0y0EwlNivUdZPKqqCkdM-4HB/view?usp=sharing) |
- DINOv2: Learning Robust Visual Features without Supervision - 2304.07193-b31b1b.svg)](https://arxiv.org/abs/2304.07193) [](https://drive.google.com/file/d/11szszgtsYESO3QF8jkFsLFTVtN797uH2/view?usp=sharing) |
- Segment Anything - 2304.02643-b31b1b.svg)](https://arxiv.org/abs/2304.02643) [](https://drive.google.com/file/d/18yPuL8J6boi5pB1NRO6VAUbYEwmI3tFo/view?usp=sharing) |
- A Cookbook of Self-Supervised Learning - 2304.12210-b31b1b.svg)](https://arxiv.org/abs/2304.12210) |
- Masked Autoencoders Enable Efficient Knowledge Distillers - 2208.12256-b31b1b.svg)](https://arxiv.org/abs/2208.12256) [](https://drive.google.com/file/d/1bzuOab5fvKK7jpxv5bMoGk1gW446SCUL/view?usp=sharing) |
- Understanding and Generalizing Contrastive Learning from the Inverse Optimal Transport Perspective - ICML_2023-b31b1b.svg)](https://openreview.net/forum?id=DBlWCsOy94) [](https://drive.google.com/file/d/1hBEy-yh_KtkqY3rjeato-Cuo6ITzhowr/view?usp=sharing) |
- CycleCL: Self-supervised Learning for Periodic Videos - 2311.03402-b31b1b.svg)](https://arxiv.org/abs/2311.03402) [](https://drive.google.com/file/d/1BDC891HX_JxF84UK_x8RKgHZockJqQFU/view?usp=sharing) |
- Temperature Schedules for Self-Supervised Contrastive Methods on Long-Tail Data - 2303.13664-b31b1b.svg)](https://arxiv.org/abs/2303.13664) [](https://drive.google.com/file/d/1RabJuwtOevH9hg9wuFTN4z8y4gjQxCT_/view?usp=sharing) |
- Reverse Engineering Self-Supervised Learning - 2305.15614-b31b1b.svg)](https://arxiv.org/abs/2305.15614) [](https://drive.google.com/file/d/1KsqV9_HE0y0EwlNivUdZPKqqCkdM-4HB/view?usp=sharing) |
- DINOv2: Learning Robust Visual Features without Supervision - 2304.07193-b31b1b.svg)](https://arxiv.org/abs/2304.07193) [](https://drive.google.com/file/d/11szszgtsYESO3QF8jkFsLFTVtN797uH2/view?usp=sharing) |
- Segment Anything - 2304.02643-b31b1b.svg)](https://arxiv.org/abs/2304.02643) [](https://drive.google.com/file/d/18yPuL8J6boi5pB1NRO6VAUbYEwmI3tFo/view?usp=sharing) |
- Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture - 2301.08243-b31b1b.svg)](https://arxiv.org/abs/2301.08243) [](https://drive.google.com/file/d/1l5nHxqqbv7o3ESw3DLBqgJyXILJ0FgH6/view?usp=sharing) |
- Self-supervised Object-Centric Learning for Videos - 2310.06907-b31b1b.svg)](https://arxiv.org/abs/2310.06907) |
- Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture - 2301.08243-b31b1b.svg)](https://arxiv.org/abs/2301.08243) [](https://drive.google.com/file/d/1l5nHxqqbv7o3ESw3DLBqgJyXILJ0FgH6/view?usp=sharing) |
- Self-supervised Object-Centric Learning for Videos - 2310.06907-b31b1b.svg)](https://arxiv.org/abs/2310.06907) |
- An Information-Theoretic Perspective on Variance-Invariance-Covariance Regularization - b31b1b.svg)](https://proceedings.neurips.cc/paper_files/paper/2023/file/6b1d4c03391b0aa6ddde0b807a78c950-Paper-Conference.pdf) |
- The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning - 2307.10907-b31b1b.svg)](https://arxiv.org/abs/2307.10907) [](https://github.com/apple/ml-entropy-reconstruction) |
- Fast Segment Anything - 2306.12156-b31b1b.svg)](https://arxiv.org/abs/2306.12156) [](https://github.com/CASIA-IVA-Lab/FastSAM) |
- Faster Segment Anything: Towards Lightweight SAM for Mobile Applications - 2306.14289-b31b1b.svg)](https://arxiv.org/abs/2306.14289) [](https://github.com/ChaoningZhang/MobileSAM) |
- What Do Self-Supervised Vision Transformers Learn? - 2305.00729-b31b1b.svg)](https://arxiv.org/abs/2305.00729) [](https://github.com/naver-ai/cl-vs-mim) |
- Active Self-Supervised Learning: A Few Low-Cost Relationships Are All You Need - 2303.15256-b31b1b.svg)](https://arxiv.org/abs/2303.15256) |
- EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything - 2312.00863-b31b1b.svg)](https://arxiv.org/abs/2312.00863) [](https://github.com/yformer/EfficientSAM) |
- DropPos: Pre-Training Vision Transformers by Reconstructing Dropped Positions - 2309.03576-b31b1b.svg)](https://arxiv.org/abs/2309.03576) [](https://github.com/Haochen-Wang409/DropPos) |
- VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking - 2023-b31b1b.svg)](https://openaccess.thecvf.com/content/CVPR2023/papers/Wang_VideoMAE_V2_Scaling_Video_Masked_Autoencoders_With_Dual_Masking_CVPR_2023_paper.pdf) |
- Patch nā Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution - 2310.06907-b31b1b.svg)](https://proceedings.neurips.cc/paper_files/paper/2023/file/06ea400b9b7cfce6428ec27a371632eb-Paper-Conference.pdf) |
- An Information-Theoretic Perspective on Variance-Invariance-Covariance Regularization - b31b1b.svg)](https://proceedings.neurips.cc/paper_files/paper/2023/file/6b1d4c03391b0aa6ddde0b807a78c950-Paper-Conference.pdf) |
- The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning - 2307.10907-b31b1b.svg)](https://arxiv.org/abs/2307.10907) [](https://github.com/apple/ml-entropy-reconstruction) |
- Fast Segment Anything - 2306.12156-b31b1b.svg)](https://arxiv.org/abs/2306.12156) [](https://github.com/CASIA-IVA-Lab/FastSAM) |
- Faster Segment Anything: Towards Lightweight SAM for Mobile Applications - 2306.14289-b31b1b.svg)](https://arxiv.org/abs/2306.14289) [](https://github.com/ChaoningZhang/MobileSAM) |
- What Do Self-Supervised Vision Transformers Learn? - 2305.00729-b31b1b.svg)](https://arxiv.org/abs/2305.00729) [](https://github.com/naver-ai/cl-vs-mim) |
- Active Self-Supervised Learning: A Few Low-Cost Relationships Are All You Need - 2303.15256-b31b1b.svg)](https://arxiv.org/abs/2303.15256) |
- EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything - 2312.00863-b31b1b.svg)](https://arxiv.org/abs/2312.00863) [](https://github.com/yformer/EfficientSAM) |
- DropPos: Pre-Training Vision Transformers by Reconstructing Dropped Positions - 2309.03576-b31b1b.svg)](https://arxiv.org/abs/2309.03576) [](https://github.com/Haochen-Wang409/DropPos) |
- VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking - 2023-b31b1b.svg)](https://openaccess.thecvf.com/content/CVPR2023/papers/Wang_VideoMAE_V2_Scaling_Video_Masked_Autoencoders_With_Dual_Masking_CVPR_2023_paper.pdf) |
- MGMAE: Motion Guided Masking for Video Masked Autoencoding - 2023-b31b1b.svg)](https://openaccess.thecvf.com/content/ICCV2023/papers/Huang_MGMAE_Motion_Guided_Masking_for_Video_Masked_Autoencoding_ICCV_2023_paper.pdf) [](https://github.com/MCG-NJU/MGMAE) |
- MGMAE: Motion Guided Masking for Video Masked Autoencoding - 2023-b31b1b.svg)](https://openaccess.thecvf.com/content/ICCV2023/papers/Huang_MGMAE_Motion_Guided_Masking_for_Video_Masked_Autoencoding_ICCV_2023_paper.pdf) [](https://github.com/MCG-NJU/MGMAE) |
- Improved baselines for vision-language pre-training - 2305.08675-b31b1b.svg)](https://arxiv.org/abs/2305.08675) [](https://github.com/facebookresearch/clip-rocket) |
- Improved baselines for vision-language pre-training - 2305.08675-b31b1b.svg)](https://arxiv.org/abs/2305.08675) [](https://drive.google.com/file/d/1CNLvxt1jri7chCGy2ZqXBzDwPko0s6QP/view?usp=sharing) |
-
2024
- Scalable Pre-training of Large Autoregressive Image Models - 2401.08541-b31b1b.svg)](https://arxiv.org/abs/2401.08541) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/aim.ipynb) |
- SAM 2: Segment Anything in Images and Videos - 2408.00714-b31b1b.svg)](https://arxiv.org/abs/2408.00714) [](https://drive.google.com/file/d/1kWvZclajy7z3ize2KNCLzCfvZN2pDien/view?usp=sharing) |
- Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach - 2405.15613-b31b1b.svg)](https://arxiv.org/abs/2405.15613) |
- GLID: Pre-training a Generalist Encoder-Decoder Vision Model - 2404.07603-b31b1b.svg)](https://arxiv.org/abs/2404.07603) [](https://drive.google.com/file/d/1CEaZ00z-0hqGKp5cTN8fxP6tsHiHkFye/view?usp=sharing) |
- You Don't Need Data-Augmentation in Self-Supervised Learning - 2406.09294-b31b1b.svg)](https://arxiv.org/abs/2406.09294) |
- Occam's Razor for Self Supervised Learning: What is Sufficient to Learn Good Representations? - 2406.10743-b31b1b.svg)](https://arxiv.org/abs/2406.10743) |
- Asymmetric Masked Distillation for Pre-Training Small Foundation Models - 2024-b31b1b.svg)](https://openaccess.thecvf.com/content/CVPR2024/papers/Zhao_Asymmetric_Masked_Distillation_for_Pre-Training_Small_Foundation_Models_CVPR_2024_paper.pdf) [](https://github.com/MCG-NJU/AMD) |
- Scalable Pre-training of Large Autoregressive Image Models - 2401.08541-b31b1b.svg)](https://arxiv.org/abs/2401.08541) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/aim.ipynb) |
- SAM 2: Segment Anything in Images and Videos - 2408.00714-b31b1b.svg)](https://arxiv.org/abs/2408.00714) [](https://drive.google.com/file/d/1kWvZclajy7z3ize2KNCLzCfvZN2pDien/view?usp=sharing) |
- Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach - 2405.15613-b31b1b.svg)](https://arxiv.org/abs/2405.15613) |
- GLID: Pre-training a Generalist Encoder-Decoder Vision Model - 2404.07603-b31b1b.svg)](https://arxiv.org/abs/2404.07603) [](https://drive.google.com/file/d/1CEaZ00z-0hqGKp5cTN8fxP6tsHiHkFye/view?usp=sharing) |
- You Don't Need Data-Augmentation in Self-Supervised Learning - 2406.09294-b31b1b.svg)](https://arxiv.org/abs/2406.09294) |
- Occam's Razor for Self Supervised Learning: What is Sufficient to Learn Good Representations? - 2406.10743-b31b1b.svg)](https://arxiv.org/abs/2406.10743) |
- Asymmetric Masked Distillation for Pre-Training Small Foundation Models - 2024-b31b1b.svg)](https://openaccess.thecvf.com/content/CVPR2024/papers/Zhao_Asymmetric_Masked_Distillation_for_Pre-Training_Small_Foundation_Models_CVPR_2024_paper.pdf) [](https://github.com/MCG-NJU/AMD) |
- Revisiting Feature Prediction for Learning Visual Representations from Video - 2404.08471-b31b1b.svg)](https://arxiv.org/abs/2404.08471) [](https://github.com/facebookresearch/jepa) |
- ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning - 2405.15160-b31b1b.svg)](https://arxiv.org/abs/2405.15160) |
- Revisiting Feature Prediction for Learning Visual Representations from Video - 2404.08471-b31b1b.svg)](https://arxiv.org/abs/2404.08471) [](https://github.com/facebookresearch/jepa) |
- ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning - 2405.15160-b31b1b.svg)](https://arxiv.org/abs/2405.15160) |
- Rethinking Patch Dependence for Masked Autoencoders - 2401.14391-b31b1b.svg)](https://arxiv.org/abs/2401.14391) [](https://github.com/TonyLianLong/CrossMAE) |
- Rethinking Patch Dependence for Masked Autoencoders - 2401.14391-b31b1b.svg)](https://arxiv.org/abs/2401.14391) [](https://github.com/TonyLianLong/CrossMAE) |
-
2021
- Decoupled Contrastive Learning - 2110.06848-b31b1b.svg)](https://arxiv.org/abs/2110.06848) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/dcl.ipynb) |
- Dense Contrastive Learning for Self-Supervised Visual Pre-Training - 2011.09157-b31b1b.svg)](https://arxiv.org/abs/2011.09157) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/densecl.ipynb) |
- Emerging Properties in Self-Supervised Vision Transformers - 2104.14294-b31b1b.svg)](https://arxiv.org/abs/2104.14294) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/dino.ipynb) |
- Decoupled Contrastive Learning - 2110.06848-b31b1b.svg)](https://arxiv.org/abs/2110.06848) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/dcl.ipynb) |
- Dense Contrastive Learning for Self-Supervised Visual Pre-Training - 2011.09157-b31b1b.svg)](https://arxiv.org/abs/2011.09157) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/densecl.ipynb) |
- Emerging Properties in Self-Supervised Vision Transformers - 2104.14294-b31b1b.svg)](https://arxiv.org/abs/2104.14294) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/dino.ipynb) |
- Masked Autoencoders Are Scalable Vision Learners - 2111.06377-b31b1b.svg)](https://arxiv.org/abs/2111.06377) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/mae.ipynb) |
- With a Little Help from My Friends: Nearest-Neighbor Contrastive Learning of Visual Representations - 2104.14548-b31b1b.svg)](https://arxiv.org/abs/2104.14548) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/nnclr.ipynb) |
- SimMIM: A Simple Framework for Masked Image Modeling - 2111.09886-b31b1b.svg)](https://arxiv.org/abs/2111.09886) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/simmim.ipynb) |
- Exploring Simple Siamese Representation Learning - 2011.10566-b31b1b.svg)](https://arxiv.org/abs/2011.10566) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/simsiam.ipynb) |
- Barlow Twins: Self-Supervised Learning via Redundancy Reduction - 2103.03230-b31b1b.svg)](https://arxiv.org/abs/2103.03230) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/barlowtwins.ipynb) |
- When Does Contrastive Visual Representation Learning Work? - 2105.05837-b31b1b.svg)](https://arxiv.org/abs/2105.05837) |
- Efficient Visual Pretraining with Contrastive Detection - 2103.10957-b31b1b.svg)](https://arxiv.org/abs/2103.10957) |
- Masked Autoencoders Are Scalable Vision Learners - 2111.06377-b31b1b.svg)](https://arxiv.org/abs/2111.06377) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/mae.ipynb) |
- With a Little Help from My Friends: Nearest-Neighbor Contrastive Learning of Visual Representations - 2104.14548-b31b1b.svg)](https://arxiv.org/abs/2104.14548) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/nnclr.ipynb) |
- SimMIM: A Simple Framework for Masked Image Modeling - 2111.09886-b31b1b.svg)](https://arxiv.org/abs/2111.09886) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/simmim.ipynb) |
- Exploring Simple Siamese Representation Learning - 2011.10566-b31b1b.svg)](https://arxiv.org/abs/2011.10566) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/simsiam.ipynb) |
- When Does Contrastive Visual Representation Learning Work? - 2105.05837-b31b1b.svg)](https://arxiv.org/abs/2105.05837) |
- Efficient Visual Pretraining with Contrastive Detection - 2103.10957-b31b1b.svg)](https://arxiv.org/abs/2103.10957) |
-
2022
- Unsupervised Visual Representation Learning by Synchronous Momentum Grouping - 2207.06167-b31b1b.svg)](https://arxiv.org/abs/2207.06167) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/smog.ipynb) |
- Masked Siamese Networks for Label-Efficient Learning - 2204.07141-b31b1b.svg)](https://arxiv.org/abs/2204.07141) [](https://drive.google.com/file/d/15WGpYpxy4_1a927RWrmlkeJohZDznN8e/view?usp=sharing) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/msn.ipynb) |
- The Hidden Uniform Cluster Prior in Self-Supervised Learning - 2210.07277-b31b1b.svg)](https://arxiv.org/abs/2210.07277) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/pmsn.ipynb) |
- TiCo: Transformation Invariance and Covariance Contrast for Self-Supervised Visual Representation Learning - 2206.10698-b31b1b.svg)](https://arxiv.org/abs/2206.10698) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/tico.ipynb) |
- Masked Siamese Networks for Label-Efficient Learning - 2204.07141-b31b1b.svg)](https://arxiv.org/abs/2204.07141) [](https://drive.google.com/file/d/15WGpYpxy4_1a927RWrmlkeJohZDznN8e/view?usp=sharing) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/msn.ipynb) |
- The Hidden Uniform Cluster Prior in Self-Supervised Learning - 2210.07277-b31b1b.svg)](https://arxiv.org/abs/2210.07277) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/pmsn.ipynb) |
- Unsupervised Visual Representation Learning by Synchronous Momentum Grouping - 2207.06167-b31b1b.svg)](https://arxiv.org/abs/2207.06167) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/smog.ipynb) |
- TiCo: Transformation Invariance and Covariance Contrast for Self-Supervised Visual Representation Learning - 2206.10698-b31b1b.svg)](https://arxiv.org/abs/2206.10698) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/tico.ipynb) |
- VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning - 2105.04906-b31b1b.svg)](https://arxiv.org/abs/2105.04906) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/vicreg.ipynb) |
- VICRegL: Self-Supervised Learning of Local Visual Features - 2210.01571-b31b1b.svg)](https://arxiv.org/abs/2210.01571) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/vicregl.ipynb) |
- VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training - 2203.12602-b31b1b.svg)](https://arxiv.org/abs/2203.12602) [](https://drive.google.com/file/d/1F0oyiyyxCKzWS9Gv8TssHxaCMFnAoxfb/view?usp=sharing) |
- Improving Visual Representation Learning through Perceptual Understanding - 2212.14504-b31b1b.svg)](https://arxiv.org/abs/2212.14504) [](https://drive.google.com/file/d/1n4Y0iiM368RaPxPg6qvsfACguaolFnhf/view?usp=sharing) |
- VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning - 2105.04906-b31b1b.svg)](https://arxiv.org/abs/2105.04906) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/vicreg.ipynb) |
- VICRegL: Self-Supervised Learning of Local Visual Features - 2210.01571-b31b1b.svg)](https://arxiv.org/abs/2210.01571) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/vicregl.ipynb) |
- VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training - 2203.12602-b31b1b.svg)](https://arxiv.org/abs/2203.12602) [](https://drive.google.com/file/d/1F0oyiyyxCKzWS9Gv8TssHxaCMFnAoxfb/view?usp=sharing) |
- Improving Visual Representation Learning through Perceptual Understanding - 2212.14504-b31b1b.svg)](https://arxiv.org/abs/2212.14504) [](https://drive.google.com/file/d/1n4Y0iiM368RaPxPg6qvsfACguaolFnhf/view?usp=sharing) |
- RankMe: Assessing the downstream performance of pretrained self-supervised representations by their rank - 2210.02885-b31b1b.svg)](https://arxiv.org/abs/2210.02885) [](https://drive.google.com/file/d/1cEP1_G2wMM3-AMMrdntGN6Fq1E5qwPi1/view?usp=sharing) |
- A Closer Look at Self-Supervised Lightweight Vision Transformers - 2205.14443-b31b1b.svg)](https://arxiv.org/abs/2205.14443) [](https://github.com/wangsr126/mae-lite) |
- Beyond neural scaling laws: beating power law scaling via data pruning - 2206.14486-b31b1b.svg)](https://arxiv.org/abs/2206.14486) [](https://github.com/rgeirhos/dataset-pruning-metrics) |
- A simple, efficient and scalable contrastive masked autoencoder for learning visual representations - 2210.16870-b31b1b.svg)](https://arxiv.org/abs/2210.16870) |
- Masked Autoencoders are Robust Data Augmentors - 2206.04846-b31b1b.svg)](https://arxiv.org/abs/2206.04846) |
- RankMe: Assessing the downstream performance of pretrained self-supervised representations by their rank - 2210.02885-b31b1b.svg)](https://arxiv.org/abs/2210.02885) [](https://drive.google.com/file/d/1cEP1_G2wMM3-AMMrdntGN6Fq1E5qwPi1/view?usp=sharing) |
- Is Self-Supervised Learning More Robust Than Supervised Learning? - 2206.05259-b31b1b.svg)](https://arxiv.org/abs/2206.05259) |
- Can CNNs Be More Robust Than Transformers? - 2206.03452-b31b1b.svg)](https://arxiv.org/abs/2206.03452) [](https://github.com/UCSC-VLAA/RobustCNN) |
- Patch-level Representation Learning for Self-supervised Vision Transformers - 2206.07990-b31b1b.svg)](https://arxiv.org/abs/2206.07990) [](https://github.com/alinlab/selfpatch) |
- Is Self-Supervised Learning More Robust Than Supervised Learning? - 2206.05259-b31b1b.svg)](https://arxiv.org/abs/2206.05259) |
- Can CNNs Be More Robust Than Transformers? - 2206.03452-b31b1b.svg)](https://arxiv.org/abs/2206.03452) [](https://github.com/UCSC-VLAA/RobustCNN) |
- Patch-level Representation Learning for Self-supervised Vision Transformers - 2206.07990-b31b1b.svg)](https://arxiv.org/abs/2206.07990) [](https://github.com/alinlab/selfpatch) |
-
2020
- Bootstrap your own latent: A new approach to self-supervised Learning - 2006.07733-b31b1b.svg)](https://arxiv.org/abs/2006.07733) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/byol.ipynb) |
- A Simple Framework for Contrastive Learning of Visual Representations - 2002.05709-b31b1b.svg)](https://arxiv.org/abs/2002.05709) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/simclr.ipynb) |
- Bootstrap your own latent: A new approach to self-supervised Learning - 2006.07733-b31b1b.svg)](https://arxiv.org/abs/2006.07733) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/byol.ipynb) |
- A Simple Framework for Contrastive Learning of Visual Representations - 2002.05709-b31b1b.svg)](https://arxiv.org/abs/2002.05709) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/simclr.ipynb) |
- Unsupervised Learning of Visual Features by Contrasting Cluster Assignments - 2006.09882-b31b1b.svg)](https://arxiv.org/abs/2006.09882) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/swav.ipynb) |
-
2019
- Momentum Contrast for Unsupervised Visual Representation Learning - 1911.05722-b31b1b.svg)](https://arxiv.org/abs/1911.05722) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/moco.ipynb) |
- Momentum Contrast for Unsupervised Visual Representation Learning - 1911.05722-b31b1b.svg)](https://arxiv.org/abs/1911.05722) [](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/moco.ipynb) |
-
2018
- Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination - 1805.01978-b31b1b.svg)](https://arxiv.org/abs/1805.01978) |
- Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination - 1805.01978-b31b1b.svg)](https://arxiv.org/abs/1805.01978) |
-
2016
- Context Encoders: Feature Learning by Inpainting - 1604.07379-b31b1b.svg)](https://arxiv.org/abs/1604.07379) |
- Context Encoders: Feature Learning by Inpainting - 1604.07379-b31b1b.svg)](https://arxiv.org/abs/1604.07379) |
-
Uncategorized
-
Uncategorized
- Lightly**Train**
- Lightly**SSL** - focused collection of state-of-the-art self-supervised training methods.
-
Programming Languages
Sub Categories