Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Awesome-Self-Supervised-Papers
Paper bank for Self-Supervised Learning
https://github.com/dev-sungman/Awesome-Self-Supervised-Papers
Last synced: 4 days ago
JSON representation
-
Pretraining / Feature / Representation
-
Contrastive Learning
- Dimensionality Reduction by Learning an Invariant Mapping - |
- Representation learning with contrastive predictive coding (CPC) - |
- Momentum Contrast for Unsupervised Visual Representation Learning (MoCo)
- Data-Efficient Image Recognition contrastive predictive coding (CPC v2)
- Contrastive Multiview Coding (CMC)
- A Simple Framework for Contrastive Learning of Visual Representations (SimCLR)
- Improved Baselines with Momentum Contrastive Learning(MoCo v2)
- Rethinking Image Mixture for Unsupervised Visual Representation Learning
- Feature Lenses: Plug-and-play Neural Modules for Transformation-Invariant Visual Representations
- Big Self-Supervised Models are Strong Semi-Supervised Learners(SimCLRv2)
- Bootstrap Your Own Latent A New Approach to Self-Supervised Learning
- Unsupervised Learning of Visual Features by Contrasting Cluster Assignments(SwAV)
- What Should Not Be Contrastive in Contrastive Learning - 100) |
- Debiased Contrastive Learning - 100) |
- A Framework For Contrastive Self-Supervised Learning And Designing A New Approach - |
- SELF-SUPERVISED REPRESENTATION LEARNING VIA ADAPTIVE HARD-POSITIVE MINING - 50(4x): 77.3%) |
- Contrastive Representation Learning: A Framework and Review
- EQCO: EQUIVALENT RULES FOR SELF-SUPERVISED CONTRASTIVE LEARNING
- Hard Negative Mixing for Contrastive Learning
- Exploring Simple Siamese Representation Learning(SimSiam)
- Are all negatives created equal in contrastive instance discrimination? - |
- Big Self-Supervised Models Advance Medical Image Classification - -> Chexpert / ResNet-152(2x)) |
- Contrastive Learning Inverts the Data Generating Process
- Barlow Twins: Self-Supervised Learning via Redundancy Reduction
- An Empirical Study of Training Self-Supervised Vision Transformers
- Unsupervised Learning of Dense Visual Representations
- Dense Contrastive Learning for Self-Supervised Visual Pre-Training
- Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning
- Instance Localization for Self-supervised Detection Pretraining
- Spatially Consistent Representation Learning
- Efficient Visual Pretraining with Contrastive Detection
- Aligning Pretraining for Detection via Object-Level Contrastive Learning
- Representation learning with contrastive predictive coding (CPC) - |
- Big Self-Supervised Models are Strong Semi-Supervised Learners(SimCLRv2)
- Momentum Contrast for Unsupervised Visual Representation Learning (MoCo)
- Data-Efficient Image Recognition contrastive predictive coding (CPC v2)
- Contrastive Multiview Coding (CMC)
- A Simple Framework for Contrastive Learning of Visual Representations (SimCLR)
- Improved Baselines with Momentum Contrastive Learning(MoCo v2)
- Rethinking Image Mixture for Unsupervised Visual Representation Learning
- Feature Lenses: Plug-and-play Neural Modules for Transformation-Invariant Visual Representations
- Bootstrap Your Own Latent A New Approach to Self-Supervised Learning
- Unsupervised Learning of Visual Features by Contrasting Cluster Assignments(SwAV)
- What Should Not Be Contrastive in Contrastive Learning - 100) |
- Debiased Contrastive Learning - 100) |
- A Framework For Contrastive Self-Supervised Learning And Designing A New Approach - |
- Contrastive Representation Learning: A Framework and Review
- EQCO: EQUIVALENT RULES FOR SELF-SUPERVISED CONTRASTIVE LEARNING
- Hard Negative Mixing for Contrastive Learning
- Exploring Simple Siamese Representation Learning(SimSiam)
- Are all negatives created equal in contrastive instance discrimination? - |
- Big Self-Supervised Models Advance Medical Image Classification - -> Chexpert / ResNet-152(2x)) |
- An Empirical Study of Training Self-Supervised Vision Transformers
- Dense Contrastive Learning for Self-Supervised Visual Pre-Training
- Contrastive Learning Inverts the Data Generating Process
- Self-supervised Pretraining of Visual Features in the Wild
- Barlow Twins: Self-Supervised Learning via Redundancy Reduction
- Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning
- Spatially Consistent Representation Learning
- Efficient Visual Pretraining with Contrastive Detection
- Aligning Pretraining for Detection via Object-Level Contrastive Learning
- Unsupervised Learning of Dense Visual Representations
-
Image Transformation
- Colorful image colorization(Colorization)
- Unsupervised learning of visual representations by solving jigsaw puzzles
- Unsupervised Feature Learning via Non-Parametric Instance Discrimination (NPID, NPID++)
- Boosting Self-Supervised Learning via Knowledge Transfer (Jigsaw++) - |
- Self-Supervised Learning of Pretext-Invariant Representations (PIRL)
- Steering Self-Supervised Feature Learning Beyond Local Pixel Statistics - |
- Multi-modal Self-Supervision from Generalized Data Transformations - |
- Unsupervised learning of visual representations by solving jigsaw puzzles
- Multi-modal Self-Supervision from Generalized Data Transformations - |
-
Self-supervised learning with Knowledge Distillation
- CompRess: Self-Supervised Learning by Compressing Representations
- SEED: SELF-SUPERVISED DISTILLATION FOR VISUAL REPRESENTATION
- DisCo: Remedy Self-supervised Learning on Lightweight Models with Distilled Contrastive Learning
- Distill on the Go: Online knowledge distillation in self-supervised learning
- Emerging Properties in Self-Supervised Vision Transformers
- iBOT: Image BERT Pre-Training with Online Tokenizer
- Simple Distillation Baselines for Improving Small Self-supervised Models - view loss |
- Bag of Instances Aggregation Boosts Self-supervised Learning
- Distill on the Go: Online knowledge distillation in self-supervised learning
- CompRess: Self-Supervised Learning by Compressing Representations
- Emerging Properties in Self-Supervised Vision Transformers
- SEED: SELF-SUPERVISED DISTILLATION FOR VISUAL REPRESENTATION
- DisCo: Remedy Self-supervised Learning on Lightweight Models with Distilled Contrastive Learning
- iBOT: Image BERT Pre-Training with Online Tokenizer
- Simple Distillation Baselines for Improving Small Self-supervised Models - view loss |
- Bag of Instances Aggregation Boosts Self-supervised Learning
-
Others (in Pretraining / Feature / Representation)
- Unsupervised Representation Learning by Predicting Image Rotations - training
- Mutual Information Neural Estimation
- Wasserstein Dependency Measure for Representation Learning
- Learning Deep Representations by Mutual Information Estimation and Maximization
- Local Aggregation for Unsupervised Learning of Visual Embeddings
- Learning Representations by Maximizing Mutual Information Across Views
- Large Scale Adversarial Representation Learning(BigBiGAN)
- On Mutual Information Maximization for Representation Learning
- How Useful is Self-Supervised Pretraining for Visual Tasks? - |
- Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning
- Self-Labeling via Simultaneous Clustering and Representation Learning
- Big Transfer (BiT): General Visual Representation Learning - training |
- Evaluating Self-Supervised Pretraining Without Using Labels - training |
- UNDERSTANDING SELF-SUPERVISED LEARNING WITH DUAL DEEP NETWORKS
- REPRESENTATION LEARNING VIA INVARIANT CAUSAL MECHANISMS
- Rethinking Pre-training and Self-training
- Self-Tuning for Data-Efficient Deep Learning - efficient deep learning |
- Mine Your Own vieW: Self-Supervised Learning Through Across-Sample Prediction
- Mitigating Embedding and Class Assignment Mismatch in Unsupervised Image Classification
- Improving Unsupervised Image Clustering With Robust Learning - label, clustering |
- How Well Do Self-Supervised Models Transfer?
- Mutual Information Neural Estimation
- Learning Deep Representations by Mutual Information Estimation and Maximization
- Learning Representations by Maximizing Mutual Information Across Views
- Large Scale Adversarial Representation Learning(BigBiGAN)
- On Mutual Information Maximization for Representation Learning
- How Useful is Self-Supervised Pretraining for Visual Tasks? - |
- Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning
- Self-Labeling via Simultaneous Clustering and Representation Learning
- Big Transfer (BiT): General Visual Representation Learning - training |
- UNDERSTANDING SELF-SUPERVISED LEARNING WITH DUAL DEEP NETWORKS
- REPRESENTATION LEARNING VIA INVARIANT CAUSAL MECHANISMS
- Rethinking Pre-training and Self-training
- Improving Unsupervised Image Clustering With Robust Learning - label, clustering |
- Wasserstein Dependency Measure for Representation Learning
- Local Aggregation for Unsupervised Learning of Visual Embeddings
-
-
Identification / Verification / Classification / Recognition
-
Others (in Pretraining / Feature / Representation)
- Real-world Person Re-Identification via Degradation Invariance Learning - CHUK03 | Acc : 85.7(R@1) |
- Spatially Attentive Output Layer for Image Classification - 1) |
- Look-into-Object: Self-supervised Structure Modeling for Object Recognition - 1 err : 22.87 |
- Real-world Person Re-Identification via Degradation Invariance Learning - CHUK03 | Acc : 85.7(R@1) |
- Spatially Attentive Output Layer for Image Classification - 1) |
- Look-into-Object: Self-supervised Structure Modeling for Object Recognition - 1 err : 22.87 |
-
-
Segmentation / Depth Estimation
-
Others (in Pretraining / Feature / Representation)
- Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation
- Towards Better Generalization: Joint Depth-Pose Learning without PoseNet
- Monocular Depth Estimation with Self-supervised Instance Adaptation
- Novel View Synthesis of Dynamic Scenes with Globally Coherent Depths from a Monocular Camera - | - |
- Unsupervised Intra-domain Adaptation for Semantic Segmentation through Self-Supervision - >Cityscape | mIoU : 46.3 |
- D3VO : Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry - | - |
- Self-Supervised Human Depth Estimation from Monocular Videos - | - |
- Calibrating Self-supervised Monocular Depth Estimation
- Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation
- Towards Better Generalization: Joint Depth-Pose Learning without PoseNet
- Monocular Depth Estimation with Self-supervised Instance Adaptation
- Novel View Synthesis of Dynamic Scenes with Globally Coherent Depths from a Monocular Camera - | - |
- Unsupervised Intra-domain Adaptation for Semantic Segmentation through Self-Supervision - >Cityscape | mIoU : 46.3 |
- D3VO : Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry - | - |
- Self-Supervised Human Depth Estimation from Monocular Videos - | - |
- Calibrating Self-supervised Monocular Depth Estimation
-
-
Detection / Localization
-
Others (in Pretraining / Feature / Representation)
-
-
Generation
-
Others (in Pretraining / Feature / Representation)
- StyleRig: Rigging StyleGAN for 3D Control over Portrait Images
- From Inference to Generation: End-to-End Fully Self-Supervised Generation of Human Face from Speech
- Neutral Face Game Character Auto-Creation via PokerFace-GAN
- Self-Supervised Variational Auto-Encoders - 10) |
- StyleRig: Rigging StyleGAN for 3D Control over Portrait Images
- From Inference to Generation: End-to-End Fully Self-Supervised Generation of Human Face from Speech
- Neutral Face Game Character Auto-Creation via PokerFace-GAN
-
-
Video
-
Others (in Pretraining / Feature / Representation)
- A Review on Deep Learning Techniques for Video Prediction - | - |
- Self-Supervised Learning of Video-Induced Visual Invariances - | - |
- Video Representation Learning by Recognizing Temporal Transformations - 1) | UCF101 |
- Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework - 1) | UCF101 |
- Space-Time Correspondence as a Contrastive Random Walk
- A Review on Deep Learning Techniques for Video Prediction - | - |
- Video Representation Learning by Recognizing Temporal Transformations - 1) | UCF101 |
- Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework - 1) | UCF101 |
-
-
Others
-
Others (in Pretraining / Feature / Representation)
- Flow2Stereo: Effective Self-Supervised Learning of Optical Flow and Stereo Matching
- Self-Supervised Viewpoint Learning From Image Collections
- Self-Supervised Scene De-occlusion
- Learning by Analogy : Reliable Supervision from Transformations for Unsupervised Optical Flow Estimation
- D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features - |
- SpeedNet: Learning the Speediness in Videos - |
- Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation
- MVP: Unified Motion and Visual Self-Supervised Learning for Large-Scale Robotic Navigation - |
- Active Perception and Representation for Robotic Manipulation - |
- Supervised Contrastive Learning - 1) |
- Learning from Scale-Invariant Examples for Domain Adaptation in Semantic Segmentation
- On the Effectiveness of Image Rotation for Open Set Domain Adaptation - |
- LIMP: Learning Latent Shape Representations with Metric Preservation Priors - |
- Learning to Scale Multilingual Representations for Vision-Language Tasks - Language | MSCOCO: 81.5 |
- NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis - |
- Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification - tail classification | - |
- Knowledge Distillation Meets Self-Supervision - -> MobileNetv2 Acc: 72.57 (Top-1) |
- Fast and Robust Face-to-Parameter Translation for Game Character Auto-Creation - Creation | - |
- Domain-invariant Similarity Activation Map Metric Learning for Retrieval-based Long-term Visual Localization - |
- Self-Supervised Learning for Large-Scale Unsupervised Image Clustering
- SSD: A UNIFIED FRAMEWORK FOR SELFSUPERVISED OUTLIER DETECTION
- Improving BERT with Self-Supervised Attention - SSA-H) |
- PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation - L) |
- TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition - | - |
- ALBERT: A Lite BERT For Self-Supervised Learning of Language Representations
- Learning to Compare for Better Training and Evaluation of Open Domain Natural Language Generation Models - | - |
- Contrastive Self-Supervised Learning for Commonsense Reasoning - 60 | 90.0% |
- VQ-WAV2VEC: SELF-SUPERVISED LEARNING OF DISCRETE SPEECH REPRESENTATIONS
- EFFECTIVENESS OF SELF-SUPERVISED PRE-TRAINING FOR SPEECH RECOGNITION
- Generative Pre-Training for Speech with Augoregressive Predictive Coding - | - |
- Distilled Semantics for Comprehensive Scene Understanding from Videos - |
- Learning to Compare for Better Training and Evaluation of Open Domain Natural Language Generation Models - | - |
- Contrastive Self-Supervised Learning for Commonsense Reasoning - 60 | 90.0% |
- VQ-WAV2VEC: SELF-SUPERVISED LEARNING OF DISCRETE SPEECH REPRESENTATIONS
- Jointly Fine-Tuning “BERT-like” Self Supervised Models to Improve Multimodal Speech Emotion Recognition
- Self-Supervised Viewpoint Learning From Image Collections
- Learning by Analogy : Reliable Supervision from Transformations for Unsupervised Optical Flow Estimation
- D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features - |
- SpeedNet: Learning the Speediness in Videos - |
- Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation
- MVP: Unified Motion and Visual Self-Supervised Learning for Large-Scale Robotic Navigation - |
- Words aren’t enough, their order matters: On the Robustness of Grounding Visual Referring Expressions - |
- Supervised Contrastive Learning - 1) |
- Learning from Scale-Invariant Examples for Domain Adaptation in Semantic Segmentation
- On the Effectiveness of Image Rotation for Open Set Domain Adaptation - |
- Learning to Scale Multilingual Representations for Vision-Language Tasks - Language | MSCOCO: 81.5 |
- LIMP: Learning Latent Shape Representations with Metric Preservation Priors - |
- NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis - |
- Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification - tail classification | - |
- Knowledge Distillation Meets Self-Supervision - -> MobileNetv2 Acc: 72.57 (Top-1) |
- Fast and Robust Face-to-Parameter Translation for Game Character Auto-Creation - Creation | - |
- Domain-invariant Similarity Activation Map Metric Learning for Retrieval-based Long-term Visual Localization - |
- Self-Supervised Learning for Large-Scale Unsupervised Image Clustering
- Improving BERT with Self-Supervised Attention - SSA-H) |
- PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation - L) |
- TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition - | - |
- Words aren’t enough, their order matters: On the Robustness of Grounding Visual Referring Expressions - |
- Jointly Fine-Tuning “BERT-like” Self Supervised Models to Improve Multimodal Speech Emotion Recognition
- Distilled Semantics for Comprehensive Scene Understanding from Videos - |
- EFFECTIVENESS OF SELF-SUPERVISED PRE-TRAINING FOR SPEECH RECOGNITION
- Generative Pre-Training for Speech with Augoregressive Predictive Coding - | - |
- ALBERT: A Lite BERT For Self-Supervised Learning of Language Representations
-
-
Graph
-
Others (in Pretraining / Feature / Representation)
- Contrastive Self-supervised Learning for Graph Classification - specific:85.80 |
- Towards Robust Graph Contrastive Learning - DE) |
- Towards Robust Graph Contrastive Learning - DE) |
-
-
Reinforcement Learning
-
Others (in Pretraining / Feature / Representation)
- CONTRASTIVE BEHAVIORAL SIMILARITY EMBEDDINGS FOR GENERALIZATION IN REINFORCEMENT LEARNING - catch: 821±17 (Random Initialization / DrQ+PSEs) |
- CONTRASTIVE BEHAVIORAL SIMILARITY EMBEDDINGS FOR GENERALIZATION IN REINFORCEMENT LEARNING - catch: 821±17 (Random Initialization / DrQ+PSEs) |
-
Categories