
An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Reading list for the Advanced Machine Learning Course

Last synced: about 2 months ago
JSON representation

Reading list for the Advanced Machine Learning Course

Awesome Lists containing this project



# Advanced Deep Learning @ KAIST

## Course Information
**Instructor:** Sung Ju Hwang ([email protected])
**TAs:** Seul Lee ([email protected]) and Jaehyeong Jo ([email protected])

This is an on/offline hybrid course.
Building Nubmer 9, Room 9201 (Instructor) 2nd floor (TAs)
Office hours: By appointment only.

### Grading Policy
* **Absolute Grading**
* Paper Presentation: 20%
* Attendance and Participation: 20%
* Project: 60%

## Tentative Schedule

| Dates | Topic |
|8/29| Course Introduction |
|9/1| Review of Deep Learning Basics (Video Lecture) |
|9/6| Vision Transformers (Lecture) |
|9/8| Vision Transformers / Self-Supervised Learning (Lecture) |
|9/13| Self-Supervised Learning (Lecture) |
|9/15| Self-Supervised Learning (Presentation) |
|9/20| Bayesian Deep Learning - Bayesian ML Basics, Bayesian Neural Networks (Lecture) |
|9/22| Bayesian Deep Learning - Bayesian Approximations, Uncertainties in Prediction (Lecture) |
|9/27| Bayesian Deep Learning - MCMC Sampling for Bayesian Inference, Neural Processes (Lecture) |
|9/29| Bayesian Deep Learning (Presentation) |
|10/4| Deep Generative Models - Advanced GANs (Lecture) |
|10/6| Deep Generative Models - Advanced GANs (Presentation) **Initial Proposal Due**|
|10/11| Deep Generative Models - Diffusion Models (Lecture) |
|10/13| Deep Generative Models - Diffusion Models (Lecture) |
|10/18| Deep Generative Models - Diffusion Models (Presentation) |
|10/20| **Mid-term Presentation**
|10/25| Large Language Models (Lecture) |
|10/27| Multimodal Generative Models (Lecture) |
|11/1| Large Language Models and Multimodal Generative Models (Presentation) |
|11/3| Deep Reinforcement Learning - Deep RL Basics (Lecture) |
|11/8| Deep Reinforcement Learning - Policy-based RL, Model-based RL (Lecture) |
|11/10| Deep Reinforcement Learning - Offline RL, Exploration, RL via Sequence Modeling (Lecture) |
|11/15| Deep Reinforcement Learning (Presentation) |
|11/17| Meta Learning (Lecture) |
|11/22| Meta Learning (Presentation) |
|11/24| Continual Learning (Lecture) |
|11/29| Continual Learning (Presentation) |
|12/1| Robust Deep Learning (Lecture) |
|12/6| Robust Deep Learning (Presentation) |
|12/8| Deep Graph Learning (Lecture) |
|12/13| Deep Graph Learning (Presentation) |
|12/15| **Final Presentation**

## Reading List

### Vision Transformers
[[Dosovitskiy et al. 21]]( An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, ICLR 2021.
[[Touvron et al. 21]]( Training Data-efficient Image transformers & Distillation through Attention, ICML 2021.
[[Liu et al. 21]]( Swin Transformer: Hierarchical Vision Transformer using Shifted Windows, ICCV 2021.
[[Wu et al. 21]]( CvT: Introducing Convolutions to Vision Transformers, ICCV 2021.
[[Dai et al. 21]]( CoAtNet: Marrying Convolution and Attnetion for All Data Sizes, NeurIPS 2021.
[[Yang et al. 21]]( Focal Attention for Long-Range Interactions in Vision Transformers, NeurIPS 2021.
[[El-Nouby et al. 21]]( XCiT: Cross-Covariance Image Transformers, NeurIPS 2021.
[[Li et al. 22]]( MViTv2: Improved Multiscale Vision Transformers for Classification and Detection, CVPR 2022.
[[Lee et al. 22]]( MPViT : Multi-Path Vision Transformer for Dense Prediction, CVPR 2022.
[[Liu et al. 22]]( ConvNet for the 2020s, CVPR 2022.

### Self-Supervised Learning

[[Dosovitskiy et al. 14]]( Discriminative Unsupervised Feature Learning with Convolutional Neural Networks, NIPS 2014.
[[Pathak et al. 16]]( Context Encoders: Feature Learning by Inpainting, CVPR 2016.
[[Norrozi and Favaro et al. 16]]( Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles, ECCV 2016.
[[Gidaris et al. 18]]( Unsupervised Representation Learning by Predicting Image Rotations, ICLR 2018.
[[He et al. 20]]( Momentum Contrast for Unsupervised Visual Representation Learning, CVPR 2020.
[[Chen et al. 20]]( A Simple Framework for Contrastive Learning of Visual Representations, ICML 2020.
[[Mikolov et al. 13]]( Efficient Estimation of Word Representations in Vector Space, ICLR 2013.
[[Devlin et al. 19]]( BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, NAACL 2019.
[[Clark et al. 20]]( ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators, ICLR 2020.
[[Hu et al. 20]]( Strategies for Pre-training Graph Neural Networks, ICLR 2020.
[[Chen et al. 20]]( Generative Pretraining from Pixels, ICML 2020.
[[Laskin et al. 20]]( CURL: Contrastive Unsupervised Representations for Reinforcement Learning, ICML 2020.
[[Grill et al. 20]]( Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning, NeurIPS 2020.
[[Chen et al. 20]]( Big Self-Supervised Models are Strong Semi-Supervised Learners, NeurIPS, 2020.
[[Chen and He. 21]]( Exploring Simple Siamese Representation Learning, CVPR 2021.
[[Tian et al. 21]]( Understanding Self-Supervised Learning Dynamics without Contrastive Pairs, ICML 2021.
[[Caron et al. 21]]( Emerging Properties in Self-Supervised Vision Transformers, ICCV 2021.
[[Liu et al. 22]]( Self-supervised Learning is More Robust to Dataset Imbalance, ICLR 2022.
[[Bao et al. 22]]( BEiT: BERT Pre-Training of Image Transformers, ICLR 2022.
[[He et al. 22]]( Masked Autoencoders are Scalable Vision Learners, CVPR 2022.
[[Liu et al. 22]]( Improving Contrastive Learning with Model Augmetnation, arXiv preprint, 2022.
[[Touvron et al. 22]]( DeIT III: Revenge of the VIT, arXiv preprint, 2022.

### Bayesian Deep Learning
[[Kingma and Welling 14]]( Auto-Encoding Variational Bayes, ICLR 2014.
[[Kingma et al. 15]]( Variational Dropout and the Local Reparameterization Trick, NIPS 2015.
[[Blundell et al. 15]]( Weight Uncertainty in Neural Networks, ICML 2015.
[[Gal and Ghahramani 16]]( Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, ICML 2016.
[[Liu et al. 16]]( Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm, NIPS 2016.
[[Mandt et al. 17]]( Stochastic Gradient Descent as Approximate Bayesian Inference, JMLR 2017.
[[Kendal and Gal 17]]( What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?, ICML 2017.
[[Gal et al. 17]]( Concrete Dropout, NIPS 2017.
[[Gal et al. 17]]( Deep Bayesian Active Learning with Image Data, ICML 2017.
[[Teye et al. 18]]( Bayesian Uncertainty Estimation for Batch Normalized Deep Networks, ICML 2018.
[[Garnelo et al. 18]]( Conditional Neural Process, ICML 2018.
[[Kim et al. 19]](http:// Attentive Neural Processes, ICLR 2019.
[[Sun et al. 19]]( Functional Variational Bayesian Neural Networks, ICLR 2019.
[[Louizos et al. 19]]( The Functional Neural Process, NeurIPS 2019.
[[Zhang et al. 20]]( Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning, ICLR 2020.
[[Amersfoort et al. 20]]( Uncertainty Estimation Using a Single Deep Deterministic Neural Network, ICML 2020.
[[Dusenberry et al. 20]]( Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors, ICML 2020.
[[Wenzel et al. 20]]( How Good is the Bayes Posterior in Deep Neural Networks Really?, ICML 2020.
[[Lee et al. 20]]( Bootstrapping Neural Processes, NeurIPS 2020.
[[Wilson et al. 20]]( Bayesian Deep Learning and a Probabilistic
Perspective of Generalization, NeurIPS 2020.
[[Izmailov et al. 21]]( What Are Bayesian Neural Network Posteriors Really Like?, ICML 2021.
[[Daxberger et al. 21]]( Bayesian Deep Learning via Subnetwork Inference, ICML 2021.
[[Fortuin et al. 22]]( Bayesian Neural Network Priors Revisited, ICLR 2022.
[[Muller et al. 22]]( Transformers Can Do Bayesian Inference, ICLR 2022.
[[Nguyen and Grover 22]]( Transformer Neural Processes, ICML 2022.
[[Nazaret and Blei 22]]( Variational Inference for Infinitely Deep Neural Networks, ICML 2022.
[[Lotfi et al. 22]]( Bayesian Model Selection, the Marginal Likelihood, and Generalization, ICML 2022.
[[Alexos et al. 22]]( Structured Stochastic Gradient MCMC, ICML 2022.

### Deep Generative Models
#### VAEs, Autoregressive and Flow-Based Generative Models
[[Rezende and Mohamed 15]]( Variational Inference with Normalizing Flows, ICML 2015.
[[Germain et al. 15]]( MADE: Masked Autoencoder for Distribution Estimation, ICML 2015.
[[Kingma et al. 16]]( Improved Variational Inference with Inverse Autoregressive Flow, NIPS 2016.
[[Oord et al. 16]]( Pixel Recurrent Neural Networks, ICML 2016.
[[Dinh et al. 17]]( Density Estimation Using Real NVP, ICLR 2017.
[[Papamakarios et al. 17]]( Masked Autoregressive Flow for Density Estimation, NIPS 2017.
[[Huang et al.18]]( Neural Autoregressive Flows, ICML 2018.
[[Kingma and Dhariwal 18]]( Glow: Generative Flow with Invertible 1x1 Convolutions, NeurIPS 2018.
[[Ho et al. 19]]( Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design, ICML 2019.
[[Chen et al. 19]]( Residual Flows for Invertible Generative Modeling, NeurIPS 2019.
[[Tran et al. 19]]( Discrete Flows: Invertible Generative
Models of Discrete Data, NeurIPS 2019.
[[Ping et al. 20]]( WaveFlow: A Compact Flow-based Model for Raw Audio, ICML 2020.
[[Vahdat and Kautz 20]]( NVAE: A Deep Hierarchical Variational Autoencoder, NeurIPS 2020.
[[Ho et al. 20]]( Denoising Diffusion Probabilistic Models, NeurIPS 2020.
[[Song et al. 21]]( Score-Based Generative Modeling through Stochastic Differential Equations, ICLR 2021.
[[Kosiorek et al. 21]]( NeRF-VAE: A Geometry Aware 3D Scene Generative Model, ICML 2021.

#### Generative Adversarial Networks
[[Goodfellow et al. 14]]( Generative Adversarial Nets, NIPS 2014.
[[Radford et al. 15]]( Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, ICLR 2016.
[[Chen et al. 16]]( InfoGAN: Interpreting Representation Learning by Information Maximizing Generative Adversarial Nets, NIPS 2016.
[[Arjovsky et al. 17]]( Wasserstein Generative Adversarial Networks, ICML 2017.
[[Zhu et al. 17]]( Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, ICCV 2017.
[[Zhang et al. 17]]( Adversarial Feature Matching for Text Generation, ICML 2017.
[[Karras et al. 18]]( Progressive Growing of GANs for Improved Quality, Stability, and Variation, ICLR 2018.
[[Choi et al. 18]]( StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation, CVPR 2018.
[[Brock et al. 19]]( Large Scale GAN Training for High-Fidelity Natural Image Synthesis, ICLR 2019.
[[Karras et al. 19]]( A Style-Based Generator Architecture for Generative Adversarial Networks, CVPR 2019.
[[Karras et al. 20]]( Analyzing and Improving the Image Quality of StyleGAN, CVPR 2020.
[[Sinha et al. 20]]( Small-GAN: Speeding up GAN Training using Core-Sets, ICML 2020.
[[Karras et al. 20]]( Training Generative Adversarial Networks with
Limited Data, NeurIPS 2020.
[[Liu et al. 21]]( Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis, ICLR 2021.
[[Esser et al. 22]]( Taming Transformers for High-Resolution Image Synthesis, CVPR 2021.
[[Hudson and Zitnick 21]]( Generative Adversarial Transformers, ICML 2021.
[[Karras et al. 21]]( Alias-Free Generative Adversarial Networks, NeurIPS 2021.
[[Skorokhodov et al. 22]]( StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2, CVPR 2022.
[[Lin et al. 22]]( InfinityGAN: Towards Infinite-Pixel Image Synthesis, ICLR 2022.
[[Lee et al. 22]]( ViTGAN: Training GANs with Vision Transformers, ICLR 2022.
[[Yu et al. 22]]( Vector-Quantized Image Modeling with Improved VQGAN, ICLR 2022.
[[Franceschi et al. 22]]( A Neural Tangent Kernel Perspective of GANs, ICML 2022.

#### Diffusion Models
[[Song and Ermon 19]]( Generative Modeling by Estimating Gradients of the Data Distribution, NeurIPS 2019.
[[Song and Ermon 20]]( Improved Techniques for Training Score-Based Generative Models, NeurIPS 2020.
[[Ho et al. 20]]( Denoising Diffusion Probabilistic Models, NeurIPS 2020.
[[Song et al. 21]]( Score-Based Generative Modeling through Stochastic Differential Equations, ICLR 2021.
[[Nichol and Dhariwal 21]]( Improved Denoising Diffusion Probabilistic Models, ICML 2021.
[[Vahdat et al. 21]]( Score-based Generative Modeling in Latent Space, NeurIPS 2021.
[[Dhariwal and Nichol 21]]( Diffusion Models Beat GANs on Image Synthesis, NeureIPS 2021.
[[De Bortoli et al. 22]]( Diffusion Schrodinger Bridge with Application to Score-Based Generative Modeling, NeurIPS 2021.
[[Ho and Salimans 22]]( Classifier-Free Diffusion Guidance, arXiv preprint, 2022.
[[Dockhorn et al. 22]]( Score-Based Generative Modeling with Critically-Damped Langevin Diffusion, ICLR 2022.
[[Salimans and Ho 22]]( Progressive Distillation for Fast Sampling of Diffusion Models, ICLR 2022.
[[Chen et al. 22]]( Likelihood Training of Schrodinger Bridge using Forward-Backwrad SDEs Theory, ICLR 2022.

### Deep Reinforcement Learning
[[Mnih et al. 13]]( Playing Atari with Deep Reinforcement Learning, NIPS Deep Learning Workshop 2013.
[[Silver et al. 14]]( Deterministic Policy Gradient Algorithms, ICML 2014.
[[Schulman et al. 15]]( Trust Region Policy Optimization, ICML 2015.
[[Lillicrap et al. 16]]( Continuous Control with Deep Reinforcement Learning, ICLR 2016.
[[Schaul et al. 16]]( Prioritized Experience Replay, ICLR 2016.
[[Wang et al. 16]]( Dueling Network Architectures for Deep Reinforcement Learning, ICML 2016.
[[Mnih et al. 16]]( Asynchronous Methods for Deep Reinforcement Learning, ICML 2016.
[[Schulman et al. 17]]( Proximal Policy Optimization Algorithms, arXiv preprint, 2017.
[[Nachum et al. 18]]( Data-Efficient Hierarchical Reinforcement Learning, NeurIPS 2018.
[[Ha et al. 18]]( Recurrent World Models Facilitate Policy Evolution, NeurIPS 2018.
[[Burda et al. 19]]( Large-Scale Study of Curiosity-Driven Learning, ICLR 2019.
[[Vinyals et al. 19]]( Grandmaster level in StarCraft II using multi-agent reinforcement learning, Nature, 2019.
[[Bellemare et al. 19]]( A Geometric Perspective on Optimal Representations for Reinforcement Learning, NeurIPS 2019.
[[Janner et al. 19]]( When to Trust Your Model: Model-Based Policy Optimization, NeurIPS 2019.
[[Fellows et al. 19]]( VIREL: A Variational Inference Framework for Reinforcement Learning, NeurIPS 2019.
[[Kumar et al. 19]]( Stabilizing Off-Policy Q-Learning via Bootstrapping
Error Reduction, NeurIPS 2019.
[[Kaiser et al. 20]]( Model Based Reinforcement Learning for Atari, ICLR 2020.
[[Agarwal et al. 20]]( An Optimistic Perspective on Offline Reinforcement Learning, ICML 2020.
[[Lee et al. 20]]( Batch Reinforcement Learning with Hyperparameter Gradients, ICML 2020.
[[Kumar et al. 20]]( Conservative Q-Learning for Offline Reinforcement Learning, ICML 2020.
[[Yarats et al. 21]]( Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels, ICLR 2021.
[[Chen et al. 21]]( Decision Transformer: Reinforcement Learning via Sequence Modeling, NeurIPS 2021.
[[Mai et al. 22]]( Sample Efficient Deep Reinforcement Learning via Uncertainty Estimation, ICLR 2022.
[[Furuta et al. 22]]( Generalized Decision Transformer for Offline Hindsight Information Matching, ICLR 2022.
[[Oh et al. 22]]( Model-augmented Prioritized Experience Replay, ICLR 2022.
[[Rengarajan et al. 22]]( Reinforcement Learning with Sparse Rewards Using Guidance from Offline Demonstration, ICLR 2022.
[[Patil et al. 22]]( Align-RUDDER: Learning from Few Demonstrations by Reward Redistribution, ICML 2022.
[[Goyal et al. 22]]( Retrieval Augmented Reinforcement Learning, ICML 2022.
[[Reed et al. 22]]( A Generalist Agent, arXiv preprint, 2022.

### Memory and Computation-Efficient Deep Learning
[[Han et al. 15]]( Learning both Weights and Connections for Efficient Neural Networks, NIPS 2015.
[[Wen et al. 16]]( Learning Structured Sparsity in Deep Neural Networks, NIPS 2016
[[Han et al. 16]]( Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding, ICLR 2016
[[Molchanov et al. 17]]( Variational Dropout Sparsifies Deep Neural Networks, ICML 2017
[[Luizos et al. 17]]( Bayesian Compression for Deep Learning, NIPS 2017.
[[Luizos et al. 18]]( Learning Sparse Neural Networks Through L0 Regularization, ICLR 2018.
[[Howard et al. 18]]( MobileNets: Efficient Convolutional Neural Networks for Mobile Vision
Applications, CVPR 2018.
[[Frankle and Carbin 19]](https:// The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks, ICLR 2019.
[[Lee et al. 19]]( SNIP: Single-Shot Network Pruning Based On Connection Sensitivity, ICLR 2019.
[[Liu et al. 19]]( Rethinking the Value of Network Pruning, ICLR 2019.
[[Jung et al. 19]]( Learning to Quantize Deep Networks by Optimizing Quantization Intervals with Task Loss, CVPR 2019.
[[Morcos et al. 19]]( One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers, NeurIPS 2019.
[[Renda et al. 20]]( Comparing Rewinding and Fine-tuning in Neural Network Pruning, ICLR 2020.
[[Frankle et al. 20]]( Linear Mode Connectivity and the Lottery Ticket Hypothesis, ICML 2020.
[[Tanaka et al. 20]]( Pruning Neural Networks without Any Data by Iteratively Conserving Synaptic Flow, NeurIPS 2020.
[[van Baalen et al. 20]]( Bayesian Bits: Unifying Quantization and Pruning, NeurIPS 2020.
[[de Jorge et al. 21]]( Progressive Skeletonization: Trimming more fat from a network at initialization, ICLR 2021.
[[Stock et al. 21]]( Training with Quantization Noise for Extreme Model Compression, ICLR 2021.
[[Lee et al. 21]]( Semi-Relaxed Quantization with DropBits: Training Low-Bit Neural Networks via Bit-wise Regularization, ICCV 2021.

### Meta Learning
[[Santoro et al. 16]]( Meta-Learning with Memory-Augmented Neural Networks, ICML 2016
[[Vinyals et al. 16]]( Matching Networks for One Shot Learning, NIPS 2016
[[Edwards and Storkey 17]]( Towards a Neural Statistician, ICLR 2017
[[Finn et al. 17]]( Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, ICML 2017
[[Snell et al. 17]]( Prototypical Networks for Few-shot Learning, NIPS 2017.
[[Nichol et al. 18]]( On First-Order Meta-learning Algorithms, arXiv preprint, 2018.
[[Lee and Choi 18]]( Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace, ICML 2018.
[[Liu et al. 19]]( Learning to Propagate Labels: Transductive Propagation Network for Few-shot Learning, ICLR 2019.
[[Gordon et al. 19]]( Meta-Learning Probabilistic Inference for Prediction, ICLR 2019.
[[Ravi and Beatson 19]]( Amortized Bayesian Meta-Learning, ICLR 2019.
[[Rakelly et al. 19]]( Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables, ICML 2019.
[[Shu et al. 19]]( Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting, NeurIPS 2019.
[[Finn et al. 19]]( Online Meta-Learning, ICML 2019.
[[Lee et al. 20]]( Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distribution Tasks, ICLR 2020.
[[Yin et al. 20]]( Meta-Learning without Memorization, ICLR 2020.
[[Raghu et al. 20]]( Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML, ICLR 2020.
[[Iakovleva et al. 20]]( Meta-Learning with Shared Amortized Variational Inference, ICML 2020.
[[Bronskill et al. 20]]( TaskNorm: Rethinking Batch Normalization for Meta-Learning, ICML 2020.
[[Rajendran et al. 20]]( Meta-Learning Requires Meta-Augmentation, NeurIPS 2020.
[[Lee et al. 21]]( Meta-GMVAE: Mixture of Gaussian VAE for Unsupervised Meta-Learning, ICLR 2021.
[[Shin et al. 21]]( Large-Scale Meta-Learning with Continual Trajectory Shifting, ICML 2021.
[[Acar et al. 21]]( Memory Efficient Online Meta Learning, ICML 2021.
[[Lee et al. 22]]( Online Hyperparameter Meta-Learning with Hypergradient Distillation, ICLR 2022.
[[Flennerhag et al. 22]]( Boostrapped Meta-Learning, ICLR 2022.
[[Yao et al. 22]]( Meta-Learning with Fewer Tasks through Task Interpolation, ICLR 2022.
[[Guan and Lu 22]]( Task Relatedness-Based Generalization Bounds for Meta Learning, ICLR 2022.

### Continual Learning
[[Rusu et al. 16]]( Progressive Neural Networks, arXiv preprint, 2016
[[Kirkpatrick et al. 17]]( Overcoming catastrophic forgetting in neural networks, PNAS 2017
[[Lee et al. 17]]( Overcoming Catastrophic Forgetting by Incremental Moment Matching, NIPS 2017
[[Shin et al. 17]]( Continual Learning with Deep Generative Replay, NIPS 2017.
[[Lopez-Paz and Ranzato 17]]( Gradient Episodic Memory for Continual Learning, NIPS 2017.
[[Yoon et al. 18]]( Lifelong Learning with Dynamically Expandable Networks, ICLR 2018.
[[Nguyen et al. 18]]( Variational Continual Learning, ICLR 2018.
[[Schwarz et al. 18]]( Progress & Compress: A Scalable Framework for Continual Learning, ICML 2018.
[[Chaudhry et al. 19]]( Efficient Lifelong Learning with A-GEM, ICLR 2019.
[[Rao et al. 19]]( Continual Unsupervised Representation Learning, NeurIPS 2019.
[[Rolnick et al. 19]]( Experience Replay for Continual Learning, NeurIPS 2019.
[[Jerfel et al. 20]]( Reconciling Meta-Learning and Continual Learning with Online Mixtures of Tasks, NeurIPS 2019.
[[Yoon et al. 20]]( Scalable and Order-robust Continual Learning with Additive Parameter Decomposition, ICLR 2020.
[[Remasesh et al. 20]]( Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics, Continual Learning Workshop, ICML 2020.
[[Borsos et al. 20]]( Coresets via Bilevel Optimization for Continual
Learning and Streaming, NeurIPS 2020.
[[Mirzadeh et al. 20]]( Understanding the Role of Training Regimes
in Continual Learning, NeurIPS 2020.
[[Saha et al. 21]]( Gradient Projection Memory for Continual Learning, ICLR 2021.
[[Veinat et al. 21]]( Efficient Continual Learning with Modular Networks and Task-Driven Priors, ICLR 2021.
[[Madaan et al. 22]]( Representational Continuity for Unsupervised Continual Learning, ICLR 2022.
[[Yoon et al. 22]]( Online Coreset Selection for Rehearsal-based Continual Learning, ICLR 2022.
[[Lin et al. 22]]( TRGP: Trust Region Gradient Projection for Continual Learning, ICLR 2022.
[[Wang et al. 22]]( Improving Task-free Continual Learning by Distributionally Robust Memory Evolution, ICML 2022.
[[Kang et al. 22]]( Forget-free Continual Learning with Winning Subnetworks, ICML 2022.

### Interpretable Deep Learning
[[Ribeiro et al. 16]]( "Why Should I Trust You?" Explaining the Predictions of Any Classifier, KDD 2016
[[Kim et al. 16]]( Examples are not Enough, Learn to Criticize! Criticism for Interpretability, NIPS 2016
[[Choi et al. 16]]( RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism, NIPS 2016
[[Koh et al. 17]]( Understanding Black-box Predictions via Influence Functions, ICML 2017
[[Bau et al. 17]]( Network Dissection: Quantifying Interpretability of Deep Visual Representations, CVPR 2017
[[Selvaraju et al. 17]]( Grad-CAM: Visual Explanation from Deep Networks via Gradient-based Localization, ICCV 2017.
[[Kim et al. 18]]( Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV), ICML 2018.
[[Heo et al. 18]]( Uncertainty-Aware Attention for Reliable Interpretation and Prediction, NeurIPS 2018.
[[Bau et al. 19]]( GAN Dissection: Visualizing and Understanding Generative Adversarial Networks, ICLR 2019.
[[Ghorbani et al. 19]]( Towards Automatic Concept-based Explanations, NeurIPS 2019.
[[Coenen et al. 19]]( Visualizing and Measuring the Geometry of BERT, NeurIPS 2019.
[[Heo et al. 20]]( Cost-Effective Interactive Attention Learning with Neural Attention Processes, ICML 2020.
[[Agarwal et al. 20]]( Neural Additive Models: Interpretable Machine Learning with Neural Nets, arXiv preprint, 2020.

### Reliable Deep Learning
[[Guo et al. 17]]( On Calibration of Modern Neural Networks, ICML 2017.
[[Lakshminarayanan et al. 17]]( Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, NIPS 2017.
[[Liang et al. 18]]( Enhancing the Reliability of Out-of-distrubition Image Detection in Neural Networks, ICLR 2018.
[[Lee et al. 18]]( Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples, ICLR 2018.
[[Kuleshov et al. 18]]( Accurate Uncertainties for Deep Learning Using Calibrated Regression, ICML 2018.
[[Jiang et al. 18]]( To Trust Or Not To Trust A Classifier, NeurIPS 2018.
[[Madras et al. 18]]( Predict Responsibly: Improving Fairness and Accuracy by Learning to Defer, NeurIPS 2018.
[[Maddox et al. 19]]( A Simple Baseline for Bayesian Uncertainty in Deep Learning, NeurIPS 2019.
[[Kull et al. 19]]( Beyond temperature scaling: Obtaining well-calibrated multiclass probabilities with Dirichlet calibration, NeurIPS 2019.
[[Thulasidasan et al. 19]]( On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks, NeurIPS 2019.
[[Ovadia et al. 19]]( Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift, NeurIPS 2019.
[[Hendrycks et al. 20]]( AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty, ICLR 2020.
[[Filos et al. 20]]( Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?, ICML 2020.

### Robust Deep Learning
[[Szegedy et al. 14]]( Intriguing Properties of Neural Networks, ICLR 2014.
[[Goodfellow et al. 15]]( Explaining and Harnessing Adversarial Examples, ICLR 2015.
[[Kurakin et al. 17]]( Adversarial Machine Learning at Scale, ICLR 2017.
[[Madry et al. 18]]( Toward Deep Learning Models Resistant to Adversarial Attacks, ICLR 2018.
[[Eykholt et al. 18]]( Robust Physical-World Attacks on Deep Learning Visual Classification.
[[Athalye et al. 18]]( Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples, ICML 2018.
[[Zhang et al. 19]]( Theoretically Principled Trade-off between Robustness and Accuracy, ICML 2019.
[[Carmon et al. 19]]( Unlabeled Data Improves Adversarial Robustness, NeurIPS 2019.
[[Ilyas et al. 19]]( Adversarial Examples are not Bugs, They Are Features, NeurIPS 2019.
[[Li et al. 19]]( Certified Adversarial Robustness with Additive Noise, NeurIPS 2019.
[[Tramèr and Boneh 19]]( Adversarial Training and Robustness for Multiple Perturbations, NeurIPS 2019.
[[Shafahi et al. 19]]( Adversarial Training for Free!, NeurIPS 2019.
[[Wong et al. 20]]( Fast is Better Than Free: Revisiting Adversarial Training, ICLR 2020.
[[Madaan et al. 20]]( Adversarial Neural Pruning with Latent Vulnerability Suppression, ICML 2020.
[[Croce and Hein 20]]( Reliable Evaluation of Adversarial Robustness with an Ensemble of Diverse Parameter-free Attacks, ICML 2020.
[[Maini et al. 20]]( Adversarial Robustness Against the Union of Multiple Perturbation Models, ICML 2020.
[[Kim et al. 20]]( Adversarial Self-Supervised Contrastive Learning, NeurIPS 2020.
[[Wu et al. 20]]( Adversarial Weight Perturbation Helps Robust Generalization, NeurIPS 2020.
[[Laidlaw et al. 21]]( Perceptual Adversarial Robustness: Defense Against Unseen Threat Models, ICLR 2021.
[[Pang et al. 21]]( Bag of Tricks for Adversarial Training, ICLR 2021.
[[Madaan et al. 21]]( Learning to Generate Noise for Multi-Attack Robustness, ICML 2021.
[[Mladenovic et al. 22]]( Online Adversarial Attacks, ICLR 2022.
[[Zhang et al. 22]]( How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective, ICLR 2022.
[[Carlini and Terzis 22]]( Poisoning and Backdooring Contrastive Learning, ICLR 2022.
[[Croce et al. 22]]( Evaluating the Adversarial Robustness of Adaptive Test-time Defenses, ICML 2022.
[[Zhou et al. 22]]( Understanding the Robustness in Vision Transformers, ICML 2022.

### Graph Neural Networks
[[Li et al. 16]]( Gated Graph Sequence Neural Networks, ICLR 2016.
[[Hamilton et al. 17]]( Inductive Representation Learning on Large Graphs, NIPS 2017.
[[Kipf and Welling 17]]( Semi-Supervised Classification with Graph Convolutional Networks, ICLR 2017.
[[Velickovic et al. 18]]( Graph Attention Networks, ICLR 2018.
[[Ying et al. 18]]( Hierarchical Graph Representation Learning with Differentiable Pooling, NeurIPS 2018.
[[Xu et al. 19]]( How Powerful are Graph Neural Networks?, ICLR 2019.
[[Maron et al. 19]]( Provably Powerful Graph Networks, NeurIPS 2019.
[[Yun et al. 19]]( Graph Transformer Neteworks, NeurIPS 2019.
[[Loukas 20]]( What Graph Neural Networks Cannot Learn: Depth vs Width, ICLR 2020.
[[Bianchi et al. 20]]( Spectral Clustering with Graph Neural Networks for Graph Pooling, ICML 2020.
[[Xhonneux et al. 20]]( Continuous Graph Neural Networks, ICML 2020.
[[Garg et al. 20]]( Generalization and Representational Limits of Graph Neural Networks, ICML 2020.
[[Baek et al. 21]]( Accurate Learning of Graph Representations with Graph Multiset Pooling, ICLR 2021.
[[Liu et al. 21]]( Elastic Graph Neural Networks, ICML 2021.
[[Li et al. 21]]( Training Graph Neural networks with 1000 Layers, ICML 2021.
[[Jo et al. 21]]( Edge Representation Learning with Hypergraphs, NeurIPS 2021.
[[Guo et al. 22]]( Data-Efficient Graph Grammar Learning for Molecular Generation, ICLR 2022.
[[Geerts et al. 22]]( Expressiveness and Approximation Properties of Graph Neural Networks, ICLR 2022.
[[Bevilacqua et al. 22]]( Equivariant Subgraph Aggregation Networks, ICLR 2022.
[[Jo et al. 22]]( Score-based Generative Modeling of Graphs via the System of Stochastic Differential Equations, ICML 2022.
[[Hoogeboom et al. 22]]( Equivariant Diffusion for Molecule Generation in 3D, ICML 2022.

### Federated Learning
[[Konečný et al. 16]]( Federated Optimization: Distributed Machine Learning for On-Device Intelligence, arXiv Preprint, 2016.
[[Konečný et al. 16]]( Federated Learning: Strategies for Improving Communication Efficiency, NIPS Workshop on Private Multi-Party Machine Learning 2016.
[[McMahan et al. 17]]( Communication-Efficient Learning of Deep Networks from Decentralized Data, AISTATS 2017.
[[Smith et al. 17]]( Federated Multi-Task Learning, NIPS 2017.
[[Li et al. 20]]( Federated Optimization in Heterogeneous Networks, MLSys 2020.
[[Yurochkin et al. 19]]( Bayesian Nonparametric Federated Learning of Neural Networks, ICML 2019.
[[Bonawitz et al. 19]]( Towards Federated Learning at Scale: System Design, MLSys 2019.
[[Wang et al. 20]]( Federated Learning with Matched Averaging, ICLR 2020.
[[Li et al. 20]]( On the Convergence of FedAvg on Non-IID data, ICLR 2020.
[[Karimireddy et al. 20]]( SCAFFOLD: Stochastic Controlled Averaging for Federated Learning, ICML 2020.
[[Hamer et al. 20]]( FedBoost: Communication-Efficient Algorithms for Federated Learning, ICML 2020.
[[Rothchild et al. 20]]( FetchSGD: Communication-Efficient Federated Learning with Sketching, ICML 2020.
[[Fallah et al. 21]]( Personalized Federated Learning with Theoretical
Guarantees: A Model-Agnostic Meta-Learning Approach, NeurIPS 2020.
[[Reddi et al. 21]]( Adaptive Federated Optimization, ICLR 2021.
[[Jeong et al. 21]]( Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint Learning, ICLR 2021.
[[Yoon et al. 21]]( Federated Continual Learning with Weighted Inter-client Transfer, ICML 2021.
[[Li et al. 21]]( Ditto: Fair and Robust Federated Learning Through Personalization, ICML 2021.

### Neural Architecture Search
[[Zoph and Le 17]]( Neural Architecture Search with Reinforcement Learning, ICLR 2017.
[[Baker et al. 17]]( Designing Neural Network Architectures using Reinforcement Learning, ICLR 2017.
[[Real et al. 17]]( Large-Scale Evolution of Image Classifiers, ICML 2017.
[[Liu et al. 18]]( Hierarchical Representations for Efficient Architecture Search, ICLR 2018.
[[Pham et al. 18]]( Efficient Neural Architecture Search via Parameters Sharing, ICML 2018.
[[Luo et al. 18]]( Neural Architecture Optimization, NeurIPS 2018.
[[Liu et al. 19]]( DARTS: Differentiable Architecture Search, ICLR 2019.
[[Tan et al. 19]]( MnasNet: Platform-Aware Neural Architecture Search for Mobile, CVPR 2019.
[[Cai et al. 19]]( ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware, ICLR 2019.
[[Zhou et al. 19]]( BayesNAS: A Bayesian Approach for Neural Architecture Search, ICML 2019.
[[Tan and Le 19]]( EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks, ICML 2019.
[[Guo et al. 19]]( NAT: Neural Architecture Transformer for Accurate and Compact Architectures, NeurIPS 2019.
[[Chen et al. 19]]( DetNAS: Backbone Search for Object Detection, NeurIPS 2019.
[[Dong and Yang 20]]( NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search, ICLR 2020.
[[Zela et al. 20]]( Understanding and Robustifying Differentiable Architecture Search, ICLR 2020.
[[Cai et al. 20]]( Once-for-All: Train One Network and Specialize it for Efficient Deployment, ICLR 2020.
[[Such et al. 20]]( Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data, ICML 2020.
[[Liu et al. 20]]( Are Labels Necessary for Neural Architecture Search?, ECCV 2020.
[[Dudziak et al. 20]]( BRP-NAS: Prediction-based NAS using GCNs, NeurIPS 2020.
[[Li et al. 20]]( Neural Architecture Search in A Proxy Validation Loss Landscape, ICML 2020.
[[Lee et al. 21]]( Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets, ICLR 2021.
[[Mellor et al. 21]]( Neural Architecture Search without Training, ICML 2021.

### Large Language Models
[[Shoeybi et al. 19]]( Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism, arXiv preprint, 2019.
[[Raffel et al. 20]]( Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer, JMLR 2020.
[[Gururangan et al. 20]]( Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks, ACL 2020.
[[Brown et al. 20]]( Language Models are Few-shot Learners, NeurIPS 2020.
[[Rae et al. 21]]( Scaling Language Models: Methods, Analysis & Insights from Training Gopher, arXiv preprint, 2021.
[[Thoppilan et al. 22]]( LaMDA: Language Models for Dialog Applications, arXiv preprint, 2022.
[[Wei et al. 22]]( Finetuned Langauge Models Are Zero-Shot Learners, ICLR 2022.
[[Wang et al. 22]]( Language Modeling via Stochastic Processes, ICLR 2022.
[[Alayrac et al. 22]]( Flamingo: a Visual Language Model for Few-Shot Learning, arXiv preprint, 2022.
[[Chowdhery et al. 22]]( PaLM: Scaling Langauge Modeling with Pathways, arXiv preprint, 2022.
[[Wei et al. 22]]( Chain-of-Thought Prompting Elicits Reasoning in Large Language Models, NeurIPS 2022.

### Multimodal Generative Models
[[Li et al. 19]]( Controllable Text-to-Image Generation, NeurIPS 2019.
[[Ramesh et al. 21]]( Zero-Shot Text-to-Image Generation, ICML 2021.
[[Radford et al. 21]]( Learning Transferable Visual Models From Natural Language Supervision, ICML 2021.
[[Ding et al. 21]]( CogView: Mastering Text-to-Image Generation via Transformers, NeurIPS 2021.
[[Zou et al. 22]]( Towards Language-Free Training for Text-to-Image Generation, CVPR 2022.
[[Rombach et al. 22]]( High-Resolution Image Synthesis with Latent Diffusion Models, CVPR 2022.
[[Nichol et al. 22]]( GLIDE: Towards Photorealistic Image Generation and Editing with
Text-Guided Diffusion Models, ICML 2022.
[[Saharia et al. 22]]( Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding, arXiv preprint, 2022.
[[Yu et al. 22]]( Scaling Autoregressive Models for Content-Rich Text-to-Image Generation, arXiv preprint, 2022.