Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/WeiHuang05/Awesome-Feature-Learning-in-Deep-Learning-Thoery

Welcome to the Awesome Feature Learning in Deep Learning Thoery Reading Group! This repository serves as a collaborative platform for scholars, enthusiasts, and anyone interested in delving into the fascinating world of feature learning within deep learning theory.
https://github.com/WeiHuang05/Awesome-Feature-Learning-in-Deep-Learning-Thoery

List: Awesome-Feature-Learning-in-Deep-Learning-Thoery

Last synced: 3 months ago
JSON representation

Welcome to the Awesome Feature Learning in Deep Learning Thoery Reading Group! This repository serves as a collaborative platform for scholars, enthusiasts, and anyone interested in delving into the fascinating world of feature learning within deep learning theory.

Lists

README

        

# Feature Learning in Deep Learning Theory Reading Group

## Introduction

Welcome to the GitHub repository of our Feature Learning in Deep Learning Theory Reading Group! This group is dedicated to the study, discussion, and understanding of feature learning concepts and techniques in the field of Deep Learning.

## Objective

Our objective is to bring together researchers, professionals, students, and anyone interested in feature learning, to learn from each other, discuss recent advancements and challenges, and contribute to the knowledge pool of Deep Learning Theory.

## Participation

We warmly invite anyone interested to join us. To participate:

1. **Follow this Repository**: Keep up to date with the reading materials we will be discussing.
2. **Discussion**: Participate in discussions on the `Issues` tab. Each paper will have a dedicated issue where the discussion will take place.

## Reading List

The reading list will be updated on a weekly/bi-weekly basis with the papers/articles we plan to discuss. You can find the reading list as follows.

### Classification

- Towards Understanding **Ensemble**, **Knowledge Distillation** and **Self-Distillation** in Deep Learning, *ICLR 2023*. [(link)](https://arxiv.org/abs/2012.09816)

Zeyuan Allen-Zhu, Yuanzhi Li

- Feature purification: How **adversarial training** performs robust deep learning, *FOCS 2021*. [(link)](https://arxiv.org/abs/2005.10190)

Zeyuan Allen-Zhu, Yuanzhi Li

- Toward understanding the feature learning process of self-supervised **contrastive learning**, *ICML 2021*. [(link)](https://arxiv.org/abs/2105.15134)

Zixin Wen, Yuanzhi Li

- Benign Overfitting in Two-layer **Convolutional Neural Networks**, *NeurIPS 2022*. [(link)](https://arxiv.org/abs/2202.06526) [(video)](https://www.youtube.com/watch?v=n_F17KVDQHI)

Yuan Cao, Zixiang Chen, Mikhail Belkin, Quanquan Gu

- **Graph Neural Networks** Provably Benefit from Structural Information: A Feature Learning Perspective. *ICML 2023 Workshop, Contribued Talk*. [(link)](https://arxiv.org/abs/2306.13926)

Wei Huang, Yuan Cao, Haonan Wang, Xin Cao, Taiji Suzuki.

- Towards Understanding **Mixture of Experts** in Deep Learning, *NeurIPS 2022*. [(link)](https://arxiv.org/abs/2208.02813)

Zixiang Chen, Yihe Deng, Yue Wu, Quanquan Gu, Yuanzhi Li

- Understanding the Generalization of **Adam** in Learning Neural Networks with Proper Regularization, *ICLR 2023* [(link)](https://arxiv.org/abs/2108.11371)

Difan Zou, Yuan Cao, Yuanzhi Li, Quanquan Gu

- Towards Understanding Feature Learning in **Out-of-Distribution** Generalization, *NeurIPS 2023* [(link)](https://arxiv.org/abs/2304.11327)

Wei Huang*, Yongqiang Chen*, Kaiwen Zhou*, Yatao Bian, Bo Han, James Cheng

- Benign Overfitting for Two-layer **ReLU Networks**, *ICML 2023*. [(link)](https://arxiv.org/pdf/2303.04145.pdf)

Yiwen Kou* and Zixiang Chen* and Yuanzhou Chen and Quanquan Gu

- **Vision Transformers** provably learn spatial structure, *NeurIPS 2022*. [(link)](https://arxiv.org/abs/2210.09221)

Samy Jelassi, Michael E. Sander, Yuanzhi Li

- **Data Augmentation** as Feature Manipulation, *ICML 2022*. [(link)](https://arxiv.org/abs/2203.01572)

Ruoqi Shen, Sébastien Bubeck, Suriya Gunasekar

- Towards understanding how **momentum** improves generalization in deep learning, *ICML 2022*. [(link)](https://arxiv.org/abs/2207.05931)

Samy Jelassi, Yuanzhi Li

- The Benefits of **Mixup** for Feature Learning. [(link)](https://arxiv.org/abs/2303.08433)

Difan Zou, Yuan Cao, Yuanzhi Li, Quanquan Gu

- **Pruning Before Training** May Improve Generalization, Provably. [(link)](https://arxiv.org/abs/2301.00335)

Hongru Yang, Yingbin Liang, Xiaojie Guo, Lingfei Wu, Zhangyang Wang

- A Theoretical Understanding of Shallow **Vision Transformers**: Learning, Generalization, and Sample Complexity, *ICLR 2023*. [(link)](https://arxiv.org/abs/2302.06015)

Hongkang Li, Meng Wang, Sijia Liu, Pin-yu Chen

- Provably Learning Diverse Features in Multi-View Data with Midpoint **Mixup**, *ICML 2023*. [(link)](https://proceedings.mlr.press/v202/chidambaram23a/chidambaram23a.pdf)

Muthu Chidambaram, Xiang Wang, Chenwei Wu, Rong Ge

- How Does **Semi-supervised** Learning with Pseudo-Labels? A Case Study, *ICLR 2023*. [(link)](https://openreview.net/forum?id=Dzmd-Cc8OI)

Yiwen Kou, Zixiang Chen, Yuan Cao, Quanquan Gu

- Local Signal Adaptivity: Provable Feature Learning in Neural Networks Beyond Kernels, *NeurIPS 2021*. [(link)](https://proceedings.neurips.cc/paper/2021/hash/d064bf1ad039ff366564f352226e7640-Abstract.html)

Stefani Karp, Ezra Winston, Yuanzhi Li, Aarti Singh

- Provable Guarantees for Neural Networks via Gradient Feature Learning, *NeurIPS 2023*.

Zhenmei Shi*, Junyi Wei*, Yingyu Liang

- Robust Learning with Progressive Data Expansion Against **Spurious Correlation**, *NeurIPS 2023*. [(link)](https://arxiv.org/abs/2306.04949)

Yihe Deng, Yu Yang, Baharan Mirzasoleiman, Quanquan Gu

- Understanding Transferable Representation Learning and Zero-shot Transfer in **CLIP**, [(link)](https://arxiv.org/pdf/2310.00927.pdf)

Zixiang Chen* and Yihe Deng* and Yuanzhi Li and Quanquan Gu

- Why Does **Sharpness-Aware Minimization** Generalize Better Than SGD? *NeurIPS 2023*, [(link)](https://nips.cc/virtual/2023/poster/72901)

Zixiang Chen · Junkai Zhang · Yiwen Kou · Xiangning Chen · Cho-Jui Hsieh · Quanquan Gu

- Benign Overfitting without Linearity: Neural Network Classifiers Trained by Gradient Descent for **Noisy Linear Data**, *COLT 2022*, [(link)](https://proceedings.mlr.press/v178/frei22a/frei22a.pdf)

Spencer Frei, Niladri S. Chatterji, Peter L. Bartlett

- Random Feature Amplification: Feature Learning and Generalization in Neural Networks, *JMLR 2023*, [(link)](https://arxiv.org/abs/2202.07626)

Spencer Frei, Niladri S. Chatterji, Peter L. Bartlett

### K-parities (XOR)

- Benign Overfitting and Grokking in ReLU Networks for XOR Cluster Data. [(link)](https://arxiv.org/abs/2310.02541)

Zhiwei Xu, Yutong Wang, Spencer Frei, Gal Vardi, Wei Hu

- Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data. [(link)](https://arxiv.org/abs/2310.01975)

Xuran Meng, Difan Zou, Yuan Cao

- SGD Finds then Tunes Features in Two-Layer Neural Networks with near-Optimal Sample Complexity: A Case Study in the XOR problem. [(link)](https://arxiv.org/abs/2309.15111)

Margalit Glasgow

### Mean field Theory

- Feature learning via mean-field Langevin dynamics: classifying sparse parities and beyond, *NeurIPS 2023*

Taiji Suzuki, Denny Wu, Kazusato Oko, Atsushi Nitanda

### Regression

- Feature Learning in Infinite-Width Neural Networks, *ICML 2021*. [(link)](https://arxiv.org/abs/2011.14522)

Greg Yang, Edward J. Hu

- High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation, *NeurIPS 2022*. [(link)](https://arxiv.org/abs/2205.01445)

Jimmy Ba, Murat A. Erdogdu, Taiji Suzuki, Zhichao Wang, Denny Wu, Greg Yang

- Gradient-Based Feature Learning under Structured Data, *NeurIPS 2023*. [(link)](https://arxiv.org/abs/2309.03843)

Alireza Mousavi-Hosseini, Denny Wu, Taiji Suzuki, Murat A Erdogdu

- Neural Networks can Learn Representations with Gradient Descent, *COLT 2022*. [(link)](https://arxiv.org/abs/2206.15144)

Alex Damian, Jason D. Lee, Mahdi Soltanolkotabi

- Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks. [(link)](https://arxiv.org/abs/2305.06986)

Eshaan Nichani, Alex Damian, Jason D. Lee

- The merged-staircase property: a necessary and nearly sufficient condition for SGD learning of sparse functions on two-layer neural networks, *COLT 2022*. [(link)](https://arxiv.org/abs/2202.08658)

Emmanuel Abbe, Enric Boix-Adsera, Theodor Misiakiewicz

- Neural Networks Efficiently Learn Low-Dimensional Representations with SGD, *ICLR 2023*. [(link)](https://arxiv.org/abs/2209.14863)

Alireza Mousavi-Hosseini, Sejun Park, Manuela Girotti, Ioannis Mitliagkas, Murat A. Erdogdu

- Learning Two-Layer Neural Networks, One (Giant) Step at a Time [(link)](https://arxiv.org/abs/2305.18270)

Yatin Dandi, Florent Krzakala, Bruno Loureiro, Luca Pesce, Ludovic Stephan

### LLM

- Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer, *NeurIPS 2023* [(link)](https://arxiv.org/abs/2305.16380)

Yuandong Tian, Yiping Wang, Beidi Chen, Simon Du

- JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention

Yuandong Tian, Yiping Wang, Zhenyu Zhang, Beidi Chen, Simon Du

- On the Role of Attention in Prompt-tuning, *ICML 2023* [(link)](https://arxiv.org/pdf/2306.03435.pdf)

Samet Oymak*, Ankit Singh Rawat*, Mahdi Soltanolkotabi*, Christos Thrampoulidis*

- In-Context Convergence of Transformers, *NeurIPS 2023 workshop* [(link)](https://arxiv.org/abs/2310.05249)

Yu Huang, Yuan Cheng, Yingbin Liang

## Contact

For any queries, please open an issue or feel free to reach out to us via email at [email protected]

## Code of Conduct

We aim to maintain a respectful and inclusive environment for everyone, and we expect all participants to uphold this standard.

We look forward to your active participation and happy reading!