Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/kwignb/NeuralTangentKernel-Papers

Neural Tangent Kernel Papers
https://github.com/kwignb/NeuralTangentKernel-Papers

Last synced: about 2 months ago
JSON representation

Neural Tangent Kernel Papers

Awesome Lists containing this project

README

        

# Neural Tangent Kernel Papers
This list contains papers that adopt Neural Tangent Kernel (NTK) as a main theme or core idea.
*NOTE:* If there are any papers I've missed, please feel free to [raise an issue](https://github.com/kwignb/NeuralTangentKernel-Papers/issues).

## 2024
| Title | Venue | PDF | CODE |
| :-----|:-----:|:---:|:----:|
| Faithful and Efficient Explanations for Neural Networks via Neural Tangent Kernel Surrogate Models | ICLR | [PDF](https://arxiv.org/pdf/2305.14585.pdf) | [CODE](https://github.com/pnnl/projection_ntk) |
| PINNACLE: PINN Adaptive ColLocation and Experimental points selection | ICLR | [PDF](https://openreview.net/pdf?id=GzNaCp6Vcg) | - |
| On the Foundations of Shortcut Learning | ICLR | [PDF](https://arxiv.org/pdf/2310.16228.pdf) | - |
| Understanding Reconstruction Attacks with the Neural Tangent Kernel and Dataset Distillation | ICLR | [PDF](https://arxiv.org/pdf/2302.01428.pdf) | - |
| Sample Relationship from Learning Dynamics Matters for Generalisation | ICLR | [PDF](https://arxiv.org/pdf/2401.08808.pdf) | - |
| Robust NAS benchmark under adversarial training: assessment, theory, and beyond | ICLR | [PDF](https://openreview.net/pdf?id=cdUpf6t6LZ) | - |
| Theoretical Analysis of Robust Overfitting for Wide DNNs: An NTK Approach | ICLR | [PDF](https://arxiv.org/pdf/2310.06112.pdf) | [CODE](https://github.com/fshp971/adv-ntk) |
| Heterogeneous Personalized Federated Learning by Local-Global Updates Mixing via Convergence Rate | ICLR | [PDF](https://openreview.net/pdf?id=7pWRLDBAtc) | - |
| Neural Network-Based Score Estimation in Diffusion Models: Optimization and Generalization | ICLR | [PDF](https://arxiv.org/pdf/2401.15604.pdf) | - |
| Grokking as the Transition from Lazy to Rich Training Dynamics | ICLR | [PDF](https://arxiv.org/pdf/2310.06110.pdf) | - |
| Generalization of Deep ResNets in the Mean-Field Regime | ICLR | [PDF](https://openreview.net/pdf?id=tMzPZTvz2H) | - |

## 2023
| Title | Venue | PDF | CODE |
| :-----|:-----:|:---:|:----:|
| Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models | NeurIPS | [PDF](https://arxiv.org/pdf/2305.12827.pdf) | [CODE](https://github.com/gortizji/tangent_task_arithmetic) |
| Deep Learning with Kernels through RKHM and the Perron–Frobenius Operator | NeurIPS | [PDF](https://arxiv.org/pdf/2305.13588.pdf) | - |
| A Theoretical Analysis of the Test Error of Finite-Rank Kernel Ridge Regression | NeurIPS | [PDF](https://arxiv.org/pdf/2310.00987.pdf) | - |
| Fixing the NTK: From Neural Network Linearizations to Exact Convex Programs | NeurIPS | [PDF](https://web.stanford.edu/~pilanci/papers/fixing_the_ntk.pdf) | - |
| Beyond NTK with Vanilla Gradient Descent: A Mean-Field Analysis of Neural Networks with Polynomial Width, Samples, and Time | NeurIPS | [PDF](https://arxiv.org/pdf/2306.16361.pdf) | - |
| Feature-Learning Networks Are Consistent Across Widths At Realistic Scales | NeurIPS | [PDF](https://arxiv.org/pdf/2305.18411.pdf) | - |
| Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks | NeurIPS | [PDF](https://arxiv.org/pdf/2304.03408.pdf) | [CODE](https://github.com/Pehlevan-Group/dmft_fluctuations) |
| Spectral Evolution and Invariance in Linear-width Neural Networks | NeurIPS | [PDF](https://arxiv.org/pdf/2211.06506.pdf) | - |
| Analyzing Generalization of Neural Networks through Loss Path Kernels | NeurIPS | [PDF](https://openreview.net/pdf?id=8Ba7VJ7xiM) | - |
| Neural (Tangent Kernel) Collapse | NeurIPS | [PDF](https://arxiv.org/pdf/2305.16427.pdf) | - |
| Mind the spikes: Benign overfitting of kernels and neural networks in fixed dimension | NeurIPS | [PDF](https://arxiv.org/pdf/2305.14077.pdf) | [CODE](https://github.com/moritzhaas/mind-the-spikes) |
| A PAC-Bayesian Perspective on the Interpolating Information Criterion | NeurIPS-W | [PDF](https://arxiv.org/pdf/2311.07013.pdf) | - |
| A Kernel Perspective of Skip Connections in Convolutional Networks | ICLR | [PDF](https://arxiv.org/pdf/2211.14810.pdf) | - |
| Scale-invariant Bayesian Neural Networks with Connectivity Tangent Kernel | ICLR | [PDF](https://arxiv.org/pdf/2209.15208.pdf) | - |
| Symmetric Pruning in Quantum Neural Networks | ICLR | [PDF](https://arxiv.org/pdf/2208.14057.pdf) | - |
| The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks | ICLR | [PDF](https://arxiv.org/pdf/2210.02157.pdf) | - |
| Few-shot Backdoor Attacks via Neural Tangent Kernels | ICLR | [PDF](https://arxiv.org/pdf/2210.05929.pdf) | - |
| Analyzing Tree Architectures in Ensembles via Neural Tangent Kernel | ICLR | [PDF](https://arxiv.org/pdf/2205.12904.pdf) | - |
| Supervision Complexity and its Role in Knowledge Distillation | ICLR | [PDF](https://arxiv.org/pdf/2301.12245.pdf) | - |
| NTK-SAP: Improving Neural Network Pruning By Aligning Training Dynamics | ICLR | [PDF](https://openreview.net/pdf?id=-5EWhW_4qWP) | [CODE](https://github.com/YiteWang/NTK-SAP) |
| Tuning Frequency Bias in Neural Network Training with Nonuniform Data | ICLR | [PDF](https://arxiv.org/pdf/2205.14300.pdf) | - |
| Simple initialization and parametrization of sinusoidal networks via their kernel bandwidth | ICLR | [PDF](https://arxiv.org/pdf/2211.14503.pdf) | - |
| Characterizing the spectrum of the NTK via a power series expansion | ICLR | [PDF](https://arxiv.org/pdf/2211.07844.pdf) | [CODE](https://github.com/bbowman223/data_ntk) |
| Adaptive Optimization in the $\infty$-Width Limit | ICLR | [PDF](https://openreview.net/pdf?id=zgVDqw9ZUES) | - |
| Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization | ICLR | [PDF](https://arxiv.org/pdf/2108.11371.pdf) | - |
| The Onset of Variance-Limited Behavior for Networks in the Lazy and Rich Regimes | ICLR | [PDF](https://arxiv.org/pdf/2212.12147.pdf) | - |
| Restricted Strong Convexity of Deep Learning Models with Smooth Activations | ICLR | [PDF](https://arxiv.org/pdf/2209.15106.pdf) | - |
| Feature selection and low test error in shallow low-rotation ReLU networks | ICLR | [PDF](https://arxiv.org/pdf/2208.02789.pdf) | - |
| Exploring Active 3D Object Detection from a Generalization Perspective | ICLR | [PDF](https://arxiv.org/pdf/2301.09249.pdf) | [CODE](https://github.com/Luoyadan/CRB-active-3Ddet) |
| On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks | AISTATS | [PDF](https://proceedings.mlr.press/v206/yang23b/yang23b.pdf) | - |
| Adversarial Noises Are Linearly Separable for (Nearly) Random Neural Networks | AISTATS | [PDF](https://proceedings.mlr.press/v206/zhang23d/zhang23d.pdf) | - |
| Regularize Implicit Neural Representation by Itself | CVPR | [PDF](https://arxiv.org/pdf/2303.15484.pdf) | - |
| WIRE: Wavelet Implicit Neural Representations | CVPR | [PDF](https://arxiv.org/pdf/2301.05187.pdf) | [CODE](https://github.com/vishwa91/wire) |
| Regularizing Second-Order Influences for Continual Learning | CVPR | [PDF](https://arxiv.org/pdf/2304.10177.pdf) | [CODE](https://github.com/feifeiobama/InfluenceCL) |
| Multiplicative Fourier Level of Detail | CVPR | [PDF](https://openaccess.thecvf.com/content/CVPR2023/papers/Dou_Multiplicative_Fourier_Level_of_Detail_CVPR_2023_paper.pdf) | - |
| KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection | ICCV | [PDF](https://openaccess.thecvf.com/content/ICCV2023/papers/Luo_KECOR_Kernel_Coding_Rate_Maximization_for_Active_3D_Object_Detection_ICCV_2023_paper.pdf) | [CODE](https://github.com/Luoyadan/KECOR-active-3Ddet) |
| TKIL: Tangent Kernel Approach for Class Balanced Incremental Learning | ICCV-W | [PDF](https://arxiv.org/pdf/2206.08492.pdf) | - |
| A Fast, Well-Founded Approximation to the Empirical Neural Tangent Kernel | ICML | [PDF](https://arxiv.org/pdf/2206.12543v1.pdf) | - |
| Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels | ICML | [PDF](https://arxiv.org/pdf/2306.03968.pdf) | [CODE](https://github.com/AlexImmer/ntk-marglik) |
| Graph Neural Tangent Kernel: Convergence on Large Graphs | ICML | [PDF](https://arxiv.org/pdf/2301.10808.pdf) | - |
| Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels | ICML | [PDF](https://arxiv.org/pdf/2302.01629.pdf) | [CODE](https://github.com/simone-bombari/beyond-universal-robustness) |
| Analyzing Convergence in Quantum Neural Networks: Deviations from Neural Tangent Kernels | ICML | [PDF](https://arxiv.org/pdf/2303.14844.pdf) | - |
| Benign Overfitting in Deep Neural Networks under Lazy Training | ICML | [PDF](https://arxiv.org/pdf/2305.19377.pdf) | - |
| Gradient Descent in Neural Networks as Sequential Learning in Reproducing Kernel Banach Space | ICML | [PDF](https://openreview.net/pdf?id=nCukQnbhp5) | - |
| A Kernel-Based View of Language Model Fine-Tuning | ICML | [PDF](https://proceedings.mlr.press/v202/malladi23a/malladi23a.pdf) | - |
| Combinatorial Neural Bandits | ICML | [PDF](https://arxiv.org/pdf/2306.00242.pdf) | - |
| What Can Be Learnt With Wide Convolutional Neural Networks? | ICML | [PDF](https://arxiv.org/pdf/2208.01003.pdf) | [CODE](https://github.com/pcsl-epfl/convolutional_neural_kernels) |
| Reward-Biased Maximum Likelihood Estimation for Neural Contextual Bandits | AAAI | [PDF](https://arxiv.org/pdf/2203.04192.pdf) | - |
| Neural tangent kernel at initialization: linear width suffices | UAI | [PDF](https://proceedings.mlr.press/v216/banerjee23a.html) | - |
| Kernel Ridge Regression-Based Graph Dataset Distillation | SIGKDD | [PDF](https://dl.acm.org/doi/pdf/10.1145/3580305.3599398) | [CODE](https://github.com/pricexu/KIDD) |
| Analyzing Deep PAC-Bayesian Learning with Neural Tangent Kernel: Convergence, Analytic Generalization Bound, and Efficient Hyperparameter Selection | TMLR | [PDF](https://openreview.net/pdf?id=nEX2q5B2RQ) | - |
| The Eigenlearning Framework: A Conservation Law Perspective on Kernel Regression and Wide Neural Networks | TMLR | [PDF](https://arxiv.org/pdf/2110.03922.pdf) | [CODE](https://github.com/james-simon/eigenlearning) |
| Empirical Limitations of the NTK for Understanding Scaling Laws in Deep Learning | TMLR | [PDF](https://openreview.net/pdf?id=Y3saBb7mCE) | - |
| Analysis of Convolutions, Non-linearity and Depth in Graph Neural Networks using Neural Tangent Kernel | TMLR | [PDF](https://openreview.net/pdf?id=xgYgDEof29) | - |
| A Framework and Benchmark for Deep Batch Active Learning for Regression | JMLR | [PDF](https://arxiv.org/pdf/2203.09410.pdf) | [CODE](https://github.com/dholzmueller/bmdal_rega) |
| A Continual Learning Algorithm Based on Orthogonal Gradient Descent Beyond Neural Tangent Kernel Regime | IEEE | [PDF](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10213447) | - |
| The Quantum Path Kernel: A Generalized Neural Tangent Kernel for Deep Quantum Machine Learning | QE | [PDF](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10158043) | - |
| NeuralBO: A Black-box Optimization Algorithm using Deep Neural Networks | NC | [PDF](https://arxiv.org/pdf/2303.01682.pdf) | - |
| Deep Learning in Random Neural Fields: Numerical Experiments via Neural Tangent Kernel | NN | [PDF](https://arxiv.org/pdf/2202.05254.pdf) | [CODE](https://github.com/kwignb/RandomNeuralField) |
| Physics-informed radial basis network (PIRBN): A local approximating neural network for solving nonlinear partial differential equations | CMAME | [PDF](https://arxiv.org/pdf/2304.06234.pdf) | - |
| A non-gradient method for solving elliptic partial differential equations with deep neural networks | JoCP | [PDF](https://ins.sjtu.edu.cn/people/xuzhiqin/pub/nongradpde.pdf) | - |
| Self-Adaptive Physics-Informed Neural Networks using a Soft Attention Mechanism | JoCP | [PDF](https://arxiv.org/pdf/2009.04544.pdf) | - |
| Towards a phenomenological understanding of neural networks: data | MLST | [PDF](https://arxiv.org/pdf/2305.00995.pdf) | - |
| Weighted Neural Tangent Kernel: A Generalized and Improved Network-Induced Kernel | ML | [PDF](https://arxiv.org/pdf/2103.11558.pdf) | [CODE](https://github.com/ASTAugustin/WNTK_Machine_Learning) |
| Tensor Programs IVb: Adaptive Optimization in the ∞-Width Limit | arXiv | [PDF](https://arxiv.org/pdf/2308.01814.pdf) | - |

## 2022
| Title | Venue | PDF | CODE |
| :-----|:-----:|:---:|:----:|
| Generalization Properties of NAS under Activation and Skip Connection Search | NeurIPS | [PDF](https://arxiv.org/pdf/2209.07238.pdf) | - |
| Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a Polynomial Net Study | NeurIPS | [PDF](https://arxiv.org/pdf/2209.07736.pdf) | [CODE](https://github.com/LIONS-EPFL/pntk) |
| Graph Neural Network Bandits | NeurIPS | [PDF](https://arxiv.org/pdf/2207.06456.pdf) | - |
| Lossless Compression of Deep Neural Networks: A High-dimensional Neural Tangent Kernel Approach | NeurIPS | [PDF](https://zhenyu-liao.github.io/pdf/conf/RMT4DeepCompress_nips22.pdf) | - |
| GraphQNTK: Quantum Neural Tangent Kernel for Graph Data | NeurIPS | [PDF](https://openreview.net/pdf?id=RBhIkQRpzFK) | [CODE](https://github.com/abel1231/graphQNTK) |
| Evolution of Neural Tangent Kernels under Benign and Adversarial Training | NeurIPS | [PDF](https://arxiv.org/pdf/2210.12030.pdf) | [CODE](https://github.com/yolky/adversarial_ntk_evolution) |
| TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent Kernels | NeurIPS | [PDF](https://download.arxiv.org/pdf/2207.06343v2) | [CODE](https://github.com/yaodongyu/TCT) |
| Making Look-Ahead Active Learning Strategies Feasible with Neural Tangent Kernels | NeurIPS | [PDF](https://download.arxiv.org/pdf/2206.12569v1) | [CODE](https://github.com/mohamad-amin/ntk-lookahead-active-learning) |
| Disentangling the Predictive Variance of Deep Ensembles through the Neural Tangent Kernel | NeurIPS | [PDF](https://download.arxiv.org/pdf/2210.09818v1) | [CODE](https://github.com/seijin-kobayashi/disentangle-predvar) |
| On the Generalization Power of the Overfitted Three-Layer Neural Tangent Kernel Model | NeurIPS | [PDF](https://download.arxiv.org/pdf/2206.02047v1) | - |
| What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness? | NeurIPS | [PDF](https://download.arxiv.org/pdf/2210.05577v1) | - |
| On the Spectral Bias of Convolutional Neural Tangent and Gaussian Process Kernels | NeurIPS | [PDF](https://download.arxiv.org/pdf/2203.09255v1) | - |
| Fast Neural Kernel Embeddings for General Activations | NeurIPS | [PDF](https://arxiv.org/pdf/2209.04121.pdf) | [CODE](https://github.com/insuhan/ntk_activations) |
| Bidirectional Learning for Offline Infinite-width Model-based Optimization | NeurIPS | [PDF](https://download.arxiv.org/pdf/2209.07507v4) | - |
| Infinite Recommendation Networks: A Data-Centric Approach | NeurIPS | [PDF](https://arxiv.org/pdf/2206.02626.pdf) | [CODE1](https://github.com/noveens/infinite_ae_cf)
[CODE2](https://github.com/noveens/distill_cf) |
| Distribution-Informed Neural Networks for Domain Adaptation Regression | NeurIPS | [PDF](https://openreview.net/pdf?id=8hoDLRLtl9h) | - |
| Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks | NeurIPS | [PDF](https://arxiv.org/pdf/2205.09653.pdf) | - |
| Spectral Bias Outside the Training Set for Deep Networks in the Kernel Regime | NeurIPS | [PDF](https://arxiv.org/pdf/2206.02927.pdf) | [CODE](https://github.com/bbowman223/deepspec) |
| Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization) | NeurIPS | [PDF](https://arxiv.org/pdf/2209.07263.pdf) | - |
| Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture | NeurIPS | [PDF](https://arxiv.org/pdf/2205.11786.pdf) | - |
| A Neural Pre-Conditioning Active Learning Algorithm to Reduce Label Complexity | NeurIPS | [PDF](https://arxiv.org/pdf/2104.03525.pdf) | - |
| NFT-K: Non-Fungible Tangent Kernels | ICASSP | [PDF](https://arxiv.org/pdf/2110.04945.pdf) | [CODE](https://gitlab.com/cjbarb7/icassp2022) |
| Label Propagation Across Grapsh: Node Classification Using Graph Neural Tangent Kenrels | ICASSP | [PDF](https://arxiv.org/pdf/2110.03763.pdf) | - |
| A Neural Tangent Kernel Perspective of Infinite Tree Ensembles | ICLR | [PDF](https://arxiv.org/pdf/2109.04983.pdf) | - |
| Neural Networks as Kernel Learners: The Silent Alignment Effect | ICLR | [PDF](https://arxiv.org/pdf/2111.00034.pdf) | - |
| Towards Deepening Graph Neural Networks: A GNTK-based Optimization Perspective | ICLR | [PDF](https://arxiv.org/pdf/2103.03113.pdf) | - |
| Overcoming The Spectral Bias of Neural Value Approximation | ICLR | [PDF](https://arxiv.org/pdf/2206.04672.pdf) | [CODE](https://www.episodeyang.com/ffn/) |
| Efficient Computation of Deep Nonlinear Infinite-Width Neural Networks that Learn Features | ICLR | [PDF](https://openreview.net/pdf?id=tUMr0Iox8XW) | [CODE](https://github.com/santacml/pilim) |
| Learning Neural Contextual Bandits Through Perturbed Rewards | ICLR | [PDF](https://arxiv.org/pdf/2201.09910.pdf) | - |
| Learning Curves for Continual Learning in Neural Networks: Self-knowledge Transfer and Forgetting | ICLR | [PDF](https://arxiv.org/pdf/2112.01653.pdf) | - |
| The Spectral Bias of Polynomial Neural Networks | ICLR | [PDF](https://arxiv.org/pdf/2202.13473.pdf) | - |
| On Feature Learning in Neural Networks with Global Convergence Guarantees | ICLR | [PDF](https://arxiv.org/pdf/2204.10782.pdf) | - |
| Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks | ICLR | [PDF](https://arxiv.org/pdf/2201.04738.pdf) | - |
| Eigenspace Restructuring: A Principle of Space and Frequency in Neural Networks | COLT | [PDF](https://proceedings.mlr.press/v178/xiao22a/xiao22a.pdf) | - |
| Neural Networks can Learn Representations with Gradient Descent | COLT | [PDF](https://arxiv.org/pdf/2206.15144.pdf) | - |
| Neural Contextual Bandits without Regret | AISTATS | [PDF](https://arxiv.org/pdf/2107.03144.pdf) | - |
| Finding Dynamics Preserving Adversarial Winning Tickets | AISTATS | [PDF](https://proceedings.mlr.press/v151/shi22a/shi22a.pdf) | - |
| Embedded Ensembles: Infinite Width Limit and Operating Regimes | AISTATS | [PDF](https://proceedings.mlr.press/v151/velikanov22a/velikanov22a.pdf) | - |
| Global Convergence of MAML and Theory-Inspired Neural Architecture Search for Few-Shot Learning | CVPR | [PDF](https://arxiv.org/pdf/2203.09137.pdf) | [CODE](https://github.com/YiteWang/MetaNTK-NAS) |
| Demystifying the Neural Tangent Kernel from a Practical Perspective: Can it be trusted for Neural Architecture Search without training? | CVPR | [PDF](https://arxiv.org/pdf/2203.14577.pdf) | [CODE](https://github.com/nutellamok/DemystifyingNTK) |
| A Structured Dictionary Perspective on Implicit Neural Representations | CVPR | [PDF](https://arxiv.org/pdf/2112.01917.pdf) | [CODE](https://github.com/gortizji/inr_dictionaries) |
| NL-FFC: Non-Local Fast Fourier Convolution for Image Super Resolution | CVPR-W | [PDF](https://openaccess.thecvf.com/content/CVPR2022W/NTIRE/papers/Sinha_NL-FFC_Non-Local_Fast_Fourier_Convolution_for_Image_Super_Resolution_CVPRW_2022_paper.pdf) | [CODE](https://github.com/gortizji/inr_dictionaries) |
| Intrinsic Neural Fields: Learning Functions on Manifolds | ECCV | [PDF](https://vision.in.tum.de/_media/spezial/bib/koestler2022intrinsic.pdf) | - |
| Random Gegenbauer Features for Scalable Kernel Methods | ICML | [PDF](https://arxiv.org/pdf/2202.03474.pdf) | - |
| Fast Finite Width Neural Tangent Kernel | ICML | [PDF](https://proceedings.mlr.press/v162/novak22a/novak22a.pdf) | [CODE](https://github.com/google/neural-tangents) |
| A Neural Tangent Kernel Perspective of GANs | ICML | [PDF](https://proceedings.mlr.press/v162/franceschi22a/franceschi22a.pdf) | [CODE](https://github.com/emited/gantk2) |
| Neural Tangent Kernel Empowered Federated Learning | ICML | [PDF](https://proceedings.mlr.press/v162/yue22a/yue22a.pdf) | - |
| Reverse Engineering the Neural Tangent Kernel | ICML | [PDF](https://proceedings.mlr.press/v162/simon22a/simon22a.pdf) | [CODE](https://github.com/james-simon/reverse-engineering) |
| How to Train Your Wide Neural Network Without Backprop: An Input-Weight Alignment Perspective | ICML | [PDF](https://arxiv.org/pdf/2106.08453.pdf) | [CODE](https://github.com/FieteLab/Wide-Network-Alignment) |
| Bounding the Width of Neural Networks via Coupled Initialization – A Worst Case Analysis – | ICML | [PDF](https://proceedings.mlr.press/v162/woodruff22a/woodruff22a.pdf) | - |
| Leverage Score Sampling for Tensor Product Matrices in Input Sparsity Time | ICML | [PDF](https://proceedings.mlr.press/v162/munteanu22a/munteanu22a.pdf) | - |
| Lazy Estimation of Variable Importance for Large Neural Networks | ICML | [PDF](https://proceedings.mlr.press/v162/gao22h/gao22h.pdf) | - |
| DAVINZ: Data Valuation using Deep Neural Networks at Initialization | ICML | [PDF](https://proceedings.mlr.press/v162/wu22j/wu22j.pdf) | - |
| Neural Tangent Kernel Beyond the Infinite-Width Limit: Effects of Depth and Initialization | ICML | [PDF](https://arxiv.org/pdf/2202.00553.pdf) | [CODE](https://github.com/mselezniova/ntk_beyond_limit) |
| NeuralEF: Deconstructing Kernels by Deep Neural Networks | ICML | [PDF](https://arxiv.org/pdf/2205.00165.pdf) | [CODE](https://github.com/thudzj/neuraleigenfunction) |
| Feature Learning and Signal Propagation in Deep Neural Networks | ICML | [PDF](https://arxiv.org/pdf/2110.11749.pdf) | - |
| More Than a Toy: Random Matrix Models Predict How Real-World Neural Representations Generalize | ICML | [PDF](https://arxiv.org/pdf/2203.06176.pdf) | [CODE](https://github.com/aw31/empirical-ntks) |
| Fast Graph Neural Tangent Kernel via Kronecker Sketching | AAAI | [PDF](https://arxiv.org/pdf/2112.02446.pdf) | - |
| Rethinking Influence Functions of Neural Networks in the Over-parameterized Regime | AAAI | [PDF](https://www.aaai.org/AAAI22Papers/AAAI-5153.ZhangR.pdf) | - |
| On the Empirical Neural Tangent Kernel of Standard Finite-Width Convolutional Neural Network Architectures | UAI | [PDF](https://arxiv.org/pdf/2006.13645.pdf) | - |
| Feature Learning and Random Features in Standard Finite-Width Convolutional Neural Networks: An Empirical Study | UAI | [PDF](https://openreview.net/pdf?id=ScIEZdIiqe5) | - |
| Out of Distribution Detection via Neural Network Anchoring | ACML | [PDF](https://arxiv.org/pdf/2207.04125.pdf) | [CODE](https://github.com/LLNL/AMP) |
| Learning Neural Ranking Models Online from Implicit User Feedback | WWW | [PDF](https://arxiv.org/pdf/2201.06658.pdf) | - |
| Trust Your Robots! Predictive Uncertainty Estimation of Neural Networks with Sparse Gaussian Processes | CoRL | [PDF](https://arxiv.org/pdf/2109.09690.pdf) | - |
| When and why PINNs fail to train: A neural tangent kernel perspective | CP | [PDF](https://arxiv.org/pdf/2007.14527.pdf) | [CODE](https://github.com/PredictiveIntelligenceLab/PINNsNTK ) |
| How Neural Architectures Affect Deep Learning for Communication Networks? | ICC | [PDF](https://arxiv.org/pdf/2111.02215.pdf) | - |
| Loss landscapes and optimization in over-parameterized non-linear systems and neural networks | ACHA | [PDF](https://arxiv.org/pdf/2003.00307.pdf) | - |
| Feature Purification: How Adversarial Training Performs Robust Deep Learning | FOCS | [PDF](https://arxiv.org/pdf/2005.10190.pdf) | - |
| Kernel-Based Smoothness Analysis of Residual Networks | MSML | [PDF](https://arxiv.org/pdf/2009.10008.pdf) | - |
| Analyzing Finite Neural Networks: Can We Trust Neural Tangent Kernel Theory? | MSML | [PDF](https://proceedings.mlr.press/v145/seleznova22a/seleznova22a.pdf) | - |
| The Training Response Law Explains How Deep Neural Networks Learn | IoP | [PDF](https://arxiv.org/pdf/2204.07291.pdf) | - |
| Simple, Fast, and Flexible Framework for Matrix Completion with Infinite Width Neural Networks | PNAS | [PDF](https://arxiv.org/pdf/2108.00131.pdf) | [CODE](https://github.com/uhlerlab/ntk_matrix_completion) |
| Representation Learning via Quantum Neural Tangent Kernels | PRX Quantum | [PDF](https://arxiv.org/pdf/2111.04225.pdf) | - |
| TorchNTK: A Library for Calculation of Neural Tangent Kernels of PyTorch Models | arXiv | [PDF](https://arxiv.org/pdf/2205.12372.pdf) | [CODE](https://github.com/pnnl/torchntk) |
| Neural Tangent Kernel Analysis of Shallow α-Stable ReLU Neural Networks | arXiv | [PDF](https://arxiv.org/pdf/2206.08065.pdf) | - |
| Neural Tangent Kernel: A Survey | arXiv | [PDF](https://arxiv.org/pdf/2208.13614.pdf) | - |

## 2021
| Title | Venue | PDF | CODE |
| :-----|:-----:|:---:|:----:|
| Neural Tangent Kernel Maximum Mean Discrepancy | NeurIPS | [PDF](https://arxiv.org/pdf/2106.03227.pdf) | - |
| DNN-based Topology Optimisation: Spatial Invariance and Neural Tangent Kernel | NeurIPS | [PDF](https://arxiv.org/pdf/2106.05710.pdf) | - |
| Stability & Generalisation of Gradient Descent for Shallow Neural Networks without the Neural Tangent Kernel | NeurIPS | [PDF](https://arxiv.org/pdf/2107.12723.pdf) | - |
| Scaling Neural Tangent Kernels via Sketching and Random Features | NeurIPS | [PDF](https://arxiv.org/pdf/2106.07880.pdf) | - |
| Dataset Distillation with Infinitely Wide Convolutional Networks | NeurIPS | [PDF](https://arxiv.org/pdf/2107.13034.pdf) | - |
| On the Equivalence between Neural Network and Support Vector Machine | NeurIPS | [PDF](https://arxiv.org/pdf/2111.06063.pdf) | [CODE](https://github.com/leslie-CH/equiv-nn-svm) |
| Local Signal Adaptivity: Provable Feature Learning in Neural Networks Beyond Kernels | NeurIPS | [PDF](https://proceedings.neurips.cc/paper/2021/file/d064bf1ad039ff366564f352226e7640-Paper.pdf) | [CODE](https://github.com/skarp/local-signal-adaptivity) |
| Explicit Loss Asymptotics in the Gradient Descent Training of Neural Networks | NeurIPS | [PDF](https://proceedings.neurips.cc/paper/2021/file/14faf969228fc18fcd4fcf59437b0c97-Paper.pdf) | - |
| Kernelized Heterogeneous Risk Minimization | NeurIPS | [PDF](https://arxiv.org/pdf/2110.12425.pdf) | [CODE](https://github.com/LJSthu/Kernelized-HRM) |
| An Empirical Study of Neural Kernel Bandits | NeurIPS-W | [PDF](https://arxiv.org/pdf/2111.03543.pdf) | - |
| The Curse of Depth in Kernel Regime | NeurIPS-W | [PDF](https://proceedings.mlr.press/v163/hayou22a/hayou22a.pdf) | - |
| Wearing a MASK: Compressed Representations of Variable-Length Sequences Using Recurrent Neural Tangent Kernels | ICASSP | [PDF](https://arxiv.org/pdf/2010.13975.pdf) | [CODE](https://github.com/dlej/MASK) |
| The Dynamics of Gradient Descent for Overparametrized Neural Networks | L4DC | [PDF](https://arxiv.org/pdf/2105.06569.pdf) | - |
| The Recurrent Neural Tangent Kernel | ICLR | [PDF](https://openreview.net/pdf?id=3T9iFICe0Y9) | - |
| Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS | ICLR | [PDF](https://openreview.net/pdf?id=vK9WrZ0QYQ) | - |
| Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime | ICLR | [PDF](https://arxiv.org/pdf/2006.12297.pdf) | - |
| Meta-Learning with Neural Tangent Kernels | ICLR | [PDF](https://arxiv.org/pdf/2102.03909.pdf) | - |
| How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks | ICLR | [PDF](https://arxiv.org/pdf/2009.11848.pdf) | - |
| Deep Networks and the Multiple Manifold Problem | ICLR | [PDF](https://arxiv.org/pdf/2008.11245.pdf) | - |
| Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective | ICLR | [PDF](https://arxiv.org/pdf/2102.11535.pdf) | [CODE](https://github.com/VITA-Group/TENAS) |
| Neural Thompson Sampling | ICLR | [PDF](https://arxiv.org/pdf/2010.00827.pdf) | - |
| Deep Equals Shallow for ReLU Networks in Kernel Regimes | ICLR | [PDF](https://arxiv.org/pdf/2009.14397.pdf) | - |
| A Recipe for Global Convergence Guarantee in Deep Neural Networks | AAAI | [PDF](https://arxiv.org/pdf/2104.05785.pdf) | - |
| A Deep Conditioning Treatment of Neural Networks | ALT | [PDF](https://arxiv.org/pdf/2002.01523.pdf) | - |
| Nonparametric Regression with Shallow Overparameterized Neural Networks Trained by GD with Early Stopping | COLT | [PDF](https://arxiv.org/pdf/2107.05341.pdf) | - |
| Learning with invariances in random features and kernel models | COLT | [PDF](https://arxiv.org/pdf/2102.13219.pdf) | - |
| Implicit Regularization via Neural Feature Alignment | AISTATS | [PDF](https://arxiv.org/pdf/2008.00938.pdf) | [CODE](https://github.com/tfjgeorge/ntk_alignment) |
| Regularization Matters: A Nonparametric Perspective on Overparametrized Neural Network | AISTATS | [PDF](https://arxiv.org/pdf/2007.02486.pdf) | - |
| One-pass Stochastic Gradient Descent in Overparametrized Two-layer Neural Networks | AISTATS | [PDF](https://arxiv.org/pdf/2105.00262.pdf) | - |
| Fast Adaptation with Linearized Neural Networks | AISTATS | [PDF](https://arxiv.org/pdf/2103.01439.pdf) | [CODE](https://github.com/amzn/xfer/tree/master/finite_ntk) |
| Fast Learning in Reproducing Kernel Kreın Spaces via Signed Measures | AISTATS | [PDF](https://arxiv.org/pdf/2006.00247.pdf) | - |
| Stable ResNet | AISTATS | [PDF](https://arxiv.org/pdf/2010.12859.pdf) | - |
| A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks | AISTATS | [PDF](https://arxiv.org/pdf/2010.13165.pdf) | - |
| Can We Characterize Tasks Without Labels or Features? | CVPR | [PDF](https://openaccess.thecvf.com/content/CVPR2021/papers/Wallace_Can_We_Characterize_Tasks_Without_Labels_or_Features_CVPR_2021_paper.pdf) | [CODE](https://github.com/BramSW/task_characterization_cvpr_2021) |
| The Neural Tangent Link Between CNN Denoisers and Non-Local Filters | CVPR | [PDF](https://arxiv.org/pdf/2006.02379.pdf) | [CODE](https://gitlab.com/Tachella/neural_tangent_denoiser) |
| Nerfies: Deformable Neural Radiance Fields | ICCV | [PDF](https://arxiv.org/pdf/2011.12948.pdf) | [CODE](https://github.com/google/nerfies) |
| Kernel Methods in Hyperbolic Spaces | ICCV | [PDF](https://openaccess.thecvf.com/content/ICCV2021/papers/Fang_Kernel_Methods_in_Hyperbolic_Spaces_ICCV_2021_paper.pdf) | - |
| Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks | ICML | [PDF](https://arxiv.org/pdf/2012.11654.pdf) | - |
| On the Generalization Power of Overfitted Two-Layer Neural Tangent Kernel Models | ICML | [PDF](https://arxiv.org/pdf/2103.05243.pdf) | - |
| Tensor Programs IIb: Architectural Universality of Neural Tangent Kernel Training Dynamics | ICML | [PDF](https://arxiv.org/pdf/2105.03703.pdf) | - |
| Tensor Programs IV: Feature Learning in Infinite-Width Neural Networks | ICML | [PDF](https://arxiv.org/pdf/2011.14522.pdf) | [CODE](https://github.com/edwardjhu/TP4) |
| FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Convergence Analysis | ICML | [PDF](https://arxiv.org/pdf/2105.05001.pdf) | - |
| On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent | ICML | [PDF](https://arxiv.org/pdf/2102.09769.pdf) | - |
| Feature Learning in Infinite-Width Neural Networks | ICML | [PDF](https://arxiv.org/pdf/2011.14522.pdf) | [CODE](https://github.com/edwardjhu/TP4) |
| On Monotonic Linear Interpolation of Neural Network Parameters | ICML | [PDF](https://arxiv.org/pdf/2104.11044.pdf) | - |
| Uniform Convergence, Adversarial Spheres and a Simple Remedy | ICML | [PDF](https://arxiv.org/pdf/2105.03491.pdf) | - |
| Quantifying the Benefit of Using Differentiable Learning over Tangent Kernels | ICML | [PDF](https://arxiv.org/pdf/2103.01210.pdf) | - |
| Efficient Statistical Tests: A Neural Tangent Kernel Approach | ICML | [PDF](http://proceedings.mlr.press/v139/jia21a/jia21a.pdf) | - |
| Neural Tangent Generalization Attacks | ICML | [PDF](http://proceedings.mlr.press/v139/yuan21b/yuan21b.pdf) | [CODE](https://github.com/lionelmessi6410/ntga) |
| On the Random Conjugate Kernel and Neural Tangent Kernel | ICML | [PDF](http://proceedings.mlr.press/v139/hu21b/hu21b.pdf) | - |
| Generalization Guarantees for Neural Architecture Search with Train-Validation Split | ICML | [PDF](https://arxiv.org/pdf/2104.14132.pdf) | - |
| Tilting the playing field: Dynamical loss functions for machine learning | ICML | [PDF](http://proceedings.mlr.press/v139/ruiz-garcia21a/ruiz-garcia21a.pdf) | [CODE](https://github.com/miguel-rg/dynamical-loss-functions) |
| PHEW : Constructing Sparse Networks that Learn Fast and Generalize Well Without Training Data | ICML | [PDF](hhttp://proceedings.mlr.press/v139/patil21a/patil21a.pdf) | - |
| On the Neural Tangent Kernel of Deep Networks with Orthogonal Initialization | IJCAI | [PDF](https://arxiv.org/pdf/2004.05867.pdf) | [CODE](https://github.com/WeiHuang05/Neural-Tangent-Kernel-with-Orthogonal-Initialization) |
| Towards Understanding the Spectral Bias of Deep Learning | IJCAI | [PDF](https://arxiv.org/pdf/1912.01198.pdf) | - |
| On Random Kernels of Residual Architectures | UAI | [PDF](https://arxiv.org/pdf/2001.10460.pdf) | - |
| How Shrinking Gradient Noise Helps the Performance of Neural Networks | ICBD | [PDF](https://www.researchgate.net/profile/Zhun-Deng/publication/356891225_The_Role_of_Gradient_Noise_in_the_Optimization_of_Neural_Networks/links/61b1729c4d7ff64f053691b1/The-Role-of-Gradient-Noise-in-the-Optimization-of-Neural-Networks.pdf) | - |
| Unsupervised Shape Completion via Deep Prior in the Neural Tangent Kernel Perspective | ACM TOG | [PDF](https://arxiv.org/pdf/2104.09023.pdf) | - |
| Benefits of Jointly Training Autoencoders: An Improved Neural Tangent Kernel Analysis | TIT | [PDF](https://arxiv.org/pdf/1911.11983.pdf) | - |
| Reinforcement Learning via Gaussian Processes with Neural Network Dual Kernels | CoG | [PDF](https://arxiv.org/pdf/2004.05198.pdf) | - |
| Kernel-Based Smoothness Analysis of Residual Networks | MSML | [PDF](https://arxiv.org/pdf/2009.10008.pdf) | - |
| Mathematical Models of Overparameterized Neural Networks | IEEE | [PDF](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9326403) | - |
| A Feature Fusion Based Indicator for Training-Free Neural Architecture Search | IEEE | [PDF](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9548935) | - |
| Pathological spectra of the Fisher information metric and its variants in deep neural networks | NC | [PDF](https://arxiv.org/pdf/1910.05992.pdf) | - |
| Linearized two-layers neural networks in high dimension | Ann. Statist. | [PDF](https://arxiv.org/pdf/1904.12191.pdf) | - |
| Geometric compression of invariant manifolds in neural nets | J. Stat. Mech. | [PDF](https://www.researchgate.net/profile/Leonardo-Petrini-2/publication/343150406_Compressing_invariant_manifolds_in_neural_nets/links/602e34cda6fdcc37a8339aff/Compressing-invariant-manifolds-in-neural-nets.pdf) | [CODE](https://github.com/mariogeiger/feature_lazy/tree/compressing_invariant_manifolds) |
| A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks | arXiv | [PDF](https://arxiv.org/pdf/2101.04243.pdf) | - |
| Learning with Neural Tangent Kernels in Near Input Sparsity Time | arXiv | [PDF](https://arxiv.org/pdf/2104.00415.pdf) | - |
| Spectral Analysis of the Neural Tangent Kernel for Deep Residual Networks | arXiv | [PDF](https://arxiv.org/abs/2104.03093.pdf) | - |
| Properties of the After Kernel | arXiv | [PDF](https://arxiv.org/pdf/2105.10585.pdf) | [CODE](https://github.com/google-research/google-research/tree/master/after_kernel) |

## 2020
| Title | Venue | PDF | CODE |
| :-----|:-----:|:---:|:----:|
| Forgetting Outside the Box: Scrubbing Deep Networks of Information Accessible from Input-Output Observations | ECCV | [PDF](https://arxiv.org/pdf/2003.02960.pdf) | - |
| Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? — A Neural Tangent Kernel Perspective | NeurIPS | [PDF](https://arxiv.org/pdf/2002.06262.pdf) | - |
| Label-Aware Neural Tangent Kernel: Toward Better Generalization and Local Elasticity | NeurIPS | [PDF](https://arxiv.org/pdf/2010.11775.pdf) | [CODE](https://github.com/HornHehhf/LANTK) |
| Finite Versus Infinite Neural Networks: an Empirical Study | NeurIPS | [PDF](https://arxiv.org/pdf/2007.15801.pdf) | - |
| On the linearity of large non-linear models: when and why the tangent kernel is constant | NeurIPS | [PDF](https://arxiv.org/pdf/2010.01092.pdf) | - |
| On the Similarity between the Laplace and Neural Tangent Kernels | NeurIPS | [PDF](https://arxiv.org/pdf/2007.01580.pdf) | - |
| A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks | NeurIPS | [PDF](https://arxiv.org/pdf/2002.04026.pdf) | - |
| Generalization bound of globally optimal non-convex neural network training: Transportation map estimation by infinite dimensional Langevin dynamics | NeurIPS | [PDF](https://arxiv.org/pdf/2007.05824.pdf) | - |
| Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains | NeurIPS | [PDF](https://arxiv.org/pdf/2006.10739.pdf) | [CODE](https://github.com/tancik/fourier-feature-networks) |
| Network size and weights size for memorization with two-layers neural networks | NeurIPS | [PDF](https://arxiv.org/pdf/2006.02855.pdf) | - |
| Neural Networks Learning and Memorization with (almost) no Over-Parameterization | NeurIPS | [PDF](https://arxiv.org/pdf/1911.09873.pdf) | - |
| Towards Understanding Hierarchical Learning: Benefits of Neural Representations | NeurIPS | [PDF](https://arxiv.org/pdf/2006.13436.pdf) | - |
| Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher | NeurIPS | [PDF](https://arxiv.org/pdf/2010.10090.pdf) | - |
| On Infinite-Width Hypernetworks | NeurIPS | [PDF](https://arxiv.org/pdf/2003.12193.pdf) | - |
| Predicting Training Time Without Training | NeurIPS | [PDF](https://arxiv.org/pdf/2008.12478.pdf) | - |
| Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel | NeurIPS | [PDF](https://arxiv.org/pdf/2010.15110.pdf) | - |
| Spectra of the Conjugate Kernel and Neural Tangent Kernel for Linear-Width Neural Networks | NeurIPS | [PDF](https://arxiv.org/pdf/2005.11879.pdf) | - |
| Kernel and Rich Regimes in Overparametrized Models | COLT | [PDF](https://arxiv.org/pdf/2002.09277.pdf) | - |
| Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK | COLT | [PDF](https://arxiv.org/pdf/2007.04596.pdf) | - |
| Finite Depth and Width Corrections to the Neural Tangent Kernel | ICLR | [PDF](https://openreview.net/pdf?id=SJgndT4KwB) | - |
| Neural tangent kernels, transportation mappings, and universal approximation | ICLR | [PDF](https://arxiv.org/pdf/1910.06956.pdf) | - |
| Neural Tangents: Fast and Easy Infinite Neural Networks in Python | ICLR | [PDF](https://arxiv.org/pdf/1912.02803.pdf) | [CODE](https://github.com/google/neural-tangents) |
| Picking Winning Tickets Before Training by Preserving Gradient Flow | ICLR | [PDF](https://arxiv.org/pdf/1910.01663.pdf) | [CODE](https://github.com/alecwangcq/GraSP) |
| Truth or Backpropaganda? An Empirical Investigation of Deep Learning Theory | ICLR | [PDF](https://arxiv.org/pdf/1910.00359.pdf) | - |
| Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee | ICLR | [PDF](https://arxiv.org/pdf/1905.11368.pdf) | - |
| The asymptotic spectrum of the Hessian of DNN throughout training | ICLR | [PDF](https://arxiv.org/pdf/1910.02875.pdf) | - |
| Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks | ICLR | [PDF](https://arxiv.org/pdf/2002.07376.pdf) | [CODE](https://github.com/LeoYu/neural-tangent-kernel-UCI) |
| Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks | ICLR | [PDF](https://arxiv.org/pdf/1910.01619.pdf) | - |
| Asymptotics of Wide Networks from Feynman Diagrams | ICLR | [PDF](https://arxiv.org/pdf/1909.11304.pdf) | - |
| The equivalence between Stein variational gradient descent and black-box variational inference | ICLR-W | [PDF](https://arxiv.org/pdf/2004.01822.pdf) | - |
| Neural Kernels Without Tangents | ICML | [PDF](https://arxiv.org/pdf/2003.02237.pdf) | [CODE](https://github.com/modestyachts/neural_kernels_code) |
| The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization | ICML | [PDF](https://arxiv.org/pdf/2008.06786.pdf) | - |
| Dynamics of Deep Neural Networks and Neural Tangent Hierarchy | ICML | [PDF](https://arxiv.org/pdf/1909.08156.pdf) | - |
| Disentangling Trainability and Generalization in Deep Neural Networks | ICML | [PDF](https://arxiv.org/pdf/1912.13053.pdf) | - |
| Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks | ICML | [PDF](https://arxiv.org/pdf/2002.02561.pdf) | [CODE](https://github.com/Pehlevan-Group/NTK_Learning_Curves) |
| Finding trainable sparse networks through Neural Tangent Transfer | ICML | [PDF](https://arxiv.org/pdf/2006.08228.pdf) | [CODE](https://github.com/fmi-basel/neural-tangent-transfer) |
| Associative Memory in Iterated Overparameterized Sigmoid Autoencoders | ICML | [PDF](https://arxiv.org/pdf/2006.16540.pdf) | - |
| Neural Contextual Bandits with UCB-based Exploration | ICML | [PDF](https://arxiv.org/pdf/1911.04462.pdf) | - |
| Optimization Theory for ReLU Neural Networks Trained with Normalization Layers | ICML | [PDF](https://arxiv.org/pdf/2006.06878.pdf) | - |
| Towards a General Theory of Infinite-Width Limits of Neural Classifiers | ICML | [PDF](https://arxiv.org/pdf/2003.05884.pdf) | - |
| Generalisation guarantees for continual learning with orthogonal gradient descent | ICML-W | [PDF](https://arxiv.org/pdf/2006.11942.pdf) | [CODE](https://github.com/MehdiAbbanaBennani/continual-learning-ogdplus) |
| Neural Spectrum Alignment: Empirical Study | ICANN | [PDF](https://arxiv.org/pdf/1910.08720.pdf) | - |
| A type of generalization error induced by initialization in deep neural networks | MSML | [PDF](https://arxiv.org/pdf/1905.07777.pdf) | - |
| Disentangling feature and lazy training in deep neural networks | J. Stat. Mech. | [PDF](https://arxiv.org/pdf/1906.08034.pdf) | [CODE](https://github.com/mariogeiger/feature_lazy/tree/article) |
| Scaling description of generalization with number of parameters in deep learning | J. Stat. Mech. | [PDF](https://arxiv.org/pdf/1901.01608.pdf) | [CODE](https://github.com/mariogeiger/feature_lazy/tree/article) |
| Any Target Function Exists in a Neighborhood of Any Sufficiently Wide Random Network: A Geometrical Perspective | NC | [PDF](https://arxiv.org/pdf/2001.06931.pdf) | - |
| Kolmogorov Width Decay and Poor Approximation in Machine Learning: Shallow Neural Networks, Random Feature Models and Neural Tangent Kernels | RMS | [PDF](https://arxiv.org/pdf/2005.10807.pdf) | - |
| On the infinite width limit of neural networks with a standard parameterization | arXiv | [PDF](https://arxiv.org/pdf/2001.07301.pdf) | [CODE](https://github.com/google/neural-tangents) |
| On the Empirical Neural Tangent Kernel of Standard Finite-Width Convolutional Neural Network Architectures | arXiv | [PDF](https://arxiv.org/pdf/2006.13645.pdf) | - |
| Infinite-Width Neural Networks for Any Architecture: Reference Implementations | arXiv | [PDF](https://arxiv.org/pdf/2006.14548.pdf) | [CODE](https://github.com/thegregyang/NTK4A) |
| Every Model Learned by Gradient Descent Is Approximately a Kernel Machine | arXiv | [PDF](https://arxiv.org/pdf/2012.00152.pdf) | - |
| Analyzing Finite Neural Networks: Can We Trust Neural Tangent Kernel Theory? | arXiv | [PDF](https://arxiv.org/pdf/2012.04477.pdf) | - |
| Scalable Neural Tangent Kernel of Recurrent Architectures | arXiv | [PDF](https://arxiv.org/pdf/2012.04859.pdf) | [CODE](https://github.com/moonlightlane/RNTK_UCI) |
| Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning | arXiv | [PDF](https://arxiv.org/pdf/2012.09816.pdf) | - |

## 2019
| Title | Venue | PDF | CODE |
| :-----|:-----:|:---:|:----:|
| Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel | NeurIPS | [PDF](https://arxiv.org/pdf/1810.05369.pdf) | - |
| Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent | NeurIPS | [PDF](https://arxiv.org/pdf/1902.06720.pdf) | [CODE](https://github.com/google/neural-tangents) |
| On Exact Computation with an Infinitely Wide Neural Net | NeurIPS | [PDF](https://arxiv.org/pdf/1904.11955.pdf) | [CODE](https://github.com/ruosongwang/cntk) |
| Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels | NeurIPS | [PDF](https://arxiv.org/pdf/1905.13192.pdf) | [CODE](https://github.com/KangchengHou/gntk) |
| On the Inductive Bias of Neural Tangent Kernels | NeurIPS | [PDF](https://arxiv.org/pdf/1905.12173.pdf) | [CODE](https://github.com/albietz/ckn_kernel) |
| Convergence of Adversarial Training in Overparametrized Neural Networks | NeurIPS | [PDF](https://arxiv.org/pdf/1906.07916.pdf) | - |
| Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks | NeurIPS | [PDF](https://arxiv.org/pdf/1905.13210.pdf) | - |
| Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers | NeurIPS | [PDF](https://arxiv.org/pdf/1811.04918.pdf) | - |
| Limitations of Lazy Training of Two-layers Neural Networks | NeurIPS | [PDF](https://arxiv.org/pdf/1906.08899.pdf) | - |
| The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies | NeurIPS | [PDF](https://arxiv.org/pdf/1906.00425.pdf) | [CODE](https://github.com/ykasten/Convergence-Rate-NN-Different-Frequencies) |
| On Lazy Training in Differentiable Programming | NeurIPS | [PDF](https://arxiv.org/pdf/1812.07956.pdf) | - |
| Information in Infinite Ensembles of Infinitely-Wide Neural Networks | AABI | [PDF](https://arxiv.org/pdf/1911.09189.pdf) | - |
| Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation | arXiv | [PDF](https://arxiv.org/pdf/1902.04760.pdf) | - |
| Gradient Descent can Learn Less Over-parameterized Two-layer Neural Networks on Classification Problems | arXiv | [PDF](https://arxiv.org/pdf/1905.09870.pdf) | - |
| Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems | arXiv | [PDF](https://arxiv.org/pdf/1905.11675.pdf) | - |
| Mean-field Behaviour of Neural Tangent Kernel for Deep Neural Networks | arXiv | [PDF](https://arxiv.org/pdf/1905.13654.pdf) | - |
| Order and Chaos: NTK views on DNN Normalization, Checkerboard and Boundary Artifacts | arXiv | [PDF](https://arxiv.org/pdf/1907.05715.pdf) | - |
| A Fine-Grained Spectral Perspective on Neural Networks | arXiv | [PDF](https://arxiv.org/pdf/1907.10599.pdf) | [CODE](https://github.com/thegregyang/NNspectra) |
| Enhanced Convolutional Neural Tangent Kernels | arXiv | [PDF](https://arxiv.org/pdf/1911.00809.pdf) | - |

## 2018
| Title | Venue | PDF | CODE |
| :-----|:-----:|:---:|:----:|
| Neural Tangent Kernel: Convergence and Generalization in Neural Networks | NeurIPS | [PDF](https://arxiv.org/pdf/1806.07572.pdf) | - |