https://github.com/digantamisra98/library
Paper reading list
https://github.com/digantamisra98/library
abstract-algebra computer-vision continual-learning deep-learning machine-learning neural-networks nonlinear-dynamics optimization theory
Last synced: 5 months ago
JSON representation
Paper reading list
- Host: GitHub
- URL: https://github.com/digantamisra98/library
- Owner: digantamisra98
- Created: 2020-07-09T02:35:01.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2020-10-05T15:47:07.000Z (over 5 years ago)
- Last Synced: 2025-02-01T07:16:22.765Z (over 1 year ago)
- Topics: abstract-algebra, computer-vision, continual-learning, deep-learning, machine-learning, neural-networks, nonlinear-dynamics, optimization, theory
- Homepage:
- Size: 46.9 KB
- Stars: 16
- Watchers: 5
- Forks: 6
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Library
### Deep Learning:
- [LCA: Loss Change Allocation for Neural Network Training](https://arxiv.org/abs/1909.01440)
- [Asymptotics of Wide Networks from Feynman Diagrams](https://arxiv.org/abs/1909.11304)
- [Neural networks and physical systems with emergent collective computational abilities](https://www.pnas.org/content/79/8/2554)
- [Provable Certificates for Adversarial Examples: Fitting a Ball in the Union of Polytopes](https://arxiv.org/abs/1903.08778)
- [Adversarial Robustness Through Local Lipschitzness](https://arxiv.org/abs/2003.02460)
- [Lagrangian Neural Networks](https://arxiv.org/abs/2003.04630)
- [Inherent Weight Normalization in Stochastic Neural Networks](https://openreview.net/forum?id=H1xDPEBx8r)
- [Neural Arithmetic Units](https://openreview.net/forum?id=H1gNOeHKPS)
- [Information Theory, Inference and Learning Algorithms](https://books.google.co.in/books/about/Information_Theory_Inference_and_Learnin.html?id=AKuMj4PN_EMC&printsec=frontcover&source=kp_read_button&redir_esc=y#v=onepage&q&f=false)
- [Intriguing properties of neural networks](https://arxiv.org/abs/1312.6199)
- [An Effective and Efficient Initialization Scheme for Training Multi-layer Feedforward Neural Networks](https://arxiv.org/abs/2005.08027)
- [Rigging the Lottery: Making All Tickets Winners](https://arxiv.org/abs/1911.11134)
#### Mean Field Theory/ EOC/ Dynamic Isometry:
- [Deep Information Propagation](https://arxiv.org/abs/1611.01232)
- [Exponential expressivity in deep neural networks through transient chaos](https://arxiv.org/abs/1606.05340)
- [Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice](https://arxiv.org/abs/1711.04735)
- [Dynamical Isometry is Achieved in Residual Networks in a Universal Way for any Activation Function](https://arxiv.org/pdf/1809.08848v3.pdf)
- [Mean Field Residual Networks: On the Edge of Chaos](https://arxiv.org/abs/1712.08969)
- [Mean Field Theory of Activation Functions in Deep Neural Networks](https://arxiv.org/abs/1805.08786)
- [Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks](https://arxiv.org/abs/1806.05393)
- [On the Impact of the Activation Function on Deep Neural Networks Training](https://arxiv.org/abs/1902.06853)
- [Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks](https://arxiv.org/abs/1806.05394)
- [Disentangling trainability and generalization in deep learning](https://arxiv.org/abs/1912.13053)
- [A Mean Field View of the Landscape of Two-Layers Neural Networks](https://arxiv.org/abs/1804.06561)
- [A Mean Field Theory of Batch Normalization](https://openreview.net/forum?id=SyMDXnCcF7)
- [Statistical field theory for neural networks](https://arxiv.org/abs/1901.10416)
- [A Mean-field Analysis of Deep ResNet and Beyond: Towards Provable Optimization Via Overparameterization From Depth](https://arxiv.org/abs/2003.05508)
- [Towards understanding the true loss surface of deep neural networks using random matrix theory and iterative spectral methods](https://openreview.net/forum?id=H1gza2NtwH)
#### Optimization/ Line Search/ Wolfe's Theorem:
- [Convergence guarantees for RMSProp and ADAM in non-convex optimization and an empirical comparison to Nesterov acceleration](https://arxiv.org/pdf/1807.06766.pdf)
- [Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence](https://arxiv.org/pdf/2002.10542.pdf)
- [Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates](https://arxiv.org/pdf/1905.09997.pdf)
- [Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak- Lojasiewicz Condition](https://arxiv.org/pdf/1608.04636.pdf)
- [On the distance between two neural networks and the stability of learning](https://arxiv.org/abs/2002.03432)
- [The large learning rate phase of deep learning: the catapult mechanism](https://arxiv.org/pdf/2003.02218.pdf)
- [Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates](https://papers.nips.cc/paper/8630-painless-stochastic-gradient-interpolation-line-search-and-convergence-rates.pdf)
- [A Fine-Grained Spectral Perspective on Neural Networks](https://arxiv.org/abs/1907.10599)
- [The Geometry of Sign Gradient Descent](https://arxiv.org/abs/2002.08056)
- [The Break-Even Point on Optimization Trajectories of Deep Neural Networks](https://arxiv.org/abs/2002.09572)
- [Quasi-hyperbolic momentum and Adam for deep learning](https://openreview.net/forum?id=S1fUpoR5FQ)
- [A new regret analysis for Adam-type algorithms](http://arxiv-export-lb.library.cornell.edu/abs/2003.09729)
- [Disentangling Adaptive Gradient Methods from Learning Rates](https://arxiv.org/abs/2002.11803)
- [Stochastic Flows and Geometric Optimization on the Orthogonal Group](https://arxiv.org/abs/2003.13563)
- [Adaptive Multi-level Hyper-gradient Descent](https://arxiv.org/abs/2008.07277)
##### Bonus:
- [Convex optimization: Gradient Methods and Online Learning](https://sites.cs.ucsb.edu/~yuxiangw/classes/CS292A-2019Spring/)
#### Non-Linear Dynamics:
- [Regularizing activations in neural networks via distribution matching with the Wasserstein metric](https://openreview.net/pdf?id=rygwLgrYPB)
- [Depth-Width Trade-offs for ReLU Networks via Sharkovsky's Theorem](https://arxiv.org/abs/1912.04378)
- [Effect of Activation Functions on the Training of Overparametrized Neural Nets](https://arxiv.org/abs/1908.05660v4)
- [Implicit Neural Representations with Periodic Activation Functions](https://arxiv.org/abs/2006.09661)
- [Why ReLU networks yield high-confidence predictions far away from the training data and how to mitigate the problem](https://arxiv.org/pdf/1812.05720.pdf)
- [Small nonlinearities in activation functions create bad local minima in neural networks](https://openreview.net/forum?id=rke_YiRct7)
- [Tempered Sigmoid Activations for Deep Learning with Differential Privacy](https://arxiv.org/abs/2007.14191)
- [Neural Networks Fail to Learn Periodic Functions and How to Fix It](https://arxiv.org/abs/2006.08195)
### Computer Vision:
- [Making Convolutional Networks Shift-Invariant Again](https://arxiv.org/abs/1904.11486)
- [GridDehazeNet: Attention-Based Multi-Scale Network for Image Dehazing](https://arxiv.org/abs/1908.03245)
- [Butterfly Transform: An Efficient FFT Based Neural Architecture Design](https://arxiv.org/abs/1906.02256)
- [ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network](https://arxiv.org/pdf/2007.00992.pdf)
- [Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains](https://arxiv.org/abs/2006.10739)
- [Learning One Convolutional Layer with Overlapping Patches](https://arxiv.org/abs/1802.02547)
- [Batch-Shaping for Learning Conditional Channel Gated Networks](https://arxiv.org/abs/1907.06627)
- [Convolutional Networks with Adaptive Inference Graphs](https://arxiv.org/abs/1711.11503)
- [The Singular Values of Convolutional Layers](https://openreview.net/forum?id=rJevYoA9Fm)
- [Rendering Natural Camera Bokeh Effect with Deep Learning](https://arxiv.org/abs/2006.05698)
- [Towards Learning Convolutions from Scratch](https://arxiv.org/abs/2007.13657)
- [Feature Products Yield Efficient Networks](https://arxiv.org/abs/2008.07930)
### Incremental Learning/ Continual Learning/ Lifelong Learning:
- [Conditional Channel Gated Networks for Task-Aware Continual Learning](https://arxiv.org/abs/2004.00070)
- [Supermasks in Superposition](https://arxiv.org/pdf/2006.14769.pdf)
### Mathematics (Mostly Abstract Algebra/ Topology/ Statistical Mechanics):
- [Algebra, Topology, Differential Calculus, and Optimization Theory For Computer Science and Machine Learning](https://www.cis.upenn.edu/~jean/math-deep.pdf)
- [ALGEBRA](https://solisinvicti.com/books/TheOlympiad/Books/AlgebraArtin.pdf)
- [Contemporary Abstract Algebra](https://people.clas.ufl.edu/cmcyr/files/Abstract-Algebra-Text_Gallian-e8.pdf)
- [Statistical Mechanics of Deep Learning](https://www.annualreviews.org/doi/full/10.1146/annurev-conmatphys-031119-050745)
- [Linear Algebra](http://joshua.smcvt.edu/linearalgebra/)
- [Linear Algebra Done Right](https://link.springer.com/book/10.1007/978-3-319-11080-6)
### Immediate:
- [A Simple Framework for Contrastive Learning of Visual Representations](https://arxiv.org/abs/2002.05709)
- [Self-supervised Label Augmentation via Input Transformations](https://arxiv.org/abs/1910.05872)
- [On the Loss Landscape of Adversarial Training: Identifying Challenges and How to Overcome Them](https://arxiv.org/abs/2006.08403)
- [Structured Convolutions for Efficient Neural Network Design](https://deepai.org/publication/structured-convolutions-for-efficient-neural-network-design)
- [Tensor Programs III: Neural Matrix Laws](https://arxiv.org/abs/2009.10685)
- [An Investigation into Neural Net Optimization via Hessian Eigenvalue Density](https://arxiv.org/abs/1901.10159)
- [The Hardware Lottery](https://arxiv.org/abs/2009.06489)
- [Tensor Programs II: Neural Tangent Kernel for Any Architecture](https://arxiv.org/abs/2006.14548)
- [PareCO: Pareto-aware Channel Optimization for Slimmable Neural Networks](https://arxiv.org/abs/2007.11752)
- [Coupling-based Invertible Neural Networks Are Universal Diffeomorphism Approximators](https://arxiv.org/abs/2006.11469)
- [Hypersolvers: Toward Fast Continuous-Depth Models](https://arxiv.org/abs/2007.09601)
- [Residual Feature Distillation Network for Lightweight Image Super-Resolution](https://arxiv.org/abs/2009.11551)
- [SSMBA: Self-Supervised Manifold Based Data Augmentation for Improving Out-of-Domain Robustness](https://arxiv.org/abs/2009.10195)
- [HyperNetworks](https://arxiv.org/abs/1609.09106)
- [Understanding the Role of Individual Units in a Deep Neural Network](https://arxiv.org/abs/2009.05041)