https://github.com/digantamisra98/library

Paper reading list
https://github.com/digantamisra98/library
abstract-algebra computer-vision continual-learning deep-learning machine-learning neural-networks nonlinear-dynamics optimization theory
Last synced: 6 months ago
JSON representation
Paper reading list
Host: GitHub
URL: https://github.com/digantamisra98/library
Owner: digantamisra98
Created: 2020-07-09T02:35:01.000Z (almost 6 years ago)
Default Branch: master
Last Pushed: 2020-10-05T15:47:07.000Z (almost 6 years ago)
Last Synced: 2025-02-01T07:16:22.765Z (over 1 year ago)
Topics: abstract-algebra, computer-vision, continual-learning, deep-learning, machine-learning, neural-networks, nonlinear-dynamics, optimization, theory
Homepage:
Size: 46.9 KB
Stars: 16
Watchers: 5
Forks: 6
Open Issues: 0
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

README

          # Library

### Deep Learning: 

- [LCA: Loss Change Allocation for Neural Network Training](https://arxiv.org/abs/1909.01440)

- [Asymptotics of Wide Networks from Feynman Diagrams](https://arxiv.org/abs/1909.11304)

- [Neural networks and physical systems with emergent collective computational abilities](https://www.pnas.org/content/79/8/2554)

- [Provable Certificates for Adversarial Examples: Fitting a Ball in the Union of Polytopes](https://arxiv.org/abs/1903.08778)

- [Adversarial Robustness Through Local Lipschitzness](https://arxiv.org/abs/2003.02460)

- [Lagrangian Neural Networks](https://arxiv.org/abs/2003.04630)

- [Inherent Weight Normalization in Stochastic Neural Networks](https://openreview.net/forum?id=H1xDPEBx8r)

- [Neural Arithmetic Units](https://openreview.net/forum?id=H1gNOeHKPS)

- [Information Theory, Inference and Learning Algorithms](https://books.google.co.in/books/about/Information_Theory_Inference_and_Learnin.html?id=AKuMj4PN_EMC&printsec=frontcover&source=kp_read_button&redir_esc=y#v=onepage&q&f=false)

- [Intriguing properties of neural networks](https://arxiv.org/abs/1312.6199)

- [An Effective and Efficient Initialization Scheme for Training Multi-layer Feedforward Neural Networks](https://arxiv.org/abs/2005.08027)

- [Rigging the Lottery: Making All Tickets Winners](https://arxiv.org/abs/1911.11134)

#### Mean Field Theory/ EOC/ Dynamic Isometry:

- [Deep Information Propagation](https://arxiv.org/abs/1611.01232)

- [Exponential expressivity in deep neural networks through transient chaos](https://arxiv.org/abs/1606.05340)

- [Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice](https://arxiv.org/abs/1711.04735)

- [Dynamical Isometry is Achieved in Residual Networks in a Universal Way for any Activation Function](https://arxiv.org/pdf/1809.08848v3.pdf)

- [Mean Field Residual Networks: On the Edge of Chaos](https://arxiv.org/abs/1712.08969)

- [Mean Field Theory of Activation Functions in Deep Neural Networks](https://arxiv.org/abs/1805.08786)

- [Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks](https://arxiv.org/abs/1806.05393)

- [On the Impact of the Activation Function on Deep Neural Networks Training](https://arxiv.org/abs/1902.06853)

- [Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks](https://arxiv.org/abs/1806.05394)

- [Disentangling trainability and generalization in deep learning](https://arxiv.org/abs/1912.13053)

- [A Mean Field View of the Landscape of Two-Layers Neural Networks](https://arxiv.org/abs/1804.06561)

- [A Mean Field Theory of Batch Normalization](https://openreview.net/forum?id=SyMDXnCcF7)

- [Statistical field theory for neural networks](https://arxiv.org/abs/1901.10416)

- [A Mean-field Analysis of Deep ResNet and Beyond: Towards Provable Optimization Via Overparameterization From Depth](https://arxiv.org/abs/2003.05508)

- [Towards understanding the true loss surface of deep neural networks using random matrix theory and iterative spectral methods](https://openreview.net/forum?id=H1gza2NtwH)

#### Optimization/ Line Search/ Wolfe's Theorem:

- [Convergence guarantees for RMSProp and ADAM in non-convex optimization and an empirical comparison to Nesterov acceleration](https://arxiv.org/pdf/1807.06766.pdf)

- [Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence](https://arxiv.org/pdf/2002.10542.pdf)

- [Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates](https://arxiv.org/pdf/1905.09997.pdf)

- [Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak- Lojasiewicz Condition](https://arxiv.org/pdf/1608.04636.pdf)

- [On the distance between two neural networks and the stability of learning](https://arxiv.org/abs/2002.03432)

- [The large learning rate phase of deep learning: the catapult mechanism](https://arxiv.org/pdf/2003.02218.pdf)

- [Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates](https://papers.nips.cc/paper/8630-painless-stochastic-gradient-interpolation-line-search-and-convergence-rates.pdf)

- [A Fine-Grained Spectral Perspective on Neural Networks](https://arxiv.org/abs/1907.10599)

- [The Geometry of Sign Gradient Descent](https://arxiv.org/abs/2002.08056)

- [The Break-Even Point on Optimization Trajectories of Deep Neural Networks](https://arxiv.org/abs/2002.09572)

- [Quasi-hyperbolic momentum and Adam for deep learning](https://openreview.net/forum?id=S1fUpoR5FQ)

- [A new regret analysis for Adam-type algorithms](http://arxiv-export-lb.library.cornell.edu/abs/2003.09729)

- [Disentangling Adaptive Gradient Methods from Learning Rates](https://arxiv.org/abs/2002.11803)

- [Stochastic Flows and Geometric Optimization on the Orthogonal Group](https://arxiv.org/abs/2003.13563)

- [Adaptive Multi-level Hyper-gradient Descent](https://arxiv.org/abs/2008.07277)

##### Bonus:

- [Convex optimization: Gradient Methods and Online Learning](https://sites.cs.ucsb.edu/~yuxiangw/classes/CS292A-2019Spring/)

#### Non-Linear Dynamics: 

- [Regularizing activations in neural networks via distribution matching with the Wasserstein metric](https://openreview.net/pdf?id=rygwLgrYPB)

- [Depth-Width Trade-offs for ReLU Networks via Sharkovsky's Theorem](https://arxiv.org/abs/1912.04378)

- [Effect of Activation Functions on the Training of Overparametrized Neural Nets](https://arxiv.org/abs/1908.05660v4)

- [Implicit Neural Representations with Periodic Activation Functions](https://arxiv.org/abs/2006.09661)

- [Why ReLU networks yield high-confidence predictions far away from the training data and how to mitigate the problem](https://arxiv.org/pdf/1812.05720.pdf)

- [Small nonlinearities in activation functions create bad local minima in neural networks](https://openreview.net/forum?id=rke_YiRct7)

- [Tempered Sigmoid Activations for Deep Learning with Differential Privacy](https://arxiv.org/abs/2007.14191)

- [Neural Networks Fail to Learn Periodic Functions and How to Fix It](https://arxiv.org/abs/2006.08195)

### Computer Vision:

- [Making Convolutional Networks Shift-Invariant Again](https://arxiv.org/abs/1904.11486)

- [GridDehazeNet: Attention-Based Multi-Scale Network for Image Dehazing](https://arxiv.org/abs/1908.03245)

- [Butterfly Transform: An Efficient FFT Based Neural Architecture Design](https://arxiv.org/abs/1906.02256)

- [ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network](https://arxiv.org/pdf/2007.00992.pdf)

- [Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains](https://arxiv.org/abs/2006.10739)

- [Learning One Convolutional Layer with Overlapping Patches](https://arxiv.org/abs/1802.02547)

- [Batch-Shaping for Learning Conditional Channel Gated Networks](https://arxiv.org/abs/1907.06627)

- [Convolutional Networks with Adaptive Inference Graphs](https://arxiv.org/abs/1711.11503)

- [The Singular Values of Convolutional Layers](https://openreview.net/forum?id=rJevYoA9Fm)

- [Rendering Natural Camera Bokeh Effect with Deep Learning](https://arxiv.org/abs/2006.05698)

- [Towards Learning Convolutions from Scratch](https://arxiv.org/abs/2007.13657)

- [Feature Products Yield Efficient Networks](https://arxiv.org/abs/2008.07930)

### Incremental Learning/ Continual Learning/ Lifelong Learning:

- [Conditional Channel Gated Networks for Task-Aware Continual Learning](https://arxiv.org/abs/2004.00070)

- [Supermasks in Superposition](https://arxiv.org/pdf/2006.14769.pdf)

### Mathematics (Mostly Abstract Algebra/ Topology/ Statistical Mechanics): 

- [Algebra, Topology, Differential Calculus, and Optimization Theory For Computer Science and Machine Learning](https://www.cis.upenn.edu/~jean/math-deep.pdf)

- [ALGEBRA](https://solisinvicti.com/books/TheOlympiad/Books/AlgebraArtin.pdf)

- [Contemporary Abstract Algebra](https://people.clas.ufl.edu/cmcyr/files/Abstract-Algebra-Text_Gallian-e8.pdf)

- [Statistical Mechanics of Deep Learning](https://www.annualreviews.org/doi/full/10.1146/annurev-conmatphys-031119-050745)

- [Linear Algebra](http://joshua.smcvt.edu/linearalgebra/)

- [Linear Algebra Done Right](https://link.springer.com/book/10.1007/978-3-319-11080-6)

### Immediate:

- [A Simple Framework for Contrastive Learning of Visual Representations](https://arxiv.org/abs/2002.05709)

- [Self-supervised Label Augmentation via Input Transformations](https://arxiv.org/abs/1910.05872)

- [On the Loss Landscape of Adversarial Training: Identifying Challenges and How to Overcome Them](https://arxiv.org/abs/2006.08403)

- [Structured Convolutions for Efficient Neural Network Design](https://deepai.org/publication/structured-convolutions-for-efficient-neural-network-design)

- [Tensor Programs III: Neural Matrix Laws](https://arxiv.org/abs/2009.10685)

- [An Investigation into Neural Net Optimization via Hessian Eigenvalue Density](https://arxiv.org/abs/1901.10159)

- [The Hardware Lottery](https://arxiv.org/abs/2009.06489)

- [Tensor Programs II: Neural Tangent Kernel for Any Architecture](https://arxiv.org/abs/2006.14548)

- [PareCO: Pareto-aware Channel Optimization for Slimmable Neural Networks](https://arxiv.org/abs/2007.11752)

- [Coupling-based Invertible Neural Networks Are Universal Diffeomorphism Approximators](https://arxiv.org/abs/2006.11469)

- [Hypersolvers: Toward Fast Continuous-Depth Models](https://arxiv.org/abs/2007.09601)

- [Residual Feature Distillation Network for Lightweight Image Super-Resolution](https://arxiv.org/abs/2009.11551)

- [SSMBA: Self-Supervised Manifold Based Data Augmentation for Improving Out-of-Domain Robustness](https://arxiv.org/abs/2009.10195)

- [HyperNetworks](https://arxiv.org/abs/1609.09106)

- [Understanding the Role of Individual Units in a Deep Neural Network](https://arxiv.org/abs/2009.05041)
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/digantamisra98/library

Awesome Lists containing this project

README