Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/donutloop/machine-learning-research-papers
Collection of machine learning research paper references
https://github.com/donutloop/machine-learning-research-papers
deep-learning deep-neural-networks gradient-descent machine-learning research-paper
Last synced: 29 days ago
JSON representation
Collection of machine learning research paper references
- Host: GitHub
- URL: https://github.com/donutloop/machine-learning-research-papers
- Owner: donutloop
- License: mit
- Created: 2018-09-29T09:16:46.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2024-06-30T08:57:10.000Z (6 months ago)
- Last Synced: 2024-08-10T14:13:17.993Z (5 months ago)
- Topics: deep-learning, deep-neural-networks, gradient-descent, machine-learning, research-paper
- Size: 55.7 KB
- Stars: 23
- Watchers: 6
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Machine learning research papers
Collection of machine learning research paper references
### LLM (Large language mode)
* [Self-Rewarding Language Models](https://arxiv.org/pdf/2401.10020.pdf)
* [Meta Large Language Model Compiler: Foundation Models of Compiler Optimization](https://ai.meta.com/research/publications/meta-large-language-model-compiler-foundation-models-of-compiler-optimization)## Math
* [A Beginner's Guide to the Mathematics of Neural Networks](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.161.3556&rep=rep1&type=pdf&fbclid=IwAR3OWInStoLwXtfjglO2XeQj1X7NNHBKPzzEou4At4GeYVGpx_zDkUEliz4)
* [Mathematics of Deep Learning](https://arxiv.org/abs/1712.04741)
* [The Matrix Calculus You Need For Deep Learning](https://arxiv.org/abs/1802.01528)
* [A guide to convolution arithmetic for deep learning](https://arxiv.org/abs/1603.07285)
* [Deep Learning: An Introduction for Applied Mathematicians](https://arxiv.org/abs/1801.05894) - page 23## Deep learning
* [Recent Advances in Deep Learning: An Overview](https://arxiv.org/abs/1807.08169)
* [Deep learning review](https://www.cs.toronto.edu/~hinton/absps/NatureDeepReview.pdf)
* [Understanding deep learning requires rethinking generalization](https://arxiv.org/abs/1611.03530)
* [Learning the Number of Neurons in Deep Networks](https://arxiv.org/abs/1611.06321)
* [Lifelong Learning with Dynamically Expandable Networks](https://arxiv.org/abs/1708.01547)
* [Dropout: a simple way to prevent neural networks from overfitting](http://jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf)## GAN
* [StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks](https://arxiv.org/abs/1612.03242)
* [Self-Attention Generative Adversarial Networks](https://arxiv.org/abs/1805.08318)## Neuro evolution
* [Neural Architecture Search with Reinforcement Learning](https://arxiv.org/abs/1611.01578)
* [Large-Scale Evolution of Image Classifiers](https://arxiv.org/pdf/1703.01041.pdf)
* [AutoAugment: Learning Augmentation Policies from Data](https://arxiv.org/abs/1805.09501)
* [Designing Neural Network Architectures using Reinforcement Learning](https://arxiv.org/abs/1611.02167)
* [Learning Transferable Architectures for Scalable Image Recognition](https://arxiv.org/abs/1707.07012)
* [Deep Neuroevolution: Genetic Algorithms are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning](https://arxiv.org/abs/1712.06567)
* [MorphNet: Fast & Simple Resource-Constrained Structure Learning of Deep
Networks](https://arxiv.org/abs/1711.06798)## Gradient descent
* [An overview of gradient descent optimization algorithms](https://arxiv.org/abs/1609.04747)
## Word embedding
* [Distributed Representations of Words and Phrases and their Compositionality Efficient Estimation of Word Representations in Vector Space](https://arxiv.org/abs/1310.4546)
* [Linguistic Regularities in Continuous Space Word Representations](https://www.aclweb.org/anthology/N13-1090)
* [A Neural Probabilistic Language Model](http://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf)
* [Glove](https://nlp.stanford.edu/pubs/glove.pdf)
* [Efficient Estimation of Word Representations in Vector Space](https://arxiv.org/pdf/1301.3781.pdf)
* [Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings](https://arxiv.org/abs/1607.06520)
* [FastText.zip: Compressing text classification models](https://arxiv.org/abs/1612.03651)
* [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805)## CNN
* [Siamese Neural Networks for One-shot Image Recognition](https://www.cs.cmu.edu/~rsalakhu/papers/oneshot1.pdf)
* [ImageNet Classification with Deep Convolutional Neural Networks](https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf)
* [Multi-column Deep Neural Networks for Image Classification](https://arxiv.org/abs/1202.2745)
* [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556)
* [Rethinking the Inception Architecture for Computer Vision](https://arxiv.org/abs/1512.00567)
* [Deep residual learning for image recognition](https://arxiv.org/abs/1512.03385)
* [Network In Network](https://arxiv.org/pdf/1312.4400.pdf)
* [Going Deeper with Convolutions](https://arxiv.org/abs/1409.4842)
* [OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks](https://arxiv.org/pdf/1312.6229.pdf)
* [You Only Look Once: Unified, Real-Time Object Detection](https://arxiv.org/abs/1506.02640)
* [FaceNet: A Unified Embedding for Face Recognition and Clustering](https://arxiv.org/pdf/1503.03832.pdf)
* [Visualizing and Understanding Convolutional Networks](https://arxiv.org/abs/1311.2901)
* [A Neural Algorithm of Artistic Style](https://arxiv.org/abs/1508.06576)
* [Convolutional Sequence to Sequence Learning](https://arxiv.org/abs/1705.03122)
* [Deformable Convolutional Networks](https://arxiv.org/abs/1703.06211)
* [Deep Photo Style Transfer](https://arxiv.org/abs/1703.07511)
* [Wide Residual Networks](https://arxiv.org/abs/1605.07146)
* [WaveNet: A Generative Model for Raw Audio](https://arxiv.org/abs/1609.03499)
* [Densely Connected Convolutional Networks](https://arxiv.org/abs/1608.06993)
* [Resnet in Resnet: Generalizing Residual Architectures](https://arxiv.org/abs/1603.08029)## RL
* [Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm](https://arxiv.org/pdf/1712.01815.pdf)
* [RL Overview](https://arxiv.org/abs/1701.07274)## GRU
* [Gated Feedback Recurrent Neural Networks](https://arxiv.org/abs/1502.02367)
## RNN* [DRAW: A Recurrent Neural Network For Image Generation](https://arxiv.org/abs/1502.04623)
* [Playing Atari with Deep Reinforcement Learning](https://arxiv.org/abs/1312.5602)
* [Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling](https://arxiv.org/pdf/1412.3555.pdf)
* [Sequence to Sequence Learning with Neural Networks](https://arxiv.org/abs/1409.3215)
* [Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation](https://arxiv.org/abs/1406.1078)
* [Neural Machine Translation by Jointly Learning to Align and Translate](https://arxiv.org/abs/1409.0473)
* [SQLNet: Generating Structured Queries From Natural Language Without Reinforcement Learning](https://arxiv.org/abs/1711.04436)
## Graph & Neural networks* [Relational inductive biases, deep learning, and graph networks](https://arxiv.org/abs/1806.01261)
* [Interaction Networks for Learning about Objects,Relations and Physics](https://arxiv.org/pdf/1612.00222.pdf)
* [Graph neural networks](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1015.7227&rep=rep1&type=pdf) - Page 7
* [Recurrent Relational Networks](https://arxiv.org/abs/1711.08028)
* [Graph Capsule Convolutional Neural Networks](https://arxiv.org/abs/1805.08090)
* [Graph Neural Networks for Ranking Web Pages](https://www.researchgate.net/publication/221158677_Graph_Neural_Networks_for_Ranking_Web_Pages)
* [Graph Convolutional Neural Networks for Web-Scale Recommender Systems](https://arxiv.org/abs/1806.01973)## Neural Module Networks
* [Neural Module Networks](https://arxiv.org/abs/1511.02799)
* [End-To-End Memory Networks](https://arxiv.org/pdf/1503.08895.pdf)
* [Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)](https://arxiv.org/abs/1412.6632)
* [Show and Tell: A Neural Image Caption Generator](https://arxiv.org/abs/1411.4555)## Memory Networks
* [Memory Networks](https://arxiv.org/pdf/1410.3916.pdf)
## General Models
* [One Model To Learn Them All](https://arxiv.org/abs/1706.05137)
## Neural Programmer-Interpreters
* [Neural Programmer-Interpreters](https://arxiv.org/abs/1511.06279)
* [Learning Simple Algorithms from Examples](https://arxiv.org/abs/1511.07275)
* [pix2code: Generating Code from a Graphical User Interface Screenshot](https://arxiv.org/abs/1705.07962)
* [DeepCoder: Learning to Write Programs](https://arxiv.org/abs/1611.01989)
* [A deep language model for software code](https://arxiv.org/abs/1608.02715v1)
* [Tree-to-tree Neural Networks for Program Translation](https://arxiv.org/abs/1802.03691)
* [Unsupervised Translation of Programming Languages](https://arxiv.org/abs/2006.03511)
* [TRANX: A Transition-based Neural Abstract Syntax Parser for Semantic Parsing and Code Generation](https://arxiv.org/abs/1810.02720)
* [TransCoder-IR: Code Translation with Compiler Representations](https://arxiv.org/abs/2207.03578)## Database
* [SageDB: A Learned Database System](http://cidrdb.org/cidr2019/papers/p117-kraska-cidr19.pdf)
## Cache
* [Feedforward Neural Networks for Caching: Enough or Too Much?](https://arxiv.org/abs/1810.06930)
## Activations
* [Maxout networks](https://arxiv.org/pdf/1302.4389v4.pdf)
## Other
* [Event detection in Twitter: A keyword volume approach](https://arxiv.org/abs/1901.00570)
* [Bagging](https://www.stat.berkeley.edu/~breiman/bagging.pdf)
* [Stack Overflow Considered Harmful? The Impact of Copy&Paste on Android Application Security](https://www.researchgate.net/publication/317919491_Stack_Overflow_Considered_Harmful_The_Impact_of_CopyPaste_on_Android_Application_Security)
* [DEXTER: Large-Scale Discovery and Extraction of Product
Specifications on the Web](http://www.vldb.org/pvldb/vol8/p2194-qiu.pdf)## Robotics
* [End-to-End Learning of Semantic Grasping](https://arxiv.org/abs/1707.01932)
## Machine learning (Articles)
* [Understanding LSTM Networks](https://colah.github.io/posts/2015-08-Understanding-LSTMs/)
* [Conv Nets: A Modular Perspective](https://colah.github.io/posts/2014-07-Conv-Nets-Modular)
* [Understanding Convolutions](http://colah.github.io/posts/2014-07-Understanding-Convolutions/)## Machine learning (Books)
* [Understanding machine learning theory algorithms](https://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning/understanding-machine-learning-theory-algorithms.pdf)