{"id":13993200,"url":"https://github.com/donutloop/machine-learning-research-papers","last_synced_at":"2026-01-26T06:33:02.782Z","repository":{"id":87159634,"uuid":"150847360","full_name":"donutloop/machine-learning-research-papers","owner":"donutloop","description":"Collection of machine learning research paper references ","archived":false,"fork":false,"pushed_at":"2024-06-30T08:57:10.000Z","size":57,"stargazers_count":23,"open_issues_count":0,"forks_count":3,"subscribers_count":6,"default_branch":"master","last_synced_at":"2024-08-10T14:13:17.993Z","etag":null,"topics":["deep-learning","deep-neural-networks","gradient-descent","machine-learning","research-paper"],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/donutloop.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2018-09-29T09:16:46.000Z","updated_at":"2024-07-29T08:56:17.000Z","dependencies_parsed_at":null,"dependency_job_id":"b40f7def-1509-4ad6-9df5-86292faa3e76","html_url":"https://github.com/donutloop/machine-learning-research-papers","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/donutloop%2Fmachine-learning-research-papers","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/donutloop%2Fmachine-learning-research-papers/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/donutloop%2Fmachine-learning-research-papers/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/donutloop%2Fmachine-learning-research-papers/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/donutloop","download_url":"https://codeload.github.com/donutloop/machine-learning-research-papers/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":227143417,"owners_count":17737154,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","deep-neural-networks","gradient-descent","machine-learning","research-paper"],"created_at":"2024-08-09T14:02:16.147Z","updated_at":"2026-01-26T06:33:02.774Z","avatar_url":"https://github.com/donutloop.png","language":null,"funding_links":[],"categories":["Others"],"sub_categories":[],"readme":"# Machine learning research papers\n\nCollection of machine learning research paper references \n\n### LLM (Large language mode)\n\n* [Self-Rewarding Language Models](https://arxiv.org/pdf/2401.10020.pdf)\n* [Meta Large Language Model Compiler: Foundation Models of Compiler Optimization](https://ai.meta.com/research/publications/meta-large-language-model-compiler-foundation-models-of-compiler-optimization)\n\n## Math\n\n* [A Beginner's Guide to the Mathematics of Neural Networks](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.161.3556\u0026rep=rep1\u0026type=pdf\u0026fbclid=IwAR3OWInStoLwXtfjglO2XeQj1X7NNHBKPzzEou4At4GeYVGpx_zDkUEliz4)\n* [Mathematics of Deep Learning](https://arxiv.org/abs/1712.04741)\n* [The Matrix Calculus You Need For Deep Learning](https://arxiv.org/abs/1802.01528)\n* [A guide to convolution arithmetic for deep learning](https://arxiv.org/abs/1603.07285)\n* [Deep Learning: An Introduction for Applied Mathematicians](https://arxiv.org/abs/1801.05894) - page 23\n\n## Deep learning\n\n* [Recent Advances in Deep Learning: An Overview](https://arxiv.org/abs/1807.08169)\n* [Deep learning review](https://www.cs.toronto.edu/~hinton/absps/NatureDeepReview.pdf)\n* [Understanding deep learning requires rethinking generalization](https://arxiv.org/abs/1611.03530)\n* [Learning the Number of Neurons in Deep Networks](https://arxiv.org/abs/1611.06321)\n* [Lifelong Learning with Dynamically Expandable Networks](https://arxiv.org/abs/1708.01547)\n* [Dropout: a simple way to prevent neural networks from overfitting](http://jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf)\n* [Self-Attentive Pooling for Efficient Deep Learning](https://arxiv.org/abs/2209.07659)\n\n## GAN\n\n* [StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks](https://arxiv.org/abs/1612.03242)\n* [Self-Attention Generative Adversarial Networks](https://arxiv.org/abs/1805.08318)\n\n## Neuro evolution\n\n* [Neural Architecture Search with Reinforcement Learning](https://arxiv.org/abs/1611.01578)\n* [Large-Scale Evolution of Image Classifiers](https://arxiv.org/pdf/1703.01041.pdf)\n* [AutoAugment: Learning Augmentation Policies from Data](https://arxiv.org/abs/1805.09501)\n* [Designing Neural Network Architectures using Reinforcement Learning](https://arxiv.org/abs/1611.02167)\n* [Learning Transferable Architectures for Scalable Image Recognition](https://arxiv.org/abs/1707.07012)\n* [Deep Neuroevolution: Genetic Algorithms are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning](https://arxiv.org/abs/1712.06567)\n* [MorphNet: Fast \u0026 Simple Resource-Constrained Structure Learning of Deep\nNetworks](https://arxiv.org/abs/1711.06798)\n\n## Gradient descent\n\n* [An overview of gradient descent optimization algorithms](https://arxiv.org/abs/1609.04747)\n\n## Word embedding \n\n* [Distributed Representations of Words and Phrases and their Compositionality Efficient Estimation of Word Representations in Vector Space](https://arxiv.org/abs/1310.4546)\n* [Linguistic Regularities in Continuous Space Word Representations](https://www.aclweb.org/anthology/N13-1090)\n* [A Neural Probabilistic Language Model](http://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf)\n* [Glove](https://nlp.stanford.edu/pubs/glove.pdf)\n* [Efficient Estimation of Word Representations in Vector Space](https://arxiv.org/pdf/1301.3781.pdf)\n* [Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings](https://arxiv.org/abs/1607.06520)\n* [FastText.zip: Compressing text classification models](https://arxiv.org/abs/1612.03651)\n* [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805)\n\n## CNN\n\n* [Siamese Neural Networks for One-shot Image Recognition](https://www.cs.cmu.edu/~rsalakhu/papers/oneshot1.pdf)\n* [ImageNet Classification with Deep Convolutional Neural Networks](https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf)\n* [Multi-column Deep Neural Networks for Image Classification](https://arxiv.org/abs/1202.2745)\n* [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556)\n* [Rethinking the Inception Architecture for Computer Vision](https://arxiv.org/abs/1512.00567)\n* [Deep residual learning for image recognition](https://arxiv.org/abs/1512.03385)\n* [Network In Network](https://arxiv.org/pdf/1312.4400.pdf)\n* [Going Deeper with Convolutions](https://arxiv.org/abs/1409.4842)\n* [OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks](https://arxiv.org/pdf/1312.6229.pdf)\n* [You Only Look Once: Unified, Real-Time Object Detection](https://arxiv.org/abs/1506.02640)\n* [FaceNet: A Unified Embedding for Face Recognition and Clustering](https://arxiv.org/pdf/1503.03832.pdf)\n* [Visualizing and Understanding Convolutional Networks](https://arxiv.org/abs/1311.2901)\n* [A Neural Algorithm of Artistic Style](https://arxiv.org/abs/1508.06576)\n* [Convolutional Sequence to Sequence Learning](https://arxiv.org/abs/1705.03122)\n* [Deformable Convolutional Networks](https://arxiv.org/abs/1703.06211)\n* [Deep Photo Style Transfer](https://arxiv.org/abs/1703.07511)\n* [Wide Residual Networks](https://arxiv.org/abs/1605.07146)\n* [WaveNet: A Generative Model for Raw Audio](https://arxiv.org/abs/1609.03499)\n* [Densely Connected Convolutional Networks](https://arxiv.org/abs/1608.06993)\n* [Resnet in Resnet: Generalizing Residual Architectures](https://arxiv.org/abs/1603.08029)\n\n## RL\n \n* [Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm](https://arxiv.org/pdf/1712.01815.pdf)\n* [RL Overview](https://arxiv.org/abs/1701.07274)\n\n## GRU\n\n* [Gated Feedback Recurrent Neural Networks](https://arxiv.org/abs/1502.02367)\n \n## RNN\n\n* [DRAW: A Recurrent Neural Network For Image Generation](https://arxiv.org/abs/1502.04623)\n* [Playing Atari with Deep Reinforcement Learning](https://arxiv.org/abs/1312.5602)\n* [Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling](https://arxiv.org/pdf/1412.3555.pdf)\n* [Sequence to Sequence Learning with Neural Networks](https://arxiv.org/abs/1409.3215)\n* [Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation](https://arxiv.org/abs/1406.1078)\n* [Neural Machine Translation by Jointly Learning to Align and Translate](https://arxiv.org/abs/1409.0473)\n* [SQLNet: Generating Structured Queries From Natural Language Without Reinforcement Learning](https://arxiv.org/abs/1711.04436)\n\n## Graph \u0026 Neural networks\n\n* [Relational inductive biases, deep learning, and graph networks](https://arxiv.org/abs/1806.01261)\n* [Interaction Networks for Learning about Objects,Relations and Physics](https://arxiv.org/pdf/1612.00222.pdf)\n* [Graph neural networks](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1015.7227\u0026rep=rep1\u0026type=pdf) - Page 7\n* [Recurrent Relational Networks](https://arxiv.org/abs/1711.08028)\n* [Graph Capsule Convolutional Neural Networks](https://arxiv.org/abs/1805.08090)\n* [Graph Neural Networks for Ranking Web Pages](https://www.researchgate.net/publication/221158677_Graph_Neural_Networks_for_Ranking_Web_Pages)\n* [Graph Convolutional Neural Networks for Web-Scale Recommender Systems](https://arxiv.org/abs/1806.01973)\n\n## Neural Module Networks\n\n* [Neural Module Networks](https://arxiv.org/abs/1511.02799)\n* [End-To-End Memory Networks](https://arxiv.org/pdf/1503.08895.pdf)\n* [Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)](https://arxiv.org/abs/1412.6632)\n* [Show and Tell: A Neural Image Caption Generator](https://arxiv.org/abs/1411.4555)\n\n## Memory Networks \n\n* [Memory Networks](https://arxiv.org/pdf/1410.3916.pdf)\n\n## General Models\n\n* [One Model To Learn Them All](https://arxiv.org/abs/1706.05137)\n\n## Neural Programmer-Interpreters\n\n* [Neural Programmer-Interpreters](https://arxiv.org/abs/1511.06279)\n* [Learning Simple Algorithms from Examples](https://arxiv.org/abs/1511.07275)\n* [pix2code: Generating Code from a Graphical User Interface Screenshot](https://arxiv.org/abs/1705.07962)\n* [DeepCoder: Learning to Write Programs](https://arxiv.org/abs/1611.01989)\n* [A deep language model for software code](https://arxiv.org/abs/1608.02715v1)\n* [Tree-to-tree Neural Networks for Program Translation](https://arxiv.org/abs/1802.03691)\n* [Unsupervised Translation of Programming Languages](https://arxiv.org/abs/2006.03511)\n* [TRANX: A Transition-based Neural Abstract Syntax Parser for Semantic Parsing and Code Generation](https://arxiv.org/abs/1810.02720)\n* [TransCoder-IR: Code Translation with Compiler Representations](https://arxiv.org/abs/2207.03578)\n\n## Database\n\n* [SageDB: A Learned Database System](http://cidrdb.org/cidr2019/papers/p117-kraska-cidr19.pdf)\n\n## Cache \n\n* [Feedforward Neural Networks for Caching: Enough or Too Much?](https://arxiv.org/abs/1810.06930)\n\n## Activations\n\n* [Maxout networks](https://arxiv.org/pdf/1302.4389v4.pdf)\n\n## Other\n\n* [Event detection in Twitter: A keyword volume approach](https://arxiv.org/abs/1901.00570)\n* [Bagging](https://www.stat.berkeley.edu/~breiman/bagging.pdf)\n* [Stack Overflow Considered Harmful? The Impact of Copy\u0026Paste on Android Application Security](https://www.researchgate.net/publication/317919491_Stack_Overflow_Considered_Harmful_The_Impact_of_CopyPaste_on_Android_Application_Security)\n* [DEXTER: Large-Scale Discovery and Extraction of Product\nSpecifications on the Web](http://www.vldb.org/pvldb/vol8/p2194-qiu.pdf)\n\n## Robotics\n\n* [End-to-End Learning of Semantic Grasping](https://arxiv.org/abs/1707.01932)\n\n## Machine learning (Articles)\n\n* [Understanding LSTM Networks](https://colah.github.io/posts/2015-08-Understanding-LSTMs/)\n* [Conv Nets: A Modular Perspective](https://colah.github.io/posts/2014-07-Conv-Nets-Modular)\n* [Understanding Convolutions](http://colah.github.io/posts/2014-07-Understanding-Convolutions/)\n\n## Machine learning (Books)\n\n* [Understanding machine learning theory algorithms](https://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning/understanding-machine-learning-theory-algorithms.pdf)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdonutloop%2Fmachine-learning-research-papers","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdonutloop%2Fmachine-learning-research-papers","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdonutloop%2Fmachine-learning-research-papers/lists"}