{"id":13477869,"url":"https://github.com/dennybritz/deeplearning-papernotes","last_synced_at":"2025-03-25T10:42:41.480Z","repository":{"id":49523613,"uuid":"48293060","full_name":"dennybritz/deeplearning-papernotes","owner":"dennybritz","description":"Summaries and notes on Deep Learning research papers","archived":false,"fork":false,"pushed_at":"2018-02-13T01:04:02.000Z","size":411,"stargazers_count":4417,"open_issues_count":6,"forks_count":905,"subscribers_count":758,"default_branch":"master","last_synced_at":"2025-01-30T09:43:24.706Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dennybritz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-12-19T18:34:21.000Z","updated_at":"2025-01-26T03:53:00.000Z","dependencies_parsed_at":"2022-09-18T23:02:51.386Z","dependency_job_id":null,"html_url":"https://github.com/dennybritz/deeplearning-papernotes","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dennybritz%2Fdeeplearning-papernotes","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dennybritz%2Fdeeplearning-papernotes/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dennybritz%2Fdeeplearning-papernotes/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dennybritz%2Fdeeplearning-papernotes/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dennybritz","download_url":"https://codeload.github.com/dennybritz/deeplearning-papernotes/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245449537,"owners_count":20617187,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T16:01:48.762Z","updated_at":"2025-03-25T10:42:41.452Z","avatar_url":"https://github.com/dennybritz.png","language":null,"funding_links":[],"categories":["Others","3. Paper","Deep Learning Repositories","Related","Other useful related lists and resources","Machine Learning","论文集合","Deep Learning ##","📚 Project Purpose"],"sub_categories":["2.2 Blog","General","深度学习","Design Patterns ###","Machine Learning (Intermediate-Level"],"readme":"#### 2018-02\n\n- The Matrix Calculus You Need For Deep Learning [[arXiv](https://arxiv.org/abs/1802.01528v2)]\n- Regularized Evolution for Image Classifier Architecture Search [[arXiv](https://arxiv.org/abs/1802.01548)]\n- Online Learning: A Comprehensive Survey [[arXiv](https://arxiv.org/abs/1802.02871)]\n- Visual Interpretability for Deep Learning: a Survey [[arXiv](https://arxiv.org/abs/1802.00614)]\n- Behavior is Everything – Towards Representing Concepts with Sensorimotor Contingencies [[paper](https://www.vicarious.com/wp-content/uploads/2018/01/AAAI18-pixelworld.pdf)] [[article](https://www.vicarious.com/2018/02/07/learning-concepts-through-sensorimotor-interactions/)] [[code](https://github.com/vicariousinc/pixelworld)]\n- IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures [[arXiv](https://arxiv.org/abs/1802.01561)] [[article](https://deepmind.com/blog/impala-scalable-distributed-deeprl-dmlab-30/)] [[code](https://github.com/deepmind/lab/tree/master/game_scripts/levels/contributed/dmlab30)]\n- DeepType: Multilingual Entity Linking by Neural Type System Evolution [[arXiv](https://arxiv.org/abs/1802.01021)] [[article](https://blog.openai.com/discovering-types-for-entity-disambiguation/)] [[code](https://github.com/openai/deeptype)]\n- DensePose: Dense Human Pose Estimation In The Wild [[arXiv](https://arxiv.org/abs/1802.00434)] [[article](http://densepose.org/)]\n\n#### 2018-01\n\n- Nested LSTMs [[arXiv](https://arxiv.org/abs/1801.10308)]\n- Generating Wikipedia by Summarizing Long Sequences [[arXiv](https://arxiv.org/abs/1801.10198)]\n- Scalable and accurate deep learning for electronic health records [[arXiv](https://arxiv.org/abs/1801.07860)]\n- Kernel Feature Selection via Conditional Covariance Minimization [[NIPS paper](https://papers.nips.cc/paper/7270-kernel-feature-selection-via-conditional-covariance-minimization.pdf)] [[article](http://bair.berkeley.edu/blog/2018/01/23/kernels/)] [[code](https://github.com/Jianbo-Lab/CCM)]\n- Psychlab: A Psychology Laboratory for Deep Reinforcement Learning Agents [[arXiv](https://arxiv.org/abs/1801.08116)] [[article](https://deepmind.com/blog/open-sourcing-psychlab/)] [[code](https://github.com/deepmind/lab/tree/master/game_scripts/levels/contributed/psychlab)]\n- Fine-tuned Language Models for Text Classification [[arXiv](https://arxiv.org/abs/1801.06146)] [[code]()] (soon)\n- Deep Learning: An Introduction for Applied Mathematicians [[arXiv](https://arxiv.org/abs/1801.05894v1)]\n- Innateness, AlphaZero, and Artificial Intelligence [[arXiv](https://arxiv.org/abs/1801.05667)]\n- Can Computers Create Art? [[arXiv](https://arxiv.org/abs/1801.04486)]\n- eCommerceGAN : A Generative Adversarial Network for E-commerce [[arXiv](https://arxiv.org/abs/1801.03244)]\n- Expected Policy Gradients for Reinforcement Learning [[arXiv](https://arxiv.org/abs/1801.03326)]\n- DroNet: Learning to Fly by Driving [[UZH docs](http://rpg.ifi.uzh.ch/docs/RAL18_Loquercio.pdf)] [[article](http://rpg.ifi.uzh.ch/dronet.html)] [[code](https://github.com/uzh-rpg/rpg_public_dronet)]\n- Symmetric Decomposition of Asymmetric Games [[Scientific Reports](https://www.nature.com/articles/s41598-018-19194-4)] [[article](https://deepmind.com/blog/game-theory-insights-asymmetric-multi-agent-games/)]\n- Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor [[arXiv](https://arxiv.org/abs/1801.01290)] [[code](https://github.com/haarnoja/sac)]\n- SBNet: Sparse Blocks Network for Fast Inference [[arXiv](https://arxiv.org/pdf/1801.02108.pdf)] [[article](https://eng.uber.com/sbnet/)] [[code](https://github.com/uber/sbnet)]\n- DeepMind Control Suite [[arXiv](https://arxiv.org/abs/1801.00690)] [[code](https://github.com/deepmind/dm_control)]\n- Deep Learning: A Critical Appraisal [[arXiv](https://arxiv.org/abs/1801.00631)]\n\n\n#### 2017-12\n\n- Adversarial Patch [[arXiv](https://arxiv.org/abs/1712.09665)]\n- CNN Is All You Need [[arXiv](https://arxiv.org/abs/1712.09662)]\n- Learning Robot Objectives from Physical Human Interaction [[paper](http://proceedings.mlr.press/v78/bajcsy17a/bajcsy17a.pdf)] [[article](http://bair.berkeley.edu/blog/2018/02/06/phri/)]\n- The NarrativeQA Reading Comprehension Challenge [[arXiv](https://arxiv.org/abs/1712.07040v1)] [[dataset](https://github.com/deepmind/narrativeqa)]\n- Objects that Sound [[arXiv](https://arxiv.org/abs/1712.06651)]\n- Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions [[arXiv](https://arxiv.org/abs/1712.05884)] [[article](https://research.googleblog.com/2017/12/tacotron-2-generating-human-like-speech.html)] [[article2](https://google.github.io/tacotron/publications/tacotron2/index.html)]\n- Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning [[arXiv](https://arxiv.org/abs/1712.06567)] [[article](https://eng.uber.com/deep-neuroevolution/)] [[code](https://github.com/uber-common/deep-neuroevolution)]\n- Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents [[arXiv](https://arxiv.org/abs/1712.06560)] [[article](https://eng.uber.com/deep-neuroevolution/)] [[code](https://github.com/uber-common/deep-neuroevolution)]\n- Superhuman AI for heads-up no-limit poker: Libratus beats top professionals [[Science](http://science.sciencemag.org/content/early/2017/12/15/science.aao1733)]\n- Mathematics of Deep Learning [[arXiv](https://arxiv.org/abs/1712.04741)]\n- State-of-the-art Speech Recognition With Sequence-to-Sequence Models [[arXiv](https://arxiv.org/abs/1712.01769)] [[article](https://research.googleblog.com/2017/12/improving-end-to-end-models-for-speech.html)]\n- Peephole: Predicting Network Performance Before Training [[arXiv](https://arxiv.org/abs/1712.03351)]\n- Deliberation Network: Pushing the frontiers of neural machine translation [[Research at Microsoft](https://www.microsoft.com/en-us/research/publication/deliberation-networks-sequence-generation-beyond-one-pass-decoding/)] [[article](https://www.microsoft.com/en-us/research/blog/deliberation-networks/)]\n- GPU Kernels for Block-Sparse Weights [[Research at OpenAI](https://s3-us-west-2.amazonaws.com/openai-assets/blocksparse/blocksparsepaper.pdf)] [[article](https://blog.openai.com/block-sparse-gpu-kernels/)] [[code](https://github.com/openai/blocksparse)]\n- Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm [[arXiv](https://arxiv.org/abs/1712.01815)]\n- Deep Learning Scaling is Predictable, Empirically [[arXiv](https://arxiv.org/abs/1712.00409)] [[article](http://research.baidu.com/deep-learning-scaling-predictable-empirically/)]\n\n#### 2017-11\n\n- High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs [[arXiv](https://arxiv.org/abs/1711.11585)] [[article](https://tcwang0509.github.io/pix2pixHD/)] [[code](https://github.com/NVIDIA/pix2pixHD)]\n- StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation [[arXiv](https://arxiv.org/abs/1711.09020)] [[code](https://github.com/yunjey/StarGAN/)]\n- Population Based Training of Neural Networks [[arXiv](https://arxiv.org/abs/1711.09846)] [[article](https://deepmind.com/blog/population-based-training-neural-networks/)]\n- Distilling a Neural Network Into a Soft Decision Tree [[arXiv](https://arxiv.org/abs/1711.09784)]\n- Neural Text Generation: A Practical Guide [[arXiv](https://arxiv.org/abs/1711.09534)]\n- Parallel WaveNet: Fast High-Fidelity Speech Synthesis [[DeepMind documents](https://deepmind.com/documents/131/Distilling_WaveNet.pdf)] [[article](https://deepmind.com/blog/high-fidelity-speech-synthesis-wavenet/)]\n- CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning [[arXiv](https://arxiv.org/abs/1711.05225)] [[article](https://stanfordmlgroup.github.io/projects/chexnet/)]\n- Non-local Neural Networks [[arXiv](https://arxiv.org/abs/1711.07971)]\n- Deep Image Prior [[paper](https://sites.skoltech.ru/app/data/uploads/sites/25/2017/11/deep_image_prior.pdf)] [[article](https://dmitryulyanov.github.io/deep_image_prior)] [[code](https://github.com/DmitryUlyanov/deep-image-prior)]\n- Online Deep Learning: Learning Deep Neural Networks on the Fly [[arXiv](https://arxiv.org/abs/1711.03705)]\n- Learning Explanatory Rules from Noisy Data [[arXiv](https://arxiv.org/abs/1711.04574)]\n- Improving Palliative Care with Deep Learning [[arXiv](https://arxiv.org/abs/1711.06402)] [[article](https://stanfordmlgroup.github.io/projects/improving-palliative-care/)]\n- VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection [[arXiv](https://arxiv.org/abs/1711.06396)]\n- Weighted Transformer Network for Machine Translation [[arXiv](https://arxiv.org/abs/1711.02132)] [[article](https://einstein.ai/research/weighted-transformer)]\n- Non-Autoregressive Neural Machine Translation [[arXiv](https://arxiv.org/abs/1711.02281)] [[article](https://einstein.ai/research/non-autoregressive-neural-machine-translation)]\n- Block-Sparse Recurrent Neural Networks [[arXiv](https://arxiv.org/abs/1711.02782)]\n- A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning [[arXiv](https://arxiv.org/abs/1711.00832)]\n- Neural Discrete Representation Learning [[arXiv](https://arxiv.org/abs/1711.00937)] [[article](https://avdnoord.github.io/homepage/vqvae/)]\n- Don't Decay the Learning Rate, Increase the Batch Size [[arXiv](https://arxiv.org/abs/1711.00489)]\n- Hierarchical Representations for Efficient Architecture Search [[arXiv](https://arxiv.org/abs/1711.00436)]\n\n#### 2017-10\n\n- Unsupervised Machine Translation Using Monolingual Corpora Only [[arXiv](https://arxiv.org/abs/1711.00043)]\n- Dynamic Routing Between Capsules [[arXiv](https://arxiv.org/abs/1710.09829)]\n- A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs [[Science](http://science.sciencemag.org/content/early/2017/10/26/science.aag2612.full)] [[article](https://www.vicarious.com/2017/10/26/common-sense-cortex-and-captcha/)] [[code](https://github.com/vicariousinc/science_rcn)]\n- Understanding Grounded Language Learning Agents [[arXiv](https://arxiv.org/abs/1710.09867)]\n- Planning, Fast and Slow: A Framework for Adaptive Real-Time Safe Trajectory Planning [[arXiv](https://arxiv.org/abs/1710.04731)] [[article](http://bair.berkeley.edu/blog/2017/12/05/fastrack/)] [[code](https://github.com/HJReachability)] (soon)\n- Malware Detection by Eating a Whole EXE [[arXiv](https://arxiv.org/abs/1710.09435)] [[article](https://devblogs.nvidia.com/malware-detection-neural-networks/)]\n- Progressive Growing of GANs for Improved Quality, Stability, and Variation [[Research at Nvidia](http://research.nvidia.com/sites/default/files/pubs/2017-10_Progressive-Growing-of//karras2017gan-paper.pdf)] [[article](http://research.nvidia.com/publication/2017-10_Progressive-Growing-of)] [[code](https://github.com/tkarras/progressive_growing_of_gans)]\n- Meta Learning Shared Hierarchies [[arXiv](https://arxiv.org/abs/1710.09767)] [[article](https://blog.openai.com/learning-a-hierarchy/)] [[code](https://github.com/openai/mlsh)]\n- Deep Voice 3: 2000-Speaker Neural Text-to-Speech [[arXiv](http://research.baidu.com/deep-voice-3-2000-speaker-neural-text-speech/)] [[article](http://research.baidu.com/deep-voice-3-2000-speaker-neural-text-speech/)]\n- AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions [[arXiv](https://arxiv.org/abs/1705.08421)] [[article](https://research.googleblog.com/2017/10/announcing-ava-finely-labeled-video.html)] [[dataset](https://research.google.com/ava/)]\n-  Mastering the game of Go without Human Knowledge [[Nature](https://www.nature.com/articles/nature24270.epdf?author_access_token=VJXbVjaSHxFoctQQ4p2k4tRgN0jAjWel9jnR3ZoTv0PVW4gB86EEpGqTRDtpIz-2rmo8-KG06gqVobU5NSCFeHILHcVFUeMsbvwS-lxjqQGg98faovwjxeTUgZAUMnRQ)] [[article](https://deepmind.com/blog/alphago-zero-learning-scratch/)]\n-  Sim-to-Real Transfer of Robotic Control with Dynamics Randomization [[arXiv](https://arxiv.org/abs/1710.06537)] [[article](https://blog.openai.com/generalizing-from-simulation/)]\n-  Asymmetric Actor Critic for Image-Based Robot Learning [[arXiv](https://arxiv.org/abs/1710.06542)] [[article](https://blog.openai.com/generalizing-from-simulation/)]\n-  A systematic study of the class imbalance problem in convolutional neural networks [[arXiv](https://arxiv.org/abs/1710.05381)]\n-  Generalization in Deep Learning [[arXiv](https://arxiv.org/abs/1710.05468)]\n- Swish: a Self-Gated Activation Function [[arXiv](https://arxiv.org/abs/1710.05941)]\n- Emergent Translation in Multi-Agent Communication [[arXiv](https://arxiv.org/abs/1710.06922)]\n- SLING: A framework for frame semantic parsing [[arXiv](https://arxiv.org/abs/1710.07032)] [[article](https://research.googleblog.com/2017/11/sling-natural-language-frame-semantic.html)] [[code](https://github.com/google/sling)]\n- Meta-Learning for Wrestling [[arXiv](https://arxiv.org/abs/1710.03641)] [[article](https://blog.openai.com/meta-learning-for-wrestling/)] [[code](https://github.com/openai/robosumo)]\n- Mixed Precision Training [[arXiv](https://arxiv.org/abs/1710.03740)] [[article](http://research.baidu.com/mixed-precision-training/)] [[article2](https://devblogs.nvidia.com/parallelforall/mixed-precision-training-deep-neural-networks/)] [[code/docs](http://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html)]\n- Generative Adversarial Networks: An Overview [[arXiv](https://arxiv.org/abs/1710.07035)]\n- Emergent Complexity via Multi-Agent Competition [[arXiv](https://arxiv.org/abs/1710.03748)] [[article](https://blog.openai.com/competitive-self-play/)] [[code](https://github.com/openai/multiagent-competition)]\n- Deep Lattice Networks and Partial Monotonic Functions [[Research at Google](https://research.google.com/pubs/pub46327.html)] [[article](https://research.googleblog.com/2017/10/tensorflow-lattice-flexibility.html)] [[code](https://github.com/tensorflow/lattice)]\n- The IIT Bombay English-Hindi Parallel Corpus [[arXiv](https://arxiv.org/abs/1710.02855)] [[article](http://www.cfilt.iitb.ac.in/iitb_parallel/)]\n- Rainbow: Combining Improvements in Deep Reinforcement Learning [[arXiv](https://arxiv.org/abs/1710.02298)]\n- Lifelong Learning With Dynamically Expandable Networks [[arXiv](https://arxiv.org/abs/1708.01547)]\n- Variational Inference \u0026 Deep Learning: A New Synthesis (Thesis) [[dropbox](https://www.dropbox.com/s/v6ua3d9yt44vgb3/cover_and_thesis.pdf)]\n- Neural Task Programming: Learning to Generalize Across Hierarchical Tasks [[arXiv](https://arxiv.org/abs/1710.01813)]\n- Neural Color Transfer between Images [[arXiv](https://arxiv.org/abs/1710.00756)]\n- The hippocampus as a predictive map [[biorXiv](https://www.biorxiv.org/content/biorxiv/early/2017/07/25/097170.full.pdf)] [[article](https://deepmind.com/blog/hippocampus-predictive-map/)]\n- Scalable and accurate deep learning for electronic health\nrecords [[arXiv](https://arxiv.org/abs/1801.07860)]\n\n#### 2017-09\n\n- Variational Memory Addressing in Generative Models [[arXiv](https://arxiv.org/abs/1709.07116)]\n- Overcoming Exploration in Reinforcement Learning with Demonstrations [[arXiv](https://arxiv.org/abs/1709.10089)]\n- A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech Enhancement [[arXiv](https://arxiv.org/abs/1709.08243)] [[article](https://people.xiph.org/~jm/demo/rnnoise/)] [[code](https://github.com/xiph/rnnoise/)]\n- ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on\nWeakly-Supervised Classification and Localization of Common Thorax Diseases [[CVF](http://openaccess.thecvf.com/content_cvpr_2017/papers/Wang_ChestX-ray8_Hospital-Scale_Chest_CVPR_2017_paper.pdf)] [[article](https://www.nih.gov/news-events/news-releases/nih-clinical-center-provides-one-largest-publicly-available-chest-x-ray-datasets-scientific-community)] [[dataset](https://nihcc.app.box.com/v/ChestXray-NIHCC)]\n- NIMA: Neural Image Assessment [[arXiv](https://arxiv.org/abs/1709.05424)] [[article](https://research.googleblog.com/2017/12/introducing-nima-neural-image-assessment.html)]\n- Generating Sentences by Editing Prototypes [[arXiv](https://arxiv.org/abs/1709.08878)] [[code](https://github.com/kelvinguu/neural-editor)]\n- The Consciousness Prior [[arXiv](https://arxiv.org/abs/1709.08568)]\n- StarSpace: Embed All The Things! [[arXiv](https://arxiv.org/abs/1709.03856)] [[code](https://github.com/facebookresearch/Starspace)]\n- Neural Optimizer Search with Reinforcement Learning [[arXiv](https://arxiv.org/abs/1709.07417)]\n- Dynamic Evaluation of Neural Sequence Models [[arXiv](https://arxiv.org/abs/1709.07432)]\n- Neural Machine Translation [[arXiv](https://arxiv.org/abs/1709.07809)]\n- Matterport3D: Learning from RGB-D Data in Indoor Environments [[arXiv](https://arxiv.org/abs/1709.06158)] [[article](https://niessner.github.io/Matterport/)] [[article2](https://hackernoon.com/announcing-the-matterport3d-research-dataset-815cae932939)] [[code](https://github.com/niessner/Matterport)]\n- Deep Reinforcement Learning that Matters [[arXiv](https://arxiv.org/abs/1709.06560)] [[code](https://github.com/Breakend/DeepReinforcementLearningThatMatters)]\n- The Uncertainty Bellman Equation and Exploration [[arXiv](https://arxiv.org/abs/1709.05380)]\n- WESPE: Weakly Supervised Photo Enhancer for Digital Cameras [[arXiv](https://arxiv.org/abs/1709.01118)] [[article](http://people.ee.ethz.ch/~ihnatova/wespe.html)]\n- Globally Normalized Reader [[arXiv](https://arxiv.org/abs/1709.02828)] [[article](http://research.baidu.com/gnr/)] [[code](https://github.com/baidu-research/GloballyNormalizedReader)]\n- A Brief Introduction to Machine Learning for Engineers [[arXiv](https://arxiv.org/abs/1709.02840)]\n- Learning with Opponent-Learning Awareness [[arXiv](https://arxiv.org/abs/1709.04326)] [[article](https://blog.openai.com/learning-to-model-other-minds/)]\n- A Deep Reinforcement Learning Chatbot [[arXiv](https://arxiv.org/abs/1709.02349)]\n- Squeeze-and-Excitation Networks [[arXiv](https://arxiv.org/abs/1709.01507)]\n- Efficient Methods and Hardware for Deep Learning (Thesis) [[Stanford Digital Repository](https://purl.stanford.edu/qf934gh3708)]\n\n#### 2017-08\n\n- Design and Analysis of the NIPS 2016 Review Process [[arXiv](https://arxiv.org/abs/1708.09794)]\n- Fast Automated Analysis of Strong Gravitational Lenses with Convolutional Neural Networks [[arXiv](https://arxiv.org/abs/1708.08842)] [[article](http://www.symmetrymagazine.org/article/neural-networks-meet-space)]\n- TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow [[white paper](https://drive.google.com/file/d/0B20Yn-GSaVHGMVlPanRTRlNIRlk/view)] [[code](https://github.com/tensorflow/agents)]\n- Automated Crowdturfing Attacks and Defenses in Online Review Systems [[arXiv](https://arxiv.org/abs/1708.08151)]\n- Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning [[arXiv](https://arxiv.org/abs/1708.02596)] [[article](http://bair.berkeley.edu/blog/2017/11/30/model-based-rl/)] [[code](https://github.com/nagaban2/nn_dynamics)]\n- Deep Learning for Video Game Playing [[arXiv](https://arxiv.org/abs/1708.07902)]\n- Deep \u0026 Cross Network for Ad Click Predictions [[arXiv](https://arxiv.org/abs/1708.05123)]\n- Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms [[arXiv](https://arxiv.org/abs/1708.07747)] [[code](https://github.com/zalandoresearch/fashion-mnist)]\n- Multi-task Self-Supervised Visual Learning [[arXiv](https://arxiv.org/abs/1708.07860)]\n- Learning a Multi-View Stereo Machine [[arXiv](https://arxiv.org/abs/1708.05375)] [[article](http://bair.berkeley.edu/blog/2017/09/05/unified-3d/)] [[code]()] (soon)\n- Twin Networks: Using the Future as a Regularizer [[arXiv](https://arxiv.org/abs/1708.06742)]\n- A Brief Survey of Deep Reinforcement Learning [[arXiv](https://arxiv.org/abs/1708.05866)]\n- Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation [[arXiv](https://arxiv.org/abs/1708.05144)] [[code](https://github.com/openai/baselines)]\n- On the Effectiveness of Visible Watermarks [[CVPR](http://openaccess.thecvf.com/content_cvpr_2017/papers/Dekel_On_the_Effectiveness_CVPR_2017_paper.pdf)] [[article](https://research.googleblog.com/2017/08/making-visible-watermarks-more-effective.html)]\n- Practical Network Blocks Design with Q-Learning [[arXiv](https://arxiv.org/abs/1708.05552)]\n- On Ensuring that Intelligent Machines Are Well-Behaved [[arXiv](https://arxiv.org/abs/1708.05448)]\n- Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control [[arXiv](https://arxiv.org/abs/1708.04133)] [[code](https://github.com/Breakend/ReproducibilityInContinuousPolicyGradientMethods)]\n- Training Deep AutoEncoders for Collaborative Filtering [[arXiv](https://arxiv.org/abs/1708.01715)] [[code](https://github.com/NVIDIA/DeepRecommender)]\n- Learning to Perform a Perched Landing on the GroundUsing Deep Reinforcement Learning [[nature](https://link.springer.com/epdf/10.1007/s10846-017-0696-1?author_access_token=BEvJgzY3QauUddBuQAus2ve4RwlQNchNByi7wbcMAY5xhRRqI6HVNnXt8Pgp850SnuV5ue6mUo3Jc7FIP5FgLmqk34Wob3oqyuGtkg7E_1T0dg02IYhfY-3dvb8R9zEmaGzTogYCIXm4O4vZ_tSGnA%3D%3D)]\n- Revisiting the Effectiveness of Off-the-shelf Temporal Modeling Approaches for Large-scale Video Classification [[arXiv](https://arxiv.org/abs/1708.03805)] [[article](http://research.baidu.com/spatial-temporal-modeling-framework-large-scale-video-understanding/)]\n- Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning [[arXiv](https://arxiv.org/abs/1708.02190)]\n- Neural Expectation Maximization [[arXiv](https://arxiv.org/abs/1708.03498)] [[code](https://github.com/sjoerdvansteenkiste/)]\n- Google Vizier: A Service for Black-Box Optimization [[Research at Google](https://research.google.com/pubs/pub46180.html)]\n- STARDATA: A StarCraft AI Research Dataset [[arXiv](https://arxiv.org/abs/1708.02139)] [[code](https://github.com/TorchCraft/StarData)]\n- Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm [[arXiv](https://arxiv.org/abs/1708.00524)] [[code](https://github.com/bfelbo/deepmoji)] [[article](https://www.media.mit.edu/posts/what-can-we-learn-from-emojis/)]\n- Natural Language Processing with Small Feed-Forward Networks [[arXiv](https://arxiv.org/abs/1708.00214)]\n\n#### 2017-07\n\n- Photographic Image Synthesis with Cascaded Refinement Networks [[arXiv](https://arxiv.org/abs/1707.09405)] [[code](https://github.com/CQFIO/PhotographicImageSynthesis)]\n- StarCraft II: A New Challenge for Reinforcement Learning [[DeepMind Documents](https://deepmind.com/documents/110/sc2le.pdf)] [[code](https://github.com/deepmind/pysc2)] [[article](https://deepmind.com/blog/deepmind-and-blizzard-open-starcraft-ii-ai-research-environment/)]\n- Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards [[arXiv](https://arxiv.org/abs/1707.08817)]\n- Reinforcement Learning with Deep Energy-Based Policies [[arXiv](https://arxiv.org/abs/1702.08165)] [[article](http://bair.berkeley.edu/blog/2017/10/06/soft-q-learning/)] [[code](https://github.com/haarnoja/softqlearning)]\n- DARLA: Improving Zero-Shot Transfer in Reinforcement Learning [[arXiv](https://arxiv.org/abs/1707.08475)]\n- Synthesizing Robust Adversarial Examples [[arXiv](https://arxiv.org/abs/1707.07397)] [[article](http://www.labsix.org/physical-objects-that-fool-neural-nets/)] [[code]()] (Soon)\n- Voice Synthesis for in-the-Wild Speakers via a Phonological Loop [[arXiv](https://arxiv.org/abs/1707.06588)] [[code](https://github.com/facebookresearch/loop)] [[article](https://ytaigman.github.io/loop/)]\n- Eyemotion: Classifying facial expressions in VR using eye-tracking cameras [[arXiv](https://arxiv.org/abs/1707.07204)] [[article](https://research.googleblog.com/2017/07/expressions-in-virtual-reality.html)]\n- A Distributional Perspective on Reinforcement Learning [[arXiv](https://arxiv.org/abs/1707.06887)] [[article](https://deepmind.com/blog/going-beyond-average-reinforcement-learning/)] [[video](https://vimeo.com/235922311)]\n- On the State of the Art of Evaluation in Neural Language Models [[arXiv](https://arxiv.org/abs/1707.05589)]\n- Optimizing the Latent Space of Generative Networks [[arXiv](https://arxiv.org/abs/1707.05776)]\n- Neuroscience-Inspired Artificial Intelligence [[Neuron](http://www.cell.com/neuron/fulltext/S0896-6273(17)30509-3?_returnURL=http%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS0896627317305093%3Fshowall%3Dtrue)] [[article](https://deepmind.com/blog/ai-and-neuroscience-virtuous-circle/)]\n- Learning Transferable Architectures for Scalable Image Recognition [[arXiv](https://arxiv.org/abs/1707.07012)]\n- Reverse Curriculum Generation for Reinforcement Learning [[arXiv](https://arxiv.org/abs/1707.05300)]\n- Imagination-Augmented Agents for Deep Reinforcement Learning [[arXiv](https://arxiv.org/abs/1707.06203)] [[article](https://deepmind.com/blog/agents-imagine-and-plan/)]\n- Learning model-based planning from scratch [[arXiv](https://arxiv.org/abs/1707.06170)] [[article](https://deepmind.com/blog/agents-imagine-and-plan/)]\n- Proximal Policy Optimization Algorithms [[AWSS3](https://openai-public.s3-us-west-2.amazonaws.com/blog/2017-07/ppo/ppo-arxiv.pdf)] [[code](https://github.com/openai/baselines)]\n- Automatic Recognition of Deceptive Facial Expressions of Emotion [[arXiv](https://arxiv.org/abs/1707.04061)]\n- Distral: Robust Multitask Reinforcement Learning [[arXiv](https://arxiv.org/abs/1707.04175)]\n- Creatism: A deep-learning photographer capable of creating professional work [[arXiv](https://arxiv.org/abs/1707.03491)] [[article](https://research.googleblog.com/2017/07/using-deep-learning-to-create.html)]\n- SCAN: Learning Abstract Hierarchical Compositional Visual Concepts [[arXiv](https://arxiv.org/abs/1707.03389)] [[article](https://deepmind.com/blog/imagine-creating-new-visual-concepts-recombining-familiar-ones/)]\n- Revisiting Unreasonable Effectiveness of Data in Deep Learning Era [[arXiv](https://arxiv.org/abs/1707.02968)] [[article](https://research.googleblog.com/2017/07/revisiting-unreasonable-effectiveness.html)]\n- The Intentional Unintentional Agent: Learning to Solve Many Continuous Control Tasks Simultaneously [[arXiv](https://arxiv.org/abs/1707.03300)]\n- Deep Bilateral Learning for Real-Time Image Enhancement [[arXiv](https://arxiv.org/abs/1707.02880)] [[code](https://github.com/mgharbi/hdrnet)] [[article](https://groups.csail.mit.edu/graphics/hdrnet/)]\n- Emergence of Locomotion Behaviours in Rich Environments [[arXiv](https://arxiv.org/abs/1707.02286)] [[article](https://deepmind.com/blog/producing-flexible-behaviours-simulated-environments/)]\n- Learning human behaviors from motion capture by adversarial imitation [[arXiv](https://arxiv.org/abs/1707.02201)] [[article](https://deepmind.com/blog/producing-flexible-behaviours-simulated-environments/)]\n- Robust Imitation of Diverse Behaviors [[arXiv](https://deepmind.com/documents/95/diverse_arxiv.pdf)] [[article](https://deepmind.com/blog/producing-flexible-behaviours-simulated-environments/)]\n- [Hindsight Experience Replay](notes/hindsight-ep.md) [[arXiv](https://arxiv.org/abs/1707.01495)]\n- Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks [[arXiv](https://arxiv.org/abs/1707.01836)] [[article](https://stanfordmlgroup.github.io/projects/ecg/)]\n- End-to-End Learning of Semantic Grasping [[arXiv](https://arxiv.org/abs/1707.01932)]\n- ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games [[arXiv](https://arxiv.org/abs/1707.01067)] [[code](https://github.com/facebookresearch/ELF)] [[article](https://code.facebook.com/posts/132985767285406/introducing-elf-an-extensive-lightweight-and-flexible-platform-for-game-research/)]\n\n#### 2017-06\n\n- [Noisy Networks for Exploration](notes/noisy-networks-4-exploration.md) [[arXiv](https://arxiv.org/abs/1706.10295)]\n- Do GANs actually learn the distribution? An empirical study [[arXiv](https://arxiv.org/abs/1706.08224)]\n- Gradient Episodic Memory for Continuum Learning [[arXiv](https://arxiv.org/abs/1706.08840)]\n- Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog [[arXiv](https://arxiv.org/abs/1706.08502)] [[code](https://github.com/batra-mlp-lab/lang-emerge)]\n- Deep Interest Network for Click-Through Rate Prediction [[arXiv](https://arxiv.org/abs/1706.06978)]\n- Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study [[arXiv](https://arxiv.org/abs/1706.08606)] [[article](https://deepmind.com/blog/cognitive-psychology/)]\n- Structure Learning in Motor Control: A Deep Reinforcement Learning Model [[arXiv](https://arxiv.org/abs/1706.06827)]\n- Programmable Agents [[arXiv](https://arxiv.org/abs/1706.06383)]\n- Grounded Language Learning in a Simulated 3D World [[arXiv](https://arxiv.org/abs/1706.06551)]\n- Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics [[arXiv](https://arxiv.org/abs/1706.04317)]\n- SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability [[arXiv](https://arxiv.org/abs/1706.05806)] [[article](https://research.googleblog.com/2017/11/interpreting-deep-neural-networks-with.html)] [[code](https://github.com/google/svcca)]\n- One Model To Learn Them All [[arXiv](https://arxiv.org/abs/1706.05137)] [[code](https://github.com/tensorflow/tensor2tensor)] [[article](https://research.googleblog.com/2017/06/multimodel-multi-task-machine-learning.html)]\n- Hybrid Reward Architecture for Reinforcement Learning [[arXiv](https://arxiv.org/abs/1706.04208)]\n- Expected Policy Gradients [[arXiv](https://arxiv.org/abs/1706.05374)]\n- Variational Approaches for Auto-Encoding Generative Adversarial Networks [[arXiv](https://arxiv.org/abs/1706.04987)]\n- Deal or No Deal? End-to-End Learning for Negotiation Dialogues [[S3AWS](https://s3.amazonaws.com/end-to-end-negotiator/end-to-end-negotiator.pdf)] [[code](https://github.com/facebookresearch/end-to-end-negotiator)] [[article](https://code.facebook.com/posts/1686672014972296/deal-or-no-deal-training-ai-bots-to-negotiate/)]\n- Attention Is All You Need [[arXiv](https://arxiv.org/abs/1706.03762)] [[code](https://github.com/tensorflow/tensor2tensor)] [[article](https://research.googleblog.com/2017/08/transformer-novel-neural-network.html)]\n- Sobolev Training for Neural Networks [[arXiv](https://arxiv.org/abs/1706.04859)]\n- YellowFin and the Art of Momentum Tuning [[arXiv](https://arxiv.org/abs/1706.03471)] [[code](https://github.com/JianGoForIt/YellowFin)] [[article](http://dawn.cs.stanford.edu/2017/07/05/yellowfin/)]\n- Forward Thinking: Building and Training Neural Networks One Layer at a Time [[arXiv](https://arxiv.org/abs/1706.02480)]\n- Depthwise Separable Convolutions for Neural Machine Translation [[arXiv](https://arxiv.org/abs/1706.03059)] [[code](https://github.com/tensorflow/tensor2tensor)]\n- Parameter Space Noise for Exploration [[arXiv](https://arxiv.org/abs/1706.01905)] [[code](https://github.com/openai/baselines)] [[article](https://blog.openai.com/better-exploration-with-parameter-noise/)]\n- Deep Reinforcement Learning from human preferences [[arXiv](https://arxiv.org/abs/1706.03741)] [[article](https://blog.openai.com/deep-reinforcement-learning-from-human-preferences/)]\n- Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments [[arXiv](https://arxiv.org/abs/1706.02275)] [[code](https://github.com/openai/multiagent-particle-envs)]\n- Self-Normalizing Neural Networks [[arXiv](https://arxiv.org/abs/1706.02515)] [[code](https://github.com/bioinf-jku/SNNs)]\n- Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour [[arXiv](https://arxiv.org/abs/1706.02677)]\n- A simple neural network module for relational reasoning [[arXiv](https://arxiv.org/abs/1706.01427)] [[article](https://deepmind.com/blog/neural-approach-relational-reasoning/)]\n- Visual Interaction Networks [[arXiv](https://arxiv.org/abs/1706.01433)] [[article](https://deepmind.com/blog/neural-approach-relational-reasoning/)]\n\n#### 2017-05\n\n- Supervised Learning of Universal Sentence Representations from Natural Language Inference Data [[arXiv](https://arxiv.org/abs/1705.02364)]  [[code](https://github.com/facebookresearch/InferSent)]\n- pix2code: Generating Code from a Graphical User Interface Screenshot [[arXiv](https://arxiv.org/abs/1705.07962)] [[article](https://uizard.io/research#pix2code)] [[code](https://github.com/tonybeltramelli/pix2code)]\n- The Cramer Distance as a Solution to Biased Wasserstein Gradients [[arXiv](https://arxiv.org/abs/1705.10743)]\n- Reinforcement Learning with a Corrupted Reward Channel [[arXiv](https://arxiv.org/abs/1705.08417)]\n- Dilated Residual Networks [[arXiv](https://arxiv.org/abs/1705.09914)] [[code](https://github.com/fyu/drn)]\n- Bayesian GAN [[arXiv](https://arxiv.org/abs/1705.09558)] [[code](https://github.com/andrewgordonwilson/bayesgan/)]\n- Gradient Descent Can Take Exponential Time to Escape Saddle Points [[arXiv](https://arxiv.org/abs/1705.10412)] [[article](http://bair.berkeley.edu/blog/2017/08/31/saddle-efficiency/)]\n- Thinking Fast and Slow with Deep Learning and Tree Search [[arXiv]()]\n- ParlAI: A Dialog Research Software Platform [[arXiv](https://arxiv.org/abs/1705.06476)] [[code](https://github.com/facebookresearch/ParlAI)] [[article](https://code.facebook.com/posts/266433647155520/parlai-a-new-software-platform-for-dialog-research/)]\n- Semantically Decomposing the Latent Spaces of Generative Adversarial Networks [[arXiv](https://arxiv.org/abs/1705.07904)] [[article](https://aws.amazon.com/blogs/ai/combining-deep-learning-networks-gan-and-siamese-to-generate-high-quality-life-like-images/)]\n- Look, Listen and Learn [[arXiv](https://arxiv.org/abs/1705.08168)]\n- Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [[arXiv](https://arxiv.org/abs/1705.07750)] [[code](https://github.com/deepmind/kinetics-i3d)]\n- Convolutional Sequence to Sequence Learning [[arXiv](https://arxiv.org/abs/1705.03122)] [[code](https://github.com/facebookresearch/fairseq)] [[code2](https://github.com/facebookresearch/fairseq-py)] [[article](https://code.facebook.com/posts/1978007565818999/a-novel-approach-to-neural-machine-translation/)]\n- The Kinetics Human Action Video Dataset [[arXiv](https://arxiv.org/abs/1705.06950)] [[article](https://deepmind.com/research/open-source/open-source-datasets/kinetics/)]\n- Safe and Nested Subgame Solving for Imperfect-Information Games [[arXiv](https://arxiv.org/abs/1705.02955)]\n- Discrete Sequential Prediction of Continuous Actions for Deep RL [[arXiv](https://arxiv.org/abs/1705.05035)]\n- Metacontrol for Adaptive Imagination-Based Optimization [[arXiv](https://arxiv.org/abs/1705.02670)]\n- Efficient Parallel Methods for Deep Reinforcement Learning [[arXiv](https://arxiv.org/abs/1705.04862)]\n- Real-Time Adaptive Image Compression [[arXiv](https://arxiv.org/abs/1705.05823)]\n\n#### 2017-04\n\n- General Video Game AI: Learning from Screen Capture [[arXiv](https://arxiv.org/abs/1704.06945)]\n- Learning to Skim Text [[arXiv](https://arxiv.org/abs/1704.06877)]\n- Get To The Point: Summarization with Pointer-Generator Networks [[arXiv](https://arxiv.org/abs/1704.04368)] [[code](https://github.com/abisee/pointer-generator)] [[article](http://www.abigailsee.com/2017/04/16/taming-rnns-for-better-summarization.html)]\n- Adversarial Neural Machine Translation [[arXiv](https://arxiv.org/abs/1704.06933)]\n- [Deep Q-learning from Demonstrations](notes/dqn-demonstrations.md) [[arXiv](https://arxiv.org/abs/1704.03732)]\n- Learning from Demonstrations for Real World Reinforcement Learning [[arXiv](https://arxiv.org/abs/1704.03732)]\n- DSLR-Quality Photos on Mobile Devices with Deep Convolutional Networks [[arXiv](https://arxiv.org/abs/1704.02470)] [[article](http://people.ee.ethz.ch/~ihnatova/)] [[code](https://github.com/aiff22/DPED)]\n- A Neural Representation of Sketch Drawings [[arXiv](https://arxiv.org/abs/1704.03477)] [[code](https://github.com/tensorflow/magenta/tree/master/magenta/models/sketch_rnn)] [[article](https://research.googleblog.com/2017/04/teaching-machines-to-draw.html)]\n- Automated Curriculum Learning for Neural Networks [[arXiv](https://arxiv.org/abs/1704.03003)]\n- Hierarchical Surface Prediction for 3D Object Reconstruction [[arXiv](https://arxiv.org/abs/1704.00710)] [[article](http://bair.berkeley.edu/blog/2017/08/23/high-quality-3d-obj-reconstruction/)]\n- Neural Message Passing for Quantum Chemistry [[arXiv](https://arxiv.org/abs/1704.01212)]\n- Learning to Generate Reviews and Discovering Sentiment [[arXiv](https://arxiv.org/abs/1704.01444)] [[code](https://github.com/openai/generating-reviews-discovering-sentiment)]\n- Best Practices for Applying Deep Learning to Novel Applications [[arXiv](https://arxiv.org/abs/1704.01568)]\n\n#### 2017-03\n\n- Improved Training of Wasserstein GANs [[arXiv](https://arxiv.org/abs/1704.00028)]\n- Evolution Strategies as a Scalable Alternative to Reinforcement Learning [[arXiv](https://arxiv.org/abs/1703.03864)]\n- Controllable Text Generation [[arXiv](https://arxiv.org/abs/1703.00955)]\n- Neural Episodic Control [[arXiv](https://arxiv.org/abs/1703.01988)]\n- [A Structured Self-attentive Sentence Embedding](notes/self_attention_embedding.md) [[arXiv](https://arxiv.org/abs/1703.03130)]\n- Multi-step Reinforcement Learning: A Unifying Algorithm [[arXiv](https://arxiv.org/abs/1703.01327)]\n- Deep learning with convolutional neural networks for brain mapping and decoding of movement-related information from the human EEG [[arXiv](https://arxiv.org/abs/1703.05051)]\n- FaSTrack: a Modular Framework for Fast and Guaranteed Safe Motion Planning [[arXiv](https://arxiv.org/abs/1703.07373)] [[article](http://bair.berkeley.edu/blog/2017/12/05/fastrack/)] [[article2](http://sylviaherbert.com/fastrack/)]\n- Massive Exploration of Neural Machine Translation Architectures [[arXiv](https://arxiv.org/abs/1703.03906)] [[code](https://github.com/google/seq2seq)]\n- Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression [[arXiv](https://arxiv.org/abs/1703.07834)] [[article](http://aaronsplace.co.uk/papers/jackson2017recon/)] [[code](https://github.com/AaronJackson/vrn)]\n- Minimax Regret Bounds for Reinforcement Learning [[arXiv](https://arxiv.org/abs/1703.05449)]\n- Sharp Minima Can Generalize For Deep Nets [[arXiv](https://arxiv.org/abs/1703.04933)]\n- Parallel Multiscale Autoregressive Density Estimation [[arXiv](https://arxiv.org/abs/1703.03664)]\n- Neural Machine Translation and Sequence-to-sequence Models: A Tutorial [[arXiv](https://arxiv.org/abs/1703.01619)]\n- Large-Scale Evolution of Image Classifiers [[arXiv](https://arxiv.org/abs/1703.01041)]\n- FeUdal Networks for Hierarchical Reinforcement Learning [[arXiv](https://arxiv.org/abs/1703.01161)]\n- Evolving Deep Neural Networks [[arXiv](https://arxiv.org/abs/1703.00548)]\n- How to Escape Saddle Points Efficiently [[arXiv](https://arxiv.org/abs/1703.00887)] [[article](http://bair.berkeley.edu/blog/2017/08/31/saddle-efficiency/)]\n- Opening the Black Box of Deep Neural Networks via Information [[arXiv](https://arxiv.org/abs/1703.00810)] [[video](https://youtu.be/bLqJHjXihK8)]\n- Understanding Synthetic Gradients and Decoupled Neural Interfaces [[arXiv](https://arxiv.org/abs/1703.00522)]\n- Learning to Optimize Neural Nets [[arXiv](https://arxiv.org/abs/1703.00441)] [[article](http://bair.berkeley.edu/blog/2017/09/12/learning-to-optimize-with-rl/)]\n\n\n#### 2017-02\n\n- The Shattered Gradients Problem: If resnets are the answer, then what is the question? [[arXiv](https://arxiv.org/abs/1702.08591)]\n- Neural Map: Structured Memory for Deep Reinforcement Learning [[arXiv](https://arxiv.org/abs/1702.08360)]\n- Bridging the Gap Between Value and Policy Based Reinforcement Learning [[arXiv](https://arxiv.org/abs/1702.08892)]\n- Deep Voice: Real-time Neural Text-to-Speech [[arXiv](https://arxiv.org/abs/1702.07825)]\n- Beating the World's Best at Super Smash Bros. with Deep Reinforcement Learning [[arXiv](https://arxiv.org/abs/1702.06230)]\n- The Game Imitation: Deep Supervised Convolutional Networks for Quick Video Game AI [[arXiv](https://arxiv.org/abs/1702.05663)]\n- Learning to Parse and Translate Improves Neural Machine Translation [[arXiv](https://arxiv.org/abs/1702.03525)]\n- All-but-the-Top: Simple and Effective Postprocessing for Word Representations [[arXiv](https://arxiv.org/abs/1702.01417)]\n- Deep Learning with Dynamic Computation Graphs [[arXiv](https://arxiv.org/abs/1702.02181)]\n- Skip Connections as Effective Symmetry-Breaking [[arXiv](https://arxiv.org/abs/1701.09175)]\n- odelSemi-Supervised QA with Generative Domain-Adaptive Nets [[arXiv](https://arxiv.org/abs/1702.02206)]\n\n#### 2017-01\n\n- Wasserstein GAN [[arXiv](https://arxiv.org/abs/1701.07875)]\n- Deep Reinforcement Learning: An Overview [[arXiv](https://arxiv.org/abs/1701.07274)]\n- DyNet: The Dynamic Neural Network Toolkit [[arXiv](https://arxiv.org/abs/1701.03980)]\n- DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker [[arXiv](https://arxiv.org/abs/1701.01724)]\n- NIPS 2016 Tutorial: Generative Adversarial Networks [[arXiv](https://arxiv.org/abs/1701.00160)]\n\n#### 2016-12\n\n- [A recurrent neural network without Chaos](notes/rnn_no_chaos.md) [[arXiv](https://arxiv.org/abs/1612.06212)]\n- Language Modeling with Gated Convolutional Networks [[arXiv](https://arxiv.org/abs/1612.08083)]\n- EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis [[arXiv](https://arxiv.org/abs/1612.07919)] [[article](http://webdav.tuebingen.mpg.de/pixel/enhancenet/)]\n- Learning from Simulated and Unsupervised Images through Adversarial Training [[arXiv](https://arxiv.org/abs/1612.07828)]\n- How Grammatical is Character-level Neural Machine Translation? Assessing MT Quality with Contrastive Translation Pairs [[arXiv](https://arxiv.org/abs/1612.04629)]\n- Improving Neural Language Models with a Continuous Cache [[arXiv](https://arxiv.org/abs/1612.04426)]\n- DeepMind Lab [[arXiv](https://arxiv.org/abs/1612.03801)] [[code](https://github.com/deepmind/lab)]\n- Deep Learning of Robotic Tasks without a Simulator using Strong and Weak Human Supervision [[arXiv](https://arxiv.org/abs/1612.01086)]\n- Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning [[arXiv](https://arxiv.org/abs/1612.01887)]\n- Overcoming catastrophic forgetting in neural networks [[arXiv](https://arxiv.org/abs/1612.00796)]\n\n#### 2016-11 (ICLR Edition)\n\n- Image-to-Image Translation with Conditional Adversarial Networks [[arXiv](https://arxiv.org/abs/1611.07004)]\n- [Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer](notes/mixture-experts.md) [[OpenReview](https://openreview.net/forum?id=B1ckMDqlg)]\n- Learning to reinforcement learn [[arXiv](https://arxiv.org/abs/1611.05763)]\n- A Way out of the Odyssey: Analyzing and Combining Recent Insights for LSTMs [[arXiv](https://arxiv.org/abs/1611.05104)]\n- [Adversarial Training Methods for Semi-Supervised Text Classification](notes/adversarial-text-classification.md) [[arXiv](https://arxiv.org/abs/1605.07725)]\n- Importance Sampling with Unequal Support [[arXiv](https://arxiv.org/abs/1611.03451)]\n- Quasi-Recurrent Neural Networks [[arXiv](https://arxiv.org/abs/1611.01576)]\n- Capacity and Learnability in Recurrent Neural Networks [[OpenReview](http://openreview.net/forum?id=BydARw9ex)]\n- Unrolled Generative Adversarial Networks [[OpenReview](http://openreview.net/forum?id=BydrOIcle)]\n- Deep Information Propagation [[OpenReview](http://openreview.net/forum?id=H1W1UN9gg)]\n- Structured Attention Networks [[OpenReview](http://openreview.net/forum?id=HkE0Nvqlg)]\n- Incremental Sequence Learning [[arXiv](https://arxiv.org/abs/1611.03068)]\n- Delving into Transferable Adversarial Examples and Black-box Attacks [[arXiv](https://arxiv.org/abs/1611.02770)] [[code](https://github.com/ReDeiPirati/transferability-advdnn-pub)]\n- b-GAN: Unified Framework of Generative Adversarial Networks [[OpenReview](http://openreview.net/forum?id=S1JG13oee)]\n- A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks [[OpenReview](http://openreview.net/forum?id=SJZAb5cel)]\n- Categorical Reparameterization with Gumbel-Softmax [[arXiv](https://arxiv.org/abs/1611.01144)]\n- Lip Reading Sentences in the Wild [[arXiv](https://arxiv.org/abs/1611.05358)]\n\nReinforcement Learning:\n\n-Learning to reinforcement learn [[arXiv](https://arxiv.org/abs/1611.05763)]\n- A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models [[arXiv](https://arxiv.org/abs/1611.03852)]\n- The Predictron: End-To-End Learning and Planning [[OpenReview](http://openreview.net/forum?id=BkJsCIcgl)]\n- [Third-Person Imitation Learning](notes/third-person-imitation-learning.md) [[OpenReview](http://openreview.net/forum?id=B16dGcqlx)]\n- Generalizing Skills with Semi-Supervised Reinforcement Learning [[OpenReview](http://openreview.net/forum?id=ryHlUtqge)]\n- Sample Efficient Actor-Critic with Experience Replay [[OpenReview](http://openreview.net/forum?id=HyM25Mqel)]\n- [Reinforcement Learning with Unsupervised Auxiliary Tasks](notes/rl-auxiliary-tasks.md) [[arXiv](https://arxiv.org/abs/1611.05397)]\n- Neural Architecture Search with Reinforcement Learning [[OpenReview](http://openreview.net/forum?id=r1Ue8Hcxg)]\n- Towards Information-Seeking Agents [[OpenReview](http://openreview.net/forum?id=SyW2QSige)]\n- Multi-Agent Cooperation and the Emergence of (Natural) Language [[OpenReview](http://openreview.net/forum?id=Hk8N3Sclg)]\n- Improving Policy Gradient by Exploring Under-appreciated Rewards [[OpenReview](http://openreview.net/forum?id=ryT4pvqll)]\n- Stochastic Neural Networks for Hierarchical Reinforcement Learning [[OpenReview](http://openreview.net/forum?id=B1oK8aoxe)]\n- Tuning Recurrent Neural Networks with Reinforcement Learning [[OpenReview](https://arxiv.org/abs/1611.02796)]\n- RL^2: Fast Reinforcement Learning via Slow Reinforcement Learning [[arXiv](https://arxiv.org/abs/1611.02779)]\n- Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning [[OpenReview](http://openreview.net/forum?id=Hyq4yhile)]\n- Learning to Perform Physics Experiments via Deep Reinforcement Learning [[OpenReview](http://openreview.net/forum?id=r1nTpv9eg)]\n- Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU [[OpenReview](http://openreview.net/forum?id=r1VGvBcxl)]\n- Learning to Compose Words into Sentences with Reinforcement Learning[[OpenReview](http://openreview.net/forum?id=Skvgqgqxe)]\n- Deep Reinforcement Learning for Accelerating the Convergence Rate [[OpenReview](http://openreview.net/forum?id=Syg_lYixe)]\n- [#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning](notes/count-based-exploration.md) [[arXiv](https://arxiv.org/abs/1611.04717)]\n- Learning to Compose Words into Sentences with Reinforcement Learning [[OpenReview](http://openreview.net/forum?id=Skvgqgqxe)]\n- Learning to Navigate in Complex Environments [[arXiv](https://arxiv.org/abs/1611.03673)]\n- Unsupervised Perceptual Rewards for Imitation Learning [[OpenReview](http://openreview.net/forum?id=Bkul3t9ee)]\n- Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic [[OpenReview](http://openreview.net/forum?id=SJ3rcZcxl)]\n\n\nMachine Translation \u0026 Dialog\n\n- [Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation](notes/gnmt-multilingual.md) [[arXiv](https://arxiv.org/abs/1611.04558)]\n- [Neural Machine Translation with Reconstruction](notes/nmt-with-reconstruction.md) [[arXiv](https://arxiv.org/abs/1611.01874v1)]\n- Iterative Refinement for Machine Translation [[OpenReview](http://openreview.net/forum?id=r1y1aawlg)]\n- A Convolutional Encoder Model for Neural Machine Translation [[arXiv](https://arxiv.org/abs/1611.02344)]\n- Improving Neural Language Models with a Continuous Cache [[OpenReview](http://openreview.net/forum?id=B184E5qee)]\n- Vocabulary Selection Strategies for Neural Machine Translation [[OpenReview](http://openreview.net/forum?id=Bk8N0RLxx)]\n- Towards an automatic Turing test: Learning to evaluate dialogue responses [[OpenReview](http://openreview.net/forum?id=HJ5PIaseg)]\n- Dialogue Learning With Human-in-the-Loop [[OpenReview](http://openreview.net/forum?id=HJgXCV9xx)]\n- Batch Policy Gradient Methods for Improving Neural Conversation Models [[OpenReview](http://openreview.net/forum?id=rJfMusFll)]\n- Learning through Dialogue Interactions [[OpenReview](http://openreview.net/forum?id=rkE8pVcle)]\n- [Dual Learning for Machine Translation](notes/dual-learning-mt.md) [[arXiv](https://arxiv.org/abs/1611.00179)]\n- Unsupervised Pretraining for Sequence to Sequence Learning [[arXiv](https://arxiv.org/abs/1611.02683)]\n\n\n\n#### 2016-10\n\n- Hybrid computing using a neural network with dynamic external memory [[nature](https://www.nature.com/articles/nature20101.epdf?author_access_token=ImTXBI8aWbYxYQ51Plys8NRgN0jAjWel9jnR3ZoTv0MggmpDmwljGswxVdeocYSurJ3hxupzWuRNeGvvXnoO8o4jTJcnAyhGuZzXJ1GEaD-Z7E6X_a9R-xqJ9TfJWBqz)] [[code](https://github.com/deepmind/dnc)]\n- Quantum Machine Learning [[arXiv](https://arxiv.org/abs/1611.09347)]\n- Understanding deep learning requires rethinking generalization [[arXiv](https://arxiv.org/abs/1611.03530)]\n- Universal adversarial perturbations [[arXiv](https://arxiv.org/abs/1610.08401)] [[code](https://github.com/LTS4/universal)]\n- [Neural Machine Translation in Linear Time](notes/nmt-linear-time.md) [[arXiv](https://arxiv.org/abs/1610.10099)] [[code](https://github.com/tensorflow/tensor2tensor)]\n- [Professor Forcing: A New Algorithm for Training Recurrent Networks](notes/professor-forcing.md) [[arXiv](https://arxiv.org/abs/1610.09038)]\n- Learning to Protect Communications with Adversarial Neural Cryptography [[arXiv](https://arxiv.org/abs/1610.06918v1)]\n- Can Active Memory Replace Attention? [[arXiv](https://arxiv.org/abs/1610.08613)]\n- [Using Fast Weights to Attend to the Recent Past](notes/fast-weight-to-attend.md) [[arXiv](https://arxiv.org/abs/1610.06258)]\n- [Fully Character-Level Neural Machine Translation without Explicit Segmentation](notes/conv-char-level-nmt.md) [[arXiv](https://arxiv.org/abs/1610.03017)]\n- [Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models](notes/diverse-beam-search.md) [[arXiv](https://arxiv.org/abs/1610.02424)]\n- Video Pixel Networks [[arXiv](https://arxiv.org/abs/1610.00527)]\n- Connecting Generative Adversarial Networks and Actor-Critic Methods [[arXiv](https://arxiv.org/abs/1610.01945)]\n- [Learning to Translate in Real-time with Neural Machine Translation](notes/learning-to-translate-real-time.md) [[arXiv](https://arxiv.org/abs/1610.00388)]\n- Xception: Deep Learning with Depthwise Separable Convolutions [[arXiv](https://arxiv.org/abs/1610.02357)]\n- Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search [[arXiv](https://arxiv.org/abs/1610.00673)]\n- [Pointer Sentinel Mixture Models](notes/pointer-sentinel-mixture.md) [[arXiv](https://arxiv.org/abs/1609.07843)]\n\n#### 2016-09\n\n- Towards Deep Symbolic Reinforcement Learning [[arXiv](https://arxiv.org/abs/1609.05518)]\n- HyperNetworks [[arXiv](https://arxiv.org/abs/1609.09106)]\n- Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation [[arXiv](http://arxiv.org/abs/1609.08144)]\n- Safe and Efficient Off-Policy Reinforcement Learning [[arXiv](http://arxiv.org/abs/1606.02647)]\n- Playing FPS Games with Deep Reinforcement Learning [[arXiv](http://arxiv.org/abs/1609.05521)]\n- [SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient](notes/seq-gan.md) [[arXiv](https://arxiv.org/abs/1609.05473)]\n- Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks [[arXiv](http://arxiv.org/abs/1609.02993)]\n- Energy-based Generative Adversarial Network [[arXiv](https://arxiv.org/abs/1609.03126)]\n- Stealing Machine Learning Models via Prediction APIs [[arXiv](http://arxiv.org/abs/1609.02943)]\n- Semi-Supervised Classification with Graph Convolutional Networks [[arXiv](http://arxiv.org/abs/1609.02907)]\n- WaveNet: A Generative Model For Raw Audio [[arXiv](https://arxiv.org/abs/1609.03499)]\n- [Hierarchical Multiscale Recurrent Neural Networks](notes/hm-rnn.md) [[arXiv](https://arxiv.org/abs/1609.01704)]\n- End-to-End Reinforcement Learning of Dialogue Agents for Information Access [[arXiv](https://arxiv.org/abs/1609.00777)]\n- Deep Neural Networks for YouTube Recommendations [[paper](https://research.google.com/pubs/pub45530.html)]\n\n#### 2016-08\n\n- Semantics derived automatically from language corpora contain human-like biases [[arXiv](https://arxiv.org/abs/1608.07187)]\n- Why does deep and cheap learning work so well? [[arXiv](https://arxiv.org/abs/1608.08225)]\n- Machine Comprehension Using Match-LSTM and Answer Pointer [[arXiv](https://arxiv.org/abs/1608.07905)]\n- Stacked Approximated Regression Machine: A Simple Deep Learning Approach [[arXiv](http://arxiv.org/abs/1608.04062)]\n- Decoupled Neural Interfaces using Synthetic Gradients [[arXiv](http://arxiv.org/abs/1608.05343)]\n- WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia [[arXiv](https://arxiv.org/abs/1608.03542)]\n- Temporal Attention Model for Neural Machine Translation [[arXiv](http://arxiv.org/abs/1608.02927)]\n- Residual Networks of Residual Networks: Multilevel Residual Networks [[arXiv](http://arxiv.org/abs/1608.02908)]\n- [Learning Online Alignments with Continuous Rewards Policy Gradient](notes/online-alignments-pg.md) [[arXiv](https://arxiv.org/abs/1608.01281)]\n\n#### 2016-07\n\n- [An Actor-Critic Algorithm for Sequence Prediction](notes/actor-critic-sequence.md) [[arXiv](http://arxiv.org/abs/1607.07086)]\n- Cognitive Science in the era of Artificial Intelligence: A roadmap for reverse-engineering the infant language-learner [[arXiv](http://arxiv.org/abs/1607.08723v1)]\n- [Recurrent Neural Machine Translation](notes/recurrent-nmt.md) [[arXiv](http://arxiv.org/abs/1607.08725)]\n- MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition [[arXiv](http://arxiv.org/abs/1607.08221)]\n- [Layer Normalization](notes/layer-norm.md) [[arXiv](https://arxiv.org/abs/1607.06450)]\n- [Neural Machine Translation with Recurrent Attention Modeling](notes/nmt-rec-attention.md)  [[arXiv](https://arxiv.org/abs/1607.05108)]\n- Neural Semantic Encoders [[arXiv](https://arxiv.org/abs/1607.04315)]\n- [Attention-over-Attention Neural Networks for Reading Comprehension](notes/att-over-att.md) [[arXiv](https://arxiv.org/abs/1607.04423)]\n- sk_p: a neural program corrector for MOOCs [[arXiv](http://arxiv.org/abs/1607.02902)]\n- Recurrent Highway Networks [[arXiv](https://arxiv.org/abs/1607.03474)]\n- Bag of Tricks for Efficient Text Classification [[arXiv](http://arxiv.org/abs/1607.01759)]\n- Context-Dependent Word Representation for Neural Machine Translation [[arXiv](https://arxiv.org/abs/1607.00578)]\n- Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes [[arXiv](http://arxiv.org/abs/1607.00036)]\n\n#### 2016-06\n\n- Sequence-to-Sequence Learning as Beam-Search Optimization [[arXiv](https://arxiv.org/abs/1606.02960)]\n- [Sequence-Level Knowledge Distillation](notes/seq-knowledge-distillation.md) [[arXiv](https://arxiv.org/abs/1606.07947)]\n- Policy Networks with Two-Stage Training for Dialogue Systems [[arXiv](http://arxiv.org/abs/1606.03152)]\n- Towards an integration of deep learning and neuroscience [[arXiv](https://arxiv.org/abs/1606.03813)]\n- On Multiplicative Integration with Recurrent Neural Networks [[arxiv](https://arxiv.org/abs/1606.06630)]\n- [Wide \u0026 Deep Learning for Recommender Systems](wide-and-deep.md) [[arXiv](https://arxiv.org/abs/1606.07792)]\n- Online and Offline Handwritten Chinese Character Recognition [[arXiv](https://arxiv.org/abs/1606.05763)]\n- Tutorial on Variational Autoencoders [[arXiv](http://arxiv.org/abs/1606.05908)]\n- Concrete Problems in AI Safety [[arXiv](https://arxiv.org/abs/1606.06565)]\n- Deep Reinforcement Learning Discovers Internal Models [[arXiv](http://arxiv.org/abs/1606.05174v1)]\n- [SQuAD: 100,000+ Questions for Machine Comprehension of Text](notes/squad.md) [[arXiv](http://arxiv.org/abs/1606.05250)]\n- Conditional Image Generation with PixelCNN Decoders [[arXiv](http://arxiv.org/abs/1606.05328)]\n- Model-Free Episodic Control [[arXiv](http://arxiv.org/abs/1606.04460)]\n- [Progressive Neural Networks](notes/progressive-nn.md) [[arXiv](http://arxiv.org/abs/1606.04671)]\n- Improved Techniques for Training GANs [[arXiv](http://arxiv.org/abs/1606.03498)] [[code](https://github.com/openai/improved-gan)]\n- Memory-Efficient Backpropagation Through Time [[arXiv](http://arxiv.org/abs/1606.03401)]\n- InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets [[arXiv](http://arxiv.org/abs/1606.03657)]\n- Zero-Resource Translation with Multi-Lingual Neural Machine Translation [[arXiv](http://arxiv.org/abs/1606.04164)]\n- Key-Value Memory Networks for Directly Reading Documents [[arXiv](http://arxiv.org/abs/1606.03126)]\n- Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translatin [[arXiv](http://arxiv.org/abs/1606.04199)]\n- Learning to learn by gradient descent by gradient descent [[arXiv](http://arxiv.org/abs/1606.04474)]\n- Learning Language Games through Interaction [[arXiv](http://arxiv.org/abs/1606.02447)]\n- Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations [[arXiv](https://arxiv.org/abs/1606.01305)]\n- Smart Reply: Automated Response Suggestion for Email [[arXiv](http://arxiv.org/abs/1606.04870)]\n- Virtual Adversarial Training for Semi-Supervised Text Classification [[arXiv](https://arxiv.org/abs/1605.07725)]\n- Deep Reinforcement Learning for Dialogue Generation [[arXiv](http://arxiv.org/abs/1606.01541)]\n- Very Deep Convolutional Networks for Natural Language Processing [[arXiv](https://arxiv.org/abs/1606.01781)]\n- Neural Net Models for Open-Domain Discourse Coherence [[arXiv](https://arxiv.org/abs/1606.01545)]\n- Neural Architectures for Fine-grained Entity Type Classification [[arXiv](https://arxiv.org/abs/1606.01341)]\n- Matching Networks for One Shot Learning [[arXiv](https://arxiv.org/abs/1606.04080)]\n- Cooperative Inverse Reinforcement Learning [[arXiv](https://arxiv.org/abs/1606.03137)] [[article](http://bair.berkeley.edu/blog/2017/08/17/cooperatively-learning-human-values/)]\n- Gated-Attention Readers for Text Comprehension [[arXiv](http://arxiv.org/abs/1606.01549)]\n- [End-to-end LSTM-based dialog control optimized with supervised and reinforcement learning](notes/e2e-dialog-control-sl-rl.md) [[arXiv](https://arxiv.org/abs/1606.01269)]\n- Iterative Alternating Neural Attention for Machine Reading [[arXiv](https://arxiv.org/abs/1606.02245)]\n- Memory-enhanced Decoder for Neural Machine Translation [[arXiv](http://arxiv.org/abs/1606.02003)]\n- Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation [[arXiv](https://arxiv.org/abs/1606.00776)]\n- Learning to Optimize [[arXiv](https://arxiv.org/abs/1606.01885)] [[article](http://bair.berkeley.edu/blog/2017/09/12/learning-to-optimize-with-rl/)]\n- [Natural Language Comprehension with the EpiReader](notes/epireader.md) [[arXiv](https://arxiv.org/abs/1606.02270)]\n- Conversational Contextual Cues: The Case of Personalization and History for Response Ranking [[arXiv](https://arxiv.org/abs/1606.00372)]\n- Adversarially Learned Inference [[arXiv](https://arxiv.org/abs/1606.00704)]\n- OpenAI Gym [[arXiv](https://arxiv.org/abs/1606.01540)] [[code](https://github.com/deepmind/lab)]\n- Neural Network Translation Models for Grammatical Error Correction [[arXiv](https://arxiv.org/abs/1606.00189)]\n\n#### 2016-05\n\n- Hierarchical Memory Networks [[arXiv](https://arxiv.org/abs/1605.07427)]\n- Deep API Learning [[arXiv](http://arxiv.org/abs/1605.08535)]\n- Wide Residual Networks [[arXiv](http://arxiv.org/abs/1605.07146)]\n- TensorFlow: A system for large-scale machine learning [[arXiv](http://arxiv.org/abs/1605.08695)]\n- Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention [[arXiv](http://arxiv.org/abs/1605.09090)]\n- Aspect Level Sentiment Classification with Deep Memory Network [[arXiv](http://arxiv.org/abs/1605.08900)]\n- FractalNet: Ultra-Deep Neural Networks without Residuals [[arXiv](https://arxiv.org/abs/1605.07648)]\n- Learning End-to-End Goal-Oriented Dialog [[arXiv](http://arxiv.org/abs/1605.07683)]\n- One-shot Learning with Memory-Augmented Neural Networks [[arXiv](http://arxiv.org/abs/1605.06065)]\n- Deep Learning without Poor Local Minima [[arXiv](http://arxiv.org/abs/1605.07110)]\n- AVEC 2016 - Depression, Mood, and Emotion Recognition Workshop and Challenge [[arXiv](https://arxiv.org/abs/1605.01600)]\n- Data Programming: Creating Large Training Sets, Quickly [[arXiv](http://arxiv.org/abs/1605.07723)]\n- Deeply-Fused Nets [[arXiv](http://arxiv.org/abs/1605.07716)]\n- Deep Portfolio Theory [[arXiv](http://arxiv.org/abs/1605.07230)]\n- Unsupervised Learning for Physical Interaction through Video Prediction [[arXiv](http://arxiv.org/abs/1605.07157)]\n- Movie Description [[arXiv](http://arxiv.org/abs/1605.03705)]\n\n\n#### 2016-04\n\n- Higher Order Recurrent Neural Networks [[arXiv](https://arxiv.org/abs/1605.00064)]\n- Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition [[arXiv](https://arxiv.org/abs/1604.08352)]\n- Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation [[arXiv](https://arxiv.org/abs/1604.06057)]\n- The IBM 2016 English Conversational Telephone Speech Recognition System [[arXiv](https://arxiv.org/abs/1604.08242)]\n- Dialog-based Language Learning [[arXiv](https://arxiv.org/abs/1604.06045)]\n- Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss [[arXiv](https://arxiv.org/abs/1604.05529)]\n- Sentence-Level Grammatical Error Identification as Sequence-to-Sequence Correction [[arXiv](https://arxiv.org/abs/1604.04677)]\n- A Network-based End-to-End Trainable Task-oriented Dialogue System [[arXiv](http://arxiv.org/abs/1604.04562)]\n- Visual Storytelling [[arXiv](https://arxiv.org/abs/1604.03968)]\n- Improving the Robustness of Deep Neural Networks via Stability Training [[arXiv](http://arxiv.org/abs/1604.04326)]\n- [Bridging the Gaps Between Residual Learning, Recurrent Neural Networks and Visual Cortex](notes/bridging-gap-resnet-rnn.md) [[arXiv](https://arxiv.org/abs/1604.03640)]\n- Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention [[arXiv](https://arxiv.org/abs/1604.03286)]\n- [Sentence Level Recurrent Topic Model: Letting Topics Speak for Themselves](notes/slrtm.md) [[arXiv](https://arxiv.org/abs/1604.02038)]\n- [Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models](notes/open-vocab-nmt-hybrid-word-character.md) [[arXiv](http://arxiv.org/abs/1604.00788)]\n- [Building Machines That Learn and Think Like People](notes/building-machines-that-learn-and-think-like-people.md) [[arXiv](http://arxiv.org/abs/1604.00289)]\n- A Semisupervised Approach for Language Identification based on Ladder Networks [[arXiv](http://arxiv.org/abs/1604.00317)]\n- [Deep Networks with Stochastic Depth](notes/stochastic-depth.md) [[arXiv](http://arxiv.org/abs/1603.09382)]\n- PHOCNet: A Deep Convolutional Neural Network for Word Spotting in Handwritten Documents [[arXiv](http://arxiv.org/abs/1604.00187)]\n\n\n#### 2016-03\n\n- Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning [[arXiv](https://arxiv.org/abs/1603.07954)]\n- A Fast Unified Model for Parsing and Sentence Understanding [[arXiv](http://arxiv.org/abs/1603.06021)]\n- [Latent Predictor Networks for Code Generation](notes/latent-predictor-networks.md) [[arXiv](http://arxiv.org/abs/1603.06744)]\n- Attend, Infer, Repeat: Fast Scene Understanding with Generative Models [[arXiv](http://arxiv.org/abs/1603.08575)]\n- Recurrent Batch Normalization [[arXiv](http://arxiv.org/abs/1603.09025)]\n- Neural Language Correction with Character-Based Attention [[arXiv](http://arxiv.org/abs/1603.09727)]\n- [Incorporating Copying Mechanism in Sequence-to-Sequence Learning](notes/copynet.md) [[arXiv](http://arxiv.org/abs/1603.06393)]\n- How NOT To Evaluate Your Dialogue System [[arXiv](http://arxiv.org/abs/1603.08023)]\n- [Adaptive Computation Time for Recurrent Neural Networks](notes/act-rnn.md) [[arXiv](http://arxiv.org/abs/1603.08983)]\n- A guide to convolution arithmetic for deep learning [[arXiv](http://arxiv.org/abs/1603.07285)]\n- Colorful Image Colorization [[arXiv](http://arxiv.org/abs/1603.08983)]\n- Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles [[arXiv](http://arxiv.org/abs/1603.09246)]\n- Generating Factoid Questions With Recurrent Neural Networks: The 30M Factoid Question-Answer Corpus [[arXiv](http://arxiv.org/abs/1603.06807)]\n- A Persona-Based Neural Conversation Model [[arXiv](http://arxiv.org/abs/1603.06155)]\n- [A Character-level Decoder without Explicit Segmentation for Neural Machine Translation](notes/char-level-decoder.md) [[arXiv](http://arxiv.org/abs/1603.06147)]\n- Multi-Task Cross-Lingual Sequence Tagging from Scratch [[arXiv](http://arxiv.org/abs/1603.06270)]\n- Neural Variational Inference for Text Processing [[arXiv](http://arxiv.org/abs/1511.06038)]\n- Recurrent Dropout without Memory Loss [[arXiv](http://arxiv.org/abs/1603.05118)]\n- One-Shot Generalization in Deep Generative Models [[arXiv](http://arxiv.org/abs/1603.05106)]\n- Recursive Recurrent Nets with Attention Modeling for OCR in the Wild [[arXiv](Recursive Recurrent Nets with Attention Modeling for OCR in the Wild)]\n- A New Method to Visualize Deep Neural Networks [[arXiv](A New Method to Visualize Deep Neural Networks)]\n- Neural Architectures for Named Entity Recognition [[arXiv](http://arxiv.org/abs/1603.01360)]\n- End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF [[arXiv](http://arxiv.org/abs/1603.01354)]\n- Character-based Neural Machine Translation [[arXiv](http://arxiv.org/abs/1603.00810)]\n- Learning Word Segmentation Representations to Improve Named Entity Recognition for Chinese Social Media [[arXiv](http://arxiv.org/abs/1603.00786)]\n\n#### 2016-02\n\n- Architectural Complexity Measures of Recurrent Neural Networks [[arXiv](http://arxiv.org/abs/1602.08210)]\n- Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks [[arXiv](http://arxiv.org/abs/1602.07868)]\n- Recurrent Neural Network Grammars [[arXiv](http://arxiv.org/abs/1602.07776)]\n- Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations [[arXiv](http://arxiv.org/abs/1602.07332)]\n- [Contextual LSTM (CLSTM) models for Large scale NLP tasks](notes/clstm-large-scale.md) [[arXiv](http://arxiv.org/abs/1602.06291)]\n- Sequence-to-Sequence RNNs for Text Summarization [[arXiv](http://arxiv.org/abs/1602.06023)]\n- Extraction of Salient Sentences from Labelled Documents [[arXiv](http://arxiv.org/abs/1412.6815)]\n- Learning Distributed Representations of Sentences from Unlabelled Data [[arXiv](http://arxiv.org/abs/1602.03483)]\n- Benefits of depth in neural networks [[arXiv](http://arxiv.org/abs/1602.04485)]\n- [Associative Long Short-Term Memory](notes/associative-lstm.md) [[arXiv](http://arxiv.org/abs/1602.03032)]\n- Why Should I Trust You?\": Explaining the Predictions of Any Classifier [[arXiv](https://arxiv.org/abs/1602.04938)] [[code](https://github.com/marcotcr/lime)]\n- Generating images with recurrent adversarial networks [[arXiv](http://arxiv.org/abs/1602.05110)]\n- [Exploring the Limits of Language Modeling](notes/exploring-the-limits-of-lm.md) [[arXiv](http://arxiv.org/abs/1602.02410)]\n- Swivel: Improving Embeddings by Noticing What’s Missing [[arXiv](http://arxiv.org/abs/1602.02215)]\n- [WebNav: A New Large-Scale Task for Natural Language based Sequential Decision Making](notes/webnav.md) [[arXiv](http://arxiv.org/abs/1602.02261)]\n- [Efficient Character-level Document Classification by Combining Convolution and Recurrent Layers](notes/efficient-char-level-document-classification-cnn-rnn.md) [[arXiv](http://arxiv.org/abs/1602.00367)]\n- Gradient Descent Converges to Minimizers [[arXiv](https://arxiv.org/abs/1602.04915)] [[article](http://www.offconvex.org/2016/03/24/saddles-again/)]\n- BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 [[arXiv](http://arxiv.org/abs/1602.02830)]\n- Learning Discriminative Features via Label Consistent Neural Network [[arXiv](http://arxiv.org/abs/1602.01168)]\n\n#### 2016-01\n\n- What’s your ML test score? A rubric for ML production systems [[Research at Google](https://research.google.com/pubs/pub45742.html)]\n- Pixel Recurrent Neural Networks [[arXiv](http://arxiv.org/abs/1601.06759)]\n- Bitwise Neural Networks [[arXiv](http://arxiv.org/abs/1601.06071)]\n- Long Short-Term Memory-Networks for Machine Reading [[arXiv](http://arxiv.org/abs/1601.06733)]\n- Coverage-based Neural Machine Translation [[arXiv](http://arxiv.org/abs/1601.04811)]\n- Understanding Deep Convolutional Networks [[arXiv](http://arxiv.org/abs/1601.04920)]\n- Training Recurrent Neural Networks by Diffusion [[arXiv](http://arxiv.org/abs/1601.04114)]\n- Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures [[arXiv](http://arxiv.org/abs/1601.03896)]\n- [Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism](notes/multi-way-nmt-shared-attention.md) [[arXiv](http://arxiv.org/abs/1601.01073)]\n- [Recurrent Memory Network for Language Modeling](notes/rmn-language-modeling.md) [[arXiv](http://arxiv.org/abs/1601.01272)]\n- Language to Logical Form with Neural Attention [[arXiv](http://arxiv.org/abs/1601.01280)]\n- Learning to Compose Neural Networks for Question Answering [[arXiv](http://arxiv.org/abs/1601.01705)]\n- The Inevitability of Probability: Probabilistic Inference in Generic Neural Networks Trained with Non-Probabilistic Feedback [[arXiv](http://arxiv.org/abs/1601.03060)]\n- COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images [[arXiv](http://arxiv.org/abs/1601.07140)]\n- Survey on the attention based RNN model and its applications in computer vision [[arXiv](http://arxiv.org/abs/1601.06823)]\n\n#### 2015-12\n\nNLP\n\n- [Strategies for Training Large Vocabulary Neural Language Models](notes/strategies-for-training-large-vocab-lm.md) [[arXiv](http://arxiv.org/abs/1512.04906)]\n- [Multilingual Language Processing From Bytes](notes/multilingual-language-processing-from-bytes.md) [[arXiv](http://arxiv.org/abs/1512.00103)]\n- [Learning Document Embeddings by Predicting N-grams for Sentiment Classification of Long Movie Reviews](notes/learning-document-embeddings-ngrams.md) [[arXiv](http://arxiv.org/abs/1512.08183)]\n- [Target-Dependent Sentiment Classification with Long Short Term Memory](notes/target-dependent-sentiment-lstm.md) [[arXiv](http://arxiv.org/abs/1512.01100)]\n- Reading Text in the Wild with Convolutional Neural Networks [[arXiv](http://arxiv.org/abs/1412.1842)]\n\nVision\n\n- [Deep Residual Learning for Image Recognition](notes/deep-residual-learning.md) [[arXiv](http://arxiv.org/abs/1512.03385)]\n- Rethinking the Inception Architecture for Computer Vision [[arXiv](http://arxiv.org/abs/1512.00567)]\n- Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks [[arXiv](http://arxiv.org/abs/1512.04143)]\n- Deep Speech 2: End-to-End Speech Recognition in English and Mandarin [[arXiv](http://arxiv.org/abs/1512.02595)]\n\n\n#### 2015-11\n\nNLP\n\n- [Deep Reinforcement Learning with a Natural Language Action Space](notes/drl-nlp-action.md) [[arXiv](https://arxiv.org/abs/1511.04636)]\n- Sequence Level Training with Recurrent Neural Networks [[arXiv](http://arxiv.org/abs/1511.06732)]\n- [Teaching Machines to Read and Comprehend](notes/teaching-machines-to-read-and-comprehend.md) [[arxiv](http://arxiv.org/abs/1506.03340)]\n- [Semi-supervised Sequence Learning](notes/semi-supervised-sequence-learning.md) [[arXiv](http://arxiv.org/abs/1511.01432)]\n- [Multi-task Sequence to Sequence Learning](notes/multitask-seq2seq.md) [[arXiv](http://arxiv.org/abs/1511.06114)]\n- [Alternative structures for character-level RNNs](notes/alternative-structure-char-rnn.md) [[arXiv](http://arxiv.org/abs/1511.06303)]\n- [Larger-Context Language Modeling](notes/larger-context-lm.md) [[arXiv](http://arxiv.org/abs/1511.03729)]\n- [A Unified Tagging Solution: Bidirectional LSTM Recurrent Neural Network with Word Embedding](notes/unified-tagging-blstm.md) [[arXiv](http://arxiv.org/abs/1511.00215)]\n- Towards Universal Paraphrastic Sentence Embeddings [[arXiv](http://arxiv.org/abs/1511.08198)]\n- BlackOut: Speeding up Recurrent Neural Network Language Models With Very Large Vocabularies [[arXiv](http://arxiv.org/abs/1511.06909)]\n- Sequence Level Training with Recurrent Neural Networks [[arXiv](http://arxiv.org/abs/1511.06732)]\n- Natural Language Understanding with Distributed Representation [[arXiv](http://arxiv.org/abs/1511.07916)]\n- sense2vec - A Fast and Accurate Method for Word Sense Disambiguation In Neural Word Embeddings [[arXiv](http://arxiv.org/abs/1511.06388)]\n- LSTM-based Deep Learning Models for non-factoid answer selection [[arXiv](http://arxiv.org/abs/1511.04108)]\n\nPrograms\n\n- Neural Random-Access Machines [[arxiv](http://arxiv.org/abs/1511.06392)]\n- Neural Programmer: Inducing Latent Programs with Gradient Descent [[arXiv](http://arxiv.org/abs/1511.04834)]\n- Neural Programmer-Interpreters [[arXiv](http://arxiv.org/abs/1511.06279)]\n- Learning Simple Algorithms from Examples [[arXiv](http://arxiv.org/abs/1511.07275)]\n- Neural GPUs Learn Algorithms [[arXiv](http://arxiv.org/abs/1511.08228)] [[code](https://github.com/tensorflow/tensor2tensor)]\n- On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models [[arXiv](http://arxiv.org/abs/1511.09249)]\n\nVision\n\n- ReSeg: A Recurrent Neural Network for Object Segmentation [[arXiv](http://arxiv.org/abs/1511.07053)]\n- Deconstructing the Ladder Network Architecture [[arXiv](http://arxiv.org/abs/1511.06430)]\n- Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks [[arXiv](http://arxiv.org/abs/1511.06434)]\n- Multi-Scale Context Aggregation by Dilated Convolutions [[arXiv](https://arxiv.org/abs/1511.07122)] [[code](https://github.com/fyu/drn)]\n\nGeneral\n\n- Towards Principled Unsupervised Learning [[arXiv](http://arxiv.org/abs/1511.06440)]\n- Dynamic Capacity Networks [[arXiv](http://arxiv.org/abs/1511.07838)]\n- [Generating Sentences from a `ous Space](notes/generating-sentences-cont-space.md) [[arXiv](http://arxiv.org/abs/1511.06349)]\n- Net2Net: Accelerating Learning via Knowledge Transfer [[arXiv](http://arxiv.org/abs/1511.05641)]\n- A Roadmap towards Machine Intelligence [[arXiv](http://arxiv.org/abs/1511.08130)]\n- Session-based Recommendations with Recurrent Neural Networks [[arXiv](http://arxiv.org/abs/1511.06939)]\n- Regularizing RNNs by Stabilizing Activations [[arXiv](http://arxiv.org/abs/1511.08400)]\n\n\n#### 2015-10\n\n- [A Sensitivity Analysis of (and Practitioners' Guide to) Convolutional Neural Networks for Sentence Classification](notes/sensitivity-analysis-cnn-sentence-classification.md) [[arXiv](http://arxiv.org/abs/1510.03820)]\n- [Attention with Intention for a Neural Network Conversation Model](notes/attention-with-intention.md) [[arXiv](http://arxiv.org/abs/1510.08565)]\n- Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Recurrent Neural Network [[arXiv](http://arxiv.org/abs/1510.06168)]\n- A Survey: Time Travel in Deep Learning Space: An Introduction to Deep Learning Models and How Deep Learning Models Evolved from the Initial Ideas [[arXiv](http://arxiv.org/abs/1510.04781)]\n- A Primer on Neural Network Models for Natural Language Processing [[arXiv](http://arxiv.org/abs/1510.00726)]\n- [A Diversity-Promoting Objective Function for Neural Conversation Models](notes/diversity-promoting-objective-ncm.md) [[arXiv](http://arxiv.org/abs/1510.03055)]\n\n\n#### 2015-09\n\n- [Character-level Convolutional Networks for Text Classification](notes/character-level-cnn-for-text-classification.md) [[arXiv](http://arxiv.org/abs/1509.01626)]\n- [A Neural Attention Model for Abstractive Sentence Summarization](notes/neural-attention-model-for-abstractive-sentence-summarization.md) [[arXiv](http://arxiv.org/abs/1509.00685)]\n- Poker-CNN: A Pattern Learning Strategy for Making Draws and Bets in Poker Games [[arXiv](http://arxiv.org/abs/1509.06731)]\n\n#### 2015-08\n\n- [Neural Machine Translation of Rare Words with Subword Units](notes/nmt-subword.md) [[arXiv](https://arxiv.org/abs/1508.07909)] [[code](https://github.com/rsennrich/subword-nmt)]\n- Listen, Attend and Spell [[arxiv](http://arxiv.org/abs/1508.01211)]\n- [Character-Aware Neural Language Models](notes/character-aware-nlm.md) [[arXiv](http://arxiv.org/abs/1508.06615)]\n- Improved Transition-Based Parsing by Modeling Characters instead of Words with LSTMs [[arXiv](http://arxiv.org/abs/1508.00657)]\n- Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation [[arXiv](http://arxiv.org/abs/1508.02096)]\n- [Effective Approaches to Attention-based Neural Machine Translation](notes/effective-approaches-nmt-attention.md) [[arXiv](https://arxiv.org/abs/1508.04025)]\n\n#### 2015-07\n\n- [Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models](e2e-dialog-ghnnm.md) [[arXiv](http://arxiv.org/abs/1507.04808)]\n- Semi-Supervised Learning with Ladder Networks [[arXiv](http://arxiv.org/abs/1507.02672)]\n- [Document Embedding with Paragraph Vectors](notes/document-embedding-with-pv.md) [[arXiv](http://arxiv.org/abs/1507.07998)]\n- [Training Very Deep Networks](notes/training-very-deep-networks.md) [[arXiv](http://arxiv.org/abs/1507.06228)]\n\n#### 2015-06\n\n- Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning [[arXiv](https://arxiv.org/abs/1506.02142)]\n- [A Neural Network Approach to Context-Sensitive Generation of Conversational Responses](notes/nn-context-sentitive-responses.md) [[arXiv](http://arxiv.org/abs/1506.06714)]\n- [Document Embedding with Paragraph Vectors](notes/document-embedding-with-pv.md) [[arXiv](http://arxiv.org/abs/1507.07998)]\n- [A Neural Conversational Model](notes/neural-conversational-model.md) [[arXiv](http://arxiv.org/abs/1506.05869)]\n- [Skip-Thought Vectors](notes/skip-thought-vectors.md) [[arXiv](http://arxiv.org/abs/1506.06726)]\n- [Pointer Networks](notes/pointer-networks.md) [[arXiv](http://arxiv.org/abs/1506.03134)]\n- [Spatial Transformer Networks](notes/spatial-transformer-networks.md) [[arXiv](http://arxiv.org/abs/1506.02025)]\n- Tree-structured composition in neural networks without tree-structured architectures [[arXiv](http://arxiv.org/abs/1506.04834)]\n- Visualizing and Understanding Neural Models in NLP [[arXiv](http://arxiv.org/abs/1506.01066)]\n- Learning to Transduce with Unbounded Memory [[arXiv](http://arxiv.org/abs/1506.02516)]\n- Ask Me Anything: Dynamic Memory Networks for Natural Language Processing [[arXiv](http://arxiv.org/abs/1506.07285)]\n- [Deep Knowledge Tracing](notes/deep-knowledge-tracing.md) [[arXiv](http://arxiv.org/abs/1506.05908)]\n\n#### 2015-05\n\n- [ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks](notes/renet-rnn-alternative-to-convnet.md) [[arXiv](http://arxiv.org/abs/1505.00393)]\n- Reinforcement Learning Neural Turing Machines [[arXiv](http://arxiv.org/abs/1505.00521)]\n\n#### 2015-04\n\n- Correlational Neural Networks [[arXiv](http://arxiv.org/abs/1504.07225)]\n\n#### 2015-03\n\n\n- [Distilling the Knowledge in a Neural Network](notes/distilling-the-knowledge-in-a-nn.md) [[arXiv](http://arxiv.org/abs/1503.02531)]\n- [End-To-End Memory Networks](notes/end-to-end-memory-networks.md) [[arXiv](http://arxiv.org/abs/1503.08895)]\n- [Neural Responding Machine for Short-Text Conversation](notes/neural-responding-machine.md) [[arXiv](http://arxiv.org/abs/1503.02364)]\n- [Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift](notes/batch-normalization.md) [[arXiv](http://arxiv.org/abs/1502.03167)]\n- Escaping From Saddle Points --- Online Stochastic Gradient for Tensor Decomposition [[arXiv](https://arxiv.org/abs/1503.02101)] [[article](Escaping from Saddle Points)]\n\n\n#### 2015-02\n\n- Human-level control through deep reinforcement\nlearning [[Nature](https://web.stanford.edu/class/psych209/Readings/MnihEtAlHassibis15NatureControlDeepRL.pdf)] [[code](https://github.com/deepmind/dqn)]\n- [Text Understanding from Scratch](notes/text-understanding-from-scratch.md) [[arXiv](http://arxiv.org/abs/1502.01710)]\n- [Show, Attend and Tell: Neural Image Caption Generation with Visual Attention](notes/show-attend-tell.md) [[arXiv](http://arxiv.org/abs/1502.03044)]\n\n#### 2015-01\n\n- Hidden Technical Debt in Machine Learning Systems [[NIPS](https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf)]\n\n#### 2014-12\n\n- Learning Longer Memory in Recurrent Neural Networks [[arXiv](http://arxiv.org/abs/1412.7753)]\n- [Neural Turing Machines](notes/neural-turing-machines.md) [[arxiv](http://arxiv.org/abs/1410.5401)]\n- [Grammar as a Foreign Langauage](notes/grammar-as-a-foreign-language.md) [[arXiv](http://arxiv.org/abs/1412.7449)]\n- [On Using Very Large Target Vocabulary for Neural Machine Translation](notes/on-using-very-large-target-vocabulary-for-nmt.md) [[arXiv](http://arxiv.org/abs/1412.2007)]\n- Effective Use of Word Order for Text Categorization with Convolutional Neural Networks [[arXiv](http://arxiv.org/abs/1412.1058v1)]\n- Multiple Object Recognition with Visual Attention [[arXiv](http://arxiv.org/abs/1412.7755)]\n\n#### 2014-11\n\n- The Loss Surfaces of Multilayer Networks [[arXiv](https://arxiv.org/abs/1412.0233)]\n\n#### 2014-10\n\n- [Learning to Execute](notes/learning-to-execute.md) [[arXiv](http://arxiv.org/abs/1410.4615)]\n\n#### 2014-09\n\n- [Sequence to Sequence Learning with Neural Networks](notes/seq2seq-with-neural-networks.md) [[arXiv](http://arxiv.org/abs/1409.3215)]\n- [Neural Machine Translation by Jointly Learning to Align and Translate](notes/nmt-jointly-learning-to-align-and-translate.md) [[arxiv](http://arxiv.org/abs/1409.0473)]\n- [On the Properties of Neural Machine Translation: Encoder-Decoder Approaches](notes/properties-of-neural-mt.md) [[arXiv](http://arxiv.org/abs/1409.1259)]\n- [Recurrent Neural Network Regularization](notes/rnn-regularization.md) [[arXiv](http://arxiv.org/abs/1409.2329)]\n- Very Deep Convolutional Networks for Large-Scale Image Recognition [[arXiv](http://arxiv.org/abs/1409.1556)]\n- Going Deeper with Convolutions [[arXiv](http://arxiv.org/abs/1409.4842)]\n\n#### 2014-08\n\n- Convolutional Neural Networks for Sentence Classification [[arxiv](http://arxiv.org/abs/1408.5882)]\n\n#### 2014-07\n\n#### 2014-06\n\n- [Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation](notes/learning-phrase-representations.md) [[arXiv](http://arxiv.org/abs/1406.1078)]\n- [Recurrent Models of Visual Attention](notes/recurrent-models-of-visual-attention.md) [[arXiv](http://arxiv.org/abs/1406.6247)]\n- Generative Adversarial Networks [[arXiv](http://arxiv.org/abs/1406.2661)]\n\n#### 2014-05\n\n- [Distributed Representations of Sentences and Documents](notes/distributed-representations-of-sentences-and-documents.md) [[arXiv](http://arxiv.org/abs/1405.4053)]\n\n#### 2014-04\n\n- A Convolutional Neural Network for Modelling Sentences [[arXiv](http://arxiv.org/abs/1404.2188)]\n\n#### 2014-03\n\n#### 2014-02\n\n#### 2014-01\n\n- Machine Learning: The High Interest Credit Card of Technical Debt [[Research at Google](https://research.google.com/pubs/pub43146.html)]\n\n#### 2013\n\n- Visualizing and Understanding Convolutional Networks [[arXiv](http://arxiv.org/abs/1311.2901)]\n- DeViSE: A Deep Visual-Semantic Embedding Model [[pub](http://research.google.com/pubs/pub41473.html)]\n- Maxout Networks [[arXiv](http://arxiv.org/abs/1302.4389)]\n- Exploiting Similarities among Languages for Machine Translation [[arXiv](http://arxiv.org/abs/1309.4168)]\n- Efficient Estimation of Word Representations in Vector Space [[arXiv](http://arxiv.org/abs/1301.3781)]\n\n\n#### 2011\n\n- Natural Language Processing (almost) from Scratch [[arXiv](http://arxiv.org/abs/1103.0398)]\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdennybritz%2Fdeeplearning-papernotes","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdennybritz%2Fdeeplearning-papernotes","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdennybritz%2Fdeeplearning-papernotes/lists"}