https://github.com/dennybritz/deeplearning-papernotes

Summaries and notes on Deep Learning research papers
https://github.com/dennybritz/deeplearning-papernotes
Last synced: 12 months ago
JSON representation
Summaries and notes on Deep Learning research papers
Host: GitHub
URL: https://github.com/dennybritz/deeplearning-papernotes
Owner: dennybritz
Created: 2015-12-19T18:34:21.000Z (about 10 years ago)
Default Branch: master
Last Pushed: 2018-02-13T01:04:02.000Z (about 8 years ago)
Last Synced: 2025-01-30T09:43:24.706Z (about 1 year ago)
Homepage:
Size: 401 KB
Stars: 4,417
Watchers: 758
Forks: 905
Open Issues: 6
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

ai-links - GitHub - dennybritz/deeplearning-papernotes: Summaries and notes on Deep Learning research papers
research-papers - Summaries and notes on deep learning research papers
Best-Audio-Classification-Resources-with-Deep-learning - DL PaperNotes - Summaries and notes on general deep learning research papers (Other useful related lists and resources)
Github-Repositories - Summaries and notes on Deep Learning research papers
awesome-papers - dennybritz/deeplearning-papernotes
awesome-deep-learning-music - DL PaperNotes - Summaries and notes on general deep learning research papers (Other useful related lists and resources)
Resources - Summaries and notes on Deep Learning research papers
README

          #### 2018-02

- The Matrix Calculus You Need For Deep Learning [[arXiv](https://arxiv.org/abs/1802.01528v2)]

- Regularized Evolution for Image Classifier Architecture Search [[arXiv](https://arxiv.org/abs/1802.01548)]

- Online Learning: A Comprehensive Survey [[arXiv](https://arxiv.org/abs/1802.02871)]

- Visual Interpretability for Deep Learning: a Survey [[arXiv](https://arxiv.org/abs/1802.00614)]

- Behavior is Everything – Towards Representing Concepts with Sensorimotor Contingencies [[paper](https://www.vicarious.com/wp-content/uploads/2018/01/AAAI18-pixelworld.pdf)] [[article](https://www.vicarious.com/2018/02/07/learning-concepts-through-sensorimotor-interactions/)] [[code](https://github.com/vicariousinc/pixelworld)]

- IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures [[arXiv](https://arxiv.org/abs/1802.01561)] [[article](https://deepmind.com/blog/impala-scalable-distributed-deeprl-dmlab-30/)] [[code](https://github.com/deepmind/lab/tree/master/game_scripts/levels/contributed/dmlab30)]

- DeepType: Multilingual Entity Linking by Neural Type System Evolution [[arXiv](https://arxiv.org/abs/1802.01021)] [[article](https://blog.openai.com/discovering-types-for-entity-disambiguation/)] [[code](https://github.com/openai/deeptype)]

- DensePose: Dense Human Pose Estimation In The Wild [[arXiv](https://arxiv.org/abs/1802.00434)] [[article](http://densepose.org/)]

#### 2018-01

- Nested LSTMs [[arXiv](https://arxiv.org/abs/1801.10308)]

- Generating Wikipedia by Summarizing Long Sequences [[arXiv](https://arxiv.org/abs/1801.10198)]

- Scalable and accurate deep learning for electronic health records [[arXiv](https://arxiv.org/abs/1801.07860)]

- Kernel Feature Selection via Conditional Covariance Minimization [[NIPS paper](https://papers.nips.cc/paper/7270-kernel-feature-selection-via-conditional-covariance-minimization.pdf)] [[article](http://bair.berkeley.edu/blog/2018/01/23/kernels/)] [[code](https://github.com/Jianbo-Lab/CCM)]

- Psychlab: A Psychology Laboratory for Deep Reinforcement Learning Agents [[arXiv](https://arxiv.org/abs/1801.08116)] [[article](https://deepmind.com/blog/open-sourcing-psychlab/)] [[code](https://github.com/deepmind/lab/tree/master/game_scripts/levels/contributed/psychlab)]

- Fine-tuned Language Models for Text Classification [[arXiv](https://arxiv.org/abs/1801.06146)] [[code]()] (soon)

- Deep Learning: An Introduction for Applied Mathematicians [[arXiv](https://arxiv.org/abs/1801.05894v1)]

- Innateness, AlphaZero, and Artificial Intelligence [[arXiv](https://arxiv.org/abs/1801.05667)]

- Can Computers Create Art? [[arXiv](https://arxiv.org/abs/1801.04486)]

- eCommerceGAN : A Generative Adversarial Network for E-commerce [[arXiv](https://arxiv.org/abs/1801.03244)]

- Expected Policy Gradients for Reinforcement Learning [[arXiv](https://arxiv.org/abs/1801.03326)]

- DroNet: Learning to Fly by Driving [[UZH docs](http://rpg.ifi.uzh.ch/docs/RAL18_Loquercio.pdf)] [[article](http://rpg.ifi.uzh.ch/dronet.html)] [[code](https://github.com/uzh-rpg/rpg_public_dronet)]

- Symmetric Decomposition of Asymmetric Games [[Scientific Reports](https://www.nature.com/articles/s41598-018-19194-4)] [[article](https://deepmind.com/blog/game-theory-insights-asymmetric-multi-agent-games/)]

- Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor [[arXiv](https://arxiv.org/abs/1801.01290)] [[code](https://github.com/haarnoja/sac)]

- SBNet: Sparse Blocks Network for Fast Inference [[arXiv](https://arxiv.org/pdf/1801.02108.pdf)] [[article](https://eng.uber.com/sbnet/)] [[code](https://github.com/uber/sbnet)]

- DeepMind Control Suite [[arXiv](https://arxiv.org/abs/1801.00690)] [[code](https://github.com/deepmind/dm_control)]

- Deep Learning: A Critical Appraisal [[arXiv](https://arxiv.org/abs/1801.00631)]

#### 2017-12

- Adversarial Patch [[arXiv](https://arxiv.org/abs/1712.09665)]

- CNN Is All You Need [[arXiv](https://arxiv.org/abs/1712.09662)]

- Learning Robot Objectives from Physical Human Interaction [[paper](http://proceedings.mlr.press/v78/bajcsy17a/bajcsy17a.pdf)] [[article](http://bair.berkeley.edu/blog/2018/02/06/phri/)]

- The NarrativeQA Reading Comprehension Challenge [[arXiv](https://arxiv.org/abs/1712.07040v1)] [[dataset](https://github.com/deepmind/narrativeqa)]

- Objects that Sound [[arXiv](https://arxiv.org/abs/1712.06651)]

- Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions [[arXiv](https://arxiv.org/abs/1712.05884)] [[article](https://research.googleblog.com/2017/12/tacotron-2-generating-human-like-speech.html)] [[article2](https://google.github.io/tacotron/publications/tacotron2/index.html)]

- Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning [[arXiv](https://arxiv.org/abs/1712.06567)] [[article](https://eng.uber.com/deep-neuroevolution/)] [[code](https://github.com/uber-common/deep-neuroevolution)]

- Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents [[arXiv](https://arxiv.org/abs/1712.06560)] [[article](https://eng.uber.com/deep-neuroevolution/)] [[code](https://github.com/uber-common/deep-neuroevolution)]

- Superhuman AI for heads-up no-limit poker: Libratus beats top professionals [[Science](http://science.sciencemag.org/content/early/2017/12/15/science.aao1733)]

- Mathematics of Deep Learning [[arXiv](https://arxiv.org/abs/1712.04741)]

- State-of-the-art Speech Recognition With Sequence-to-Sequence Models [[arXiv](https://arxiv.org/abs/1712.01769)] [[article](https://research.googleblog.com/2017/12/improving-end-to-end-models-for-speech.html)]

- Peephole: Predicting Network Performance Before Training [[arXiv](https://arxiv.org/abs/1712.03351)]

- Deliberation Network: Pushing the frontiers of neural machine translation [[Research at Microsoft](https://www.microsoft.com/en-us/research/publication/deliberation-networks-sequence-generation-beyond-one-pass-decoding/)] [[article](https://www.microsoft.com/en-us/research/blog/deliberation-networks/)]

- GPU Kernels for Block-Sparse Weights [[Research at OpenAI](https://s3-us-west-2.amazonaws.com/openai-assets/blocksparse/blocksparsepaper.pdf)] [[article](https://blog.openai.com/block-sparse-gpu-kernels/)] [[code](https://github.com/openai/blocksparse)]

- Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm [[arXiv](https://arxiv.org/abs/1712.01815)]

- Deep Learning Scaling is Predictable, Empirically [[arXiv](https://arxiv.org/abs/1712.00409)] [[article](http://research.baidu.com/deep-learning-scaling-predictable-empirically/)]

#### 2017-11

- High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs [[arXiv](https://arxiv.org/abs/1711.11585)] [[article](https://tcwang0509.github.io/pix2pixHD/)] [[code](https://github.com/NVIDIA/pix2pixHD)]

- StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation [[arXiv](https://arxiv.org/abs/1711.09020)] [[code](https://github.com/yunjey/StarGAN/)]

- Population Based Training of Neural Networks [[arXiv](https://arxiv.org/abs/1711.09846)] [[article](https://deepmind.com/blog/population-based-training-neural-networks/)]

- Distilling a Neural Network Into a Soft Decision Tree [[arXiv](https://arxiv.org/abs/1711.09784)]

- Neural Text Generation: A Practical Guide [[arXiv](https://arxiv.org/abs/1711.09534)]

- Parallel WaveNet: Fast High-Fidelity Speech Synthesis [[DeepMind documents](https://deepmind.com/documents/131/Distilling_WaveNet.pdf)] [[article](https://deepmind.com/blog/high-fidelity-speech-synthesis-wavenet/)]

- CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning [[arXiv](https://arxiv.org/abs/1711.05225)] [[article](https://stanfordmlgroup.github.io/projects/chexnet/)]

- Non-local Neural Networks [[arXiv](https://arxiv.org/abs/1711.07971)]

- Deep Image Prior [[paper](https://sites.skoltech.ru/app/data/uploads/sites/25/2017/11/deep_image_prior.pdf)] [[article](https://dmitryulyanov.github.io/deep_image_prior)] [[code](https://github.com/DmitryUlyanov/deep-image-prior)]

- Online Deep Learning: Learning Deep Neural Networks on the Fly [[arXiv](https://arxiv.org/abs/1711.03705)]

- Learning Explanatory Rules from Noisy Data [[arXiv](https://arxiv.org/abs/1711.04574)]

- Improving Palliative Care with Deep Learning [[arXiv](https://arxiv.org/abs/1711.06402)] [[article](https://stanfordmlgroup.github.io/projects/improving-palliative-care/)]

- VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection [[arXiv](https://arxiv.org/abs/1711.06396)]

- Weighted Transformer Network for Machine Translation [[arXiv](https://arxiv.org/abs/1711.02132)] [[article](https://einstein.ai/research/weighted-transformer)]

- Non-Autoregressive Neural Machine Translation [[arXiv](https://arxiv.org/abs/1711.02281)] [[article](https://einstein.ai/research/non-autoregressive-neural-machine-translation)]

- Block-Sparse Recurrent Neural Networks [[arXiv](https://arxiv.org/abs/1711.02782)]

- A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning [[arXiv](https://arxiv.org/abs/1711.00832)]

- Neural Discrete Representation Learning [[arXiv](https://arxiv.org/abs/1711.00937)] [[article](https://avdnoord.github.io/homepage/vqvae/)]

- Don't Decay the Learning Rate, Increase the Batch Size [[arXiv](https://arxiv.org/abs/1711.00489)]

- Hierarchical Representations for Efficient Architecture Search [[arXiv](https://arxiv.org/abs/1711.00436)]

#### 2017-10

- Unsupervised Machine Translation Using Monolingual Corpora Only [[arXiv](https://arxiv.org/abs/1711.00043)]

- Dynamic Routing Between Capsules [[arXiv](https://arxiv.org/abs/1710.09829)]

- A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs [[Science](http://science.sciencemag.org/content/early/2017/10/26/science.aag2612.full)] [[article](https://www.vicarious.com/2017/10/26/common-sense-cortex-and-captcha/)] [[code](https://github.com/vicariousinc/science_rcn)]

- Understanding Grounded Language Learning Agents [[arXiv](https://arxiv.org/abs/1710.09867)]

- Planning, Fast and Slow: A Framework for Adaptive Real-Time Safe Trajectory Planning [[arXiv](https://arxiv.org/abs/1710.04731)] [[article](http://bair.berkeley.edu/blog/2017/12/05/fastrack/)] [[code](https://github.com/HJReachability)] (soon)

- Malware Detection by Eating a Whole EXE [[arXiv](https://arxiv.org/abs/1710.09435)] [[article](https://devblogs.nvidia.com/malware-detection-neural-networks/)]

- Progressive Growing of GANs for Improved Quality, Stability, and Variation [[Research at Nvidia](http://research.nvidia.com/sites/default/files/pubs/2017-10_Progressive-Growing-of//karras2017gan-paper.pdf)] [[article](http://research.nvidia.com/publication/2017-10_Progressive-Growing-of)] [[code](https://github.com/tkarras/progressive_growing_of_gans)]

- Meta Learning Shared Hierarchies [[arXiv](https://arxiv.org/abs/1710.09767)] [[article](https://blog.openai.com/learning-a-hierarchy/)] [[code](https://github.com/openai/mlsh)]

- Deep Voice 3: 2000-Speaker Neural Text-to-Speech [[arXiv](http://research.baidu.com/deep-voice-3-2000-speaker-neural-text-speech/)] [[article](http://research.baidu.com/deep-voice-3-2000-speaker-neural-text-speech/)]

- AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions [[arXiv](https://arxiv.org/abs/1705.08421)] [[article](https://research.googleblog.com/2017/10/announcing-ava-finely-labeled-video.html)] [[dataset](https://research.google.com/ava/)]

-  Mastering the game of Go without Human Knowledge [[Nature](https://www.nature.com/articles/nature24270.epdf?author_access_token=VJXbVjaSHxFoctQQ4p2k4tRgN0jAjWel9jnR3ZoTv0PVW4gB86EEpGqTRDtpIz-2rmo8-KG06gqVobU5NSCFeHILHcVFUeMsbvwS-lxjqQGg98faovwjxeTUgZAUMnRQ)] [[article](https://deepmind.com/blog/alphago-zero-learning-scratch/)]

-  Sim-to-Real Transfer of Robotic Control with Dynamics Randomization [[arXiv](https://arxiv.org/abs/1710.06537)] [[article](https://blog.openai.com/generalizing-from-simulation/)]

-  Asymmetric Actor Critic for Image-Based Robot Learning [[arXiv](https://arxiv.org/abs/1710.06542)] [[article](https://blog.openai.com/generalizing-from-simulation/)]

-  A systematic study of the class imbalance problem in convolutional neural networks [[arXiv](https://arxiv.org/abs/1710.05381)]

-  Generalization in Deep Learning [[arXiv](https://arxiv.org/abs/1710.05468)]

- Swish: a Self-Gated Activation Function [[arXiv](https://arxiv.org/abs/1710.05941)]

- Emergent Translation in Multi-Agent Communication [[arXiv](https://arxiv.org/abs/1710.06922)]

- SLING: A framework for frame semantic parsing [[arXiv](https://arxiv.org/abs/1710.07032)] [[article](https://research.googleblog.com/2017/11/sling-natural-language-frame-semantic.html)] [[code](https://github.com/google/sling)]

- Meta-Learning for Wrestling [[arXiv](https://arxiv.org/abs/1710.03641)] [[article](https://blog.openai.com/meta-learning-for-wrestling/)] [[code](https://github.com/openai/robosumo)]

- Mixed Precision Training [[arXiv](https://arxiv.org/abs/1710.03740)] [[article](http://research.baidu.com/mixed-precision-training/)] [[article2](https://devblogs.nvidia.com/parallelforall/mixed-precision-training-deep-neural-networks/)] [[code/docs](http://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html)]

- Generative Adversarial Networks: An Overview [[arXiv](https://arxiv.org/abs/1710.07035)]

- Emergent Complexity via Multi-Agent Competition [[arXiv](https://arxiv.org/abs/1710.03748)] [[article](https://blog.openai.com/competitive-self-play/)] [[code](https://github.com/openai/multiagent-competition)]

- Deep Lattice Networks and Partial Monotonic Functions [[Research at Google](https://research.google.com/pubs/pub46327.html)] [[article](https://research.googleblog.com/2017/10/tensorflow-lattice-flexibility.html)] [[code](https://github.com/tensorflow/lattice)]

- The IIT Bombay English-Hindi Parallel Corpus [[arXiv](https://arxiv.org/abs/1710.02855)] [[article](http://www.cfilt.iitb.ac.in/iitb_parallel/)]

- Rainbow: Combining Improvements in Deep Reinforcement Learning [[arXiv](https://arxiv.org/abs/1710.02298)]

- Lifelong Learning With Dynamically Expandable Networks [[arXiv](https://arxiv.org/abs/1708.01547)]

- Variational Inference & Deep Learning: A New Synthesis (Thesis) [[dropbox](https://www.dropbox.com/s/v6ua3d9yt44vgb3/cover_and_thesis.pdf)]

- Neural Task Programming: Learning to Generalize Across Hierarchical Tasks [[arXiv](https://arxiv.org/abs/1710.01813)]

- Neural Color Transfer between Images [[arXiv](https://arxiv.org/abs/1710.00756)]

- The hippocampus as a predictive map [[biorXiv](https://www.biorxiv.org/content/biorxiv/early/2017/07/25/097170.full.pdf)] [[article](https://deepmind.com/blog/hippocampus-predictive-map/)]

- Scalable and accurate deep learning for electronic health

records [[arXiv](https://arxiv.org/abs/1801.07860)]

#### 2017-09

- Variational Memory Addressing in Generative Models [[arXiv](https://arxiv.org/abs/1709.07116)]

- Overcoming Exploration in Reinforcement Learning with Demonstrations [[arXiv](https://arxiv.org/abs/1709.10089)]

- A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech Enhancement [[arXiv](https://arxiv.org/abs/1709.08243)] [[article](https://people.xiph.org/~jm/demo/rnnoise/)] [[code](https://github.com/xiph/rnnoise/)]

- ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on

Weakly-Supervised Classification and Localization of Common Thorax Diseases [[CVF](http://openaccess.thecvf.com/content_cvpr_2017/papers/Wang_ChestX-ray8_Hospital-Scale_Chest_CVPR_2017_paper.pdf)] [[article](https://www.nih.gov/news-events/news-releases/nih-clinical-center-provides-one-largest-publicly-available-chest-x-ray-datasets-scientific-community)] [[dataset](https://nihcc.app.box.com/v/ChestXray-NIHCC)]

- NIMA: Neural Image Assessment [[arXiv](https://arxiv.org/abs/1709.05424)] [[article](https://research.googleblog.com/2017/12/introducing-nima-neural-image-assessment.html)]

- Generating Sentences by Editing Prototypes [[arXiv](https://arxiv.org/abs/1709.08878)] [[code](https://github.com/kelvinguu/neural-editor)]

- The Consciousness Prior [[arXiv](https://arxiv.org/abs/1709.08568)]

- StarSpace: Embed All The Things! [[arXiv](https://arxiv.org/abs/1709.03856)] [[code](https://github.com/facebookresearch/Starspace)]

- Neural Optimizer Search with Reinforcement Learning [[arXiv](https://arxiv.org/abs/1709.07417)]

- Dynamic Evaluation of Neural Sequence Models [[arXiv](https://arxiv.org/abs/1709.07432)]

- Neural Machine Translation [[arXiv](https://arxiv.org/abs/1709.07809)]

- Matterport3D: Learning from RGB-D Data in Indoor Environments [[arXiv](https://arxiv.org/abs/1709.06158)] [[article](https://niessner.github.io/Matterport/)] [[article2](https://hackernoon.com/announcing-the-matterport3d-research-dataset-815cae932939)] [[code](https://github.com/niessner/Matterport)]

- Deep Reinforcement Learning that Matters [[arXiv](https://arxiv.org/abs/1709.06560)] [[code](https://github.com/Breakend/DeepReinforcementLearningThatMatters)]

- The Uncertainty Bellman Equation and Exploration [[arXiv](https://arxiv.org/abs/1709.05380)]

- WESPE: Weakly Supervised Photo Enhancer for Digital Cameras [[arXiv](https://arxiv.org/abs/1709.01118)] [[article](http://people.ee.ethz.ch/~ihnatova/wespe.html)]

- Globally Normalized Reader [[arXiv](https://arxiv.org/abs/1709.02828)] [[article](http://research.baidu.com/gnr/)] [[code](https://github.com/baidu-research/GloballyNormalizedReader)]

- A Brief Introduction to Machine Learning for Engineers [[arXiv](https://arxiv.org/abs/1709.02840)]

- Learning with Opponent-Learning Awareness [[arXiv](https://arxiv.org/abs/1709.04326)] [[article](https://blog.openai.com/learning-to-model-other-minds/)]

- A Deep Reinforcement Learning Chatbot [[arXiv](https://arxiv.org/abs/1709.02349)]

- Squeeze-and-Excitation Networks [[arXiv](https://arxiv.org/abs/1709.01507)]

- Efficient Methods and Hardware for Deep Learning (Thesis) [[Stanford Digital Repository](https://purl.stanford.edu/qf934gh3708)]

#### 2017-08

- Design and Analysis of the NIPS 2016 Review Process [[arXiv](https://arxiv.org/abs/1708.09794)]

- Fast Automated Analysis of Strong Gravitational Lenses with Convolutional Neural Networks [[arXiv](https://arxiv.org/abs/1708.08842)] [[article](http://www.symmetrymagazine.org/article/neural-networks-meet-space)]

- TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow [[white paper](https://drive.google.com/file/d/0B20Yn-GSaVHGMVlPanRTRlNIRlk/view)] [[code](https://github.com/tensorflow/agents)]

- Automated Crowdturfing Attacks and Defenses in Online Review Systems [[arXiv](https://arxiv.org/abs/1708.08151)]

- Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning [[arXiv](https://arxiv.org/abs/1708.02596)] [[article](http://bair.berkeley.edu/blog/2017/11/30/model-based-rl/)] [[code](https://github.com/nagaban2/nn_dynamics)]

- Deep Learning for Video Game Playing [[arXiv](https://arxiv.org/abs/1708.07902)]

- Deep & Cross Network for Ad Click Predictions [[arXiv](https://arxiv.org/abs/1708.05123)]

- Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms [[arXiv](https://arxiv.org/abs/1708.07747)] [[code](https://github.com/zalandoresearch/fashion-mnist)]

- Multi-task Self-Supervised Visual Learning [[arXiv](https://arxiv.org/abs/1708.07860)]

- Learning a Multi-View Stereo Machine [[arXiv](https://arxiv.org/abs/1708.05375)] [[article](http://bair.berkeley.edu/blog/2017/09/05/unified-3d/)] [[code]()] (soon)

- Twin Networks: Using the Future as a Regularizer [[arXiv](https://arxiv.org/abs/1708.06742)]

- A Brief Survey of Deep Reinforcement Learning [[arXiv](https://arxiv.org/abs/1708.05866)]

- Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation [[arXiv](https://arxiv.org/abs/1708.05144)] [[code](https://github.com/openai/baselines)]

- On the Effectiveness of Visible Watermarks [[CVPR](http://openaccess.thecvf.com/content_cvpr_2017/papers/Dekel_On_the_Effectiveness_CVPR_2017_paper.pdf)] [[article](https://research.googleblog.com/2017/08/making-visible-watermarks-more-effective.html)]

- Practical Network Blocks Design with Q-Learning [[arXiv](https://arxiv.org/abs/1708.05552)]

- On Ensuring that Intelligent Machines Are Well-Behaved [[arXiv](https://arxiv.org/abs/1708.05448)]

- Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control [[arXiv](https://arxiv.org/abs/1708.04133)] [[code](https://github.com/Breakend/ReproducibilityInContinuousPolicyGradientMethods)]

- Training Deep AutoEncoders for Collaborative Filtering [[arXiv](https://arxiv.org/abs/1708.01715)] [[code](https://github.com/NVIDIA/DeepRecommender)]

- Learning to Perform a Perched Landing on the GroundUsing Deep Reinforcement Learning [[nature](https://link.springer.com/epdf/10.1007/s10846-017-0696-1?author_access_token=BEvJgzY3QauUddBuQAus2ve4RwlQNchNByi7wbcMAY5xhRRqI6HVNnXt8Pgp850SnuV5ue6mUo3Jc7FIP5FgLmqk34Wob3oqyuGtkg7E_1T0dg02IYhfY-3dvb8R9zEmaGzTogYCIXm4O4vZ_tSGnA%3D%3D)]

- Revisiting the Effectiveness of Off-the-shelf Temporal Modeling Approaches for Large-scale Video Classification [[arXiv](https://arxiv.org/abs/1708.03805)] [[article](http://research.baidu.com/spatial-temporal-modeling-framework-large-scale-video-understanding/)]

- Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning [[arXiv](https://arxiv.org/abs/1708.02190)]

- Neural Expectation Maximization [[arXiv](https://arxiv.org/abs/1708.03498)] [[code](https://github.com/sjoerdvansteenkiste/)]

- Google Vizier: A Service for Black-Box Optimization [[Research at Google](https://research.google.com/pubs/pub46180.html)]

- STARDATA: A StarCraft AI Research Dataset [[arXiv](https://arxiv.org/abs/1708.02139)] [[code](https://github.com/TorchCraft/StarData)]

- Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm [[arXiv](https://arxiv.org/abs/1708.00524)] [[code](https://github.com/bfelbo/deepmoji)] [[article](https://www.media.mit.edu/posts/what-can-we-learn-from-emojis/)]

- Natural Language Processing with Small Feed-Forward Networks [[arXiv](https://arxiv.org/abs/1708.00214)]

#### 2017-07

- Photographic Image Synthesis with Cascaded Refinement Networks [[arXiv](https://arxiv.org/abs/1707.09405)] [[code](https://github.com/CQFIO/PhotographicImageSynthesis)]

- StarCraft II: A New Challenge for Reinforcement Learning [[DeepMind Documents](https://deepmind.com/documents/110/sc2le.pdf)] [[code](https://github.com/deepmind/pysc2)] [[article](https://deepmind.com/blog/deepmind-and-blizzard-open-starcraft-ii-ai-research-environment/)]

- Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards [[arXiv](https://arxiv.org/abs/1707.08817)]

- Reinforcement Learning with Deep Energy-Based Policies [[arXiv](https://arxiv.org/abs/1702.08165)] [[article](http://bair.berkeley.edu/blog/2017/10/06/soft-q-learning/)] [[code](https://github.com/haarnoja/softqlearning)]

- DARLA: Improving Zero-Shot Transfer in Reinforcement Learning [[arXiv](https://arxiv.org/abs/1707.08475)]

- Synthesizing Robust Adversarial Examples [[arXiv](https://arxiv.org/abs/1707.07397)] [[article](http://www.labsix.org/physical-objects-that-fool-neural-nets/)] [[code]()] (Soon)

- Voice Synthesis for in-the-Wild Speakers via a Phonological Loop [[arXiv](https://arxiv.org/abs/1707.06588)] [[code](https://github.com/facebookresearch/loop)] [[article](https://ytaigman.github.io/loop/)]

- Eyemotion: Classifying facial expressions in VR using eye-tracking cameras [[arXiv](https://arxiv.org/abs/1707.07204)] [[article](https://research.googleblog.com/2017/07/expressions-in-virtual-reality.html)]

- A Distributional Perspective on Reinforcement Learning [[arXiv](https://arxiv.org/abs/1707.06887)] [[article](https://deepmind.com/blog/going-beyond-average-reinforcement-learning/)] [[video](https://vimeo.com/235922311)]

- On the State of the Art of Evaluation in Neural Language Models [[arXiv](https://arxiv.org/abs/1707.05589)]

- Optimizing the Latent Space of Generative Networks [[arXiv](https://arxiv.org/abs/1707.05776)]

- Neuroscience-Inspired Artificial Intelligence [[Neuron](http://www.cell.com/neuron/fulltext/S0896-6273(17)30509-3?_returnURL=http%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS0896627317305093%3Fshowall%3Dtrue)] [[article](https://deepmind.com/blog/ai-and-neuroscience-virtuous-circle/)]

- Learning Transferable Architectures for Scalable Image Recognition [[arXiv](https://arxiv.org/abs/1707.07012)]

- Reverse Curriculum Generation for Reinforcement Learning [[arXiv](https://arxiv.org/abs/1707.05300)]

- Imagination-Augmented Agents for Deep Reinforcement Learning [[arXiv](https://arxiv.org/abs/1707.06203)] [[article](https://deepmind.com/blog/agents-imagine-and-plan/)]

- Learning model-based planning from scratch [[arXiv](https://arxiv.org/abs/1707.06170)] [[article](https://deepmind.com/blog/agents-imagine-and-plan/)]

- Proximal Policy Optimization Algorithms [[AWSS3](https://openai-public.s3-us-west-2.amazonaws.com/blog/2017-07/ppo/ppo-arxiv.pdf)] [[code](https://github.com/openai/baselines)]

- Automatic Recognition of Deceptive Facial Expressions of Emotion [[arXiv](https://arxiv.org/abs/1707.04061)]

- Distral: Robust Multitask Reinforcement Learning [[arXiv](https://arxiv.org/abs/1707.04175)]

- Creatism: A deep-learning photographer capable of creating professional work [[arXiv](https://arxiv.org/abs/1707.03491)] [[article](https://research.googleblog.com/2017/07/using-deep-learning-to-create.html)]

- SCAN: Learning Abstract Hierarchical Compositional Visual Concepts [[arXiv](https://arxiv.org/abs/1707.03389)] [[article](https://deepmind.com/blog/imagine-creating-new-visual-concepts-recombining-familiar-ones/)]

- Revisiting Unreasonable Effectiveness of Data in Deep Learning Era [[arXiv](https://arxiv.org/abs/1707.02968)] [[article](https://research.googleblog.com/2017/07/revisiting-unreasonable-effectiveness.html)]

- The Intentional Unintentional Agent: Learning to Solve Many Continuous Control Tasks Simultaneously [[arXiv](https://arxiv.org/abs/1707.03300)]

- Deep Bilateral Learning for Real-Time Image Enhancement [[arXiv](https://arxiv.org/abs/1707.02880)] [[code](https://github.com/mgharbi/hdrnet)] [[article](https://groups.csail.mit.edu/graphics/hdrnet/)]

- Emergence of Locomotion Behaviours in Rich Environments [[arXiv](https://arxiv.org/abs/1707.02286)] [[article](https://deepmind.com/blog/producing-flexible-behaviours-simulated-environments/)]

- Learning human behaviors from motion capture by adversarial imitation [[arXiv](https://arxiv.org/abs/1707.02201)] [[article](https://deepmind.com/blog/producing-flexible-behaviours-simulated-environments/)]

- Robust Imitation of Diverse Behaviors [[arXiv](https://deepmind.com/documents/95/diverse_arxiv.pdf)] [[article](https://deepmind.com/blog/producing-flexible-behaviours-simulated-environments/)]

- [Hindsight Experience Replay](notes/hindsight-ep.md) [[arXiv](https://arxiv.org/abs/1707.01495)]

- Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks [[arXiv](https://arxiv.org/abs/1707.01836)] [[article](https://stanfordmlgroup.github.io/projects/ecg/)]

- End-to-End Learning of Semantic Grasping [[arXiv](https://arxiv.org/abs/1707.01932)]

- ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games [[arXiv](https://arxiv.org/abs/1707.01067)] [[code](https://github.com/facebookresearch/ELF)] [[article](https://code.facebook.com/posts/132985767285406/introducing-elf-an-extensive-lightweight-and-flexible-platform-for-game-research/)]

#### 2017-06

- [Noisy Networks for Exploration](notes/noisy-networks-4-exploration.md) [[arXiv](https://arxiv.org/abs/1706.10295)]

- Do GANs actually learn the distribution? An empirical study [[arXiv](https://arxiv.org/abs/1706.08224)]

- Gradient Episodic Memory for Continuum Learning [[arXiv](https://arxiv.org/abs/1706.08840)]

- Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog [[arXiv](https://arxiv.org/abs/1706.08502)] [[code](https://github.com/batra-mlp-lab/lang-emerge)]

- Deep Interest Network for Click-Through Rate Prediction [[arXiv](https://arxiv.org/abs/1706.06978)]

- Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study [[arXiv](https://arxiv.org/abs/1706.08606)] [[article](https://deepmind.com/blog/cognitive-psychology/)]

- Structure Learning in Motor Control: A Deep Reinforcement Learning Model [[arXiv](https://arxiv.org/abs/1706.06827)]

- Programmable Agents [[arXiv](https://arxiv.org/abs/1706.06383)]

- Grounded Language Learning in a Simulated 3D World [[arXiv](https://arxiv.org/abs/1706.06551)]

- Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics [[arXiv](https://arxiv.org/abs/1706.04317)]

- SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability [[arXiv](https://arxiv.org/abs/1706.05806)] [[article](https://research.googleblog.com/2017/11/interpreting-deep-neural-networks-with.html)] [[code](https://github.com/google/svcca)]

- One Model To Learn Them All [[arXiv](https://arxiv.org/abs/1706.05137)] [[code](https://github.com/tensorflow/tensor2tensor)] [[article](https://research.googleblog.com/2017/06/multimodel-multi-task-machine-learning.html)]

- Hybrid Reward Architecture for Reinforcement Learning [[arXiv](https://arxiv.org/abs/1706.04208)]

- Expected Policy Gradients [[arXiv](https://arxiv.org/abs/1706.05374)]

- Variational Approaches for Auto-Encoding Generative Adversarial Networks [[arXiv](https://arxiv.org/abs/1706.04987)]

- Deal or No Deal? End-to-End Learning for Negotiation Dialogues [[S3AWS](https://s3.amazonaws.com/end-to-end-negotiator/end-to-end-negotiator.pdf)] [[code](https://github.com/facebookresearch/end-to-end-negotiator)] [[article](https://code.facebook.com/posts/1686672014972296/deal-or-no-deal-training-ai-bots-to-negotiate/)]

- Attention Is All You Need [[arXiv](https://arxiv.org/abs/1706.03762)] [[code](https://github.com/tensorflow/tensor2tensor)] [[article](https://research.googleblog.com/2017/08/transformer-novel-neural-network.html)]

- Sobolev Training for Neural Networks [[arXiv](https://arxiv.org/abs/1706.04859)]

- YellowFin and the Art of Momentum Tuning [[arXiv](https://arxiv.org/abs/1706.03471)] [[code](https://github.com/JianGoForIt/YellowFin)] [[article](http://dawn.cs.stanford.edu/2017/07/05/yellowfin/)]

- Forward Thinking: Building and Training Neural Networks One Layer at a Time [[arXiv](https://arxiv.org/abs/1706.02480)]

- Depthwise Separable Convolutions for Neural Machine Translation [[arXiv](https://arxiv.org/abs/1706.03059)] [[code](https://github.com/tensorflow/tensor2tensor)]

- Parameter Space Noise for Exploration [[arXiv](https://arxiv.org/abs/1706.01905)] [[code](https://github.com/openai/baselines)] [[article](https://blog.openai.com/better-exploration-with-parameter-noise/)]

- Deep Reinforcement Learning from human preferences [[arXiv](https://arxiv.org/abs/1706.03741)] [[article](https://blog.openai.com/deep-reinforcement-learning-from-human-preferences/)]

- Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments [[arXiv](https://arxiv.org/abs/1706.02275)] [[code](https://github.com/openai/multiagent-particle-envs)]

- Self-Normalizing Neural Networks [[arXiv](https://arxiv.org/abs/1706.02515)] [[code](https://github.com/bioinf-jku/SNNs)]

- Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour [[arXiv](https://arxiv.org/abs/1706.02677)]

- A simple neural network module for relational reasoning [[arXiv](https://arxiv.org/abs/1706.01427)] [[article](https://deepmind.com/blog/neural-approach-relational-reasoning/)]

- Visual Interaction Networks [[arXiv](https://arxiv.org/abs/1706.01433)] [[article](https://deepmind.com/blog/neural-approach-relational-reasoning/)]

#### 2017-05

- Supervised Learning of Universal Sentence Representations from Natural Language Inference Data [[arXiv](https://arxiv.org/abs/1705.02364)]  [[code](https://github.com/facebookresearch/InferSent)]

- pix2code: Generating Code from a Graphical User Interface Screenshot [[arXiv](https://arxiv.org/abs/1705.07962)] [[article](https://uizard.io/research#pix2code)] [[code](https://github.com/tonybeltramelli/pix2code)]

- The Cramer Distance as a Solution to Biased Wasserstein Gradients [[arXiv](https://arxiv.org/abs/1705.10743)]

- Reinforcement Learning with a Corrupted Reward Channel [[arXiv](https://arxiv.org/abs/1705.08417)]

- Dilated Residual Networks [[arXiv](https://arxiv.org/abs/1705.09914)] [[code](https://github.com/fyu/drn)]

- Bayesian GAN [[arXiv](https://arxiv.org/abs/1705.09558)] [[code](https://github.com/andrewgordonwilson/bayesgan/)]

- Gradient Descent Can Take Exponential Time to Escape Saddle Points [[arXiv](https://arxiv.org/abs/1705.10412)] [[article](http://bair.berkeley.edu/blog/2017/08/31/saddle-efficiency/)]

- Thinking Fast and Slow with Deep Learning and Tree Search [[arXiv]()]

- ParlAI: A Dialog Research Software Platform [[arXiv](https://arxiv.org/abs/1705.06476)] [[code](https://github.com/facebookresearch/ParlAI)] [[article](https://code.facebook.com/posts/266433647155520/parlai-a-new-software-platform-for-dialog-research/)]

- Semantically Decomposing the Latent Spaces of Generative Adversarial Networks [[arXiv](https://arxiv.org/abs/1705.07904)] [[article](https://aws.amazon.com/blogs/ai/combining-deep-learning-networks-gan-and-siamese-to-generate-high-quality-life-like-images/)]

- Look, Listen and Learn [[arXiv](https://arxiv.org/abs/1705.08168)]

- Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [[arXiv](https://arxiv.org/abs/1705.07750)] [[code](https://github.com/deepmind/kinetics-i3d)]

- Convolutional Sequence to Sequence Learning [[arXiv](https://arxiv.org/abs/1705.03122)] [[code](https://github.com/facebookresearch/fairseq)] [[code2](https://github.com/facebookresearch/fairseq-py)] [[article](https://code.facebook.com/posts/1978007565818999/a-novel-approach-to-neural-machine-translation/)]

- The Kinetics Human Action Video Dataset [[arXiv](https://arxiv.org/abs/1705.06950)] [[article](https://deepmind.com/research/open-source/open-source-datasets/kinetics/)]

- Safe and Nested Subgame Solving for Imperfect-Information Games [[arXiv](https://arxiv.org/abs/1705.02955)]

- Discrete Sequential Prediction of Continuous Actions for Deep RL [[arXiv](https://arxiv.org/abs/1705.05035)]

- Metacontrol for Adaptive Imagination-Based Optimization [[arXiv](https://arxiv.org/abs/1705.02670)]

- Efficient Parallel Methods for Deep Reinforcement Learning [[arXiv](https://arxiv.org/abs/1705.04862)]

- Real-Time Adaptive Image Compression [[arXiv](https://arxiv.org/abs/1705.05823)]

#### 2017-04

- General Video Game AI: Learning from Screen Capture [[arXiv](https://arxiv.org/abs/1704.06945)]

- Learning to Skim Text [[arXiv](https://arxiv.org/abs/1704.06877)]

- Get To The Point: Summarization with Pointer-Generator Networks [[arXiv](https://arxiv.org/abs/1704.04368)] [[code](https://github.com/abisee/pointer-generator)] [[article](http://www.abigailsee.com/2017/04/16/taming-rnns-for-better-summarization.html)]

- Adversarial Neural Machine Translation [[arXiv](https://arxiv.org/abs/1704.06933)]

- [Deep Q-learning from Demonstrations](notes/dqn-demonstrations.md) [[arXiv](https://arxiv.org/abs/1704.03732)]

- Learning from Demonstrations for Real World Reinforcement Learning [[arXiv](https://arxiv.org/abs/1704.03732)]

- DSLR-Quality Photos on Mobile Devices with Deep Convolutional Networks [[arXiv](https://arxiv.org/abs/1704.02470)] [[article](http://people.ee.ethz.ch/~ihnatova/)] [[code](https://github.com/aiff22/DPED)]

- A Neural Representation of Sketch Drawings [[arXiv](https://arxiv.org/abs/1704.03477)] [[code](https://github.com/tensorflow/magenta/tree/master/magenta/models/sketch_rnn)] [[article](https://research.googleblog.com/2017/04/teaching-machines-to-draw.html)]

- Automated Curriculum Learning for Neural Networks [[arXiv](https://arxiv.org/abs/1704.03003)]

- Hierarchical Surface Prediction for 3D Object Reconstruction [[arXiv](https://arxiv.org/abs/1704.00710)] [[article](http://bair.berkeley.edu/blog/2017/08/23/high-quality-3d-obj-reconstruction/)]

- Neural Message Passing for Quantum Chemistry [[arXiv](https://arxiv.org/abs/1704.01212)]

- Learning to Generate Reviews and Discovering Sentiment [[arXiv](https://arxiv.org/abs/1704.01444)] [[code](https://github.com/openai/generating-reviews-discovering-sentiment)]

- Best Practices for Applying Deep Learning to Novel Applications [[arXiv](https://arxiv.org/abs/1704.01568)]

#### 2017-03

- Improved Training of Wasserstein GANs [[arXiv](https://arxiv.org/abs/1704.00028)]

- Evolution Strategies as a Scalable Alternative to Reinforcement Learning [[arXiv](https://arxiv.org/abs/1703.03864)]

- Controllable Text Generation [[arXiv](https://arxiv.org/abs/1703.00955)]

- Neural Episodic Control [[arXiv](https://arxiv.org/abs/1703.01988)]

- [A Structured Self-attentive Sentence Embedding](notes/self_attention_embedding.md) [[arXiv](https://arxiv.org/abs/1703.03130)]

- Multi-step Reinforcement Learning: A Unifying Algorithm [[arXiv](https://arxiv.org/abs/1703.01327)]

- Deep learning with convolutional neural networks for brain mapping and decoding of movement-related information from the human EEG [[arXiv](https://arxiv.org/abs/1703.05051)]

- FaSTrack: a Modular Framework for Fast and Guaranteed Safe Motion Planning [[arXiv](https://arxiv.org/abs/1703.07373)] [[article](http://bair.berkeley.edu/blog/2017/12/05/fastrack/)] [[article2](http://sylviaherbert.com/fastrack/)]

- Massive Exploration of Neural Machine Translation Architectures [[arXiv](https://arxiv.org/abs/1703.03906)] [[code](https://github.com/google/seq2seq)]

- Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression [[arXiv](https://arxiv.org/abs/1703.07834)] [[article](http://aaronsplace.co.uk/papers/jackson2017recon/)] [[code](https://github.com/AaronJackson/vrn)]

- Minimax Regret Bounds for Reinforcement Learning [[arXiv](https://arxiv.org/abs/1703.05449)]

- Sharp Minima Can Generalize For Deep Nets [[arXiv](https://arxiv.org/abs/1703.04933)]

- Parallel Multiscale Autoregressive Density Estimation [[arXiv](https://arxiv.org/abs/1703.03664)]

- Neural Machine Translation and Sequence-to-sequence Models: A Tutorial [[arXiv](https://arxiv.org/abs/1703.01619)]

- Large-Scale Evolution of Image Classifiers [[arXiv](https://arxiv.org/abs/1703.01041)]

- FeUdal Networks for Hierarchical Reinforcement Learning [[arXiv](https://arxiv.org/abs/1703.01161)]

- Evolving Deep Neural Networks [[arXiv](https://arxiv.org/abs/1703.00548)]

- How to Escape Saddle Points Efficiently [[arXiv](https://arxiv.org/abs/1703.00887)] [[article](http://bair.berkeley.edu/blog/2017/08/31/saddle-efficiency/)]

- Opening the Black Box of Deep Neural Networks via Information [[arXiv](https://arxiv.org/abs/1703.00810)] [[video](https://youtu.be/bLqJHjXihK8)]

- Understanding Synthetic Gradients and Decoupled Neural Interfaces [[arXiv](https://arxiv.org/abs/1703.00522)]

- Learning to Optimize Neural Nets [[arXiv](https://arxiv.org/abs/1703.00441)] [[article](http://bair.berkeley.edu/blog/2017/09/12/learning-to-optimize-with-rl/)]

#### 2017-02

- The Shattered Gradients Problem: If resnets are the answer, then what is the question? [[arXiv](https://arxiv.org/abs/1702.08591)]

- Neural Map: Structured Memory for Deep Reinforcement Learning [[arXiv](https://arxiv.org/abs/1702.08360)]

- Bridging the Gap Between Value and Policy Based Reinforcement Learning [[arXiv](https://arxiv.org/abs/1702.08892)]

- Deep Voice: Real-time Neural Text-to-Speech [[arXiv](https://arxiv.org/abs/1702.07825)]

- Beating the World's Best at Super Smash Bros. with Deep Reinforcement Learning [[arXiv](https://arxiv.org/abs/1702.06230)]

- The Game Imitation: Deep Supervised Convolutional Networks for Quick Video Game AI [[arXiv](https://arxiv.org/abs/1702.05663)]

- Learning to Parse and Translate Improves Neural Machine Translation [[arXiv](https://arxiv.org/abs/1702.03525)]

- All-but-the-Top: Simple and Effective Postprocessing for Word Representations [[arXiv](https://arxiv.org/abs/1702.01417)]

- Deep Learning with Dynamic Computation Graphs [[arXiv](https://arxiv.org/abs/1702.02181)]

- Skip Connections as Effective Symmetry-Breaking [[arXiv](https://arxiv.org/abs/1701.09175)]

- odelSemi-Supervised QA with Generative Domain-Adaptive Nets [[arXiv](https://arxiv.org/abs/1702.02206)]

#### 2017-01

- Wasserstein GAN [[arXiv](https://arxiv.org/abs/1701.07875)]

- Deep Reinforcement Learning: An Overview [[arXiv](https://arxiv.org/abs/1701.07274)]

- DyNet: The Dynamic Neural Network Toolkit [[arXiv](https://arxiv.org/abs/1701.03980)]

- DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker [[arXiv](https://arxiv.org/abs/1701.01724)]

- NIPS 2016 Tutorial: Generative Adversarial Networks [[arXiv](https://arxiv.org/abs/1701.00160)]

#### 2016-12

- [A recurrent neural network without Chaos](notes/rnn_no_chaos.md) [[arXiv](https://arxiv.org/abs/1612.06212)]

- Language Modeling with Gated Convolutional Networks [[arXiv](https://arxiv.org/abs/1612.08083)]

- EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis [[arXiv](https://arxiv.org/abs/1612.07919)] [[article](http://webdav.tuebingen.mpg.de/pixel/enhancenet/)]

- Learning from Simulated and Unsupervised Images through Adversarial Training [[arXiv](https://arxiv.org/abs/1612.07828)]

- How Grammatical is Character-level Neural Machine Translation? Assessing MT Quality with Contrastive Translation Pairs [[arXiv](https://arxiv.org/abs/1612.04629)]

- Improving Neural Language Models with a Continuous Cache [[arXiv](https://arxiv.org/abs/1612.04426)]

- DeepMind Lab [[arXiv](https://arxiv.org/abs/1612.03801)] [[code](https://github.com/deepmind/lab)]

- Deep Learning of Robotic Tasks without a Simulator using Strong and Weak Human Supervision [[arXiv](https://arxiv.org/abs/1612.01086)]

- Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning [[arXiv](https://arxiv.org/abs/1612.01887)]

- Overcoming catastrophic forgetting in neural networks [[arXiv](https://arxiv.org/abs/1612.00796)]

#### 2016-11 (ICLR Edition)

- Image-to-Image Translation with Conditional Adversarial Networks [[arXiv](https://arxiv.org/abs/1611.07004)]

- [Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer](notes/mixture-experts.md) [[OpenReview](https://openreview.net/forum?id=B1ckMDqlg)]

- Learning to reinforcement learn [[arXiv](https://arxiv.org/abs/1611.05763)]

- A Way out of the Odyssey: Analyzing and Combining Recent Insights for LSTMs [[arXiv](https://arxiv.org/abs/1611.05104)]

- [Adversarial Training Methods for Semi-Supervised Text Classification](notes/adversarial-text-classification.md) [[arXiv](https://arxiv.org/abs/1605.07725)]

- Importance Sampling with Unequal Support [[arXiv](https://arxiv.org/abs/1611.03451)]

- Quasi-Recurrent Neural Networks [[arXiv](https://arxiv.org/abs/1611.01576)]

- Capacity and Learnability in Recurrent Neural Networks [[OpenReview](http://openreview.net/forum?id=BydARw9ex)]

- Unrolled Generative Adversarial Networks [[OpenReview](http://openreview.net/forum?id=BydrOIcle)]

- Deep Information Propagation [[OpenReview](http://openreview.net/forum?id=H1W1UN9gg)]

- Structured Attention Networks [[OpenReview](http://openreview.net/forum?id=HkE0Nvqlg)]

- Incremental Sequence Learning [[arXiv](https://arxiv.org/abs/1611.03068)]

- Delving into Transferable Adversarial Examples and Black-box Attacks [[arXiv](https://arxiv.org/abs/1611.02770)] [[code](https://github.com/ReDeiPirati/transferability-advdnn-pub)]

- b-GAN: Unified Framework of Generative Adversarial Networks [[OpenReview](http://openreview.net/forum?id=S1JG13oee)]

- A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks [[OpenReview](http://openreview.net/forum?id=SJZAb5cel)]

- Categorical Reparameterization with Gumbel-Softmax [[arXiv](https://arxiv.org/abs/1611.01144)]

- Lip Reading Sentences in the Wild [[arXiv](https://arxiv.org/abs/1611.05358)]

Reinforcement Learning:

-Learning to reinforcement learn [[arXiv](https://arxiv.org/abs/1611.05763)]

- A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models [[arXiv](https://arxiv.org/abs/1611.03852)]

- The Predictron: End-To-End Learning and Planning [[OpenReview](http://openreview.net/forum?id=BkJsCIcgl)]

- [Third-Person Imitation Learning](notes/third-person-imitation-learning.md) [[OpenReview](http://openreview.net/forum?id=B16dGcqlx)]

- Generalizing Skills with Semi-Supervised Reinforcement Learning [[OpenReview](http://openreview.net/forum?id=ryHlUtqge)]

- Sample Efficient Actor-Critic with Experience Replay [[OpenReview](http://openreview.net/forum?id=HyM25Mqel)]

- [Reinforcement Learning with Unsupervised Auxiliary Tasks](notes/rl-auxiliary-tasks.md) [[arXiv](https://arxiv.org/abs/1611.05397)]

- Neural Architecture Search with Reinforcement Learning [[OpenReview](http://openreview.net/forum?id=r1Ue8Hcxg)]

- Towards Information-Seeking Agents [[OpenReview](http://openreview.net/forum?id=SyW2QSige)]

- Multi-Agent Cooperation and the Emergence of (Natural) Language [[OpenReview](http://openreview.net/forum?id=Hk8N3Sclg)]

- Improving Policy Gradient by Exploring Under-appreciated Rewards [[OpenReview](http://openreview.net/forum?id=ryT4pvqll)]

- Stochastic Neural Networks for Hierarchical Reinforcement Learning [[OpenReview](http://openreview.net/forum?id=B1oK8aoxe)]

- Tuning Recurrent Neural Networks with Reinforcement Learning [[OpenReview](https://arxiv.org/abs/1611.02796)]

- RL^2: Fast Reinforcement Learning via Slow Reinforcement Learning [[arXiv](https://arxiv.org/abs/1611.02779)]

- Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning [[OpenReview](http://openreview.net/forum?id=Hyq4yhile)]

- Learning to Perform Physics Experiments via Deep Reinforcement Learning [[OpenReview](http://openreview.net/forum?id=r1nTpv9eg)]

- Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU [[OpenReview](http://openreview.net/forum?id=r1VGvBcxl)]

- Learning to Compose Words into Sentences with Reinforcement Learning[[OpenReview](http://openreview.net/forum?id=Skvgqgqxe)]

- Deep Reinforcement Learning for Accelerating the Convergence Rate [[OpenReview](http://openreview.net/forum?id=Syg_lYixe)]

- [#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning](notes/count-based-exploration.md) [[arXiv](https://arxiv.org/abs/1611.04717)]

- Learning to Compose Words into Sentences with Reinforcement Learning [[OpenReview](http://openreview.net/forum?id=Skvgqgqxe)]

- Learning to Navigate in Complex Environments [[arXiv](https://arxiv.org/abs/1611.03673)]

- Unsupervised Perceptual Rewards for Imitation Learning [[OpenReview](http://openreview.net/forum?id=Bkul3t9ee)]

- Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic [[OpenReview](http://openreview.net/forum?id=SJ3rcZcxl)]

Machine Translation & Dialog

- [Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation](notes/gnmt-multilingual.md) [[arXiv](https://arxiv.org/abs/1611.04558)]

- [Neural Machine Translation with Reconstruction](notes/nmt-with-reconstruction.md) [[arXiv](https://arxiv.org/abs/1611.01874v1)]

- Iterative Refinement for Machine Translation [[OpenReview](http://openreview.net/forum?id=r1y1aawlg)]

- A Convolutional Encoder Model for Neural Machine Translation [[arXiv](https://arxiv.org/abs/1611.02344)]

- Improving Neural Language Models with a Continuous Cache [[OpenReview](http://openreview.net/forum?id=B184E5qee)]

- Vocabulary Selection Strategies for Neural Machine Translation [[OpenReview](http://openreview.net/forum?id=Bk8N0RLxx)]

- Towards an automatic Turing test: Learning to evaluate dialogue responses [[OpenReview](http://openreview.net/forum?id=HJ5PIaseg)]

- Dialogue Learning With Human-in-the-Loop [[OpenReview](http://openreview.net/forum?id=HJgXCV9xx)]

- Batch Policy Gradient Methods for Improving Neural Conversation Models [[OpenReview](http://openreview.net/forum?id=rJfMusFll)]

- Learning through Dialogue Interactions [[OpenReview](http://openreview.net/forum?id=rkE8pVcle)]

- [Dual Learning for Machine Translation](notes/dual-learning-mt.md) [[arXiv](https://arxiv.org/abs/1611.00179)]

- Unsupervised Pretraining for Sequence to Sequence Learning [[arXiv](https://arxiv.org/abs/1611.02683)]

#### 2016-10

- Hybrid computing using a neural network with dynamic external memory [[nature](https://www.nature.com/articles/nature20101.epdf?author_access_token=ImTXBI8aWbYxYQ51Plys8NRgN0jAjWel9jnR3ZoTv0MggmpDmwljGswxVdeocYSurJ3hxupzWuRNeGvvXnoO8o4jTJcnAyhGuZzXJ1GEaD-Z7E6X_a9R-xqJ9TfJWBqz)] [[code](https://github.com/deepmind/dnc)]

- Quantum Machine Learning [[arXiv](https://arxiv.org/abs/1611.09347)]

- Understanding deep learning requires rethinking generalization [[arXiv](https://arxiv.org/abs/1611.03530)]

- Universal adversarial perturbations [[arXiv](https://arxiv.org/abs/1610.08401)] [[code](https://github.com/LTS4/universal)]

- [Neural Machine Translation in Linear Time](notes/nmt-linear-time.md) [[arXiv](https://arxiv.org/abs/1610.10099)] [[code](https://github.com/tensorflow/tensor2tensor)]

- [Professor Forcing: A New Algorithm for Training Recurrent Networks](notes/professor-forcing.md) [[arXiv](https://arxiv.org/abs/1610.09038)]

- Learning to Protect Communications with Adversarial Neural Cryptography [[arXiv](https://arxiv.org/abs/1610.06918v1)]

- Can Active Memory Replace Attention? [[arXiv](https://arxiv.org/abs/1610.08613)]

- [Using Fast Weights to Attend to the Recent Past](notes/fast-weight-to-attend.md) [[arXiv](https://arxiv.org/abs/1610.06258)]

- [Fully Character-Level Neural Machine Translation without Explicit Segmentation](notes/conv-char-level-nmt.md) [[arXiv](https://arxiv.org/abs/1610.03017)]

- [Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models](notes/diverse-beam-search.md) [[arXiv](https://arxiv.org/abs/1610.02424)]

- Video Pixel Networks [[arXiv](https://arxiv.org/abs/1610.00527)]

- Connecting Generative Adversarial Networks and Actor-Critic Methods [[arXiv](https://arxiv.org/abs/1610.01945)]

- [Learning to Translate in Real-time with Neural Machine Translation](notes/learning-to-translate-real-time.md) [[arXiv](https://arxiv.org/abs/1610.00388)]

- Xception: Deep Learning with Depthwise Separable Convolutions [[arXiv](https://arxiv.org/abs/1610.02357)]

- Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search [[arXiv](https://arxiv.org/abs/1610.00673)]

- [Pointer Sentinel Mixture Models](notes/pointer-sentinel-mixture.md) [[arXiv](https://arxiv.org/abs/1609.07843)]

#### 2016-09

- Towards Deep Symbolic Reinforcement Learning [[arXiv](https://arxiv.org/abs/1609.05518)]

- HyperNetworks [[arXiv](https://arxiv.org/abs/1609.09106)]

- Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation [[arXiv](http://arxiv.org/abs/1609.08144)]

- Safe and Efficient Off-Policy Reinforcement Learning [[arXiv](http://arxiv.org/abs/1606.02647)]

- Playing FPS Games with Deep Reinforcement Learning [[arXiv](http://arxiv.org/abs/1609.05521)]

- [SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient](notes/seq-gan.md) [[arXiv](https://arxiv.org/abs/1609.05473)]

- Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks [[arXiv](http://arxiv.org/abs/1609.02993)]

- Energy-based Generative Adversarial Network [[arXiv](https://arxiv.org/abs/1609.03126)]

- Stealing Machine Learning Models via Prediction APIs [[arXiv](http://arxiv.org/abs/1609.02943)]

- Semi-Supervised Classification with Graph Convolutional Networks [[arXiv](http://arxiv.org/abs/1609.02907)]

- WaveNet: A Generative Model For Raw Audio [[arXiv](https://arxiv.org/abs/1609.03499)]

- [Hierarchical Multiscale Recurrent Neural Networks](notes/hm-rnn.md) [[arXiv](https://arxiv.org/abs/1609.01704)]

- End-to-End Reinforcement Learning of Dialogue Agents for Information Access [[arXiv](https://arxiv.org/abs/1609.00777)]

- Deep Neural Networks for YouTube Recommendations [[paper](https://research.google.com/pubs/pub45530.html)]

#### 2016-08

- Semantics derived automatically from language corpora contain human-like biases [[arXiv](https://arxiv.org/abs/1608.07187)]

- Why does deep and cheap learning work so well? [[arXiv](https://arxiv.org/abs/1608.08225)]

- Machine Comprehension Using Match-LSTM and Answer Pointer [[arXiv](https://arxiv.org/abs/1608.07905)]

- Stacked Approximated Regression Machine: A Simple Deep Learning Approach [[arXiv](http://arxiv.org/abs/1608.04062)]

- Decoupled Neural Interfaces using Synthetic Gradients [[arXiv](http://arxiv.org/abs/1608.05343)]

- WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia [[arXiv](https://arxiv.org/abs/1608.03542)]

- Temporal Attention Model for Neural Machine Translation [[arXiv](http://arxiv.org/abs/1608.02927)]

- Residual Networks of Residual Networks: Multilevel Residual Networks [[arXiv](http://arxiv.org/abs/1608.02908)]

- [Learning Online Alignments with Continuous Rewards Policy Gradient](notes/online-alignments-pg.md) [[arXiv](https://arxiv.org/abs/1608.01281)]

#### 2016-07

- [An Actor-Critic Algorithm for Sequence Prediction](notes/actor-critic-sequence.md) [[arXiv](http://arxiv.org/abs/1607.07086)]

- Cognitive Science in the era of Artificial Intelligence: A roadmap for reverse-engineering the infant language-learner [[arXiv](http://arxiv.org/abs/1607.08723v1)]

- [Recurrent Neural Machine Translation](notes/recurrent-nmt.md) [[arXiv](http://arxiv.org/abs/1607.08725)]

- MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition [[arXiv](http://arxiv.org/abs/1607.08221)]

- [Layer Normalization](notes/layer-norm.md) [[arXiv](https://arxiv.org/abs/1607.06450)]

- [Neural Machine Translation with Recurrent Attention Modeling](notes/nmt-rec-attention.md)  [[arXiv](https://arxiv.org/abs/1607.05108)]

- Neural Semantic Encoders [[arXiv](https://arxiv.org/abs/1607.04315)]

- [Attention-over-Attention Neural Networks for Reading Comprehension](notes/att-over-att.md) [[arXiv](https://arxiv.org/abs/1607.04423)]

- sk_p: a neural program corrector for MOOCs [[arXiv](http://arxiv.org/abs/1607.02902)]

- Recurrent Highway Networks [[arXiv](https://arxiv.org/abs/1607.03474)]

- Bag of Tricks for Efficient Text Classification [[arXiv](http://arxiv.org/abs/1607.01759)]

- Context-Dependent Word Representation for Neural Machine Translation [[arXiv](https://arxiv.org/abs/1607.00578)]

- Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes [[arXiv](http://arxiv.org/abs/1607.00036)]

#### 2016-06

- Sequence-to-Sequence Learning as Beam-Search Optimization [[arXiv](https://arxiv.org/abs/1606.02960)]

- [Sequence-Level Knowledge Distillation](notes/seq-knowledge-distillation.md) [[arXiv](https://arxiv.org/abs/1606.07947)]

- Policy Networks with Two-Stage Training for Dialogue Systems [[arXiv](http://arxiv.org/abs/1606.03152)]

- Towards an integration of deep learning and neuroscience [[arXiv](https://arxiv.org/abs/1606.03813)]

- On Multiplicative Integration with Recurrent Neural Networks [[arxiv](https://arxiv.org/abs/1606.06630)]

- [Wide & Deep Learning for Recommender Systems](wide-and-deep.md) [[arXiv](https://arxiv.org/abs/1606.07792)]

- Online and Offline Handwritten Chinese Character Recognition [[arXiv](https://arxiv.org/abs/1606.05763)]

- Tutorial on Variational Autoencoders [[arXiv](http://arxiv.org/abs/1606.05908)]

- Concrete Problems in AI Safety [[arXiv](https://arxiv.org/abs/1606.06565)]

- Deep Reinforcement Learning Discovers Internal Models [[arXiv](http://arxiv.org/abs/1606.05174v1)]

- [SQuAD: 100,000+ Questions for Machine Comprehension of Text](notes/squad.md) [[arXiv](http://arxiv.org/abs/1606.05250)]

- Conditional Image Generation with PixelCNN Decoders [[arXiv](http://arxiv.org/abs/1606.05328)]

- Model-Free Episodic Control [[arXiv](http://arxiv.org/abs/1606.04460)]

- [Progressive Neural Networks](notes/progressive-nn.md) [[arXiv](http://arxiv.org/abs/1606.04671)]

- Improved Techniques for Training GANs [[arXiv](http://arxiv.org/abs/1606.03498)] [[code](https://github.com/openai/improved-gan)]

- Memory-Efficient Backpropagation Through Time [[arXiv](http://arxiv.org/abs/1606.03401)]

- InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets [[arXiv](http://arxiv.org/abs/1606.03657)]

- Zero-Resource Translation with Multi-Lingual Neural Machine Translation [[arXiv](http://arxiv.org/abs/1606.04164)]

- Key-Value Memory Networks for Directly Reading Documents [[arXiv](http://arxiv.org/abs/1606.03126)]

- Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translatin [[arXiv](http://arxiv.org/abs/1606.04199)]

- Learning to learn by gradient descent by gradient descent [[arXiv](http://arxiv.org/abs/1606.04474)]

- Learning Language Games through Interaction [[arXiv](http://arxiv.org/abs/1606.02447)]

- Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations [[arXiv](https://arxiv.org/abs/1606.01305)]

- Smart Reply: Automated Response Suggestion for Email [[arXiv](http://arxiv.org/abs/1606.04870)]

- Virtual Adversarial Training for Semi-Supervised Text Classification [[arXiv](https://arxiv.org/abs/1605.07725)]

- Deep Reinforcement Learning for Dialogue Generation [[arXiv](http://arxiv.org/abs/1606.01541)]

- Very Deep Convolutional Networks for Natural Language Processing [[arXiv](https://arxiv.org/abs/1606.01781)]

- Neural Net Models for Open-Domain Discourse Coherence [[arXiv](https://arxiv.org/abs/1606.01545)]

- Neural Architectures for Fine-grained Entity Type Classification [[arXiv](https://arxiv.org/abs/1606.01341)]

- Matching Networks for One Shot Learning [[arXiv](https://arxiv.org/abs/1606.04080)]

- Cooperative Inverse Reinforcement Learning [[arXiv](https://arxiv.org/abs/1606.03137)] [[article](http://bair.berkeley.edu/blog/2017/08/17/cooperatively-learning-human-values/)]

- Gated-Attention Readers for Text Comprehension [[arXiv](http://arxiv.org/abs/1606.01549)]

- [End-to-end LSTM-based dialog control optimized with supervised and reinforcement learning](notes/e2e-dialog-control-sl-rl.md) [[arXiv](https://arxiv.org/abs/1606.01269)]

- Iterative Alternating Neural Attention for Machine Reading [[arXiv](https://arxiv.org/abs/1606.02245)]

- Memory-enhanced Decoder for Neural Machine Translation [[arXiv](http://arxiv.org/abs/1606.02003)]

- Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation [[arXiv](https://arxiv.org/abs/1606.00776)]

- Learning to Optimize [[arXiv](https://arxiv.org/abs/1606.01885)] [[article](http://bair.berkeley.edu/blog/2017/09/12/learning-to-optimize-with-rl/)]

- [Natural Language Comprehension with the EpiReader](notes/epireader.md) [[arXiv](https://arxiv.org/abs/1606.02270)]

- Conversational Contextual Cues: The Case of Personalization and History for Response Ranking [[arXiv](https://arxiv.org/abs/1606.00372)]

- Adversarially Learned Inference [[arXiv](https://arxiv.org/abs/1606.00704)]

- OpenAI Gym [[arXiv](https://arxiv.org/abs/1606.01540)] [[code](https://github.com/deepmind/lab)]

- Neural Network Translation Models for Grammatical Error Correction [[arXiv](https://arxiv.org/abs/1606.00189)]

#### 2016-05

- Hierarchical Memory Networks [[arXiv](https://arxiv.org/abs/1605.07427)]

- Deep API Learning [[arXiv](http://arxiv.org/abs/1605.08535)]

- Wide Residual Networks [[arXiv](http://arxiv.org/abs/1605.07146)]

- TensorFlow: A system for large-scale machine learning [[arXiv](http://arxiv.org/abs/1605.08695)]

- Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention [[arXiv](http://arxiv.org/abs/1605.09090)]

- Aspect Level Sentiment Classification with Deep Memory Network [[arXiv](http://arxiv.org/abs/1605.08900)]

- FractalNet: Ultra-Deep Neural Networks without Residuals [[arXiv](https://arxiv.org/abs/1605.07648)]

- Learning End-to-End Goal-Oriented Dialog [[arXiv](http://arxiv.org/abs/1605.07683)]

- One-shot Learning with Memory-Augmented Neural Networks [[arXiv](http://arxiv.org/abs/1605.06065)]

- Deep Learning without Poor Local Minima [[arXiv](http://arxiv.org/abs/1605.07110)]

- AVEC 2016 - Depression, Mood, and Emotion Recognition Workshop and Challenge [[arXiv](https://arxiv.org/abs/1605.01600)]

- Data Programming: Creating Large Training Sets, Quickly [[arXiv](http://arxiv.org/abs/1605.07723)]

- Deeply-Fused Nets [[arXiv](http://arxiv.org/abs/1605.07716)]

- Deep Portfolio Theory [[arXiv](http://arxiv.org/abs/1605.07230)]

- Unsupervised Learning for Physical Interaction through Video Prediction [[arXiv](http://arxiv.org/abs/1605.07157)]

- Movie Description [[arXiv](http://arxiv.org/abs/1605.03705)]

#### 2016-04

- Higher Order Recurrent Neural Networks [[arXiv](https://arxiv.org/abs/1605.00064)]

- Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition [[arXiv](https://arxiv.org/abs/1604.08352)]

- Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation [[arXiv](https://arxiv.org/abs/1604.06057)]

- The IBM 2016 English Conversational Telephone Speech Recognition System [[arXiv](https://arxiv.org/abs/1604.08242)]

- Dialog-based Language Learning [[arXiv](https://arxiv.org/abs/1604.06045)]

- Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss [[arXiv](https://arxiv.org/abs/1604.05529)]

- Sentence-Level Grammatical Error Identification as Sequence-to-Sequence Correction [[arXiv](https://arxiv.org/abs/1604.04677)]

- A Network-based End-to-End Trainable Task-oriented Dialogue System [[arXiv](http://arxiv.org/abs/1604.04562)]

- Visual Storytelling [[arXiv](https://arxiv.org/abs/1604.03968)]

- Improving the Robustness of Deep Neural Networks via Stability Training [[arXiv](http://arxiv.org/abs/1604.04326)]

- [Bridging the Gaps Between Residual Learning, Recurrent Neural Networks and Visual Cortex](notes/bridging-gap-resnet-rnn.md) [[arXiv](https://arxiv.org/abs/1604.03640)]

- Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention [[arXiv](https://arxiv.org/abs/1604.03286)]

- [Sentence Level Recurrent Topic Model: Letting Topics Speak for Themselves](notes/slrtm.md) [[arXiv](https://arxiv.org/abs/1604.02038)]

- [Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models](notes/open-vocab-nmt-hybrid-word-character.md) [[arXiv](http://arxiv.org/abs/1604.00788)]

- [Building Machines That Learn and Think Like People](notes/building-machines-that-learn-and-think-like-people.md) [[arXiv](http://arxiv.org/abs/1604.00289)]

- A Semisupervised Approach for Language Identification based on Ladder Networks [[arXiv](http://arxiv.org/abs/1604.00317)]

- [Deep Networks with Stochastic Depth](notes/stochastic-depth.md) [[arXiv](http://arxiv.org/abs/1603.09382)]

- PHOCNet: A Deep Convolutional Neural Network for Word Spotting in Handwritten Documents [[arXiv](http://arxiv.org/abs/1604.00187)]

#### 2016-03

- Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning [[arXiv](https://arxiv.org/abs/1603.07954)]

- A Fast Unified Model for Parsing and Sentence Understanding [[arXiv](http://arxiv.org/abs/1603.06021)]

- [Latent Predictor Networks for Code Generation](notes/latent-predictor-networks.md) [[arXiv](http://arxiv.org/abs/1603.06744)]

- Attend, Infer, Repeat: Fast Scene Understanding with Generative Models [[arXiv](http://arxiv.org/abs/1603.08575)]

- Recurrent Batch Normalization [[arXiv](http://arxiv.org/abs/1603.09025)]

- Neural Language Correction with Character-Based Attention [[arXiv](http://arxiv.org/abs/1603.09727)]

- [Incorporating Copying Mechanism in Sequence-to-Sequence Learning](notes/copynet.md) [[arXiv](http://arxiv.org/abs/1603.06393)]

- How NOT To Evaluate Your Dialogue System [[arXiv](http://arxiv.org/abs/1603.08023)]

- [Adaptive Computation Time for Recurrent Neural Networks](notes/act-rnn.md) [[arXiv](http://arxiv.org/abs/1603.08983)]

- A guide to convolution arithmetic for deep learning [[arXiv](http://arxiv.org/abs/1603.07285)]

- Colorful Image Colorization [[arXiv](http://arxiv.org/abs/1603.08983)]

- Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles [[arXiv](http://arxiv.org/abs/1603.09246)]

- Generating Factoid Questions With Recurrent Neural Networks: The 30M Factoid Question-Answer Corpus [[arXiv](http://arxiv.org/abs/1603.06807)]

- A Persona-Based Neural Conversation Model [[arXiv](http://arxiv.org/abs/1603.06155)]

- [A Character-level Decoder without Explicit Segmentation for Neural Machine Translation](notes/char-level-decoder.md) [[arXiv](http://arxiv.org/abs/1603.06147)]

- Multi-Task Cross-Lingual Sequence Tagging from Scratch [[arXiv](http://arxiv.org/abs/1603.06270)]

- Neural Variational Inference for Text Processing [[arXiv](http://arxiv.org/abs/1511.06038)]

- Recurrent Dropout without Memory Loss [[arXiv](http://arxiv.org/abs/1603.05118)]

- One-Shot Generalization in Deep Generative Models [[arXiv](http://arxiv.org/abs/1603.05106)]

- Recursive Recurrent Nets with Attention Modeling for OCR in the Wild [[arXiv](Recursive Recurrent Nets with Attention Modeling for OCR in the Wild)]

- A New Method to Visualize Deep Neural Networks [[arXiv](A New Method to Visualize Deep Neural Networks)]

- Neural Architectures for Named Entity Recognition [[arXiv](http://arxiv.org/abs/1603.01360)]

- End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF [[arXiv](http://arxiv.org/abs/1603.01354)]

- Character-based Neural Machine Translation [[arXiv](http://arxiv.org/abs/1603.00810)]

- Learning Word Segmentation Representations to Improve Named Entity Recognition for Chinese Social Media [[arXiv](http://arxiv.org/abs/1603.00786)]

#### 2016-02

- Architectural Complexity Measures of Recurrent Neural Networks [[arXiv](http://arxiv.org/abs/1602.08210)]

- Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks [[arXiv](http://arxiv.org/abs/1602.07868)]

- Recurrent Neural Network Grammars [[arXiv](http://arxiv.org/abs/1602.07776)]

- Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations [[arXiv](http://arxiv.org/abs/1602.07332)]

- [Contextual LSTM (CLSTM) models for Large scale NLP tasks](notes/clstm-large-scale.md) [[arXiv](http://arxiv.org/abs/1602.06291)]

- Sequence-to-Sequence RNNs for Text Summarization [[arXiv](http://arxiv.org/abs/1602.06023)]

- Extraction of Salient Sentences from Labelled Documents [[arXiv](http://arxiv.org/abs/1412.6815)]

- Learning Distributed Representations of Sentences from Unlabelled Data [[arXiv](http://arxiv.org/abs/1602.03483)]

- Benefits of depth in neural networks [[arXiv](http://arxiv.org/abs/1602.04485)]

- [Associative Long Short-Term Memory](notes/associative-lstm.md) [[arXiv](http://arxiv.org/abs/1602.03032)]

- Why Should I Trust You?": Explaining the Predictions of Any Classifier [[arXiv](https://arxiv.org/abs/1602.04938)] [[code](https://github.com/marcotcr/lime)]

- Generating images with recurrent adversarial networks [[arXiv](http://arxiv.org/abs/1602.05110)]

- [Exploring the Limits of Language Modeling](notes/exploring-the-limits-of-lm.md) [[arXiv](http://arxiv.org/abs/1602.02410)]

- Swivel: Improving Embeddings by Noticing What’s Missing [[arXiv](http://arxiv.org/abs/1602.02215)]

- [WebNav: A New Large-Scale Task for Natural Language based Sequential Decision Making](notes/webnav.md) [[arXiv](http://arxiv.org/abs/1602.02261)]

- [Efficient Character-level Document Classification by Combining Convolution and Recurrent Layers](notes/efficient-char-level-document-classification-cnn-rnn.md) [[arXiv](http://arxiv.org/abs/1602.00367)]

- Gradient Descent Converges to Minimizers [[arXiv](https://arxiv.org/abs/1602.04915)] [[article](http://www.offconvex.org/2016/03/24/saddles-again/)]

- BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 [[arXiv](http://arxiv.org/abs/1602.02830)]

- Learning Discriminative Features via Label Consistent Neural Network [[arXiv](http://arxiv.org/abs/1602.01168)]

#### 2016-01

- What’s your ML test score? A rubric for ML production systems [[Research at Google](https://research.google.com/pubs/pub45742.html)]

- Pixel Recurrent Neural Networks [[arXiv](http://arxiv.org/abs/1601.06759)]

- Bitwise Neural Networks [[arXiv](http://arxiv.org/abs/1601.06071)]

- Long Short-Term Memory-Networks for Machine Reading [[arXiv](http://arxiv.org/abs/1601.06733)]

- Coverage-based Neural Machine Translation [[arXiv](http://arxiv.org/abs/1601.04811)]

- Understanding Deep Convolutional Networks [[arXiv](http://arxiv.org/abs/1601.04920)]

- Training Recurrent Neural Networks by Diffusion [[arXiv](http://arxiv.org/abs/1601.04114)]

- Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures [[arXiv](http://arxiv.org/abs/1601.03896)]

- [Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism](notes/multi-way-nmt-shared-attention.md) [[arXiv](http://arxiv.org/abs/1601.01073)]

- [Recurrent Memory Network for Language Modeling](notes/rmn-language-modeling.md) [[arXiv](http://arxiv.org/abs/1601.01272)]

- Language to Logical Form with Neural Attention [[arXiv](http://arxiv.org/abs/1601.01280)]

- Learning to Compose Neural Networks for Question Answering [[arXiv](http://arxiv.org/abs/1601.01705)]

- The Inevitability of Probability: Probabilistic Inference in Generic Neural Networks Trained with Non-Probabilistic Feedback [[arXiv](http://arxiv.org/abs/1601.03060)]

- COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images [[arXiv](http://arxiv.org/abs/1601.07140)]

- Survey on the attention based RNN model and its applications in computer vision [[arXiv](http://arxiv.org/abs/1601.06823)]

#### 2015-12

NLP

- [Strategies for Training Large Vocabulary Neural Language Models](notes/strategies-for-training-large-vocab-lm.md) [[arXiv](http://arxiv.org/abs/1512.04906)]

- [Multilingual Language Processing From Bytes](notes/multilingual-language-processing-from-bytes.md) [[arXiv](http://arxiv.org/abs/1512.00103)]

- [Learning Document Embeddings by Predicting N-grams for Sentiment Classification of Long Movie Reviews](notes/learning-document-embeddings-ngrams.md) [[arXiv](http://arxiv.org/abs/1512.08183)]

- [Target-Dependent Sentiment Classification with Long Short Term Memory](notes/target-dependent-sentiment-lstm.md) [[arXiv](http://arxiv.org/abs/1512.01100)]

- Reading Text in the Wild with Convolutional Neural Networks [[arXiv](http://arxiv.org/abs/1412.1842)]

Vision

- [Deep Residual Learning for Image Recognition](notes/deep-residual-learning.md) [[arXiv](http://arxiv.org/abs/1512.03385)]

- Rethinking the Inception Architecture for Computer Vision [[arXiv](http://arxiv.org/abs/1512.00567)]

- Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks [[arXiv](http://arxiv.org/abs/1512.04143)]

- Deep Speech 2: End-to-End Speech Recognition in English and Mandarin [[arXiv](http://arxiv.org/abs/1512.02595)]

#### 2015-11

NLP

- [Deep Reinforcement Learning with a Natural Language Action Space](notes/drl-nlp-action.md) [[arXiv](https://arxiv.org/abs/1511.04636)]

- Sequence Level Training with Recurrent Neural Networks [[arXiv](http://arxiv.org/abs/1511.06732)]

- [Teaching Machines to Read and Comprehend](notes/teaching-machines-to-read-and-comprehend.md) [[arxiv](http://arxiv.org/abs/1506.03340)]

- [Semi-supervised Sequence Learning](notes/semi-supervised-sequence-learning.md) [[arXiv](http://arxiv.org/abs/1511.01432)]

- [Multi-task Sequence to Sequence Learning](notes/multitask-seq2seq.md) [[arXiv](http://arxiv.org/abs/1511.06114)]

- [Alternative structures for character-level RNNs](notes/alternative-structure-char-rnn.md) [[arXiv](http://arxiv.org/abs/1511.06303)]

- [Larger-Context Language Modeling](notes/larger-context-lm.md) [[arXiv](http://arxiv.org/abs/1511.03729)]

- [A Unified Tagging Solution: Bidirectional LSTM Recurrent Neural Network with Word Embedding](notes/unified-tagging-blstm.md) [[arXiv](http://arxiv.org/abs/1511.00215)]

- Towards Universal Paraphrastic Sentence Embeddings [[arXiv](http://arxiv.org/abs/1511.08198)]

- BlackOut: Speeding up Recurrent Neural Network Language Models With Very Large Vocabularies [[arXiv](http://arxiv.org/abs/1511.06909)]

- Sequence Level Training with Recurrent Neural Networks [[arXiv](http://arxiv.org/abs/1511.06732)]

- Natural Language Understanding with Distributed Representation [[arXiv](http://arxiv.org/abs/1511.07916)]

- sense2vec - A Fast and Accurate Method for Word Sense Disambiguation In Neural Word Embeddings [[arXiv](http://arxiv.org/abs/1511.06388)]

- LSTM-based Deep Learning Models for non-factoid answer selection [[arXiv](http://arxiv.org/abs/1511.04108)]

Programs

- Neural Random-Access Machines [[arxiv](http://arxiv.org/abs/1511.06392)]

- Neural Programmer: Inducing Latent Programs with Gradient Descent [[arXiv](http://arxiv.org/abs/1511.04834)]

- Neural Programmer-Interpreters [[arXiv](http://arxiv.org/abs/1511.06279)]

- Learning Simple Algorithms from Examples [[arXiv](http://arxiv.org/abs/1511.07275)]

- Neural GPUs Learn Algorithms [[arXiv](http://arxiv.org/abs/1511.08228)] [[code](https://github.com/tensorflow/tensor2tensor)]

- On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models [[arXiv](http://arxiv.org/abs/1511.09249)]

Vision

- ReSeg: A Recurrent Neural Network for Object Segmentation [[arXiv](http://arxiv.org/abs/1511.07053)]

- Deconstructing the Ladder Network Architecture [[arXiv](http://arxiv.org/abs/1511.06430)]

- Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks [[arXiv](http://arxiv.org/abs/1511.06434)]

- Multi-Scale Context Aggregation by Dilated Convolutions [[arXiv](https://arxiv.org/abs/1511.07122)] [[code](https://github.com/fyu/drn)]

General

- Towards Principled Unsupervised Learning [[arXiv](http://arxiv.org/abs/1511.06440)]

- Dynamic Capacity Networks [[arXiv](http://arxiv.org/abs/1511.07838)]

- [Generating Sentences from a `ous Space](notes/generating-sentences-cont-space.md) [[arXiv](http://arxiv.org/abs/1511.06349)]

- Net2Net: Accelerating Learning via Knowledge Transfer [[arXiv](http://arxiv.org/abs/1511.05641)]

- A Roadmap towards Machine Intelligence [[arXiv](http://arxiv.org/abs/1511.08130)]

- Session-based Recommendations with Recurrent Neural Networks [[arXiv](http://arxiv.org/abs/1511.06939)]

- Regularizing RNNs by Stabilizing Activations [[arXiv](http://arxiv.org/abs/1511.08400)]

#### 2015-10

- [A Sensitivity Analysis of (and Practitioners' Guide to) Convolutional Neural Networks for Sentence Classification](notes/sensitivity-analysis-cnn-sentence-classification.md) [[arXiv](http://arxiv.org/abs/1510.03820)]

- [Attention with Intention for a Neural Network Conversation Model](notes/attention-with-intention.md) [[arXiv](http://arxiv.org/abs/1510.08565)]

- Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Recurrent Neural Network [[arXiv](http://arxiv.org/abs/1510.06168)]

- A Survey: Time Travel in Deep Learning Space: An Introduction to Deep Learning Models and How Deep Learning Models Evolved from the Initial Ideas [[arXiv](http://arxiv.org/abs/1510.04781)]

- A Primer on Neural Network Models for Natural Language Processing [[arXiv](http://arxiv.org/abs/1510.00726)]

- [A Diversity-Promoting Objective Function for Neural Conversation Models](notes/diversity-promoting-objective-ncm.md) [[arXiv](http://arxiv.org/abs/1510.03055)]

#### 2015-09

- [Character-level Convolutional Networks for Text Classification](notes/character-level-cnn-for-text-classification.md) [[arXiv](http://arxiv.org/abs/1509.01626)]

- [A Neural Attention Model for Abstractive Sentence Summarization](notes/neural-attention-model-for-abstractive-sentence-summarization.md) [[arXiv](http://arxiv.org/abs/1509.00685)]

- Poker-CNN: A Pattern Learning Strategy for Making Draws and Bets in Poker Games [[arXiv](http://arxiv.org/abs/1509.06731)]

#### 2015-08

- [Neural Machine Translation of Rare Words with Subword Units](notes/nmt-subword.md) [[arXiv](https://arxiv.org/abs/1508.07909)] [[code](https://github.com/rsennrich/subword-nmt)]

- Listen, Attend and Spell [[arxiv](http://arxiv.org/abs/1508.01211)]

- [Character-Aware Neural Language Models](notes/character-aware-nlm.md) [[arXiv](http://arxiv.org/abs/1508.06615)]

- Improved Transition-Based Parsing by Modeling Characters instead of Words with LSTMs [[arXiv](http://arxiv.org/abs/1508.00657)]

- Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation [[arXiv](http://arxiv.org/abs/1508.02096)]

- [Effective Approaches to Attention-based Neural Machine Translation](notes/effective-approaches-nmt-attention.md) [[arXiv](https://arxiv.org/abs/1508.04025)]

#### 2015-07

- [Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models](e2e-dialog-ghnnm.md) [[arXiv](http://arxiv.org/abs/1507.04808)]

- Semi-Supervised Learning with Ladder Networks [[arXiv](http://arxiv.org/abs/1507.02672)]

- [Document Embedding with Paragraph Vectors](notes/document-embedding-with-pv.md) [[arXiv](http://arxiv.org/abs/1507.07998)]

- [Training Very Deep Networks](notes/training-very-deep-networks.md) [[arXiv](http://arxiv.org/abs/1507.06228)]

#### 2015-06

- Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning [[arXiv](https://arxiv.org/abs/1506.02142)]

- [A Neural Network Approach to Context-Sensitive Generation of Conversational Responses](notes/nn-context-sentitive-responses.md) [[arXiv](http://arxiv.org/abs/1506.06714)]

- [Document Embedding with Paragraph Vectors](notes/document-embedding-with-pv.md) [[arXiv](http://arxiv.org/abs/1507.07998)]

- [A Neural Conversational Model](notes/neural-conversational-model.md) [[arXiv](http://arxiv.org/abs/1506.05869)]

- [Skip-Thought Vectors](notes/skip-thought-vectors.md) [[arXiv](http://arxiv.org/abs/1506.06726)]

- [Pointer Networks](notes/pointer-networks.md) [[arXiv](http://arxiv.org/abs/1506.03134)]

- [Spatial Transformer Networks](notes/spatial-transformer-networks.md) [[arXiv](http://arxiv.org/abs/1506.02025)]

- Tree-structured composition in neural networks without tree-structured architectures [[arXiv](http://arxiv.org/abs/1506.04834)]

- Visualizing and Understanding Neural Models in NLP [[arXiv](http://arxiv.org/abs/1506.01066)]

- Learning to Transduce with Unbounded Memory [[arXiv](http://arxiv.org/abs/1506.02516)]

- Ask Me Anything: Dynamic Memory Networks for Natural Language Processing [[arXiv](http://arxiv.org/abs/1506.07285)]

- [Deep Knowledge Tracing](notes/deep-knowledge-tracing.md) [[arXiv](http://arxiv.org/abs/1506.05908)]

#### 2015-05

- [ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks](notes/renet-rnn-alternative-to-convnet.md) [[arXiv](http://arxiv.org/abs/1505.00393)]

- Reinforcement Learning Neural Turing Machines [[arXiv](http://arxiv.org/abs/1505.00521)]

#### 2015-04

- Correlational Neural Networks [[arXiv](http://arxiv.org/abs/1504.07225)]

#### 2015-03

- [Distilling the Knowledge in a Neural Network](notes/distilling-the-knowledge-in-a-nn.md) [[arXiv](http://arxiv.org/abs/1503.02531)]

- [End-To-End Memory Networks](notes/end-to-end-memory-networks.md) [[arXiv](http://arxiv.org/abs/1503.08895)]

- [Neural Responding Machine for Short-Text Conversation](notes/neural-responding-machine.md) [[arXiv](http://arxiv.org/abs/1503.02364)]

- [Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift](notes/batch-normalization.md) [[arXiv](http://arxiv.org/abs/1502.03167)]

- Escaping From Saddle Points --- Online Stochastic Gradient for Tensor Decomposition [[arXiv](https://arxiv.org/abs/1503.02101)] [[article](Escaping from Saddle Points)]

#### 2015-02

- Human-level control through deep reinforcement

learning [[Nature](https://web.stanford.edu/class/psych209/Readings/MnihEtAlHassibis15NatureControlDeepRL.pdf)] [[code](https://github.com/deepmind/dqn)]

- [Text Understanding from Scratch](notes/text-understanding-from-scratch.md) [[arXiv](http://arxiv.org/abs/1502.01710)]

- [Show, Attend and Tell: Neural Image Caption Generation with Visual Attention](notes/show-attend-tell.md) [[arXiv](http://arxiv.org/abs/1502.03044)]

#### 2015-01

- Hidden Technical Debt in Machine Learning Systems [[NIPS](https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf)]

#### 2014-12

- Learning Longer Memory in Recurrent Neural Networks [[arXiv](http://arxiv.org/abs/1412.7753)]

- [Neural Turing Machines](notes/neural-turing-machines.md) [[arxiv](http://arxiv.org/abs/1410.5401)]

- [Grammar as a Foreign Langauage](notes/grammar-as-a-foreign-language.md) [[arXiv](http://arxiv.org/abs/1412.7449)]

- [On Using Very Large Target Vocabulary for Neural Machine Translation](notes/on-using-very-large-target-vocabulary-for-nmt.md) [[arXiv](http://arxiv.org/abs/1412.2007)]

- Effective Use of Word Order for Text Categorization with Convolutional Neural Networks [[arXiv](http://arxiv.org/abs/1412.1058v1)]

- Multiple Object Recognition with Visual Attention [[arXiv](http://arxiv.org/abs/1412.7755)]

#### 2014-11

- The Loss Surfaces of Multilayer Networks [[arXiv](https://arxiv.org/abs/1412.0233)]

#### 2014-10

- [Learning to Execute](notes/learning-to-execute.md) [[arXiv](http://arxiv.org/abs/1410.4615)]

#### 2014-09

- [Sequence to Sequence Learning with Neural Networks](notes/seq2seq-with-neural-networks.md) [[arXiv](http://arxiv.org/abs/1409.3215)]

- [Neural Machine Translation by Jointly Learning to Align and Translate](notes/nmt-jointly-learning-to-align-and-translate.md) [[arxiv](http://arxiv.org/abs/1409.0473)]

- [On the Properties of Neural Machine Translation: Encoder-Decoder Approaches](notes/properties-of-neural-mt.md) [[arXiv](http://arxiv.org/abs/1409.1259)]

- [Recurrent Neural Network Regularization](notes/rnn-regularization.md) [[arXiv](http://arxiv.org/abs/1409.2329)]

- Very Deep Convolutional Networks for Large-Scale Image Recognition [[arXiv](http://arxiv.org/abs/1409.1556)]

- Going Deeper with Convolutions [[arXiv](http://arxiv.org/abs/1409.4842)]

#### 2014-08

- Convolutional Neural Networks for Sentence Classification [[arxiv](http://arxiv.org/abs/1408.5882)]

#### 2014-07

#### 2014-06

- [Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation](notes/learning-phrase-representations.md) [[arXiv](http://arxiv.org/abs/1406.1078)]

- [Recurrent Models of Visual Attention](notes/recurrent-models-of-visual-attention.md) [[arXiv](http://arxiv.org/abs/1406.6247)]

- Generative Adversarial Networks [[arXiv](http://arxiv.org/abs/1406.2661)]

#### 2014-05

- [Distributed Representations of Sentences and Documents](notes/distributed-representations-of-sentences-and-documents.md) [[arXiv](http://arxiv.org/abs/1405.4053)]

#### 2014-04

- A Convolutional Neural Network for Modelling Sentences [[arXiv](http://arxiv.org/abs/1404.2188)]

#### 2014-03

#### 2014-02

#### 2014-01

- Machine Learning: The High Interest Credit Card of Technical Debt [[Research at Google](https://research.google.com/pubs/pub43146.html)]

#### 2013

- Visualizing and Understanding Convolutional Networks [[arXiv](http://arxiv.org/abs/1311.2901)]

- DeViSE: A Deep Visual-Semantic Embedding Model [[pub](http://research.google.com/pubs/pub41473.html)]

- Maxout Networks [[arXiv](http://arxiv.org/abs/1302.4389)]

- Exploiting Similarities among Languages for Machine Translation [[arXiv](http://arxiv.org/abs/1309.4168)]

- Efficient Estimation of Word Representations in Vector Space [[arXiv](http://arxiv.org/abs/1301.3781)]

#### 2011

- Natural Language Processing (almost) from Scratch [[arXiv](http://arxiv.org/abs/1103.0398)]
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dennybritz/deeplearning-papernotes

Awesome Lists containing this project

README