https://github.com/sbrugman/deep-learning-papers

Papers about deep learning ordered by task, date. Current state-of-the-art papers are labelled.
https://github.com/sbrugman/deep-learning-papers
arxiv deep-learning deep-learning-papers machine-learning neural-networks papers science
Last synced: 5 months ago
JSON representation
Papers about deep learning ordered by task, date. Current state-of-the-art papers are labelled.
Host: GitHub
URL: https://github.com/sbrugman/deep-learning-papers
Owner: sbrugman
Archived: true
Created: 2016-10-29T15:59:13.000Z (over 8 years ago)
Default Branch: master
Last Pushed: 2019-12-21T20:49:39.000Z (over 5 years ago)
Last Synced: 2024-09-27T03:41:20.380Z (9 months ago)
Topics: arxiv, deep-learning, deep-learning-papers, machine-learning, neural-networks, papers, science
Homepage:
Size: 61.8 MB
Stars: 3,196
Watchers: 249
Forks: 415
Open Issues: 5
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

awesome-papers - sbrugman/deep-learning-papers
awesome-list - sbrugman/deep-learning-papers - Papers about deep learning ordered by task, date. (Machine Learning / JavaScript)
README

        # Deep Learning Papers by task

Papers about deep learning ordered by task, date. For each paper there is a permanent link, which is either to Arxiv.org or to a copy of the original paper in this repository.

# Table of Contents

1. [Code](#code)

	1.1. [Code Generation](#code-generation)

	1.2. [Malware Detection and Security](#malware-detection-and-security)

2. [Text](#text)

	2.1. [Summarization](#summarization)

	2.2. [Taskbots](#taskbots)

	2.3. [Classification](#classification)

	2.4. [Question Answering](#question-answering)

	2.5. [Sentiment Analysis](#sentiment-analysis)

	2.6. [Translation](#translation)

	2.7. [Chatbots](#chatbots)

	2.8. [Reasoning](#reasoning)

	

	2.9. [Language Representation](#language-representation)

3. [Visual](#visual)

	3.1. [Gaming](#gaming)

	3.2. [Style Transfer](#style-transfer)

	3.3. [Object Tracking](#object-tracking)

	3.4. [Visual Question Answering](#visual-question-answering)

	3.5. [Image Segmentation](#image-segmentation)

	3.6. [Text (in the Wild) Recognition](#text-in-the-wild-recognition)

	3.7. [Brain Computer Interfacing](#brain-computer-interfacing)

	3.8. [Self-Driving Cars](#self-driving-cars)

	3.9. [Object Recognition](#object-recognition)

	3.10. [Logo Recognition](#logo-recognition)

	3.11. [Super Resolution](#super-resolution)

	3.12. [Pose Estimation](#pose-estimation)

	3.13. [Image Captioning](#image-captioning)

	3.14. [Image Compression](#image-compression)

	3.15. [Image Synthesis](#image-synthesis)

	3.16. [Face Recognition](#face-recognition)

	

	3.17. [Image Composition](#image-composition)

	

	3.18. [Scene Graph Parsing](#scene-graph-parsing)

	

	3.19. [Video Deblurring](#video-deblurring)

	

	3.20. [Depth Perception](#depth-perception)

	3.21. [3D Reconstruction](#3d-reconstruction)

	

	3.22. [Vision Representation](#vision-representation)

4. [Audio](#audio)

	4.1. [Audio Synthesis](#audio-synthesis)

5. [Other](#other)

	5.1. [Unclassified](#unclassified)

	5.2. [Regularization](#regularization)

	5.3. [Neural Network Compression](#neural-network-compression)

	5.4. [Optimizers](#optimizers)

## Code

### Code Generation

|Title|Date|Paper|Code|

|---|---|---|---|

| DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning | _25 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.07734) |  | 

| A Syntactic Neural Model for General-Purpose Code Generation | _6 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.01696) |  | 

| RobustFill: Neural Program Learning under Noisy I/O | _21 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.07469) |  | 

| DeepFix: Fixing Common C Language Errors by Deep Learning | _12 feb 2017_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/deepfix-fixing-common-c-language-errors-by-deep-learning.pdf) |  | 

| DeepCoder: Learning to Write Programs | _7 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.01989) |  | 

| Neuro-Symbolic Program Synthesis | _6 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.01855) |  | 

| Deep API Learning | _27 may 2016_ | [arxiv](https://arxiv.org/pdf/1605.08535) |  | 

### Malware Detection and Security

|Title|Date|Paper|Code|

|---|---|---|---|

| PassGAN: A Deep Learning Approach for Password Guessing | _1 sep 2017_ | [arxiv](https://arxiv.org/pdf/1709.00440) |  | 

| Deep Android Malware Detection | _22 mar 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/deep-android-malware-detection.pdf) | [github](https://github.com/niallmcl/Deep-Android-Malware-Detection) | 

| Droid-Sec: Deep Learning in Android Malware Detection | _17 aug 2014_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/droid-sec-deep-learning-in-android-malware-detection.pdf) | [github](https://github.com/pjlantz/droidbox) | 

## Text

### Summarization

|Title|Date|Paper|Code|

|---|---|---|---|

| A Deep Reinforced Model for Abstractive Summarization | _11 may 2017_ | [arxiv](https://arxiv.org/pdf/1705.04304) |  | 

| Get To The Point: Summarization with Pointer-Generator Networks | _14 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.04368) |  | 

| SummaRuNNer: A Recurrent Neural Network based Sequence Model for Extractive Summarization of Documents | _14 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.04230) |  | 

### Taskbots

|Title|Date|Paper|Code|

|---|---|---|---|

| Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning | _10 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03084) | [github](https://github.com/MiuLab/TC-Bot) | 

| End-to-End Task-Completion Neural Dialogue Systems | _3 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.01008) | [github](https://github.com/MiuLab/TC-Bot) | 

### Classification

|Title|Date|Paper|Code|

|---|---|---|---|

| A Large Self-Annotated Corpus for Sarcasm | _19 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.05579) |  | 

| ConceptNet at SemEval-2017 Task 2: Extending Word Embeddings with Multilingual Relational Knowledge | _11 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03560) |  | 

| Bilateral Multi-Perspective Matching for Natural Language Sentences | _13 feb 2017_ | [arxiv](https://arxiv.org/pdf/1702.03814) |  | 

| FastText.zip: Compressing text classification models | _12 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.03651) |  | 

| ConceptNet 5.5: An Open Multilingual Graph of General Knowledge | _12 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.03975) |  | 

| A Simple but Tough-to-Beat Baseline for Sentence Embeddings | _4 nov 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/a-simple-but-tough-to-beat-baseline-for-sentence-embeddings.pdf) | [github](https://github.com/YingyuLiang/SIF) | 

| Enriching Word Vectors with Subword Information | _15 jul 2016_ | [arxiv](https://arxiv.org/pdf/1607.04606) |  | 

| From Word Embeddings To Document Distances | _6 jul 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/from-word-embeddings-to-document-distances.pdf) | [github](https://github.com/mkusner/wmd) | 

| Bag of Tricks for Efficient Text Classification | _6 jul 2016_ | [arxiv](https://arxiv.org/pdf/1607.01759) |  | 

| Character-level Convolutional Networks for Text Classification | _4 sep 2015_ | [arxiv](https://arxiv.org/pdf/1509.01626) |  | 

| GloVe: Global Vectors for Word Representation | _25 may 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/glove-global-vectors-for-word-representation.pdf) | [github](https://github.com/stanfordnlp/GloVe) | 

| Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks | _28 feb 2015_ | [arxiv](https://arxiv.org/pdf/1503.00075) |  | 

| Distributed Representations of Sentences and Documents | _16 may 2014_ | [arxiv](https://arxiv.org/pdf/1405.4053) |  | 

| Efficient Estimation of Word Representations in Vector Space | _16 jan 2013_ | [arxiv](https://arxiv.org/pdf/1301.3781) |  | 

| SimHash: Hash-based Similarity Detection | _13 dec 2007_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/simhash-hash-based-similarity-detection.pdf) |  | 

### Question Answering

|Title|Date|Paper|Code|

|---|---|---|---|

| IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models | _30 may 2017_ | [arxiv](https://arxiv.org/pdf/1705.10513) | [github](https://github.com/geek-ai/irgan) | 

### Sentiment Analysis

|Title|Date|Paper|Code|

|---|---|---|---|

| Rationalizing Neural Predictions | _13 jun 2016_ | [arxiv](https://arxiv.org/pdf/1606.04155) |  | 

| Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank | _18 okt 2013_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/recursive-deep-models-for-semantic-compositionality-over-a-sentiment-treebank.pdf) |  | 

### Translation

|Title|Date|Paper|Code|

|---|---|---|---|

| Attention Is All You Need | _12 jun 2017_ | [arxiv](https://arxiv.org/pdf/1706.03762) |  | 

| Convolutional Sequence to Sequence Learning | _8 may 2017_ | [arxiv](https://arxiv.org/pdf/1705.03122) | [github](https://github.com/facebookresearch/fairseq) | 

| Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation | _14 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.04558) |  | 

| A Convolutional Encoder Model for Neural Machine Translation | _7 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.02344) |  | 

| Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation | _26 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.08144) |  | 

| Neural Machine Translation by Jointly Learning to Align and Translate | _1 sep 2014_ | [arxiv](https://arxiv.org/pdf/1409.0473) |  | 

### Chatbots

|Title|Date|Paper|Code|

|---|---|---|---|

| A Deep Reinforcement Learning Chatbot | _7 sep 2017_ | [arxiv](https://arxiv.org/pdf/1709.02349) |  | 

| A Neural Conversational Model | _19 jun 2015_ | [arxiv](https://arxiv.org/pdf/1506.05869) | [github](https://github.com/inikdom/neural-chatbot) | 

### Reasoning

|Title|Date|Paper|Code|

|---|---|---|---|

| NeuroSAT: Learning a SAT Solver from Single-Bit Supervision | _5 jan 2019_ | [arxiv](https://arxiv.org/pdf/1802.03685.pdf) | [github](https://github.com/dselsam/neurosat) |

| Tracking the World State with Recurrent Entity Networks | _12 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.03969) |  |

### Language Representation

|Title|Date|Paper|Code|

|---|---|---|---|

| Efficient Estimation of Word Representations in Vector Space | _7 sep 2013_ | [arxiv](https://arxiv.org/pdf/1301.3781.pdf) |  |

| Distributed Representations of Words and Phrases and their Compositionality | _16 okt 2013_ | [arxiv](https://arxiv.org/pdf/1310.4546.pdf) | 

| ELMO: Deep contextualized word representations | _22 Mar 2018_ | [arxiv](https://arxiv.org/pdf/1802.05365.pdf) | |

| BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding | _24 may 2019_ | [arxiv](https://arxiv.org/pdf/1810.04805.pdf) | [github](https://github.com/google-research/bert) |

| XLNet: Generalized Autoregressive Pretraining for Language Understanding | _19 jun 2019_ | [arxiv](https://arxiv.org/pdf/1906.08237.pdf) | [github](https://github.com/zihangdai/xlnet) |

| RoBERTa: A Robustly Optimized BERT Pretraining Approach | _26 jul 2019_ | [arxiv](https://arxiv.org/pdf/1907.11692.pdf) | [github](https://github.com/pytorch/fairseq/tree/master/examples/roberta) |

## Visual

### Gaming

|Title|Date|Paper|Code|

|---|---|---|---|

| Phase-Functioned Neural Networks for Character Control | _1 may 2017_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/phase-functioned-neural-networks-for-character-control.pdf) |  | 

| Equivalence Between Policy Gradients and Soft Q-Learning | _21 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.06440) |  | 

| Beating Atari with Natural Language Guided Reinforcement Learning | _18 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.05539) |  | 

| Learning from Demonstrations for Real World Reinforcement Learning | _12 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03732) |  | 

| FeUdal Networks for Hierarchical Reinforcement Learning | _3 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.01161) |  | 

| Overcoming catastrophic forgetting in neural networks | _2 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.00796) |  | 

| Playing Doom with SLAM-Augmented Deep Reinforcement Learning | _1 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.00380) |  | 

| Playing FPS Games with Deep Reinforcement Learning | _18 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.05521) |  | 

| DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess | _16 aug 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/deepchess-end-to-end-deep-neural-network-for-automatic-learning-in-chess.pdf) |  | 

| Generative Adversarial Imitation Learning | _10 jun 2016_ | [arxiv](https://arxiv.org/pdf/1606.03476) |  | 

| Dueling Network Architectures for Deep Reinforcement Learning | _20 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.06581) |  | 

| Prioritized Experience Replay | _18 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.05952) |  | 

| Human-level control through deep reinforcement learning | _26 feb 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/human-level-control-through-deep-reinforcement-learning.pdf) |  | 

| Playing Atari with Deep Reinforcement Learning | _19 dec 2013_ | [arxiv](https://arxiv.org/pdf/1312.5602) |  | 

### Style Transfer

|Title|Date|Paper|Code|

|---|---|---|---|

| The Contextual Loss for Image Transformation with Non-Aligned Data | _18 jul 2018_ | [arxiv](https://arxiv.org/pdf/1803.02077.pdf) | [github](https://github.com/roimehrez/contextualLoss) |

| Deep Photo Style Transfer | _22 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.07511) |  |

| Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization | _20 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.06868) | [github](https://github.com/xunhuang1995/AdaIN-style) | 

| A Learned Representation For Artistic Style | _24 okt 2016_ | [arxiv](https://arxiv.org/pdf/1610.07629) |  | 

| Instance Normalization: The Missing Ingredient for Fast Stylization | _27 jul 2016_ | [arxiv](https://arxiv.org/pdf/1607.08022) |  | 

| Perceptual Losses for Real-Time Style Transfer and Super-Resolution | _27 mar 2016_ | [arxiv](https://arxiv.org/pdf/1603.08155) | [github](http://github.com/jcjohnson/fast-neural-style) | 

| A Neural Algorithm of Artistic Style | _26 aug 2015_ | [arxiv](https://arxiv.org/pdf/1508.06576) | [github](https://github.com/lengstrom/fast-style-transfer/) |  

### Object Tracking

|Title|Date|Paper|Code|

|---|---|---|---|

| End-to-end representation learning for Correlation Filter based tracking | _20 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.06036) | [github](https://github.com/bertinetto/cfnet) | 

### Visual Question Answering

|Title|Date|Paper|Code|

|---|---|---|---|

| VQA: Visual Question Answering | _3 may 2015_ | [arxiv](https://arxiv.org/pdf/1505.00468) |  | 

### Image Segmentation

|Title|Date|Paper|Code|

|---|---|---|---|

| PointRend: Image Segmentation as Rendering | _17 dec 2019_ | [arxiv](https://arxiv.org/pdf/1912.08193.pdf) | |

| Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation | _22 aug 2018_ | [paper](https://arxiv.org/pdf/1802.02611.pdf) [cvpr](http://openaccess.thecvf.com/content_CVPR_2019/papers/Liu_Auto-DeepLab_Hierarchical_Neural_Architecture_Search_for_Semantic_Image_Segmentation_CVPR_2019_paper.pdf) | [github](https://github.com/tensorflow/models/tree/master/research/deeplab) |

| Dilated Residual Networks | _22 jul 2017_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/dilated-residual-networks.pdf) |  | 

| SfM-Net: Learning of Structure and Motion from Video | _25 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.07804) |  | 

| Semi and Weakly Supervised Semantic Segmentation Using Generative Adversarial Network | _28 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.09695) |  | 

| Mask R-CNN | _20 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.06870) |  | 

| Learning Features by Watching Objects Move | _19 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.06370) |  | 

| RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation | _20 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.06612) | [github](https://github.com/guosheng/refinenet) | 

| UberNet: Training a `Universal' Convolutional Neural Network for Low-, Mid-, and High-Level Vision using Diverse Datasets and Limited Memory | _7 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.02132) |  | 

| DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs | _2 jun 2016_ | [arxiv](https://arxiv.org/pdf/1606.00915) |  | 

| Fully Convolutional Networks for Semantic Segmentation | _20 may 2016_ | [arxiv](https://arxiv.org/pdf/1605.06211) | [github](https://github.com/shelhamer/fcn.berkeleyvision.org) | 

| Instance-aware Semantic Segmentation via Multi-task Network Cascades | _14 dec 2015_ | [arxiv](https://arxiv.org/pdf/1512.04412) |  | 

| Multi-Scale Context Aggregation by Dilated Convolutions | _23 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.07122) |  | 

| SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation | _2 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.00561) |  | 

| U-Net: Convolutional Networks for Biomedical Image Segmentation | _18 may 2015_ | [arxiv](https://arxiv.org/pdf/1505.04597) |  | 

| Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs | _22 dec 2014_ | [arxiv](https://arxiv.org/pdf/1412.7062) |  | 

| Learning Rich Features from RGB-D Images for Object Detection and Segmentation | _22 jul 2014_ | [arxiv](https://arxiv.org/pdf/1407.5736) |  | 

### Text (in the Wild) Recognition

|Title|Date|Paper|Code|

|---|---|---|---|

| OCR Error Correction Using Character Correction and Feature-Based Word Classification | _21 apr 2016_ | [arxiv](https://arxiv.org/pdf/1604.06225) |  | 

| Recursive Recurrent Nets with Attention Modeling for OCR in the Wild | _9 mar 2016_ | [arxiv](https://arxiv.org/pdf/1603.03101) |  | 

| COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images | _26 jan 2016_ | [arxiv](https://arxiv.org/pdf/1601.07140) |  | 

| Efficient Scene Text Localization and Recognition with Local Character Refinement | _14 apr 2015_ | [arxiv](https://arxiv.org/pdf/1504.03522) |  | 

| Reading Text in the Wild with Convolutional Neural Networks | _4 dec 2014_ | [arxiv](https://arxiv.org/pdf/1412.1842) |  | 

| Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition | _9 jun 2014_ | [arxiv](https://arxiv.org/pdf/1406.2227) |  | 

### Brain Computer Interfacing

|Title|Date|Paper|Code|

|---|---|---|---|

| Deep learning with convolutional neural networks for brain mapping and decoding of movement-related information from the human EEG | _15 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.05051) |  | 

| Encoding Voxels with Deep Learning | _2 dec 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/encoding-voxels-with-deep-learning.pdf) |  | 

| Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream | _8 jul 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/deep-neural-networks-reveal-a-gradient-in-the-complexity-of-neural-representations-across-the-ventral-stream.pdf) |  | 

### Self-Driving Cars

|Title|Date|Paper|Code|

|---|---|---|---|

| Computer Vision for Autonomous Vehicles: Problems, Datasets and State-of-the-Art | _18 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.05519) |  | 

| End to End Learning for Self-Driving Cars | _25 apr 2016_ | [arxiv](https://arxiv.org/pdf/1604.07316) |  | 

### Object Recognition

|Title|Date|Paper|Code|

|---|---|---|---|

| Cascade R-CNN: High Quality Object Detection and Instance Segmentation | _24 Jun 2019_ | [arxiv](https://arxiv.org/pdf/1906.09756.pdf) | [github](https://github.com/zhaoweicai/cascade-rcnn) |

| YOLOv3: An Incremental Improvement | _8 Apr 2018_ | [arxiv](https://arxiv.org/pdf/1804.02767.pdf) | [github]( https://github.com/pjreddie/darknet), [github reimplementation](https://github.com/ultralytics/yolov3) | 

| Focal Loss for Dense Object Detection | _7 aug 2017_ | [arxiv](https://arxiv.org/pdf/1708.02002) |  | 

| Introspective Classifier Learning: Empower Generatively | _25 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.07816) |  | 

| Learning Chained Deep Features and Classifiers for Cascade in Object Detection | _23 feb 2017_ | [arxiv](https://arxiv.org/pdf/1702.07054) |  | 

| DSSD : Deconvolutional Single Shot Detector | _23 jan 2017_ | [arxiv](https://arxiv.org/pdf/1701.06659) |  |  

| YOLO9000: Better, Faster, Stronger | _25 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.08242) | [github](https://github.com/pjreddie/darknet) |  

| Feature Pyramid Networks for Object Detection | _9 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.03144) |  | 

| Speed/accuracy trade-offs for modern convolutional object detectors | _30 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.10012) |  | 

| Aggregated Residual Transformations for Deep Neural Networks | _16 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.05431) |  | 

| Aggregated Residual Transformations for Deep Neural Networks | _16 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.05431) |  | 

| Hierarchical Object Detection with Deep Reinforcement Learning | _11 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.03718) |  | 

| Xception: Deep Learning with Depthwise Separable Convolutions | _7 okt 2016_ | [arxiv](https://arxiv.org/pdf/1610.02357) |  | 

| Learning to Make Better Mistakes: Semantics-aware Visual Food Recognition | _1 okt 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/learning-to-make-better-mistakes-semantics-aware-visual-food-recognition.pdf) |  | 

| Densely Connected Convolutional Networks | _25 aug 2016_ | [arxiv](https://arxiv.org/pdf/1608.06993) |  | 

| Residual Networks of Residual Networks: Multilevel Residual Networks | _9 aug 2016_ | [arxiv](https://arxiv.org/pdf/1608.02908) |  | 

| Context Matters: Refining Object Detection in Video with Recurrent Neural Networks | _15 jul 2016_ | [arxiv](https://arxiv.org/pdf/1607.04648) |  | 

| R-FCN: Object Detection via Region-based Fully Convolutional Networks | _20 may 2016_ | [arxiv](https://arxiv.org/pdf/1605.06409) |  | 

| Training Region-based Object Detectors with Online Hard Example Mining | _12 apr 2016_ | [arxiv](https://arxiv.org/pdf/1604.03540) |  | 

| T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos | _9 apr 2016_ | [arxiv](https://arxiv.org/pdf/1604.02532) |  | 

| Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning | _23 feb 2016_ | [arxiv](https://arxiv.org/pdf/1602.07261) |  | 

| Deep Residual Learning for Image Recognition | _10 dec 2015_ | [arxiv](https://arxiv.org/pdf/1512.03385) |  | 

| SSD: Single Shot MultiBox Detector | _8 dec 2015_ | [arxiv](https://arxiv.org/pdf/1512.02325) |  |  

| Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) | _23 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.07289) |  | 

| ParseNet: Looking Wider to See Better | _15 jun 2015_ | [arxiv](https://arxiv.org/pdf/1506.04579) |  | 

| You Only Look Once: Unified, Real-Time Object Detection | _8 jun 2015_ | [arxiv](https://arxiv.org/pdf/1506.02640) |  |  

| Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks | _4 jun 2015_ | [arxiv](https://arxiv.org/pdf/1506.01497) |  |  

| Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification | _6 feb 2015_ | [arxiv](https://arxiv.org/pdf/1502.01852) |  | 

| Deep Image: Scaling up Image Recognition | _13 jan 2015_ | [arxiv](https://arxiv.org/pdf/1501.02876) |  | 

| Rich feature hierarchies for accurate object detection and semantic segmentation | _11 nov 2013_ | [arxiv](https://arxiv.org/pdf/1311.2524) |  | 

| Selective Search for Object Recognition | _11 mar 2013_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/selective-search-for-object-recognition.pdf) |  | 

| ImageNet Classification with Deep Convolutional Neural Networks | _3 dec 2012_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/imagenet-classification-with-deep-convolutional-neural-networks.pdf) |  |  

### Logo Recognition

|Title|Date|Paper|Code|

|---|---|---|---|

| Deep Learning Logo Detection with Data Expansion by Synthesising Context | _29 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.09322) |  | 

| Automatic Graphic Logo Detection via Fast Region-based Convolutional Networks | _20 apr 2016_ | [arxiv](https://arxiv.org/pdf/1604.06083) |  | 

| LOGO-Net: Large-scale Deep Logo Detection and Brand Recognition with Deep Region-based Convolutional Networks | _8 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.02462) |  | 

| DeepLogo: Hitting Logo Recognition with the Deep Neural Network Hammer | _7 okt 2015_ | [arxiv](https://arxiv.org/pdf/1510.02131) |  | 

### Super Resolution

|Title|Date|Paper|Code|

|---|---|---|---|

| Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network | _16 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.05158) |  | 

| Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network | _15 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.04802) |  | 

| RAISR: Rapid and Accurate Image Super Resolution | _3 jun 2016_ | [arxiv](https://arxiv.org/pdf/1606.01299) |  | 

| Perceptual Losses for Real-Time Style Transfer and Super-Resolution | _27 mar 2016_ | [arxiv](https://arxiv.org/pdf/1603.08155) | [github](http://github.com/jcjohnson/fast-neural-style) | 

| Image Super-Resolution Using Deep Convolutional Networks | _31 dec 2014_ | [arxiv](https://arxiv.org/pdf/1501.00092) |  | 

### Pose Estimation

|Title|Date|Paper|Code|

|---|---|---|---|

| Forecasting Human Dynamics from Static Images | _11 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03432) |  | 

| Fast Single Shot Detection and Pose Estimation | _19 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.05590) |  | 

| Convolutional Pose Machines | _30 jan 2016_ | [arxiv](https://arxiv.org/pdf/1602.00134) |  | 

| Flowing ConvNets for Human Pose Estimation in Videos | _9 jun 2015_ | [arxiv](https://arxiv.org/pdf/1506.02897) |  | 

### Image Captioning

|Title|Date|Paper|Code|

|---|---|---|---|

| Actor-Critic Sequence Training for Image Captioning | _29 jun 2017_ | [arxiv](https://arxiv.org/pdf/1706.09601) |  | 

| Detecting and Recognizing Human-Object Interactions | _24 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.07333) |  | 

| Deep Reinforcement Learning-based Image Captioning with Embedding Reward | _12 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03899) |  | 

| Towards Diverse and Natural Image Descriptions via a Conditional GAN | _17 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.06029) |  | 

| Temporal Tessellation: A Unified Approach for Video Analysis | _21 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.06950) | [github](https://github.com/dot27/temporal-tessellation) | 

| Self-critical Sequence Training for Image Captioning | _2 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.00563) |  | 

| Generation and Comprehension of Unambiguous Object Descriptions | _7 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.02283) |  | 

| Show, Attend and Tell: Neural Image Caption Generation with Visual Attention | _10 feb 2015_ | [arxiv](https://arxiv.org/pdf/1502.03044) |  | 

| Long-term Recurrent Convolutional Networks for Visual Recognition and Description | _17 nov 2014_ | [arxiv](https://arxiv.org/pdf/1411.4389) |  | 

### Image Compression

|Title|Date|Paper|Code|

|---|---|---|---|

| Full Resolution Image Compression with Recurrent Neural Networks | _18 aug 2016_ | [arxiv](https://arxiv.org/pdf/1608.05148) |  | 

### Image Synthesis

|Title|Date|Paper|Code|

|---|---|---|---|

| Scene Text Synthesis for Efficient and Effective Deep Network Training | _26 jan 2019_ | [arxiv](https://arxiv.org/pdf/1901.09193.pdf) | |

| A Neural Representation of Sketch Drawings | _11 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03477) |  | 

| BEGAN: Boundary Equilibrium Generative Adversarial Networks | _31 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.10717) | [github](https://github.com/carpedm20/BEGAN-tensorflow) | 

| Improved Training of Wasserstein GANs | _31 mar 2017_ | [arxiv](https://arxiv.org/pdf/1704.00028) |  | 

| Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks | _30 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.10593) | [github](https://github.com/junyanz/CycleGAN) | 

| Wasserstein GAN | _26 jan 2017_ | [arxiv](https://arxiv.org/pdf/1701.07875) |  | 

| RenderGAN: Generating Realistic Labeled Data | _4 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.01331) |  | 

| Conditional Image Generation with PixelCNN Decoders | _16 jun 2016_ | [arxiv](https://arxiv.org/pdf/1606.05328) |  | 

| Pixel Recurrent Neural Networks | _25 jan 2016_ | [arxiv](https://arxiv.org/pdf/1601.06759) |  | 

| Generative Adversarial Networks | _10 jun 2014_ | [arxiv](https://arxiv.org/pdf/1406.2661) |  | 

### Face Recognition

|Title|Date|Paper|Code|

|---|---|---|---|

| Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition | _24 okt 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/accessorize-to-a-crime-real-and-stealthy-attacks-on-state-of-the-art-face-recognition.pdf) |  | 

| OpenFace: A general-purpose face recognition library with mobile applications | _1 jun 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/openface-a-general-purpose-face-recognition-library-with-mobile-applications.pdf) |  | 

| Emotion Recognition in the Wild via Convolutional Neural Networks and Mapped Binary Patterns | _9 nov 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/emotion-recognition-in-the-wild-via-convolutional-neural-networks-and-mapped-binary-patterns.pdf) |  | 

| Deep Face Recognition | _7 sep 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/deep-face-recognition.pdf) |  | 

| Compact Convolutional Neural Network Cascade for Face Detection | _6 aug 2015_ | [arxiv](https://arxiv.org/pdf/1508.01292) |  | 

| Learning Robust Deep Face Representation | _17 jul 2015_ | [arxiv](https://arxiv.org/pdf/1507.04844) |  | 

| Facenet: A unified embedding for face recognition and clustering | _12 jun 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/facenet-a-unified-embedding-for-face-recognition-and-clustering.pdf) |  | 

| Multi-view Face Detection Using Deep Convolutional Neural Networks | _10 feb 2015_ | [arxiv](https://arxiv.org/pdf/1502.02766) |  | 

### Image Composition

|Title|Date|Paper|Code|

|---|---|---|---|

| Auto-Retoucher(ART) — A Framework for Background Replacement and Foreground Adjustment | _13 jan 2019_ | [arxiv](https://arxiv.org/pdf/1901.03954.pdf) (brave new task) | [github](https://github.com/woshiyyya/Auto-Retoucher-pytorch) (not able to reproduce results based on code) | 

| Spatial Fusion GAN for Image Synthesis | _14 Dec 2018_ | [arxiv](https://arxiv.org/pdf/1812.05840.pdf) (needs revision, interesting approach however) | [github](https://github.com/fnzhan/SF-GAN) (currently, no code available) | 

| Compositional GAN: Learning Conditional Image Composition | _23 Aug 2018_ | [arxiv](https://arxiv.org/pdf/1807.07560.pdf) (with respect to spatial orientation) | [github](https://github.com/azadis/CompositionalGAN) (currently, no code available) | 

| ST-GAN | _5 mar 2018_ | [arxiv](https://arxiv.org/pdf/1803.01837) (with respect to spatial orientation) | [github](https://github.com/chenhsuanlin/spatial-transformer-GAN)  | 

| Deep Painterly Harmonization | _26 Jun 2018_ | [paper](https://arxiv.org/pdf/1804.03189.pdf) | [github](https://github.com/luanfujun/deep-painterly-harmonization) | 

| Deep Image Harmonization | _28 feb 2017_ | [paper](http://openaccess.thecvf.com/content_cvpr_2017/papers/Tsai_Deep_Image_Harmonization_CVPR_2017_paper.pdf) | [github](https://github.com/wasidennis/DeepHarmonization) (only code for inference) | 

| Understanding and Improving the Realism of Image Composites | _1 Jul 2012_ | [paper](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.682.3987&rep=rep1&type=pdf) | |

### Scene Graph Parsing

|Title|Date|Paper|Code|

|---|---|---|---|

| Neural Motifs: Scene Graph Parsing with Global Context | _29 Mar 2018_ | [arxiv](https://arxiv.org/pdf/1711.06640.pdf) | [github](https://github.com/rowanz/neural-motifs) | 

### Video Deblurring

|Title|Date|Paper|Code|

|---|---|---|---|

| Spatio-Temporal Filter Adaptive Network for Video Deblurring | _28 Apr 2019_ | [arxiv](https://arxiv.org/pdf/1904.12257.pdf) | [github](https://shangchenzhou.com/projects/stfan/) (to appear) | 

### Depth Perception

|Title|Date|Paper|Code|

|---|---|---|---|

| Learning Depth with Convolutional Spatial Propagation Network | _13 Okt 2018_ | [arxiv](https://arxiv.org/pdf/1810.02695.pdf) | [github](https://github.com/XinJCheng/CSPN) |

| Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches | _18 May 2016_ | [arxiv](https://arxiv.org/pdf/1510.05970.pdf) | [github](https://github.com/jzbontar/mc-cnn) | 

### 3D Reconstruction

|Title|Date|Paper|Code|

|---|---|---|---|

| Cerberus: A Multi-headed Derenderer | _28 May 2019_ | [arxiv](https://arxiv.org/pdf/1905.11940.pdf) |  |

### Vision Representation

|Title|Date|Paper|Code|

|---|---|---|---|

| VisualBERT: A Simple and Performant Baseline for Vision and Language | _9 aug 2019_ | [arxiv](https://arxiv.org/pdf/1908.03557.pdf) | |

| Expected to appear: some paper learning an unsupervised vision representation that beats SOTA on a large number of tasks |  | | | 

## Audio

### Audio Synthesis

|Title|Date|Paper|Code|

|---|---|---|---|

| Deep Cross-Modal Audio-Visual Generation | _26 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.08292) |  | 

| A Neural Parametric Singing Synthesizer | _12 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03809) |  | 

| Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders | _5 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.01279) | [github](https://github.com/tensorflow/magenta/tree/master/magenta/models/nsynth) | 

| Tacotron: Towards End-to-End Speech Synthesis | _29 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.10135) | [github](https://github.com/Kyubyong/tacotron) | 

| Deep Voice: Real-time Neural Text-to-Speech | _25 feb 2017_ | [arxiv](https://arxiv.org/pdf/1702.07825) |  | 

| WaveNet: A Generative Model for Raw Audio | _12 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.03499) | [github](https://github.com/ibab/tensorflow-wavenet) |  

## Other

### Unclassified

|Title|Date|Paper|Code|

|---|---|---|---|

| A simple neural network module for relational reasoning | _5 jun 2017_ | [arxiv](https://arxiv.org/pdf/1706.01427) |  | 

| Deep Complex Networks | _27 may 2017_ | [arxiv](https://arxiv.org/pdf/1705.09792) | [github](https://github.com/ChihebTrabelsi/deep_complex_networks) | 

| Learning to Fly by Crashing | _19 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.05588) |  | 

| Who Said What: Modeling Individual Labelers Improves Classification | _26 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.08774) |  | 

| Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data | _18 okt 2016_ | [arxiv](https://arxiv.org/pdf/1610.05755) |  | 

| DeepMath - Deep Sequence Models for Premise Selection | _14 jun 2016_ | [arxiv](https://arxiv.org/pdf/1606.04442) |  | 

| Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue | _16 mar 2016_ | [arxiv](https://arxiv.org/pdf/1603.04992) |  | 

| Long Short-Term Memory | _15 nov 1997_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/long-short-term-memory.pdf) |  | 

### Regularization

|Title|Date|Paper|Code|

|---|---|---|---|

| Self-Normalizing Neural Networks | _8 jun 2017_ | [arxiv](https://arxiv.org/pdf/1706.02515) |  | 

| Concrete Dropout | _22 may 2017_ | [arxiv](https://arxiv.org/pdf/1705.07832) | [github](https://github.com/yaringal/ConcreteDropout) | 

| Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning | _6 jun 2015_ | [arxiv](https://arxiv.org/pdf/1506.02142) |  | 

| Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift | _11 feb 2015_ | [arxiv](https://arxiv.org/pdf/1502.03167) |  | 

### Neural Network Compression

|Title|Date|Paper|Code|

|---|---|---|---|

| Design of Efficient Convolutional Layers using Single Intra-channel Convolution, Topological Subdivisioning and Spatial "Bottleneck" Structure | _15 aug 2016_ | [arxiv](https://arxiv.org/pdf/1608.04337) |  | 

| SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size | _24 feb 2016_ | [arxiv](https://arxiv.org/pdf/1602.07360) |  | 

| Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding | _1 okt 2015_ | [arxiv](https://arxiv.org/pdf/1510.00149) |  | 

### Optimizers

|Title|Date|Paper|Code|

|---|---|---|---|

| Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour | _8 jun 2017_ | [arxiv](https://arxiv.org/pdf/1706.02677) |  | 

| Equilibrated adaptive learning rates for non-convex optimization | _15 feb 2015_ | [arxiv](https://arxiv.org/pdf/1502.04390) |  | 

| Adam: A Method for Stochastic Optimization | _22 dec 2014_ | [arxiv](https://arxiv.org/pdf/1412.6980) |  | 

| Deep learning with Elastic Averaging SGD | _20 dec 2014_ | [arxiv](https://arxiv.org/pdf/1412.6651) |  | 

| ADADELTA: An Adaptive Learning Rate Method | _22 dec 2012_ | [arxiv](https://arxiv.org/pdf/1212.5701) |  | 

| Advances in Optimizing Recurrent Networks | _4 dec 2012_ | [arxiv](https://arxiv.org/pdf/1212.0901) |  | 

| Efficient Backprop | _1 jul 1998_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/efficient-backprop.pdf) |  |  

## A note on arXiv

arXiv provides the world with access to the newest scientific developments.

Open Access has a myriad of benefits, in particular, it allows science to be more efficient.

Remember to think about the quality of the papers referenced.

In particular, the importance of the [peer-review process](https://undsci.berkeley.edu/article/howscienceworks_16) for science.  

If you find an article on arXiv you should check if it has been peer-reviewed and published elsewhere. 

The authoritative version of the paper is not the version on arXiv, rather it is the published peer-reviewed version. 

The two versions may differ significantly. 

For example, this is the case with one of the papers that I once discussed in the Text and Multimedia Mining class at Radboud:

- [peer-reviewed version](http://opus.bath.ac.uk/55288/4/CaliskanEtAl_authors_full.pdf)

- [arXiv version](https://arxiv.org/abs/1608.07187)

Compare for yourself.

For the selection of the papers above, I choose open access over completeness.

If you find another (open) version of a paper, you are invited to make a pull request.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/sbrugman/deep-learning-papers

Awesome Lists containing this project

README