https://github.com/dpressel/lit

Last synced: 24 days ago
JSON representation

Host: GitHub
URL: https://github.com/dpressel/lit
Owner: dpressel
Created: 2018-03-23T16:00:48.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2018-06-19T10:50:25.000Z (over 7 years ago)
Last Synced: 2025-02-22T22:28:46.190Z (9 months ago)
Size: 5.86 KB
Stars: 5
Watchers: 6
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# README #

Papers and books that I think are important to read if you are doing NLP and Deep Learning

### Books ###

#### I strongly recommend reading at least the first book

- Jurafsky/Martin (https://web.stanford.edu/~jurafsky/slp3/)
- Third edition is online (https://web.stanford.edu/~jurafsky/slp3/ed3book.pdf)

- Manning/Schutze (https://nlp.stanford.edu/fsnlp/)

### Papers for NLP ###

- _Accurate Methods for the Statistics of Surprise and Coincidence (Dunning)_
- http://www.aclweb.org/anthology/J93-1003

- _A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition (Rabiner)_
- http://www.ece.ucsb.edu/Faculty/Rabiner/ece259/Reprints/tutorial%20on%20hmm%20and%20applications.pdf

- _TnT - A Statistical Part of Speech Tagger (Brants)_
- http://www.coli.uni-saarland.de/~thorsten/publications/Brants-ANLP00.pdf

- _A Maximum Entropy Approach to Natural Language Processing_
- http://www.cs.columbia.edu/~jebara/6772/papers/maxent.pdf

- _A Maximum Entropy Model for Part-Of-Speech Tagging (Ratnaparkhi)_
- http://www.aclweb.org/anthology/W/W96/W96-0213.pdf

- _Baselines and Bigrams: Simple, Good Sentiment and Topic Classification (Wang, Manning) (NBSVM)_
- http://nlp.stanford.edu/pubs/sidaw12_simple_sentiment.pdf
- https://rawgit.com/dpressel/Meetups/master/nlp-meetup-2016-02-25/presentation.html#34

- _Unsupervised Word Sense Disambiguation Rivaling Supervised Methods (Yarowksy)_
- http://www.aclweb.org/anthology/P/P95/P95-1026.pdf

- _Training Deterministic Parsers with Non-Deterministic Oracles (Goldberg, Nivre)_
- http://aclweb.org/anthology/Q/Q13/Q13-1033.pdf

- _A Dynamic Oracle for Arc-Eager Dependency Parsing (Goldberg, Nivre)_
- http://www.aclweb.org/anthology/C/C12/C12-1059.pdf

- _TextRank: Bringing Order into Texts (Mihalcea, Tarau)_
- https://web.eecs.umich.edu/~mihalcea/papers/mihalcea.emnlp04.pdf

- _Improving Machine Learning Approaches to Coreference Resolution (Ng, Cardie)_
- http://www.hlt.utdallas.edu/~vince/papers/acl02.pdf

- _Local and Global Algorithms for Disambiguation to Wikipedia (Ratinov, Roth, Downey, Anderson)_
- http://www.aclweb.org/anthology/P11-1138.pdf

- _Latent Dirichlet Allocation (Blei, Ng, Jordan)_
- http://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf

### Papers for Deep Learning (mostly for NLP) ###

#### Representations, Cross-task

- _Distributed Representations of Words and Phrases and their Compositionality (Mikolov, Sutskever, Chen, Corrado, Dean)_
- https://arxiv.org/abs/1310.4546
- _Exploiting Similarities among Languages for Machine Translation (Mikolov, Le, Sutskever)_
- https://arxiv.org/abs/1309.4168
- _Efficient Estimation of Word Representations in Vector Space (Mikolov, Chen, Corrado, Dean)_
- https://arxiv.org/abs/1301.3781
- _Deep contextualized word representations (Peters et al)_
- https://export.arxiv.org/pdf/1802.05365
- _Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation (Ling et al)_
- https://arxiv.org/pdf/1508.02096.pdf
- _Natural Language Processing (Almost) from Scratch (Collobert et al)_
- http://jmlr.org/papers/volume12/collobert11a/collobert11a.pdf
- _Enriching Word Vectors with Subword Information (Bojanowski, Grave, Joulin, Mikolov)_
- https://arxiv.org/abs/1607.04606
- _Convolutional Neural Networks for Text Categorization: Shallow Word-level vs Deep Character-level (Johnson, Zhang)_
- https://arxiv.org/pdf/1609.00718.pdf

#### Language Modeling

- _Recurrent Neural Network Regularization (Zaremba, Sutskever, Vinyals)_
- https://arxiv.org/abs/1409.2329
- _Character-Aware Neural Language Models (Kim, Jernite, Sontag, Rush)_
- https://arxiv.org/abs/1508.06615
- _Exploring the Limits of Language Modeling (Jozefowicz, Vinyals, Schuster, Shazeer, Wu)_
- https://arxiv.org/pdf/1602.02410v2.pdf
- _Regularizing and Optimizing LSTM Language Models_
- https://arxiv.org/pdf/1708.02182.pdf

#### Sequence Tagging

- _Learning Character-level Representations for Part-of-Speech Tagging (dos Santos, Zadrozny)_
- http://proceedings.mlr.press/v32/santos14.pdf
- https://rawgit.com/dpressel/Meetups/master/nlp-reading-group-2016-03-14/presentation.html#1
- _Boosting Named Entity Recognition with Neural Character Embeddings (dos Santos, Cıcero and Victor Guimaraes)_
- http://www.aclweb.org/anthology/W15-3904
- https://rawgit.com/dpressel/Meetups/master/nlp-reading-group-2016-03-14/presentation.html#1
- _Neural Architectures for Named Entity Recognition (Lample et al)_
- https://arxiv.org/abs/1603.01360
- _End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF (Ma, Hovy)_
- https://arxiv.org/abs/1603.01354
- _Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging (Reimers, Gurevych)_
- http://aclweb.org/anthology/D17-1035
- _Design Challenges and Misconceptions in Neural Sequence Labeling (Yang, Liang, Zhang)_
- https://arxiv.org/pdf/1806.04470.pdf
#### Encoder-Decoders, NMT

- _Sequence to Sequence Learning with Neural Networks (Sutskever, Vinyals, Le)_
- https://arxiv.org/abs/1409.3215
- _Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation (Cho et al)_
- https://arxiv.org/abs/1406.1078
- _Neural Machine Translation by Jointly Learning to Align and Translate (Bahdanau, Cho, Bengio)_
- https://arxiv.org/abs/1409.0473
- _Attention Is All You Need (Vaswani et al)_
- https://arxiv.org/pdf/1706.03762.pdf
- _Show and Tell: A Neural Image Caption Generator (Vinyals, Tosheb, Bengio, Erhan)_
- https://arxiv.org/pdf/1411.4555v2.pdf

- _Effective Approaches to Attention-based Neural Machine Translation_ (Luong, Pham, Manning)
- https://nlp.stanford.edu/pubs/emnlp15_attn.pdf

#### Dialogue

- _Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models (Serban, Sordoni, Bengio, Courville, Pineau)_
- https://arxiv.org/pdf/1507.04808.pdf
- _End-to-end LSTM-based dialog control optimized with supervised and reinforcement learning (Williams, Zweig)_
- https://arxiv.org/pdf/1606.01269.pdf
- _Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning (Williams, Asadi, Zweig)_
- https://arxiv.org/pdf/1702.03274.pdf
- _Learning End-to-End Goal-oriented Dialog (Bordes, Boureau, Weston)_
- https://arxiv.org/pdf/1605.07683.pdf
- https://rawgit.com/dpressel/Meetups/master/nlp-reading-group-2017-02-08/presentation.html#(1)
- _A Neural Conversation Model (Vinyals, Le)_
- https://arxiv.org/pdf/1506.05869v3.pdf

#### Classification, Architecture, ML

- _Convolutional Neural Networks for Sentence Classification (Kim)_
- https://arxiv.org/abs/1408.5882
- _Rethinking the Inception Architecture for Computer Vision (Szegedy)_
- https://arxiv.org/abs/1512.00567
- _Going Deeper with Convolutions (Szegedy et al)_
- https://arxiv.org/abs/1409.4842
- _Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift (Ioffe/Szegedy)_
- https://arxiv.org/abs/1502.03167
- _Hierarchical Attention Networks for Document Classification (Yanh et al)_
- https://www.microsoft.com/en-us/research/publication/hierarchical-attention-networks-document-classification/
- _Deep Residual Learning for Image Recognition (He, Zhang, Ren, Sun)_
- https://arxiv.org/pdf/1512.03385v1.pdf

### Videos/Courses/Learning ###

- Hugo LaRochelle's Neural Networks course
- https://www.youtube.com/watch?v=SGZ6BttHMPw&list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH
- Coursera
- Hinton course
- Ng course
- Andrew Gibiansky's Blog
- http://andrew.gibiansky.com/blog/machine-learning/fully-connected-neural-networks/
- _Understanding LSTM Networks_
- https://colah.github.io/posts/2015-08-Understanding-LSTMs/

- _The Unreasonable Effectiveness of Recurrent Neural Networks (Karpathy)_
- https://karpathy.github.io/2015/05/21/rnn-effectiveness/

- _Ronan Collobert's Thesis_
- http://ronan.collobert.org/pub/matos/2004_phdthesis_lip6.pdf

- _SGD Tricks (Bottou)_
- http://research.microsoft.com/pubs/192769/tricks-2012.pdf

- _Large Scale Online Learning (Bottou, LeCun)_
- http://leon.bottou.org/publications/pdf/nips-2003.pdf

- _Feature Hashing (Langford)_
- http://cilvr.cs.nyu.edu/diglib/lsml/lecture08-hashing.pdf

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dpressel/lit

Awesome Lists containing this project

README