Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/DanAnastasyev/DeepNLP-Course
Deep NLP Course
https://github.com/DanAnastasyev/DeepNLP-Course
colab-notebook deep-learning keras nlp pytorch
Last synced: 3 months ago
JSON representation
Deep NLP Course
- Host: GitHub
- URL: https://github.com/DanAnastasyev/DeepNLP-Course
- Owner: DanAnastasyev
- Created: 2018-02-21T19:15:38.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2019-07-20T20:24:49.000Z (over 5 years ago)
- Last Synced: 2024-06-27T15:38:06.527Z (5 months ago)
- Topics: colab-notebook, deep-learning, keras, nlp, pytorch
- Language: Jupyter Notebook
- Homepage:
- Size: 1.88 MB
- Stars: 621
- Watchers: 25
- Forks: 164
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-google-colab - Deep NLP Course - A deep NLP Course (Course and Tutorial)
README
# Deep NLP Course at ABBYY
Deep learning for NLP crash course at ABBYY.
Suggested textbook: [Neural Network Methods in Natural Language Processing by Yoav Goldberg](https://www.amazon.com/Language-Processing-Synthesis-Lectures-Technologies/dp/1627052984)
*I'm gradually updating and translating the notebooks right now. Stay in touch.*
## Materials
### Week 1: *Introduction*
Sentiment analysis on the IMDB movie review dataset: a short overview of classical machine learning for NLP + indecently brief intro to keras.Russian version: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/12nrEX3JXTxsHWC-HpuwkTWyJybjmkZu-#forceEdit=true&offline=true&sandboxMode=true)
Updated English version: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1eW-mN3gEdLluYe1W1unQ-7-71by8eauz#scrollTo=OlqOAQmQGXOL&forceEdit=true&offline=true&sandboxMode=true)
### Week 2: *Word Embeddings: Part 1*
Meet the Word Embeddings: an unsupervised method to capture some fun relationships between words.
Phrases similarity with word embeddings model + word based machine translation without parallel data (with MUSE word embeddings).Russian version: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1o65wrq6RYgWyyMvNP8r9ZknXBniDoXrn#forceEdit=true&offline=true&sandboxMode=true)
Updated English version: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1ey9NARKvk-c4vfQGdvOkPjp5wNGmxd5o#forceEdit=true&offline=true&sandboxMode=true)
### Week 3: *Word Embeddings: Part 2*
Introduction to PyTorch. Implementation of pet linear regression on pure numpy and pytorch. Implementations of CBoW, skip-gram, negative sampling and structured Word2vec models.Russian version: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1YruNhE5aEJfLpaCZSKGIaZ1hOQR5qoIG#forceEdit=true&offline=true&sandboxMode=true)
Updated English version: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1Cgdg_jUIbhMBZiL3DtUQMN6xqFUYjKlg#forceEdit=true&offline=true&sandboxMode=true)
### Week 4: *Convolutional Neural Networks*
Introduction to convolutional networks. Relations between convolutions and n-grams. Simple surname detector on character-level convolutions + fun visualizations.Russian version: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1Vo_yuiA7xLjavUA_7ayLeosGJyMsyDAt#forceEdit=true&offline=true&sandboxMode=true)
Updated English version: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1iwhfaHp_L2loxjvbqW9DhO9BaWlIWzpB#forceEdit=true&offline=true&sandboxMode=true)
### Week 5: *RNNs: Part 1*
RNNs for text classification. Simple RNN implementation + memorization test. Surname detector in multilingual setup: character-level LSTM classifier.Russian version: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1-FoMnf7s-BYNM7jT9UF3u9m63h7dSq3_#forceEdit=true&offline=true&sandboxMode=true)
Updated English version: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1WA9YA30m7xFYfLyptuW7lHROvVGbZWZo#forceEdit=true&offline=true&sandboxMode=true)
### Week 6: *RNNs: Part 2*
RNNs for sequence labelling. Part-of-speech tagger implementations based on word embeddings and character-level word embeddings.Russian version: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1A7dbNANHg8srCemnwFI8WB1wLhvmuJp0#forceEdit=true&offline=true&sandboxMode=true)
### Week 7: *Language Models: Part 1*
Character-level language model for Russian troll tweets generation: fixed-window model via convolutions and RNN model.
Simple conditional language model: surname generation given source language.
And Toxic Comment Classification Challenge - to apply your skills to a real-world problem.Russian version: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1W5uaNpKFoaq1gV9N9FpIAEDyrsGGRBBi#forceEdit=true&offline=true&sandboxMode=true)
### Week 8: *Language Models: Part 2*
Word-level language model for poetry generation. Pet examples of transfer learning and multi-task learning applied to language models.Russian version: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1lUlBsdvAYJc5rLHwkOICyFhvns5Ssp1X#forceEdit=true&offline=true&sandboxMode=true)
### Week 9: *Seq2seq*
Seq2seq for machine translation and image captioning. Byte-pair encoding, beam search and other usefull stuff for machine translation.Russian version: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1jSYWuEGwik2lnnvGSU_PyXTFtbRKSyz_#forceEdit=true&offline=true&sandboxMode=true)
### Week 10: *Seq2seq with Attention*
Seq2seq with attention for machine translation and image captioning.Russian version: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1xZed_YAQf20fYacr9anE7T4EsdC_R0Oy#forceEdit=true&offline=true&sandboxMode=true)
### Week 11: *Transformers & Text Summarization*
Implementation of Transformer model for text summarization. Discussion of Pointer-Generator Networks for text summarization.Russian version: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1wy5BDHZVEm-vSeH8U4Xh0Sm3bArwVWGU#forceEdit=true&offline=true&sandboxMode=true)
### Week 12: *Dialogue Systems: Part 1*
Goal-orientied dialogue systems. Implemention of the multi-task model: intent classifier and token tagger for dialogue manager.Russian version: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1lNhbHboRYVb-caV7Ktj9cWJnHW7DPD9-#forceEdit=true&offline=true&sandboxMode=true)
### Week 13: *Dialogue Systems: Part 2*
General conversation dialogue systems and DSSMs. Implementation of question answering model on SQuAD dataset and chit-chat model on OpenSubtitles dataset.Russian version: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/19kQoxDWhv9VOfXxZCHcB39qIM30gziCR#forceEdit=true&offline=true&sandboxMode=true)
### Week 14: *Pretrained Models*
Pretrained models for various tasks: Universal Sentence Encoder for sentence similarity, ELMo for sequence tagging (with a bit of CRF), BERT for SWAG - reasoning about possible continuation.Russian version: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1WMspBJe-m0mJHb7SbDTj64W-5-3AxW7v#forceEdit=true&offline=true&sandboxMode=true)
### Final Presentation
[NLP Summary](https://drive.google.com/open?id=16GV-jSGtMAQPJgO_B6q1gLYXL10vM8Ev) - summary of cool stuff that appeared and didn't in the course.