Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/dipanjans/nlp_workshop_odsc_europe20

Extensive tutorials for the Advanced NLP Workshop in Open Data Science Conference Europe 2020. We will leverage machine learning, deep learning and deep transfer learning to learn and solve popular tasks using NLP including NER, Classification, Recommendation \ Information Retrieval, Summarization, Classification, Language Translation, Q&A and Topic Models.
https://github.com/dipanjans/nlp_workshop_odsc_europe20

deep-learning gensim jupyter-notebook machine-learning natural-language-processing nltk python pytorch scikit-learn spacy tensorflow transfer-learning transformers

Last synced: 2 months ago
JSON representation

Extensive tutorials for the Advanced NLP Workshop in Open Data Science Conference Europe 2020. We will leverage machine learning, deep learning and deep transfer learning to learn and solve popular tasks using NLP including NER, Classification, Recommendation \ Information Retrieval, Summarization, Classification, Language Translation, Q&A and Topic Models.

Awesome Lists containing this project

README

        

# ODSC Europe 2020 Workshop
![](https://i.imgur.com/EgiPQsO.png)

# Advanced NLP: From Essentials to Deep Transfer Learning
![](https://i.imgur.com/zcn60eh.png)

## Abstract:

Being specialized in domains like computer vision and natural language processing is no longer a luxury but a necessity which is expected of any data scientist in today’s fast-paced world! With a hands-on and interactive approach, we will understand essential concepts in NLP along with extensive hands-on examples to master state-of-the-art tools, techniques and methodologies for actually applying NLP to solve real-world problems. We will leverage machine learning, deep learning and deep transfer learning to learn and solve popular tasks using NLP including NER, Classification, Recommendation \ Information Retrieval, Summarization, Classification, Language Translation, Q&A and Topic Models


## Session Outline

### Module 1: NLP Essentials
Here we start with the basics of how to process and work with text data and strings. Look at essential components of a NLP pipeline and get started on some of the key components from this pipeline including understanding POS tagging, Named Entity Recognition and Text Pre-processing. We will look at traditional approaches as well as newer deep transfer learning based approaches for a few of these components.

__Key Focus Areas: Text Pre-processing, NER, POS Tagging__


### Module 2: Text Representation
Text can't be consumed directly by downstream machine learning and deep learning models since they are at heart math-based models. The key focus of this module will be to cover both traditional statistical based methodologies and newer representation learning based methodologies which use deep learning to represent text data including bag of words, n-grams, word embeddings, universal embeddings and contextual embeddings.

__Key Focus Areas: Count-based Representations (Bag of Words, N-grams, TF-IDF), Similarity, Topics, Word Embeddings (Word2Vec, GloVe, FastText), Universal Embeddings, Contextual Embeddings (Transformers)__


### Module 3: NLP Application (Machine Learning \ Deep Learning)
We will look at several popular applications of NLP in this module and go through hands-on examples. This includes movie recommendation systems using similarity, topic modeling analysis on research papers, summarizing text documents, language translation, text classification and sentiment analysis

__Key Focus Areas: Topic Models, Similarity \ Information Retrieval, Summarization (TextRank \ Transformers), Language Translation (seq2seq \ attention), Classification (machine learning & deep learning models)__


### Module 4: NLP Applications with Deep Transfer Learning
We finally dive into some of the latest and best advancements which have happened in the last few years in the world of NLP, thanks to deep transfer learning. We will cover a deep conceptual understanding of the transformer architecture and look at some hands-on examples of text classification and multi-task NLP using transformers where we look at solving NER, Q&A, sentiment analysis, summarization, translation using effective constructs like the transformers pipeline.

__Key Focus Areas: Text Classification (with pre-trained embeddings, universal sentence encoders and transformers), Multi-task NLP with transformer pipelines (sentiment analysis, NER, text generation, summarization, question-answering, translation). Fine-tuning\training transformers (tips \ guidelines) with examples e.g NER__


## Background Knowledge
__Skills:__ Basic understanding of Machine Learning, Deep Learning (though we will cover some essentials)
__Tools \ Languages:__ Python, Tensorflow\Keras\PyTorch, Scikit-Learn (Basics)