Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/morvanzhou/nlp-tutorials

Simple implementations of NLP models. Tutorials are written in Chinese on my website https://mofanpy.com
https://github.com/morvanzhou/nlp-tutorials

attention bert elmo gpt nlp seq2seq transformer tutorial w2v

Last synced: 2 days ago
JSON representation

Simple implementations of NLP models. Tutorials are written in Chinese on my website https://mofanpy.com

Awesome Lists containing this project

README

        

# Natural Language Processing Tutorial

Tutorial in Chinese can be found in [mofanpy.com](https://mofanpy.com/tutorials/machine-learning/nlp/).

This repo includes many simple implementations of models in Neural Language Processing (NLP).

All code implementations in this tutorial are organized as following:

1. Search Engine
- [TF-IDF numpy / TF-IDF skearn](#TF-IDF)
2. Understand Word (W2V)
- [Continuous Bag of Words (CBOW)](#Word2Vec)
- [Skip-Gram](#Word2Vec)
3. Understand Sentence (Seq2Seq)
- [seq2seq](#Seq2Seq)
- [CNN language model](#CNNLanguageModel)
4. All about Attention
- [seq2seq with attention](#Seq2SeqAttention)
- [Transformer](#Transformer)
5. Pretrained Models
- [ELMo](#ELMO)
- [GPT](#GPT)
- [BERT](#BERT)

Thanks for the contribution made by [@W1Fl](https://github.com/W1Fl) with a simplified keras codes in [simple_realize](simple_realize).
And the a [pytorch version of this NLP](/pytorch) tutorial made by [@ruifanxu](https://github.com/ruifan831).

## Installation

```shell script
$ git clone https://github.com/MorvanZhou/NLP-Tutorials
$ cd NLP-Tutorials/
$ sudo pip3 install -r requirements.txt
```

## TF-IDF

TF-IDF numpy [code](tf_idf.py)

TF-IDF short sklearn [code](tf_idf_sklearn.py)


image

## Word2Vec
[Efficient Estimation of Word Representations in Vector Space](https://arxiv.org/pdf/1301.3781.pdf)

Skip-Gram [code](skip-gram.py)

CBOW [code](CBOW.py)


image


image


image

## Seq2Seq
[Sequence to Sequence Learning with Neural Networks](https://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf)

Seq2Seq [code](seq2seq.py)


image

## CNNLanguageModel
[Convolutional Neural Networks for Sentence Classification](https://arxiv.org/pdf/1408.5882.pdf)

CNN language model [code](cnn-lm.py)


image

## Seq2SeqAttention
[Effective Approaches to Attention-based Neural Machine Translation](https://arxiv.org/pdf/1508.04025.pdf)

Seq2Seq Attention [code](seq2seq_attention.py)


image


image

## Transformer
[Attention Is All You Need](https://arxiv.org/pdf/1706.03762.pdf)

Transformer [code](transformer.py)


image


image


image

## ELMO
[Deep contextualized word representations](https://arxiv.org/pdf/1802.05365.pdf)

ELMO [code](ELMo.py)


image


image

## GPT
[Improving Language Understanding by Generative Pre-Training](https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf)

GPT [code](GPT.py)


image


image

## BERT
[BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/pdf/1810.04805.pdf)

BERT [code](BERT.py)

My new attempt [Bert with window mask](BERT_window_mask.py)


image


image