Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/morvanzhou/nlp-tutorials
Simple implementations of NLP models. Tutorials are written in Chinese on my website https://mofanpy.com
https://github.com/morvanzhou/nlp-tutorials
attention bert elmo gpt nlp seq2seq transformer tutorial w2v
Last synced: 6 days ago
JSON representation
Simple implementations of NLP models. Tutorials are written in Chinese on my website https://mofanpy.com
- Host: GitHub
- URL: https://github.com/morvanzhou/nlp-tutorials
- Owner: MorvanZhou
- License: mit
- Created: 2018-11-30T09:11:39.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2023-05-22T23:31:37.000Z (over 1 year ago)
- Last Synced: 2024-05-02T21:27:40.746Z (6 months ago)
- Topics: attention, bert, elmo, gpt, nlp, seq2seq, transformer, tutorial, w2v
- Language: Python
- Homepage: https://mofanpy.com
- Size: 888 KB
- Stars: 874
- Watchers: 17
- Forks: 309
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Natural Language Processing Tutorial
Tutorial in Chinese can be found in [mofanpy.com](https://mofanpy.com/tutorials/machine-learning/nlp/).
This repo includes many simple implementations of models in Neural Language Processing (NLP).
All code implementations in this tutorial are organized as following:
1. Search Engine
- [TF-IDF numpy / TF-IDF skearn](#TF-IDF)
2. Understand Word (W2V)
- [Continuous Bag of Words (CBOW)](#Word2Vec)
- [Skip-Gram](#Word2Vec)
3. Understand Sentence (Seq2Seq)
- [seq2seq](#Seq2Seq)
- [CNN language model](#CNNLanguageModel)
4. All about Attention
- [seq2seq with attention](#Seq2SeqAttention)
- [Transformer](#Transformer)
5. Pretrained Models
- [ELMo](#ELMO)
- [GPT](#GPT)
- [BERT](#BERT)Thanks for the contribution made by [@W1Fl](https://github.com/W1Fl) with a simplified keras codes in [simple_realize](simple_realize).
And the a [pytorch version of this NLP](/pytorch) tutorial made by [@ruifanxu](https://github.com/ruifan831).## Installation
```shell script
$ git clone https://github.com/MorvanZhou/NLP-Tutorials
$ cd NLP-Tutorials/
$ sudo pip3 install -r requirements.txt
```## TF-IDF
TF-IDF numpy [code](tf_idf.py)
TF-IDF short sklearn [code](tf_idf_sklearn.py)
## Word2Vec
[Efficient Estimation of Word Representations in Vector Space](https://arxiv.org/pdf/1301.3781.pdf)Skip-Gram [code](skip-gram.py)
CBOW [code](CBOW.py)
## Seq2Seq
[Sequence to Sequence Learning with Neural Networks](https://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf)Seq2Seq [code](seq2seq.py)
## CNNLanguageModel
[Convolutional Neural Networks for Sentence Classification](https://arxiv.org/pdf/1408.5882.pdf)CNN language model [code](cnn-lm.py)
## Seq2SeqAttention
[Effective Approaches to Attention-based Neural Machine Translation](https://arxiv.org/pdf/1508.04025.pdf)Seq2Seq Attention [code](seq2seq_attention.py)
## Transformer
[Attention Is All You Need](https://arxiv.org/pdf/1706.03762.pdf)Transformer [code](transformer.py)
## ELMO
[Deep contextualized word representations](https://arxiv.org/pdf/1802.05365.pdf)ELMO [code](ELMo.py)
## GPT
[Improving Language Understanding by Generative Pre-Training](https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf)GPT [code](GPT.py)
## BERT
[BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/pdf/1810.04805.pdf)BERT [code](BERT.py)
My new attempt [Bert with window mask](BERT_window_mask.py)