{"id":13416176,"url":"https://github.com/makcedward/nlp","last_synced_at":"2025-04-12T15:32:54.500Z","repository":{"id":45775475,"uuid":"134008334","full_name":"makcedward/nlp","owner":"makcedward","description":":memo: This repository recorded my NLP journey.","archived":false,"fork":false,"pushed_at":"2020-08-29T04:04:59.000Z","size":2297,"stargazers_count":1077,"open_issues_count":10,"forks_count":326,"subscribers_count":49,"default_branch":"master","last_synced_at":"2025-04-03T14:12:50.960Z","etag":null,"topics":["ai","data-science","deep-learning","machine-learning","nlp"],"latest_commit_sha":null,"homepage":"https://makcedward.github.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/makcedward.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-05-18T22:05:56.000Z","updated_at":"2025-02-14T13:43:40.000Z","dependencies_parsed_at":"2022-08-28T11:51:06.763Z","dependency_job_id":null,"html_url":"https://github.com/makcedward/nlp","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/makcedward%2Fnlp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/makcedward%2Fnlp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/makcedward%2Fnlp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/makcedward%2Fnlp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/makcedward","download_url":"https://codeload.github.com/makcedward/nlp/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248589865,"owners_count":21129692,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","data-science","deep-learning","machine-learning","nlp"],"created_at":"2024-07-30T21:00:55.032Z","updated_at":"2025-04-12T15:32:54.469Z","avatar_url":"https://github.com/makcedward.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# NLP - Tutorial\r\nRepository to show how NLP can tacke real problem. Including the source code, dataset, state-of-the art in NLP\r\n\r\n## Data Augmentation\r\n*   [Data Augmentation in NLP](https://towardsdatascience.com/data-augmentation-in-nlp-2801a34dfc28)\r\n*   [Data Augmentation library for Text](https://towardsdatascience.com/data-augmentation-library-for-text-9661736b13ff)\r\n*   [Does your NLP model able to prevent adversarial attack?](https://medium.com/hackernoon/does-your-nlp-model-able-to-prevent-adversarial-attack-45b5ab75129c)\r\n*   [How does Data Noising Help to Improve your NLP Model?](https://medium.com/towards-artificial-intelligence/how-does-data-noising-help-to-improve-your-nlp-model-480619f9fb10)\r\n*   [Data Augmentation library for Speech Recognition](https://towardsdatascience.com/data-augmentation-for-speech-recognition-e7c607482e78)\r\n*   [Data Augmentation library for Audio](https://towardsdatascience.com/data-augmentation-for-audio-76912b01fdf6)\r\n*   [Unsupervied Data Augmentation](https://medium.com/towards-artificial-intelligence/unsupervised-data-augmentation-6760456db143)\r\n*   [Adversarial Attacks in Textual Deep Neural Networks](https://medium.com/towards-artificial-intelligence/adversarial-attacks-in-textual-deep-neural-networks-245dc90029df)\r\n*\t[Back Translation in Text Augmentation by nlpaug](https://medium.com/towards-artificial-intelligence/back-translation-in-text-augmentation-by-nlpaug-d65518dd092f)\r\n\r\n## General\r\n*\t[Tricks of Building an ML or DNN Model](https://medium.com/towards-artificial-intelligence/tricks-of-building-an-ml-or-dnn-model-b2de54cf440a)\r\n\r\n## Text Preprocessing\r\n| Section | Sub-Section | Description | Story |\r\n| --- | --- | --- | --- |\r\n| Tokenization | Subword Tokenization |  | [Medium](https://towardsdatascience.com/how-subword-helps-on-your-nlp-model-83dd1b836f46) |\r\n| Tokenization | Word Tokenization |  | [Medium](https://medium.com/@makcedward/nlp-pipeline-word-tokenization-part-1-4b2b547e6a3) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-word_tokenization.ipynb) |\r\n| Tokenization | Sentence Tokenization |  | [Medium](https://medium.com/@makcedward/nlp-pipeline-sentence-tokenization-part-6-86ed55b185e6) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-sentence_tokenization.ipynb) |\r\n| Part of Speech | | | [Medium](https://medium.com/@makcedward/nlp-pipeline-part-of-speech-part-2-b683c90e327d) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-part_of_speech.ipynb) |\r\n| Lemmatization | | | [Medium](https://medium.com/@makcedward/nlp-pipeline-lemmatization-part-3-4bfd7304957) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp_lemmatization.ipynb) |\r\n| Stemming | | | [Medium](https://medium.com/@makcedward/nlp-pipeline-stemming-part-4-b60a319fd52) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-stemming.ipynb) |\r\n| Stop Words | | | [Medium](https://medium.com/@makcedward/nlp-pipeline-stop-words-part-5-d6770df8a936) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-stop_words.ipynb) |\r\n| Phrase Word Recognition | |  |  |\r\n| Spell Checking | Lexicon-based | Peter Norvig algorithm | [Medium](https://towardsdatascience.com/correcting-your-spelling-error-with-4-operations-50bcfd519bb8) [Github](https://github.com/makcedward/nlp/blob/master/sample/util/nlp-util-spell_corrector.ipynb) |\r\n| | Lexicon-based | Symspell | [Medium](https://towardsdatascience.com/essential-text-correction-process-for-nlp-tasks-f731a025fcc3) [Github](https://github.com/makcedward/nlp/blob/master/sample/util/nlp-util-symspell.ipynb) |\r\n| | Machine Translation | Statistical Machine Translation | [Medium](https://towardsdatascience.com/correcting-text-input-by-machine-translation-and-classification-fa9d82087de1) |\r\n| | Machine Translation | Attention | [Medium](https://towardsdatascience.com/fix-your-text-thought-attention-before-nlp-tasks-7dc074b9744f) |\r\n| String Matching | Fuzzywuzzy | | [Medium](https://towardsdatascience.com/how-fuzzy-matching-improve-your-nlp-model-bc617385ad6b) [Github](https://github.com/makcedward/nlp/blob/master/sample/preprocessing/nlp-preprocessing-string_matching-fuzzywuzzy.ipynb) |\r\n\r\n## Text Representation\r\n| Section | Sub-Section | Research Lab | Story | Source |\r\n| --- | --- | --- | --- | --- |\r\n| Traditional Method | Bag-of-words (BoW) |  | [Medium](https://towardsdatascience.com/3-basic-approaches-in-bag-of-words-which-are-better-than-word-embeddings-c2cbc7398016) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-bag_of_words.ipynb) |  |\r\n|  | Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) |  | [Medium](https://towardsdatascience.com/2-latent-methods-for-dimension-reduction-and-topic-modeling-20ff6d7d547) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-lsa_lda.ipynb) |  |\r\n| Character Level | Character Embedding | NYU | [Medium](https://medium.com/@makcedward/besides-word-embedding-why-you-need-to-know-character-embedding-6096a34a3b10) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-character_embedding.ipynb) | [Paper](https://arxiv.org/pdf/1502.01710v5.pdf) |\r\n| Word Level | Negative Sampling and Hierarchical Softmax |  | [Medium](https://towardsdatascience.com/how-negative-sampling-work-on-word2vec-7bf8d545b116) |  |\r\n|  | Word2Vec, GloVe, fastText |  | [Medium](https://towardsdatascience.com/3-silver-bullets-of-word-embedding-in-nlp-10fa8f50cc5a) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-word_embedding.ipynb) |  |\r\n|  | Contextualized Word Vectors (CoVe) | Salesforce | [Medium](https://towardsdatascience.com/replacing-your-word-embeddings-by-contextualized-word-vectors-9508877ad65d) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-embeddings-word-cove.ipynb) | [Paper](http://papers.nips.cc/paper/7209-learned-in-translation-contextualized-word-vectors.pdf) [Code](https://github.com/salesforce/cove) |\r\n|  | Misspelling Oblivious (word) Embeddings | Facebook | [Medium](https://medium.com/towards-artificial-intelligence/new-model-for-word-embeddings-which-are-resilient-to-misspellings-moe-9ecfd3ab473e) | [Paper](https://arxiv.org/pdf/1905.09755.pdf) |\r\n|  | Embeddings from Language Models (ELMo) | AI2 | [Medium](https://towardsdatascience.com/elmo-helps-to-further-improve-your-word-embeddings-c6ed2c9df95f) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-embeddings-sentence-elmo.ipynb) | [Paper](https://arxiv.org/pdf/1802.05365.pdf) [Code](https://github.com/allenai/allennlp/) |\r\n|  | Contextual String Embeddings | Zalando Research | [Medium](https://towardsdatascience.com/contextual-embeddings-for-nlp-sequence-labeling-9a92ba5a6cf0) | [Paper](http://aclweb.org/anthology/C18-1139) [Code](https://github.com/zalandoresearch/flair)| \r\n| Sentence Level | Skip-thoughts |  | [Medium](https://towardsdatascience.com/transforming-text-to-sentence-embeddings-layer-via-some-thoughts-b77bed60822c) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-skip_thoughts.ipynb) | [Paper](https://arxiv.org/pdf/1506.06726) [Code](https://github.com/ryankiros/skip-thoughts) |\r\n|  | InferSent |  | [Medium](https://towardsdatascience.com/learning-sentence-embeddings-by-natural-language-inference-a50b4661a0b8) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-embeddings-sentence-infersent.ipynb) | [Paper](https://arxiv.org/pdf/1705.02364) [Code](https://github.com/facebookresearch/InferSent) |\r\n|  | Quick-Thoughts | Google | [Medium](https://towardsdatascience.com/building-sentence-embeddings-via-quick-thoughts-945484cae273) | [Paper](https://arxiv.org/pdf/1803.02893.pdf) [Code](https://github.com/lajanugen/S2V) |\r\n|  | General Purpose Sentence (GenSen) |  | [Medium](https://towardsdatascience.com/learning-generic-sentence-representation-by-various-nlp-tasks-df39ce4e81d7) | [Paper](https://arxiv.org/pdf/1804.00079.pdf) [Code](https://github.com/Maluuba/gensen) |\r\n|  | Bidirectional Encoder Representations from Transformers (BERT) | Google | [Medium](https://towardsdatascience.com/how-bert-leverage-attention-mechanism-and-transformer-to-learn-word-contextual-relations-5bbee1b6dbdb) | [Paper(2019)](https://arxiv.org/pdf/1810.04805) [Code](https://github.com/google-research/bert)| \r\n|  | Generative Pre-Training (GPT) | OpenAI | [Medium](https://towardsdatascience.com/combining-supervised-learning-and-unsupervised-learning-to-improve-word-vectors-d4dea84ec36b) | [Paper(2019)](https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf) [Code](https://github.com/openai/finetune-transformer-lm)| \r\n|  | Self-Governing Neural Networks (SGNN) | Google | [Medium](https://towardsdatascience.com/embeddings-free-deep-learning-nlp-model-ce067c7a7c93) | [Paper](https://aclweb.org/anthology/D18-1105) | \r\n|  | Multi-Task Deep Neural Networks (MT-DNN) | Microsoft | [Medium](https://towardsdatascience.com/when-multi-task-learning-meet-with-bert-d1c49cc40a0c) | [Paper(2019)](https://arxiv.org/pdf/1901.11504.pdf) | \r\n|  | Generative Pre-Training-2 (GPT-2) | OpenAI | [Medium](https://towardsdatascience.com/too-powerful-nlp-model-generative-pre-training-2-4cc6afb6655) | [Paper(2019)](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) [Code](https://github.com/openai/gpt-2)| \r\n|  | Universal Language Model Fine-tuning (ULMFiT) | OpenAI | [Medium](https://towardsdatascience.com/multi-task-learning-in-language-model-for-text-classification-c3acc1fedd89) | [Paper](https://arxiv.org/pdf/1801.06146.pdf) [Code](https://github.com/fastai/fastai)| \r\n|  | BERT in Science Domain |  | [Medium](https://towardsdatascience.com/how-to-apply-bert-in-scientific-domain-2d9db0480bd9) | [Paper(2019)](https://arxiv.org/pdf/1903.10676.pdf) [Paper(2019)](https://arxiv.org/pdf/1901.08746.pdf)| \r\n|  | BERT in Clinical Domain | NYU/PU | [Medium](https://towardsdatascience.com/how-do-they-apply-bert-in-the-clinical-domain-49113a51be50) | [Paper(2019)](https://arxiv.org/pdf/1904.03323.pdf) [Paper(2019)](https://arxiv.org/pdf/1904.05342.pdf)| \r\n|  | RoBERTa | UW/Facebook | [Medium](https://medium.com/towards-artificial-intelligence/a-robustly-optimized-bert-pretraining-approach-f6b6e537e6a6) | [Paper(2019)](https://arxiv.org/pdf/1904.03323.pdf) [Paper](https://arxiv.org/pdf/1907.11692.pdf)| \r\n|  | Unified Language Model for NLP and NLU (UNILM) | Microsoft | [Medium](https://medium.com/towards-artificial-intelligence/unified-language-model-pre-training-for-natural-language-understanding-and-generation-f87dc226aa2) | [Paper(2019)](https://arxiv.org/pdf/1905.03197.pdf)| \r\n|  | Cross-lingual Language Model (XLMs) | Facebook | [Medium](https://medium.com/towards-artificial-intelligence/cross-lingual-language-model-56a65dba9358) | [Paper(2019)](https://arxiv.org/pdf/1901.07291.pdf)| \r\n|  | Transformer-XL | CMU/Google | [Medium](https://medium.com/towards-artificial-intelligence/address-limitation-of-rnn-in-nlp-problems-by-using-transformer-xl-866d7ce1c8f4) | [Paper(2019)](https://arxiv.org/pdf/1901.02860.pdf)| \r\n|  | XLNet | CMU/Google | [Medium](https://medium.com/dataseries/why-does-xlnet-outperform-bert-da98a8503d5b) | [Paper(2019)](https://arxiv.org/pdf/1906.08237.pdf)| \r\n|  | CTRL | Salesforce | [Medium](https://medium.com/dataseries/a-controllable-framework-for-text-generation-8be9e1f2c5db) | [Paper(2019)](https://arxiv.org/pdf/1909.05858.pdf)|z\r\n|  | ALBERT | Google/Toyota | [Medium](https://medium.com/towards-artificial-intelligence/a-lite-bert-for-reducing-inference-time-bed8d990daac) | [Paper(2019)](https://arxiv.org/pdf/1909.11942.pdf)|\r\n|  | T5 | Googles | [Medium](https://medium.com/dataseries/text-to-text-transfer-transformer-e35dc28bae14) | [Paper(2019)](https://arxiv.org/pdf/1910.10683.pdf)|\r\n|  | MultiFiT |   | [Medium](https://medium.com/towards-artificial-intelligence/multi-lingual-language-model-fine-tuning-81922a80438f) | [Paper(2019)](https://arxiv.org/pdf/1909.04761.pdf) |\r\n|  | XTREME |   | [Medium](https://medium.com/towards-artificial-intelligence/new-multilingual-model-xtreme-276bbaa26d79) | [Paper(2020)](https://arxiv.org/pdf/2003.11080.pdf) |\r\n|  | REALM |   | [Medium](https://medium.com/towards-artificial-intelligence/realm-retrieval-augmented-language-model-pre-training-534feae7ab98) | [Paper(2020)](https://arxiv.org/pdf/2002.08909.pdf) |\r\n\r\n| Document Level | lda2vec |  | [Medium](https://towardsdatascience.com/combing-lda-and-word-embeddings-for-topic-modeling-fe4a1315a5b4) | [Paper](https://arxiv.org/pdf/1605.02019.pdf) |\r\n|  | doc2vec | Google | [Medium](https://towardsdatascience.com/understand-how-to-transfer-your-paragraph-to-vector-by-doc2vec-1e225ccf102) [Github](https://github.com/makcedward/nlp/blob/master/sample/embeddings/nlp-embeddings-document-doc2vec.ipynb) | [Paper](https://arxiv.org/pdf/1405.4053.pdf) |\r\n\r\n## NLP Problem \r\n| Section | Sub-Section | Description | Research Lab | Story | Paper \u0026 Code |\r\n| --- | --- | --- | --- | --- | --- |\r\n| Named Entity Recognition (NER) | Pattern-based Recognition | | | [Medium](https://towardsdatascience.com/pattern-based-recognition-did-help-in-nlp-5c54b4e7a962)  |  |\r\n| | Lexicon-based Recognition | | | [Medium](https://towardsdatascience.com/step-out-from-regular-expression-for-feature-engineering-134e594f542c) |  |\r\n| | spaCy Pre-trained NER | | | [Medium](https://medium.com/@makcedward/named-entity-recognition-3fad3f53c91e) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-named_entity_recognition.ipynb) |  |\r\n| Optical Character Recognition (OCR) | Printed Text | Google Cloud Vision API | Google | [Medium](https://towardsdatascience.com/secret-of-google-web-based-ocr-service-fe30eecedd01) | [Paper](https://das2018.cvl.tuwien.ac.at/media/filer_public/85/fd/85fd4698-040f-45f4-8fcc-56d66533b82d/das2018_short_papers.pdf) |\r\n| | Handwriting | LSTM | Google | [Medium](https://towardsdatascience.com/lstm-based-handwriting-recognition-by-google-eb99663ca6de) | [Paper](https://arxiv.org/pdf/1902.10525.pdf) | \r\n| Text Summarization | Extractive Approach | | | [Medium](https://medium.com/@makcedward/text-summarization-extractive-approach-567fe4b85c23) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-text_summarization_extractive.ipynb) | |\r\n| | Abstractive Approach |  |  | [Medium](https://medium.com/dataseries/summarize-document-by-combing-extractive-and-abstractive-steps-40295310526) | \r\n| Emotion Recognition | Audio, Text, Visual | 3 Multimodals for Emotion Recognition |  | [Medium](https://becominghuman.ai/multimodal-for-emotion-recognition-21df267fddc4) |\r\n\r\n## Acoustic Problem\r\n| Section | Sub-Section | Description | Research Lab | Story | Paper \u0026 Code |\r\n| --- | --- | --- | --- | --- | --- |\r\n| Feature Representation | Unsupervised Learning| Introduction to Audio Feature Learning | |  [Medium](https://medium.com/hackernoon/how-can-you-apply-unsupervised-learning-on-audio-data-be95153c5860) | [Paper 1](https://ai.stanford.edu/~ang/papers/nips09-AudioConvolutionalDBN.pdf) [Paper 2](https://arxiv.org/pdf/1607.03681.pdf) [Paper 3](https://arxiv.org/pdf/1712.03835.pdf)\r\n| Feature Representation | Unsupervised Learning| Speech2Vec and Sentence Level Embeddings | |  [Medium](https://medium.com/towards-artificial-intelligence/two-ways-to-learn-audio-embeddings-9dfcaab10ba6) | [Paper 1](https://arxiv.org/pdf/1803.08976.pdf) [Paper 2](https://arxiv.org/pdf/1902.07817.pdf)\r\n| Feature Representation | Unsupervised Learning| Wav2vec | |  [Medium](https://becominghuman.ai/unsupervised-pre-training-for-speech-recognition-wav2vec-aba643824324) | [Paper](https://arxiv.org/pdf/1904.05862.pdf)\r\n| Speech-to-text | | Introduction to Speeh-to-text | |  [Medium](https://becominghuman.ai/how-does-your-assistant-device-work-based-on-text-to-speech-technology-5f31e56eae7e) |\r\n\r\n## Text Distance Measurement\r\n| Section | Sub-Section | Description | Research Lab | Story | Paper \u0026 Code |\r\n| --- | --- | --- | --- | --- | --- |\r\n| Euclidean Distance, Cosine Similarity and Jaccard Similarity |  |  |  | [Medium](https://towardsdatascience.com/3-basic-distance-measurement-in-text-mining-5852becff1d7) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-3_basic_distance_measurement_in_text_mining.ipynb) |  |\r\n| Edit Distance | Levenshtein Distance |  |  | [Medium](https://towardsdatascience.com/measure-distance-between-2-words-by-simple-calculation-a97cf4993305) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-distance-edit_distance.ipynb) |  |\r\n| Word Moving Distance (WMD) |  |  |  | [Medium](https://towardsdatascience.com/word-distance-between-word-embeddings-cc3e9cf1d632) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-word_mover_distance.ipynb) |\r\n| Supervised Word Moving Distance (S-WMD) |  |  |  | [Medium](https://towardsdatascience.com/word-distance-between-word-embeddings-with-weight-bf02869c50e1)|\r\n| Manhattan LSTM |  |  |  | [Medium](https://towardsdatascience.com/text-matching-with-deep-learning-e6aa05333399) | [Paper](http://www.mit.edu/~jonasm/info/MuellerThyagarajan_AAAI16.pdf) |\r\n\r\n## Model Interpretation\r\n| Section | Sub-Section | Description | Research Lab | Story | Paper \u0026 Code |\r\n| --- | --- | --- | --- | --- | --- |\r\n| ELI5, LIME and Skater |  |  |  | [Medium](https://towardsdatascience.com/3-ways-to-interpretate-your-nlp-model-to-management-and-customer-5428bc07ce15) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-model_interpretation.ipynb) |\r\n| SHapley Additive exPlanations (SHAP) |  |  |  | [Medium](https://towardsdatascience.com/interpreting-your-deep-learning-model-by-shap-e69be2b47893) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-model_interpretation_shap.ipynb) |\r\n| Anchors |  |  |  | [Medium](https://towardsdatascience.com/anchor-your-model-interpretation-by-anchors-aa4ed7104032) [Github](https://github.com/makcedward/nlp/blob/master/sample/nlp-model_interpretation_anchor.ipynb) |\r\n\r\n## Graph\r\n| Section | Sub-Section | Description | Research Lab | Story | Paper \u0026 Code |\r\n| --- | --- | --- | --- | --- | --- |\r\n| Embeddings | | TransE, RESCAL, DistMult, ComplEx, PyTorch BigGraph | |  [Medium](https://medium.com/towards-artificial-intelligence/a-gentle-introduction-to-graph-embeddings-c7b3d1db0fa8) | [RESCAL(2011)](https://pdfs.semanticscholar.org/68a3/3a3afac65eb6e0fb3726c1f9c8b727f32a42.pdf?_ga=2.21151099.1397092755.1575835510-317581445.1533093975) [TransE(2013)](https://papers.nips.cc/paper/5071-translating-embeddings-for-modeling-multi-relational-data.pdf) [DistMult(2015)](https://arxiv.org/pdf/1412.6575v4.pdf) [ComplEx(2016)](https://arxiv.org/pdf/1606.06357.pdf) [PyTorch BigGraph(2019)](https://arxiv.org/pdf/1903.12287.pdf)\r\n| Embeddings | | DeepWalk, node2vec, LINE, GraphSAGE | |  [Medium](https://medium.com/towards-artificial-intelligence/random-walk-in-node-embeddings-deepwalk-node2vec-line-and-graphsage-ca23df60e493) | [DeepWalk(2014)](https://arxiv.org/pdf/1403.6652.pdf) [node2vec(2015)](https://cs.stanford.edu/~jure/pubs/node2vec-kdd16.pdf) [LINE(2015)](https://arxiv.org/pdf/1503.03578.pdf) [GraphSAGE(2018)](https://arxiv.org/pdf/1706.02216.pdf)\r\n| Embeddings | | WLG, GCN, GAT, GIN | |  [Medium](https://medium.com/towards-artificial-intelligence/4-graph-neural-networks-you-need-to-know-wlg-gcn-gat-gin-1bf10d29d836) | [WLG(2011)](http://www.jmlr.org/papers/volume12/shervashidze11a/shervashidze11a.pdf) [GCN2017)](https://arxiv.org/pdf/1609.02907.pdf) [GAT(2017)](https://arxiv.org/pdf/1710.10903.pdf) [GraphSAGE(2018)](https://arxiv.org/pdf/1810.00826.pdf)\r\n| Embeddings | | [PinSAGE(2018)](https://arxiv.org/pdf/1806.01973.pdf) | Pinterest | [Medium](https://medium.com/towards-artificial-intelligence/when-graphsage-meets-pinterest-5e82c9a88120)\r\n| Embeddings | | [HoIE(2015)](https://arxiv.org/pdf/1510.04935.pdf), [SimpIE(2018)](https://arxiv.org/pdf/1802.04868.pdf) | | [Medium](https://medium.com/towards-artificial-intelligence/knowledge-graph-embeddings-dc9251bffa80)\r\n| Embeddings | | [ContE(2017)](http://repository.ittelkom-pwt.ac.id/4358/1/Learning%20Contextual%20Embeddings%20for%20Knowledge%20Graph%20Completion.pdf), [ETE(2017)](https://persagen.com/files/misc/Moon2017Learning.pdf) | | [Medium](https://medium.com/towards-artificial-intelligence/from-conte-to-entity-type-embeddings-in-natural-language-processing-19e53db90dd5)\r\n\r\n## Meta-Learning\r\n| Section | Sub-Section | Description | Story |\r\n| --- | --- | --- | --- |\r\n| Introduction |  | [Matching Nets(2016)](https://arxiv.org/pdf/1606.04080.pdf) [MANN(2016)](https://arxiv.org/pdf/1605.06065.pdf) [LSTM-based meta-learner(2017)](https://openreview.net/pdf?id=rJY0-Kcll) [Prototypical Networks(2017)](https://arxiv.org/pdf/1703.05175.pdf) [ARC(2017)](https://arxiv.org/pdf/1703.00767.pdf) [MAML(2017)](https://arxiv.org/pdf/1703.03400.pdf) [MetaNet(2017)](https://arxiv.org/pdf/1703.00837.pdf) | [Medium](https://medium.com/towards-artificial-intelligence/a-gentle-introduction-to-meta-learning-8e36f3d93f61)  |\r\n| NLP | Dialog Generation | [DAML(2019)](https://arxiv.org/pdf/1906.03520.pdf), [PAML(2019)](https://arxiv.org/pdf/1905.10033.pdf), [NTMS(2019)](https://arxiv.org/pdf/1910.10487.pdf) | [Medium](https://medium.com/towards-artificial-intelligence/meta-learning-in-dialog-generation-41367e397086)\r\n| | Classification | [Intent Embeddings(2016)](https://www.csie.ntu.edu.tw/~yvchen/doc/ICASSP16_ZeroShot.pdf) [LEOPARD(2019)](https://arxiv.org/pdf/1911.03863.pdf) | [Medium](https://medium.com/towards-artificial-intelligence/meta-learning-in-nlp-classification-db78fbcdf15c)\r\n| CV | Unsupervised Learning | [CACTUs(2018)](https://arxiv.org/pdf/1810.02334.pdf) | [Medium](https://medium.com/dataseries/unsupervised-learning-in-meta-learning-f71c549e2ae2)\r\n| General | | [Siamese Network(1994)](http://papers.nips.cc/paper/769-signature-verification-using-a-siamese-time-delay-neural-network.pdf), [Triplet Network(2015)](https://arxiv.org/pdf/1412.6622.pdf) | [Medium](https://medium.com/towards-artificial-intelligence/how-do-twins-and-triplet-neural-network-work-cfed66d9b829)\r\n| | [MAML+(2018)](https://arxiv.org/pdf/1810.09502.pdf) | [Medium](https://medium.com/towards-artificial-intelligence/from-maml-to-maml-20de07203d59)\r\n\r\n## Image\r\n| Section | Sub-Section | Description | Research Lab | Story | Paper \u0026 Code |\r\n| --- | --- | --- | --- | --- | --- |\r\n| Object Detection |  | R-CNN | |  [Medium](https://medium.com/towards-artificial-intelligence/how-r-cnn-works-on-object-detection-443679b0187c) | [Paper(2013)](https://arxiv.org/pdf/1311.2524.pdf)\r\n| Object Detection |  | Fast R-CNN | |  [Medium](https://medium.com/dataseries/how-fast-r-cnn-works-on-object-detection-546e4812eaa1) | [Paper(2015)](https://arxiv.org/pdf/1504.08083.pdf)\r\n| Object Detection |  | Faster R-CNN | |  [Medium](https://becominghuman.ai/how-faster-r-cnn-works-on-object-detection-3d92432ce321) | [Paper(2015)](https://arxiv.org/pdf/1506.01497.pdf)\r\n| Object Detection |  | VGGNet | |  [Medium](https://becominghuman.ai/what-is-the-vgg-neural-network-a590caa72643) | [Paper(2014)](https://arxiv.org/pdf/1409.1556.pdf)\r\n| Instance Segmentation | | Mask R-CNN | FAIR | [Medium](https://medium.com/dataseries/mask-r-cnn-for-instance-segmentation-7f0708e3e25b) | [Paper(2017)](https://arxiv.org/pdf/1703.06870.pdf) | \r\n| Image Classification |  | [ResNet(2015)](https://arxiv.org/pdf/1512.03385.pdf) |Microsoft |  [Medium](https://medium.com/dataseries/how-does-resnet-improve-performance-caaa436f885b)  |\r\n| Image Classification |  | [ResNeXt(2016)](https://arxiv.org/pdf/1611.05431.pdf) | |  [Medium](https://medium.com/dataseries/enhancing-resnet-to-resnext-for-image-classification-3449f62a774c)  |\r\n\r\n## Evaluation\r\n| Section | Sub-Section | Description | Story |\r\n| --- | --- | --- | --- |\r\n| Introduction | | | [Medium](https://medium.com/towards-artificial-intelligence/evaluation-metrics-are-what-you-need-to-define-in-the-earlier-stage-99dbfae51472)\r\n| Classification | | Confusion Matrix, ROC, AUC | [Medium](https://medium.com/towards-artificial-intelligence/evaluation-metrics-for-classification-problems-e7442092bc5)\r\n| Regression | | MAE, MSE, RMSE, MAPE, WMAPE | [Medium](https://medium.com/towards-artificial-intelligence/evaluation-metrics-for-regression-problems-fff2ac8e3f43)\r\n| Textual | | Perplexity, BLEU, GER, WER, GLUE | [Medium](https://medium.com/towards-artificial-intelligence/evaluation-metrics-for-textual-problems-6e881feef5ad)\r\n\r\n## Source Code \r\n| Section | Sub-Section | Description | Link |\r\n| --- | --- | --- | --- |\r\n| Spellcheck |  |  | [Github](https://github.com/norvig/pytudes) |\r\n| InferSent |  |  | [Github](https://github.com/facebookresearch/InferSent) |\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmakcedward%2Fnlp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmakcedward%2Fnlp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmakcedward%2Fnlp/lists"}