Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https-github.com-keon-awesome-nlp
awesome NLP
https://github.com/manhcuogntin4/https-github.com-keon-awesome-nlp
- Seq2Seq
- Lecture Note
- Michael Collins - one of the best NLP teachers. Check out the material on the courses he is teaching.
- tutorials by Radim Řehůřek
- Natural Language Processing in Action - A guide to creating machines that understand human language.
- Intro to Natural Language Processing
- Intro to Artificial Intelligence
- Deep Learning for Natural Language Processing (2015 classes)
- Deep Learning for Natural Language Processing (2016 classes)
- Natural Language Processing - course on Coursera that was only done in 2013. The videos are not available at the moment. Also Mike Collins is a great professor and his notes and lectures are very good.
- Statistical Machine Translation - a Machine Translation course with great assignments and slides.
- Natural Language Processing SFU - Link is broken - course by [Prof Anoop Sarkar](https://www.cs.sfu.ca/~anoop/) on Natural Language Processing. Good notes and some good lectures on youtube about HMM.
- Udacity Deep Learning
- NLTK with Python 3 for Natural Language Processing
- Computational Linguistics I - Graber . Lectures from University of Maryland.
- Natural Language Processing - Stanford - 462-spring-2013)
- Deep Natural Language Processing
- Stanford CS 224D: Deep Learning for NLP class
- Richard Socher
- Udacity Deep Learning
- A Primer on Neural Network Models for Natural Language Processing
- Pre-trained word embeddings for WSJ corpus - Lab
- Word2vec
- HLBL language model
- Real-valued vector "embeddings"
- Improving Word Representations Via Global Context And Multiple Word Prototypes
- Dependency based word embeddings
- Global Vectors for Word Representations
- TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
- Twitter-text - A JavaScript implementation of Twitter's text processing library
- Knwl.js - A Natural Language Processor in JS
- Retext - Extensible system for analyzing and manipulating natural language
- NLP Compromise - Natural Language processing in the browser
- Natural - general natural language facilities for node
- fastText by Facebook - for efficient learning of word representations and sentence classification
- Scikit-learn: Machine learning in Python
- Natural Language Toolkit (NLTK)
- Pattern - A web mining module for the Python programming language. It has tools for natural language processing, machine learning, among others.
- TextBlob - Providing a consistent API for diving into common natural language processing (NLP) tasks. Stands on the giant shoulders of NLTK and Pattern, and plays nicely with both.
- YAlign - A sentence aligner, a friendly tool for extracting parallel sentences from comparable corpora.
- jieba - Chinese Words Segmentation Utilities.
- SnowNLP - A library for processing Chinese text.
- KoNLPy - A Python package for Korean natural language processing.
- Rosetta - Text processing tools and wrappers (e.g. Vowpal Wabbit)
- BLLIP Parser - BLLIP Natural Language Parser (also known as the Charniak-Johnson parser)
- PyNLPl - Python Natural Language Processing Library. General purpose NLP library for Python. Also contains some specific modules for parsing common NLP formats, most notably for [FoLiA](http://proycon.github.io/folia/), but also ARPA language models, Moses phrasetables, GIZA++ alignments.
- python-ucto - Python binding to ucto (a unicode-aware rule-based tokenizer for various languages)
- Parserator - A toolkit for making domain-specific probabilistic parsers
- python-frog - Python binding to Frog, an NLP suite for Dutch. (pos tagging, lemmatisation, dependency parsing, NER)
- python-zpar - Python bindings for [ZPar](https://github.com/frcchang/zpar), a statistical part-of-speech-tagger, constiuency parser, and dependency parser for English.
- colibri-core - C++ library, command line tools, and Python binding for extracting and working with basic linguistic constructions such as n-grams and skipgrams in a quick and memory-efficient way.
- spaCy - Industrial strength NLP with Python and Cython.
- textacy - Higher level NLP built on spaCy
- PyStanfordDependencies - Python interface for converting Penn Treebank trees to Stanford Dependencies.
- gensim - Python library to conduct unsupervised semantic modelling from plain text
- scattertext - Python library to produce d3 visualizations of how language differs between corpora.
- CogComp-NlPy - Light-weight Python NLP annotators.
- PyThaiNLP - Thai NLP in Python Package.
- jPTDP - A toolkit for joint part-of-speech (POS) tagging and dependency parsing. jPTDP provides pre-trained models for 40+ languages.
- CLTK
- pymorphy2 - a good pos-tagger for Russian
- BigARTM - a fast library for topic modelling
- AllenNLP - An NLP research library, built on PyTorch, for developing state-of-the-art deep learning models on a wide variety of linguistic tasks.
- MIT Information Extraction Toolkit - C, C++, and Python tools for named entity recognition and relation extraction
- CRF++ - Open source implementation of Conditional Random Fields (CRFs) for segmenting/labeling sequential data & other Natural Language Processing tasks.
- CRFsuite - CRFsuite is an implementation of Conditional Random Fields (CRFs) for labeling sequential data.
- BLLIP Parser - BLLIP Natural Language Parser (also known as the Charniak-Johnson parser)
- colibri-core - C++ library, command line tools, and Python binding for extracting and working with basic linguistic constructions such as n-grams and skipgrams in a quick and memory-efficient way.
- ucto - Unicode-aware regular-expression based tokenizer for various languages. Tool and C++ library. Supports FoLiA format.
- libfolia - C++ library for the [FoLiA format](http://proycon.github.io/folia/)
- frog - Memory-based NLP suite developed for Dutch: PoS tagger, lemmatiser, dependency parser, NER, shallow parser, morphological analyzer.
- MeTA - [MeTA : ModErn Text Analysis](https://meta-toolkit.org/) is a C++ Data Sciences Toolkit that facilitates mining big text data.
- Mecab (Japanese)
- Mecab (Korean)
- Moses
- StarSpace - a library from Facebook for creating embeddings of word-level, paragraph-level, document-level and for text classification
- Stanford NLP
- OpenNLP
- ClearNLP
- Word2vec in Java
- ReVerb - Scale Open Information Extraction
- OpenRegex - based regular expression language and engine.
- CogcompNLP - Core libraries developed in the U of Illinois' Cognitive Computation Group.
- MALLET - MAchine Learning for LanguagE Toolkit - package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.
- RDRPOSTagger - A robust POS tagging toolkit available (in both Java & Python) together with pre-trained models for 40+ languages.
- Saul - Library for developing NLP systems, including built in modules like SRL, POS, etc.
- ATR4S - Toolkit with state-of-the-art [automatic term recognition](https://en.wikipedia.org/wiki/Terminology_extraction) methods.
- tm - Implementation of topic modeling based on regularized multilingual [PLSA](https://en.wikipedia.org/wiki/Probabilistic_latent_semantic_analysis).
- word2vec-scala - Scala interface to word2vec model; includes operations on vectors like word-distance and word-analogy.
- Epic - Epic is a high performance statistical parser written in Scala, along with a framework for building complex structured prediction models.
- text2vec - Fast vectorization, topic modeling, distances and GloVe word embeddings in R.
- wordVectors - An R package for creating and exploring word2vec and other word embedding models
- RMallet - R package to interface with the Java machine learning tool MALLET
- dfr-browser - Creates d3 visualizations for browsing topic models of text in a web browser.
- dfrtopics - R package for exploring topic models of text.
- sentiment_classifier - Sentiment Classification using Word Sense Disambiguation and WordNet Reader
- jProcessing - Japanese Natural Langauge Processing Libraries, with Japanese sentiment classification
- Clojure-openNLP - Natural Language Processing in Clojure (opennlp)
- Infections-clj - Rails-like inflection library for Clojure and ClojureScript
- postagga - A library to parse natural language in Clojure and ClojureScript
- A collection of Natural Language Processing (NLP) Ruby libraries, tools and software
- Practical Natural Language Processing done in Ruby
- whatlang
- Wit-ai - Natural Language Interface for apps and devices
- Natural Language Understanding - developer-cloud/natural-language-classifier-nodejs) and [Machine Translation](https://github.com/watson-developer-cloud/language-translator-nodejs) API Demos
- Deep Learning for Web Search and Natural Language Processing
- Probabilistic topic models
- Natural language processing: an introduction
- A unified architecture for natural language processing: Deep neural networks with multitask learning
- A Critical Review of Recurrent Neural Networksfor Sequence Learning
- Deep parsing in Watson
- Online named entity recognition method for microtexts in social networking services: A case study of twitter
- Efficient Estimation of Word Representations in Vector Space
- Mikolov
- Word2Vec source code
- Word2Vec tutorial
- Deep Learning, NLP, and Representations
- GloVe: Global vectors for word representation
- Evalutaion section led to controversy
- Glove source code and training data
- word2vec - on creating vectors to represent language, useful for RNN inputs
- sense2vec - on word sense disambiguation
- Infinite Dimensional Word Embeddings - new
- Skip Thought Vectors - word representation method
- Adaptive skip-gram - similar approach, with adaptive properties
- Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
- Distributed Representations of Sentences and Documents
- Le
- gensim - technologies.com/doc2vec-tutorial/)
- Deep Recursive Neural Networks for Compositionality in Language
- Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks
- Semi-supervised Sequence Learning
- Neural Machine Translation by jointly learning to align and translate
- English to French Demo
- Sequence to Sequence Learning with Neural Networks
- nips presentation
- seq2seq tutorial
- Cross-lingual Pseudo-Projected Expectation Regularization for Weakly Supervised Learning
- Generating Chinese Named Entity Data from a Parallel Corpus
- IXA pipeline: Efficient and Ready to Use Multilingual NLP tools
- A Neural Network Approach toContext-Sensitive Generation of Conversational Responses
- Neural Responding Machine for Short-Text Conversation
- A Neural Conversation Model
- Le - respond-to-this-email.html)
- Reasoning, Attention and Memory RAM workshop at NIPS 2015. slides included
- Memory Networks
- End-To-End Memory Networks
- MemNN
- Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks
- Evaluating prerequisite qualities for learning end to end dialog systems
- Jason Weston lecture on MemNN
- Neural Turing Machines
- Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets
- Stack RNN source code - algorithmic-patterns-with-stack/)
- Neural autocoder for paragraphs and documents - LSTM representation
- LSTM over tree structures
- Sequence to Sequence Learning - word vectors for machine translation
- Teaching Machines to Read and Comprehend - DeepMind paper
- Efficient Estimation of Word Representations in Vector Space
- Improving distributional similarity with lessons learned from word embeddings
- Low-Dimensional Embeddings of Logic
- based on this paper
- Markov Logic Networks for Natural Language Question Answering
- Distant Supervision for Cancer Pathway Extraction From Text
- Privee: An Architecture for Automatically Analyzing Web Privacy Policies
- A Neural Probabilistic Language Model
- Template-Based Information Extraction without the Templates
- Retrofitting word vectors to semantic lexicons
- Unsupervised Learning of the Morphology of a Natural Language
- Natural Language Processing (Almost) from Scratch
- Computational Grounded Cognition: a new alliance between grounded cognition and computational modelling
- Learning the Structure of Biomedical Relation Extractions
- Relation extraction with matrix factorization and universal schemas
- A survey of named entity recognition and classification
- Benchmarking the extraction and disambiguation of named entities on the semantic web
- Knowledge base population: Successful approaches and challenges
- SpeedRead: A fast named entity recognition Pipeline
- The Unreasonable Effectiveness of Recurrent Neural Networks
- Statistical Language Models based on Neural Networks
- Slides from Google Talk
- DrQA: Open Domain Question Answering
- Word2Vec
- Relation Extraction with Matrix Factorization and Universal Schemas
- Towards a Formal Distributional Semantics: Simulating Logical Calculi with Tensors
- Presentation slides for MLN tutorial
- Presentation slides for QA applications of MLNs
- Presentation slides
- Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations
- Deep Learning, NLP, and Representations
- NLP Tutorial
- Natural Language Processing Blog
- Machine Learning Blog
- Understand & Implement Natural Language Processing
- AI Playbook
- ai-reading-list
- nlp-reading-group
- awesome-spanish-nlp
- jjangsangy's awesome-nlp
- awesome-machine-learning
- DL4NLP
Programming Languages
Keywords
nlp
19
natural-language-processing
15
python
10
machine-learning
8
computational-linguistics
7
c-plus-plus
5
text-processing
5
folia
5
pos-tagger
4
nlp-library
4
text-mining
4
ai
3
word-embeddings
3
pos-tagging
3
library
3
topic-modeling
3
java
3
nlp-parsing
2
parser
2
syntax
2
spacy
2
dependency-parsing
2
natural-language-understanding
2
text-classification
2
named-entity-recognition
2
text-analysis
2
dependency-parser
2
data-mining
2
word2vec
2
pos-tag
2
sentiment-analysis
2
word-sense-disambiguation
2
wsd
2
deep-learning
2
part-of-speech-tagger
2
tokenizer
2
linguistics
2
language
2
ruby
2
text-visualization
1
bigartm
1
stanza
1
nltk
1
visualization
1
ling
1
latin
1
historical-linguistics
1
greek
1
lstm
1
oxford
1