Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Natural language processing
Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
- GitHub: https://github.com/topics/nlp
- Wikipedia: https://en.wikipedia.org/wiki/Natural_language_processing
- Created by: Alan Turing
- Aliases: natural-language-processing, nlp-machine-learning, nlp-resources,
- Last updated: 2024-11-15 00:20:20 UTC
- JSON Representation
https://github.com/fractalego/pynsett
A programmable relation extraction tool
extract-relationships nlp relation-extraction spacy wikidata-knowledge
Last synced: 12 Oct 2024
https://github.com/ucrel/pymusas
Python Multilingual Ucrel Semantic Analysis System
natural-language-processing nlp python spacy spacy-pipeline
Last synced: 12 Oct 2024
https://github.com/dbklim/stressrnn
Modified version of RusStress (https://github.com/MashaPo/russtress) — python package for placing stress in Russian text using RNN (BiLSTM) and the "Grammatical Dictionary" by A. A. Zaliznyak (from http://odict.ru/).
accent bilstm emphasis linguistic linguistics lstm nlp rnn russian russian-accent russian-stress russtress rustress stress
Last synced: 11 Nov 2024
https://github.com/ayaka14732/bart-base-jax
JAX implementation of the bart-base model
bart jax natural-language-processing nlp nlp-model
Last synced: 28 Oct 2024
https://github.com/hscspring/pnlp
NLP预/后处理工具。
chinese-nlp concurrency nlp nlp-enhancer nlp-preprocess normalization preprocessing text-cleaning text-extraction text-length text-processing
Last synced: 26 Oct 2024
https://github.com/proycon/python-ucto
This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is regular-expression based, extensible, and advanced tokeniser written in C++ (http://ilk.uvt.nl/ucto).
computational-linguistics folia nlp nlp-library python text-processing tokenizer
Last synced: 14 Nov 2024
https://github.com/97k/spam-ham-web-app
A web app that classifies text as a spam or ham. I am using my own ML algorithm in the backend, Code to that can be found under machine_learning_section. For Live Demo: Checkout this link
bag-of-words data-visualization django heroku-deployment jupyter-notebook machine-learning machine-learning-projects multinomial-naive-bayes nlp nltk spam-classification text-classification tfidf
Last synced: 11 Nov 2024
https://github.com/anthonysigogne/web-search-engine-ui
UI - a simple web search engine
elasticsearch google-search indexing nlp python search-engine
Last synced: 12 Nov 2024
https://github.com/ademakdogan/gpterm
Creating Intelligent Terminal Apps with ChatGPT and LLM Models
chatgpt chatgpt-api iterm2 langchain langchain-python natural-language-processing nlp python query-generator terminal
Last synced: 07 Nov 2024
https://github.com/benjaminvdb/DBRD
110k Dutch Book Reviews Dataset for Sentiment Analysis
dataset dataset-creation dutch nlp nlp-machine-learning python python3 scraped-data scraper
Last synced: 03 Aug 2024
https://github.com/sedthh/lara-hungarian-nlp
NLP class for rapid ChatBot development in Hungarian language
chatbot hungarian hungarian-language lemmatizer nlp python3 stemmer
Last synced: 03 Aug 2024
https://github.com/dsdanielpark/gpt2-bert-medical-qa-chat
Medical domain-focused GPT-2 fine-tuning, optimization, and lightweighting research repository (compared to GPT-4).
bert chatgpt gpt2 gpt4 medical-chatbot natural-language-processing nlp nlp-keywords-extraction
Last synced: 14 Nov 2024
https://github.com/akosbalasko/obsidian-autotagger-plugin
This plugin offers smart tags for notes by performing Named Entity Recognition (NER) on the content
natural-language-processing nlp obsidian-md obsidian-plugin
Last synced: 22 Oct 2024
https://github.com/Qznan/QizNLP
Quick run NLP in many task 快速运行分类、序列标注、匹配、生成等NLP任务的Tensorflow框架 (中文 NLP 支持分布式)
beam-search chinese classification horovod match nlp sequence-labeling sequence-to-sequence tensorflow
Last synced: 03 Aug 2024
https://github.com/maxent-ai/lda2vec
Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019
chainer deep-learning embeddings lda nlp python3 sklearn text text-mining topic-modeling word-embeddings word2vec
Last synced: 30 Sep 2024
https://github.com/nschneid/amr-hackathon
Abstract Meaning Representation (AMR) Hackathon
abstract-meaning-representation computational-linguistics natural-language-processing nlp python semantics
Last synced: 08 Nov 2024
https://github.com/shashwath94/hierarchical-seq2seq
A PyTorch implementation of the hierarchical encoder-decoder architecture (HRED) introduced in Sordoni et al (2015). It is a hierarchical encoder-decoder architecture for modeling conversation triples in the MovieTriples dataset. This version of the model is built for the MovieTriples dataset.
deep-learning hred nlp pytorch seq2seq-pytorch
Last synced: 27 Oct 2024
https://github.com/bloomberg/entsum
Open Source / ENTSUM: A Data Set for Entity-Centric Extractive Summarization
Last synced: 09 Nov 2024
https://github.com/ariya/tinker-chat
chatbot generative-ai gpt llama llama2 llm mistral nlp openai
Last synced: 01 Nov 2024
https://github.com/trainingbypackt/deep-learning-for-natural-language-processing
Solve your natural language processing problems with smart deep neural networks
deeplearning glove gru keras language lstm namedentityrecognizer natural nlp nlp-library nlp-machine-learning partsofspeechtagger textpreprocessing word2vec
Last synced: 14 Nov 2024
https://github.com/griptape-ai/griptape-tools
Tools for the Griptape Framework.
ai cohere gpt huggingface llm nlp openai python
Last synced: 27 Sep 2024
https://github.com/rosette-api/rosette-elasticsearch-plugin
Document Enrichment plugin for Elasticsearch
categorization elasticsearch elasticsearch-plugin entity-extraction fuzzy-name-matching fuzzy-search identity-resolution machine-learning named-entity-recognition natural-language-processing nlp rosette-plugin sentiment-analysis text-analytics text-mining
Last synced: 27 Oct 2024
https://github.com/vaibhavs10/10_days_of_deep_learning
10 days 10 different practical applications of Deep Learning (primarily NLP) using Tensorflow and Keras
classification gensim keras nlp python tensorflow tfidf-matrix
Last synced: 02 Nov 2024
https://github.com/wannaphong/laonlp
Lao language NLP
hacktoberfest lao lao-language natural-language-processing nlp nlp-library python
Last synced: 14 Nov 2024
https://github.com/saidziani/feedny
The Internet plays an increasingly important part in our daily lives as a source of written content for news and leisure. Yet it is tedious and difficult to sort through this staggering flow of information and stay updated with changes in our world, even using automated tools. Reading magazines and newspapers is too time-consuming, and there is a huge amount of online content that is updated or generated each minute. Our solution considers each user’s interests and leverages Artificial Intelligence, Machine Learning and Natural Language Processing in order to suggest to relevant articles from the internet.
automatic-summarization javascript machine-learning machine-translation natural-language-processing nlp profiling react-native recommendation-system text-classification
Last synced: 28 Oct 2024
https://github.com/adapter-hub/efficient-task-transfer
Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021
adapters bert nlp roberta transfer-learning transformers
Last synced: 06 Nov 2024
https://github.com/tianduowang/diffaug
EMNLP 2022: Differentiable Data Augmentation for Contrastive Sentence Representation Learning. https://arxiv.org/abs/2210.16536
data-augmentation nlp sentence-embeddings
Last synced: 14 Oct 2024
https://github.com/ramtinms/tokenquery
TokenQuery (regular expressions over tokens)
machine-learning natural-language-processing nlp regex regular-expressions
Last synced: 11 Nov 2024
https://github.com/yuyuzha0/word2vec
a word2vec impl of Chinese language, based on deeplearning4j and ansj
chinese java nlp word2vec word2vec-zh
Last synced: 12 Nov 2024
https://github.com/zimmerrol/attention-is-all-you-need-keras
Implementation of the Transformer architecture described by Vaswani et al. in "Attention Is All You Need"
attention-is-all-you-need keras neural-network nlp seq2seq transformer
Last synced: 22 Oct 2024
https://github.com/andreaferretti/charade
A server for multilanguage, composable NLP API in Python
Last synced: 14 Oct 2024
https://github.com/anakin87/fact-checking-rocks
Fact checking baseline combining dense retrieval and textual entailment
fact-checking haystack huggingface-spaces information-retrieval natural-language-inference natural-language-processing neural-search nlp python semantic-search streamlit streamlit-webapp text-entailment transformers
Last synced: 22 Oct 2024
https://github.com/kennethenevoldsen/scandinavian-embedding-benchmark
A Scandinavian Benchmark for sentence embeddings
benchmark low-resource-nlp natural-language-processing nlp scandinavian
Last synced: 31 Oct 2024
https://github.com/praful932/llmsearch
Find better generation parameters for your LLM
llm llm-evaluation llm-inference nlp
Last synced: 27 Oct 2024
https://github.com/veler/notepad-based-calculator
A smart calculator using natural language processing
calculator csharp dotnet mef natural-language-processing nlp
Last synced: 29 Oct 2024
https://github.com/swanhtet1992/ReSegment
Burmese (Myanmar) syllable level segmentation with regex.
burmese-nlp myanmar-nlp myanmar-text nlp segmentation
Last synced: 25 Oct 2024
https://github.com/Praful932/llmsearch
Find better generation parameters for your LLM
llm llm-evaluation llm-inference nlp
Last synced: 08 Nov 2024
https://github.com/loomchild/maligna
Bilingual sengence aligner
nlp text-alignment translation
Last synced: 08 Nov 2024
https://github.com/siphulangeni/tortus
A PyPI package for easy text annotation in a Jupyter Notebook.
annotation-tool ipywidgets jupyter-notebook labeling-tool nlp
Last synced: 08 Nov 2024
https://github.com/trashhalo/logseq-summarizer
Logseq plugin to summarize text
Last synced: 02 Nov 2024
https://github.com/houbb/word-cloud
The word cloud tool for java.(java 好用的词云工具-云图)
cloud image nlp word word-cloud wordcloud
Last synced: 07 Nov 2024
https://github.com/aqibsaeed/research-paper-categorization
Research paper classification using machine learning and NLP
machine-learning nlp text-classification
Last synced: 09 Nov 2024
https://github.com/fredriko/bert-tensorflow-pytorch-spacy-conversion
Instructions for how to convert a BERT Tensorflow model to work with HuggingFace's pytorch-transformers, and spaCy. This walk-through uses DeepPavlov's RuBERT as example.
bert bert-model how-to keras nlp pytorch-transformers spacy spacy-models spacy-nlp spacy-package spacy-pytorch-transformers tensorflow
Last synced: 07 Aug 2024
https://github.com/laugustyniak/textlytics
Text processing library for sentiment analysis and related tasks
classification natural-language-processing nlp opinion-mining scikit-learn sentiment-analysis supervised-learning word-embeddings
Last synced: 03 Aug 2024
https://github.com/sap-samples/acl2020-commonsense
Source code for paper on commonsense reasoning for 2020 Annual Conference of the Association for Computational Linguistics (ACL) 2020.
commonsense-reasoning contrastive deep-learning machine-learning nlp sample sample-code self-supervised
Last synced: 15 Nov 2024
https://github.com/suicao/vn-accent-restorer
This project applies multiple deep learning models to the problem of restoring diacritical marks to sentences in Vietnamese.
deep-learning nlp tensorflow tensorflow-experiments
Last synced: 10 Oct 2024
https://github.com/centre-for-humanities-computing/tweetopic
Blazing fast topic modelling for short texts.
dirichlet-process-mixtures dmm gibbs-sampling gsdmm machine-learning mcmc nlp python scikit-learn topic-modeling tweet tweet-analysis visualization
Last synced: 09 Nov 2024
https://github.com/seanlee97/llano
Let ChatGPT (Large Language Models) Serve As Data Annotator and Zero-shot/few-shot Information Extractor.
annotataion annotator chatgpt chatie classification data-augmentation few-shot gpt gpt-3 gpt-4 information-extraction large-language-models llm ner nlp openai prompt prompt-engineering relation-extraction zero-shot
Last synced: 27 Oct 2024
https://github.com/luoyuanlab/text_gcn_tutorial
A tutorial & minimal example (8min on CPU) for Graph Convolutional Networks for Text Classification. AAAI 2019
deep-learning graph-convolutional-networks nlp text-classification
Last synced: 02 Nov 2024
https://github.com/gatenlp/gateplugin-learningframework
A plugin for the GATE language technology framework for training and using machine learning models. Currently supports Mallet (MaxEnt, NaiveBayes, CRF and others), LibSVM, Scikit-Learn, Weka, and DNNs through Pytorch and Keras.
classification crf machine-learning nlp sequence-tagging
Last synced: 13 Nov 2024
https://github.com/loristns/wisty.js
🧚♀️ Chatbot library turning conversations into actions, locally, in the browser.
assistant bot bot-framework chatbot chatbots conversational-agents conversational-ai dialogue-systems hybrid-code-networks javascript machine-learning named-entity-recognition natural-language-processing nlp nlu tensorflow tensorflowjs
Last synced: 10 Oct 2024
https://github.com/yasinkuyu/turkish.cs
Turkish Suffix Library for C# & .NET- Türkçe Çekim ve Yapım Ekleri
Last synced: 06 Nov 2024
https://github.com/generall/entitycategoryprediction
Model for predicting categories of entities by its mentions
allennlp classification mentions nlp
Last synced: 14 Oct 2024
https://github.com/agatan/yoin
A Japanese Morphological Analyzer written in pure Rust
Last synced: 05 Nov 2024
https://github.com/amazon-science/bold
Dataset associated with "BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation" paper
bert bert-model bias fairness-ml gpt-2 language-model nlg nlg-dataset nlp text-generation
Last synced: 12 Nov 2024
https://github.com/philipmay/stsb-multi-mt
Machine translated multilingual STS benchmark dataset.
Last synced: 28 Oct 2024
https://github.com/jayyip/cws-tensorflow
基于Tensorflow的中文分词模型
nlp tensorflow word-segmentation
Last synced: 11 Nov 2024
https://github.com/shibing624/pinyin-tokenizer
pinyintokenizer, 拼音分词器,将连续的拼音切分为单字拼音列表。
nlp pinyin pinyin-analysis pinyin4j tokenizer trie-tree
Last synced: 22 Oct 2024
https://github.com/loristns/Wisty.js
🧚♀️ Chatbot library turning conversations into actions, locally, in the browser.
assistant bot bot-framework chatbot chatbots conversational-agents conversational-ai dialogue-systems hybrid-code-networks javascript machine-learning named-entity-recognition natural-language-processing nlp nlu tensorflow tensorflowjs
Last synced: 09 Nov 2024
https://github.com/yasinkuyu/Turkish.cs
Turkish Suffix Library for C# & .NET- Türkçe Çekim ve Yapım Ekleri
Last synced: 12 Nov 2024
https://github.com/zimmerrol/show-attend-and-tell-keras
Keras implementation of the "Show, Attend and Tell" paper
attention-mechanism image-captioning keras lstm mscoco-image-dataset nlp rnn show-attend-and-tell tensorflow
Last synced: 22 Oct 2024
https://github.com/gaussalgo/adaptor
ACL 2022: Adaptor: a library to easily adapt a language model to your own task, domain, or custom objective(s).
domain-adaptation multi-objective-optimization ner nlp pytorch robustness text-classification text-generation transformers
Last synced: 08 Nov 2024
https://github.com/elizalo/question-answering-based-on-squad
Question Answering System using BiDAF Model on SQuAD v2.0
bidaf machine-learning natural-language-processing natural-language-understanding neural-network nlp nlp-datasets nlp-machine-learning python python-3-6 question-answering squad
Last synced: 28 Sep 2024
https://github.com/rileynwong/spotify-analysis
Data analysis on my monthly playlists
audio-features data-analysis data-scraping lyrics machine-learning natural-language-processing nlp nlp-machine-learning sentiment-analysis spotify-analysis supervised-learning supervised-machine-learning text text-analysis
Last synced: 14 Oct 2024
https://github.com/hankcs/sub-character-cws
Sub-Character Representation Learning
chinese-word-segmentation cws natural-language-processing nlp representation-learning simplified-chinese traditional-chinese
Last synced: 13 Oct 2024
https://github.com/kargaranamir/parstdex
A package that extracts Persian time and date markers by applying regexes -- AACL 2022
datetime event-extract event-extraction hengam hengamtagger information-extraction nlp parstdex persian persian-calendar persian-datetime persian-time regex-pattern time-date
Last synced: 04 Aug 2024
https://github.com/ElizaLo/Question-Answering-based-on-SQuAD
Question Answering System using BiDAF Model on SQuAD v2.0
bidaf machine-learning natural-language-processing natural-language-understanding neural-network nlp nlp-datasets nlp-machine-learning python python-3-6 question-answering squad
Last synced: 13 Nov 2024
https://github.com/loomchild/segment
Program used to split text into segments
Last synced: 14 Nov 2024
https://github.com/tlack/hairytext
A data labeling and NLP tool for Elixir (uses Spacy)
elixir entity-recognition nlp nlp-machine-learning phoenix-live-view spacy text-classification
Last synced: 28 Oct 2024
https://github.com/centre-for-humanities-computing/embedding-explorer
Tools for interactive visual exploration of semantic embeddings.
clustering embedding embeddings interactive knowledge-graph machine-learning networks nlp projection semantic
Last synced: 09 Nov 2024
https://github.com/anoopkunchukuttan/geomm
Geometry-aware Multilingual Embeddings
bilingual-word-embedding multilingual nlp translation word-embedding
Last synced: 03 Aug 2024
https://github.com/karma9874/seq2seq-chatbot
Chatbot based Seq2Seq model with bidirectional rnn and attention mechanism with tensorflow, trained on Cornell Movie-Dialogs Corpus and deployed on a Flask Server
attention-mechanism bidirectional-lstm chatbot deep-learning flask nlp question-answering seq2seq tensorflow
Last synced: 06 Nov 2024
https://github.com/omarsar/nlp_pytorch_tensorflow_notebooks
Deep Learning for NLP Python Notebooks in PyTorch and TensorFlow
deeplearning emotion nlp pytorch rnn sentiment-analysis tensorflow
Last synced: 13 Oct 2024
https://github.com/yuewang-cuhk/hashtaggeneration
The official implementation of the NAACL-HLT 2019 paper "Microblog Hashtag Generation via Encoding Conversation Contexts"
hashtag-generator nlp social-media
Last synced: 09 Nov 2024
https://github.com/princeton-nlp/rationale-robustness
NAACL 2022: Can Rationalization Improve Robustness? https://arxiv.org/abs/2204.11790
interpretability nlp robustness
Last synced: 11 Nov 2024
https://github.com/shibing624/title-generator
Automatic Text Summarization and Title Generation.
deep-learning nlp text-summarization title-generation
Last synced: 22 Oct 2024
https://github.com/timbmg/structured-self-attentive-sentence-embedding
Re-Implementation of "A Structured Self-Attentive Sentence Embedding" by Lin et al., 2017
attention deep-learning machine-learning neural-networks nlp pytorch recurrent-neural-networks self-attention self-attentive-rnn sentiment-analysis text-classification vizualisation yelp-dataset
Last synced: 27 Oct 2024
https://github.com/chengchingwen/bytepairencoding.jl
Julia implementation of Byte Pair Encoding for NLP
nlp nlp-library nlp-machine-learning word-segmentation
Last synced: 15 Oct 2024
https://github.com/adhaamehab/arabicnlp
Python package for Arabic natural language processing
arabic arabic-nlp keras ml nlp part-of-speech-tagger postagging sequence-modeling
Last synced: 11 Oct 2024
https://github.com/chengchingwen/BytePairEncoding.jl
Julia implementation of Byte Pair Encoding for NLP
nlp nlp-library nlp-machine-learning word-segmentation
Last synced: 28 Oct 2024
https://github.com/erikgartner/sentimental
Sentiment analysis made easy; built on top off solid libraries.
natural-language-processing nlp sentiment-analysis
Last synced: 02 Nov 2024
https://github.com/decalogue/ai
AI ——人工智能工具集,包含机器学习,深度学习,自然语言处理
ai deep-learning dl machine-learning ml natural-language-processing nlp python
Last synced: 15 Nov 2024
https://github.com/janekb04/py2gpt
Convert Python code into JSON consumable by OpenAI's function API.
ai api chatgpt converter function gpt gpt-4 json nlp openai openai-api python schema transcoding
Last synced: 05 Nov 2024
https://github.com/thunlp/hiddenkiller
Code and data of the ACL-IJCNLP 2021 paper "Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger"
Last synced: 10 Nov 2024
https://github.com/dair-ai/odsc_2020_nlp
Repository for ODSC talk related to Deep Learning NLP
elasticsearch nlp search transformer
Last synced: 10 Nov 2024
https://github.com/dayyass/latent-semantic-analysis
Pipeline for training LSA models using Scikit-Learn.
data-science hacktoberfest latent-semantic-analysis lsa machine-learning natural-language-processing nlp pipeline python topic-modeling
Last synced: 14 Oct 2024
https://github.com/andythefactory/romanian-nlp-datasets
A list of Romanian NLP Datasets
nlp nlp-data nlp-dataset nlp-datasets nlp-resources romanian romanian-language
Last synced: 07 Nov 2024
https://github.com/anakin87/neural-search-pills
Knowledge pills on Neural Search
deep-learning information-retrieval machine-learning machine-reading multimodal-search natural-language-processing neural-search nlp question-answering retrieval-systems search-engines semantic-search transformers vector-search
Last synced: 23 Oct 2024
https://github.com/mlabouardy/dialogflow-watchnow-messenger
WatchNow FB Messenger bot with DialogFlow & Golang 💬
api-ai bot dialogflow golang messenger nlp
Last synced: 15 Nov 2024
https://gair-nlp.github.io/BeHonest/
BeHonest: Benchmarking Honesty in Large Language Models
alignment benchmark evaluation honesty llm nlp
Last synced: 11 Oct 2024
https://github.com/jasonwbw/recordpapers4nlp
Record papers for some NLP related area
deep-learning dialogue-generation nlp reading-comprehension
Last synced: 28 Oct 2024
https://github.com/fractalego/subjectivity_classifier
Detects if a sentence is in a subjective or objective form
nlp rnn-tensorflow subjectivity
Last synced: 28 Oct 2024
https://github.com/shrebox/Personified-Chatbot
A personified chatbot responding to a query based on the answering pattern of Dr. APJ Abdul Kalam using Information Retrieval, Natural Language Processing, and Deep Learning techniques.
apj-abdul-kalam chatbot deep-learning information-retrieval lstm natural-language-processing nlp ranking-algorithm seq2seq-chatbot seq2seq-model summarization word2vec
Last synced: 11 Nov 2024
https://github.com/liebeck/spacy-iwnlp
German lemmatization with IWNLP as extension for spaCy
nlp spacy spacy-extension spacy-pipeline
Last synced: 14 Oct 2024
https://github.com/kampersanda/tongrams-rs
Rust library providing fast language model queries in compressed space
compression elias-fano language-model ngrams nlp trie
Last synced: 11 Nov 2024