Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Natural language processing
Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
- GitHub: https://github.com/topics/nlp
- Wikipedia: https://en.wikipedia.org/wiki/Natural_language_processing
- Created by: Alan Turing
- Aliases: natural-language-processing, nlp-machine-learning, nlp-resources,
- Last updated: 2024-11-11 00:19:59 UTC
- JSON Representation
https://github.com/sorgerlab/indra
INDRA (Integrated Network and Dynamical Reasoning Assembler) is an automated model assembly system interfacing with NLP systems and databases to collect knowledge, and through a process of assembly, produce causal graphs and dynamical models.
bioinformatics biology computational-biology indra modeling nlp pysb sbml systems-biology
Last synced: 14 Oct 2024
https://github.com/simongray/clojure-dsl-resources
A curated list of Clojure resources for dealing with domain-specific languages.
data-transformation domain-specific-language dsl nlp parsing
Last synced: 22 Oct 2024
https://github.com/xatkit-bot-platform/xatkit
The simplest way to build all types of smart chatbots and digital assistants
bot chatbot-framework chatbots conversational-ai digital-assistant dsl low-code nlp no-code
Last synced: 07 Nov 2024
https://github.com/ooguz/turkce-kufur-karaliste
Türkçe için bir kara liste (blacklist)
blacklist blacklisting bots nlp nlp-keywords-extraction turkce turkce-kufur-karaliste turkish turkish-language turkish-translation
Last synced: 04 Nov 2024
https://github.com/hscspring/all4nlp
All For NLP, especially Chinese.
ai deeplearning machinelearning nlp
Last synced: 27 Oct 2024
https://github.com/dengbocong/text-similarity
文本相似度(匹配)计算,提供Baseline、训练、推理、指标分析...代码包含TensorFlow/Pytorch双版本
bert deep-learning mechine-learing model nlp pytorch similarity text-classification transformer
Last synced: 08 Nov 2024
https://github.com/princeton-nlp/cofipruning
[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408
bert model-compression nlp pruning
Last synced: 11 Nov 2024
https://github.com/Uzay-G/espial
Espial is an engine for automated organization and discovery of personal knowledge
knowledge knowledge-graph nlp python
Last synced: 01 Nov 2024
https://github.com/cohere-ai/sandbox-conversant-lib
Conversational AI tooling & personas built on Cohere's LLMs
chatbot chatbot-framework chatbots cohere conversational-agent conversational-ai conversational-bots dialogue-generation dialogue-systems large-language-models llm nlp
Last synced: 07 Oct 2024
https://github.com/HumanSignal/label-studio-transformers
Label data using HuggingFace's transformers and automatically get a prediction service
bert data-labeling label-studio natural-language-processing natural-language-understanding nlp pytorch-transformers text-labeling transformers
Last synced: 07 Aug 2024
https://github.com/uzay-g/espial
Espial is an engine for automated organization and discovery of personal knowledge
knowledge knowledge-graph nlp python
Last synced: 27 Oct 2024
https://github.com/daspartho/prompt-extend
extending stable diffusion prompts with suitable style cues using text generation
deep-learning gpt-2 huggingface-spaces huggingface-transformers machine-learning nlp prompt stable-diffusion text-generation
Last synced: 03 Aug 2024
https://github.com/cyberzhg/keras-xlnet
Implementation of XLNet that can load pretrained checkpoints
glue keras language-model nlp xlnet
Last synced: 27 Sep 2024
https://github.com/yohasebe/wp2txt
A command-line toolkit to extract text content and category data from Wikipedia dump files
corpus machine-learning nlp ruby wikipedia wikipedia-dump
Last synced: 08 Nov 2024
https://github.com/HKUSTDial/NL2SQL_Handbook
This is a continuously updated handbook for readers to easily track the latest NL2SQL techniques in the literature and provide practical guidance for researchers and practitioners.
awesome finetuning llms nl-to-code nl-to-sql nl2sql nlp nlp-resources survey text-to-sql text2sql tutorial
Last synced: 02 Nov 2024
https://github.com/CyberZHG/keras-xlnet
Implementation of XLNet that can load pretrained checkpoints
glue keras language-model nlp xlnet
Last synced: 03 Aug 2024
https://github.com/avidale/compress-fasttext
Tools for shrinking fastText models (in gensim format)
fasttext fasttext-embeddings nlp python word-embeddings
Last synced: 13 Nov 2024
https://github.com/j2kao/fcc_nn_research
(somewhat) cleaned-up notebooks used in researching public comments for FCC Proceeding 17-108 (Net Neutrality Repeal)
Last synced: 09 Aug 2024
https://github.com/dccuchile/wefe
WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Word Embeddings models. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!
bias-detection bias-reduction fairness-ai fairness-ml library nlp nlp-library python3 word-embedding-evaluation word-embedding-fairness word-embeddings
Last synced: 05 Aug 2024
https://github.com/ownthink/semantic
语义理解/口语理解,项目包含有词法分析:中文分词、词性标注、命名实体识别;口语理解:领域分类、槽填充、意图识别。
Last synced: 07 Nov 2024
https://github.com/fiddler-labs/fiddler-auditor
Fiddler Auditor is a tool to evaluate language models.
ai-observability evaluation generative-ai langchain llms nlp robustness
Last synced: 13 Nov 2024
https://github.com/IlyaGusev/summarus
Models for automatic abstractive summarization
deep-learning machine-learning nlp pytorch summarization
Last synced: 04 Nov 2024
https://github.com/balavenkatesh3322/nlp-pretrained-model
A collection of Natural language processing pre-trained models.
deep-learning deep-neural-networks keras machine-learning model mxnet natural-language-processing neural-networks nlp nlp-machine-learning python python3 pytorch tensorflow text text-classification text-to-number text-to-speech
Last synced: 10 Nov 2024
https://github.com/balavenkatesh3322/NLP-pretrained-model
A collection of Natural language processing pre-trained models.
deep-learning deep-neural-networks keras machine-learning model mxnet natural-language-processing neural-networks nlp nlp-machine-learning python python3 pytorch tensorflow text text-classification text-to-number text-to-speech
Last synced: 07 Aug 2024
https://github.com/rylans/getlang
Natural language detection package in pure Go
language-model natural-language nlp
Last synced: 26 Oct 2024
https://github.com/ymcui/lert
LERT: A Linguistically-motivated Pre-trained Language Model(语言学信息增强的预训练模型LERT)
bert lert nlp plm pre-train pytorch tensorflow transformer
Last synced: 28 Oct 2024
https://github.com/princeton-nlp/optiprompt
[NAACL 2021] Factual Probing Is [MASK]: Learning vs. Learning to Recall https://arxiv.org/abs/2104.05240
Last synced: 11 Nov 2024
https://github.com/iterative/example-get-started
Get started DVC project (NLP, random forest)
dvc example machine-learning nlp python random-forest reproducibility reproducible reproducible-research
Last synced: 03 Aug 2024
https://github.com/prrao87/fine-grained-sentiment
A comparison and discussion of different NLP methods for 5-class sentiment classification on the SST-5 dataset.
fasttext flair nlp python pytorch sentiment-analysis text-classification transformers
Last synced: 13 Nov 2024
https://github.com/doc-analysis/XFUND
XFUND: A Multilingual Form Understanding Benchmark
dataset natural-language-processing nlp
Last synced: 06 Nov 2024
https://github.com/princeton-nlp/OptiPrompt
[NAACL 2021] Factual Probing Is [MASK]: Learning vs. Learning to Recall https://arxiv.org/abs/2104.05240
Last synced: 04 Aug 2024
https://github.com/natasha/navec
Compact high quality word embeddings for Russian language
embeddings glove nlp python quantization russian word2vec
Last synced: 10 Nov 2024
https://github.com/microsoft/presidio-research
This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.
deep-learning flair machine-learning named-entity-recognition natural-language-processing ner nlp pii privacy spacy transformers
Last synced: 07 Oct 2024
https://github.com/google-research/turkish-morphology
A two-level morphological analyzer for Turkish.
google morphological-analyser morphology natural-language-processing natural-language-understanding nlp turkish
Last synced: 10 Nov 2024
https://github.com/indix/whatthelang
Lightning Fast Language Prediction 🚀
fasttext language-detection languages nlp python
Last synced: 07 Nov 2024
https://github.com/akshaynagpal/w2n
Convert number words (eg. twenty one) to numeric digits (21)
nlp numeric-digits python word-to-number
Last synced: 05 Aug 2024
https://github.com/crazyofapple/Reading_groups
A paper & resource list of large language models, including course, paper, demo, figures
chatgpt gpt-3 gpt-4 large-language-models llm llms natural-language-processing nlp
Last synced: 10 Nov 2024
https://github.com/geekjr/quickai
QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.
ai artificial-intelligence bert deep-learning dl easy-to-use fast gpt gpt-neo huggingface-transformers ml neural-network nlp object-detection python pytorch quickai research tensorflow2 yolo
Last synced: 09 Nov 2024
https://github.com/NPCai/Open-IE-Papers
Open Information Extraction (OpenIE) and Open Relation Extraction (ORE) papers and data.
information-extraction literature-review nlp openie papers relation-extraction tuples
Last synced: 10 Nov 2024
https://github.com/platisd/duplicate-code-detection-tool
A simple Python3 tool to detect similarities between files within a repository
Last synced: 01 Nov 2024
https://github.com/cequence-io/openai-scala-client
Scala client for OpenAI API
anthropic chatgpt dall-e gpt-3 gpt-4 machine-learning ml nlp openai openai-api scala
Last synced: 09 Nov 2024
https://github.com/apple/ml-mkqa
We introduce MKQA, an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically diverse languages (260k question-answer pairs in total). The goal of this dataset is to provide a challenging benchmark for question answering quality across a wide set of languages. Please refer to our paper for details, MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering
dataset multilingual-evaluation nlp
Last synced: 07 Oct 2024
https://github.com/kuutsav/information-retrieval
Neural information retrieval / semantic-search / Bi-Encoders
information-retrieval machine-learning nlp semantic-search
Last synced: 03 Aug 2024
https://github.com/lyeoni/prenlp
Preprocessing Library for Natural Language Processing
natural-language-processing nlp preprocessing-library text-preprocessing text-processing
Last synced: 06 Nov 2024
https://github.com/lancern/asm2vec
An unofficial implementation of asm2vec as a standalone python package
asm2vec binary-analysis machine-learning nlp numpy python python3 unofficial word2vec
Last synced: 01 Nov 2024
https://github.com/abitdodgy/words_counted
A Ruby natural language processor.
natural-language-processing nlp ruby rubynlp word-counter wordcount wordscounter
Last synced: 26 Oct 2024
https://github.com/takelab/spacy-udpipe
spaCy + UDPipe
natural-language-processing nlp nlp-library python spacy udpipe universal-dependencies wrapper-library
Last synced: 31 Oct 2024
https://github.com/the-javapocalypse/Twitter-Sentiment-Analysis
This script can tell you the sentiments of people regarding to any events happening in the world by analyzing tweets related to that event
nlp python python3 sentiment sentiment-analysis textblob tweepy tweets twitter twitter-sentiment-analysis
Last synced: 25 Oct 2024
https://github.com/umarbutler/semchunk
A fast and lightweight pure Python library for splitting text into semantically meaningful chunks.
chunking nlp python semantic-chunking splitting text text-chunking text-splitting
Last synced: 07 Nov 2024
https://github.com/Lancern/asm2vec
An unofficial implementation of asm2vec as a standalone python package
asm2vec binary-analysis machine-learning nlp numpy python python3 unofficial word2vec
Last synced: 03 Aug 2024
https://github.com/husseinmozannar/SOQAL
Arabic Open Domain Question Answering System using Neural Reading Comprehension
arabic arabic-language arabic-nlp deep-learning nlp question-answering reading-comprehension tf-idf
Last synced: 03 Aug 2024
https://github.com/Yachay-AI/byt5-geotagging
Confidence and Byt5 - based geotagging model predicting coordinates from text alone.
coordinates deep-learning geo-location geotagging machine-learning neural-network nlp nlp-machine-learning python pytorch transformers
Last synced: 05 Nov 2024
https://github.com/salvatorera/tutorial
Tutorials on machine learning, artificial intelligence, data science with math explanation and reusable code (in python and R)
artificial-intelligence bioinformatics biology computer-vision convolutional-neural-networks data-science deep-learning graph image machine-learning natural-language-processing nlp python r streamlit streamlit-webapp tutorial tutorials vision-transformer
Last synced: 09 Oct 2024
https://github.com/jonathanbratt/RBERT
Implementation of BERT in R
bert natural-language-processing nlp reticulate rstats rstudio tensorflow
Last synced: 05 Aug 2024
https://github.com/microsoft/ASTRA
Self-training with Weak Supervision (NAACL 2021)
machine-learning nlp weak-supervision weakly-supervised-learning
Last synced: 05 Nov 2024
https://github.com/lxuechen/private-transformers
A codebase that makes differentially private training of transformers easy.
deep-learning differential-privacy huggingface-transformers nlp pytorch transformers
Last synced: 27 Oct 2024
https://github.com/danielegrattarola/twitter-sentiment-cnn
An implementation in TensorFlow of a convolutional neural network (CNN) to perform sentiment classification on tweets.
deep-learning nlp python sentiment-classification tensorflow twitter
Last synced: 01 Nov 2024
https://github.com/anujvyas/Natural-Language-Processing-Projects
This repository consists of all my NLP Projects
lemmatization natural-language-processing nlp nltk python sentiment-analysis stemming text-classification wordcloud
Last synced: 01 Nov 2024
https://github.com/huspacy/huspacy
HuSpaCy: industrial-strength Hungarian natural language processing
dependency-parsing hungarian hunlp huspacy information-extraction lemmatization machine-learning morphological-analysis named-entity-recognition natural-language-processing ner nlp pos-tagger python spacy spacy-models spacy-pipeline text-mining universal-dependencies
Last synced: 13 Nov 2024
https://github.com/oscar-project/ungoliant
:spider: The pipeline for the OSCAR corpus
common-crawl commoncrawl corpus-linguistics crawler fasttext language-classification nlp oscar
Last synced: 04 Nov 2024
https://github.com/lonepatient/torchblocks
A PyTorch-based toolkit for natural language processing
advertising bert multilabel-classification named-entity-recognition nlp pytorch relation-classification siamese-network text-classification text-similarity transformers triplet-loss
Last synced: 06 Nov 2024
https://github.com/zhpmatrix/BERTem
论文实现(ACL2019):《Matching the Blanks: Distributional Similarity for Relation Learning》
acl2019 bert-pytorch fewrel matching-the-blanks nlp relation-extraction
Last synced: 02 Nov 2024
https://github.com/minerva-ml/open-solution-toxic-comments
Open solution to the Toxic Comment Classification Challenge
challenge competition data-science deep-learning ensemble-model kaggle kaggle-competition machine-learning neptune nlp pipeline prediction python python3
Last synced: 07 Aug 2024
https://github.com/anasaito/skillner
A (smart) rule based NLP module to extract job skills from text
ner nlp python rule-based skillner skills spacy
Last synced: 30 Oct 2024
https://github.com/brolin59/trnlp
TÜRKÇE İÇİN DOĞAL DİL İŞLEME ARAÇLARI
dogal-dil-isleme morfoloji morfolojik-analiz nlp turkish-nlp turkish-sentence-tokenizer
Last synced: 12 Nov 2024
https://github.com/smyja/blackmaria
Python package for webscraping in Natural language
gpt-3 nlp openai python webscraping
Last synced: 09 Aug 2024
https://github.com/lucaterre/spacyfishing
A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata
entity-disambiguation entity-linking natural-language-processing nlp python3 spacy spacy-extension spacy-extensions wikidata
Last synced: 31 Oct 2024
https://github.com/dair-ai/ml-nlp-paper-discussions
📄 A repo containing notes and discussions for our weekly NLP/ML paper discussions.
Last synced: 10 Nov 2024
https://github.com/ClipsAI/clipsai
Clips AI is an open-source Python library that automatically converts long videos into clips.
computer-vision nlp video-processing
Last synced: 06 Nov 2024
https://github.com/kennethenevoldsen/augmenty
Augmenty is an augmentation library based on spaCy for augmenting texts.
augmentation natural-language-processing nlp nlproc python spacy spacy-extension spacy-nlp text-augmentation text-classification training-data
Last synced: 14 Oct 2024
https://github.com/umitkaanusta/TIA
Your Advanced Twitter stalking tool
machine-learning nlp osint sentiment-analysis sentiment-classification social-media social-media-analysis social-media-mining text-classification text-classifier twint twitter twitter-api
Last synced: 27 Oct 2024
https://github.com/chewxy/lingo
package lingo provides the data structures and algorithms required for natural language processing
conll-u go golang inflection language-model natural-language-processing nlp nlp-dependency-parsing nlp-library nlp-machine-learning nlp-parsing part-of-speech part-of-speech-tagger
Last synced: 31 Oct 2024
https://github.com/umitkaanusta/tia
Your Advanced Twitter stalking tool
machine-learning nlp osint sentiment-analysis sentiment-classification social-media social-media-analysis social-media-mining text-classification text-classifier twint twitter twitter-api
Last synced: 13 Nov 2024
https://github.com/vatshayan/live-chatbot-for-final-year-project
Chatbot system for Final Year Project. Chatbot made in Python using Natural Language Toolkit especially Machine Learning. Easy to Understand and Implement.
btech-project capstone-project chat chat-application chatbot chatbots college-project computer-science cse-project final final-project final-year-project final-year-projects machine-learning nlp nltk project-ideas projects python python-project
Last synced: 28 Oct 2024
https://github.com/kevincobain2000/jProcessing
Japanese Natural Langauge Processing Libraries
japanese nlp word-sense-disambiguation wsd
Last synced: 30 Oct 2024
https://github.com/microsoft/browsecloud
A web app to create and browse text visualizations for automated customer listening.
bayesian-networks counting-grids nlp text-classification text-processing visualization
Last synced: 05 Aug 2024
https://github.com/calpt/awesome-adapter-resources
Collection of Tools and Papers related to Adapters / Parameter-Efficient Transfer Learning/ Fine-Tuning
adapters awesome deep-learning natural-language-processing nlp parameter-efficient-learning parameter-efficient-tuning peft transformers
Last synced: 15 Oct 2024
https://github.com/rth/vtext
Simple NLP in Rust with Python bindings
bag-of-words information-retrieval nlp tf-idf tokenization
Last synced: 30 Oct 2024
https://github.com/emilhvitfeldt/r-text-data
List of textual data sources to be used for text mining in R
data-science nlp rstats text-analysis text-analytics-in-r text-mining tidytext
Last synced: 30 Oct 2024
https://github.com/emres/turkish-deasciifier
Turkish deasciifier in Python based on Deniz Yüret's turkish-mode for Emacs
deasciifier diacritics diacritics-reconstruction diacritics-restoration nlp nlp-library python turkish turkish-nlp
Last synced: 12 Nov 2024
https://github.com/xalanq/chinese-sentiment-classification
简单的中文文本情感分类 (MLP, CNN, RNN in PyTorch) - 2019 THU 人工智能导论作业
Last synced: 07 Nov 2024
https://github.com/thunlp/openbackdoor
An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)
Last synced: 10 Nov 2024
https://github.com/yuewang-cuhk/takg
The official implementation of ACL 2019 paper "Topic-Aware Neural Keyphrase Generation for Social Media Language"
keyphrase-generation nlp social-media topic-modeling
Last synced: 09 Nov 2024
https://github.com/erfanzar/EasyDeL
EasyDeL is an OpenSource Library to make your training faster and more Optimized With cool Options for training and serving Both in Python And Mojo🔥
easydel flax gpt jax machine-learning mojo nlp optax pytorch transformers
Last synced: 03 Aug 2024
https://github.com/EmilHvitfeldt/R-text-data
List of textual data sources to be used for text mining in R
data-science nlp rstats text-analysis text-analytics-in-r text-mining tidytext
Last synced: 05 Aug 2024
https://github.com/CLUEbenchmark/DataCLUE
DataCLUE: 数据为中心的NLP基准和工具包
ai chinese classification-algorithm data-centric human-in-the-loop nlp
Last synced: 03 Aug 2024
https://github.com/cluebenchmark/dataclue
DataCLUE: 数据为中心的NLP基准和工具包
ai chinese classification-algorithm data-centric human-in-the-loop nlp
Last synced: 09 Nov 2024
https://github.com/rocketchat/hubot-natural
Natural Language Processing Chatbot for RocketChat
chatbot coffeescript hubot hubot-natural nlp nodejs rocketchat rocketchat-hubot
Last synced: 29 Oct 2024
https://github.com/RocketChat/hubot-natural
Natural Language Processing Chatbot for RocketChat
chatbot coffeescript hubot hubot-natural nlp nodejs rocketchat rocketchat-hubot
Last synced: 26 Oct 2024
https://github.com/Planeshifter/text-miner
text mining utilities for Node.js
Last synced: 10 Nov 2024
https://github.com/alisonmitchell/stock-prediction
Technical and sentiment analysis to predict the stock market with machine learning models based on historical time series data and news article sentiment collected using APIs and web scraping.
beautifulsoup bert gensim huggingface keras-tensorflow machine-learning matplotlib mplfinance nlp nltk numpy pandas plotly python scikit-learn scipy seaborn spacy textblob yfinance
Last synced: 07 Nov 2024
https://github.com/planeshifter/text-miner
text mining utilities for Node.js
Last synced: 26 Oct 2024
https://github.com/KudoAI/duckduckgpt
🐤 DuckDuckGo add-on that brings the magic of ChatGPT to search results (powered by GPT-4!)
ai artificial-intelligence bot chatbot chatgpt chatgpt3 ddg duckduckgo gpt gpt-3 gpt-4 greasemonkey javascript machine-learning nlp openai search userscripts web
Last synced: 30 Oct 2024
https://github.com/kudoai/duckduckgpt
🐤 DuckDuckGo add-on that brings the magic of ChatGPT to search results (powered by GPT-4!)
ai artificial-intelligence bot chatbot chatgpt chatgpt3 ddg duckduckgo gpt gpt-3 gpt-4 greasemonkey javascript machine-learning nlp openai search userscripts web
Last synced: 12 Oct 2024
https://github.com/ofa-sys/ofasys
OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models
audio computer-vision deep-learning motion multimodal-learning multitask-learning nlp pretrained-models pytorch transformers vision-and-language
Last synced: 10 Oct 2024
https://github.com/ayoungprogrammer/Lango
Language Lego
nlp parse-trees stanford-corenlp stanford-parser
Last synced: 07 Aug 2024
https://github.com/thunlp/OpenBackdoor
An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)
Last synced: 03 Aug 2024
https://github.com/ianycxu/GCN-with-BERT
Graph Convolutional Networks (GCN) with BERT for Coreference Resolution Task [Pytorch][DGL]
bert bert-model coreference-resolution gcn gnn graph-convolutional-networks graph-neural-networks nlp pytorch
Last synced: 02 Nov 2024