Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Natural language processing
Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
- GitHub: https://github.com/topics/nlp
- Wikipedia: https://en.wikipedia.org/wiki/Natural_language_processing
- Created by: Alan Turing
- Aliases: natural-language-processing, nlp-machine-learning, nlp-resources,
- Last updated: 2024-11-15 00:20:20 UTC
- JSON Representation
https://github.com/cdpierse/script_buddy_v2
Script Buddy v2 is a film script text generation tool built using film scripts from the world's most popular film scripts and GPT2.
artificial-intelligence gpt-2 language-generation machine-learning nlp pytorch transformers
Last synced: 08 Nov 2024
https://github.com/lexiestleszek/namegen
Self-contained, minimalistic implementation of a language model that generates coherent and normal sounding names. It uses an input dataset of names and probability distribution to generate new names based on the sequences of four characters.
language-model machine-learning markov-chain name-generation natural-language-processing nlp
Last synced: 14 Nov 2024
https://github.com/LanguageMachines/PICCL
A set of workflows for corpus building through OCR, post-correction and normalisation
computational-linguistics corpus-linguistics corpus-tools folia nlp ocr workflow
Last synced: 03 Nov 2024
https://github.com/welding-torch/excel-anonymizer
A Python script that anonymizes an Excel file and synthesizes new data in its place.
data-science microsoft nlp pandas presidio privacy
Last synced: 07 Nov 2024
https://github.com/khakhulin/compressed-transformer
Compression of NMT transformer model with tensor methods
compression deep-learning mnist nlp nmt pytorch tensor-train transformer translation tucker
Last synced: 19 Nov 2024
https://github.com/kaleidophon/nlp-uncertainty-zoo
Model zoo for different kinds of uncertainty quantification methods used in Natural Language Processing, implemented in PyTorch.
deep-learning lstm nlp nlp-machine-learning package python pytorch rnn transformers uncertainty-estimation uncertainty-neural-networks uncertainty-quantification
Last synced: 10 Oct 2024
https://github.com/gunale0926/sorsa
SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models
deep-learning fine-tuning llama lora machine-learning nlp peft python pytorch rwkv sorsa svd transformer
Last synced: 17 Nov 2024
https://github.com/ChenghaoMou/pytorch-pQRNN
Implementation of pQRNN in PyTorch
nlp pqrnn pytorch text-classification
Last synced: 16 Nov 2024
https://github.com/universaldatatool/react-nlp-annotate
Interface for making NLP annotations.
classification entity entity-relation-labeling hacktoberfest nlp nlp-library nlp-machine-learning text text-classification text-entities text-entity-analysis text-mining
Last synced: 16 Nov 2024
https://github.com/yakivyusin/simplenetnlp
.NET NLP library
c-sharp corenlp csharp-library natural-language-processing nlp nlp-library nuget ukraine wrapper
Last synced: 12 Oct 2024
https://github.com/kootenpv/spacy_api
Server/Client around Spacy to load spacy only once
api machine-learning nlp spacy
Last synced: 14 Oct 2024
https://github.com/kennethenevoldsen/spacy-wrap
spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to include existing fine-tuned models within your SpaCy workflow.
deep-learning huggingface huggingface-transformers language-model machine-learning natural-language-processing nlp pytorch spacy spacy-extension spacy-extensions spacy-models spacy-nlp spacy-pipeline spacy-transformers text-classification transformers
Last synced: 12 Oct 2024
https://github.com/skoltech-nlp/rudetoxifier
Code and data of "Methods for Detoxification of Texts for the Russian Language" paper
nlp russian-language style-transfer
Last synced: 08 Nov 2024
https://github.com/hsm207/bert_attn_viz
Visualize BERT's self-attention layers on text classification tasks
attention bert explainable-ai nlp tensorflow
Last synced: 28 Oct 2024
https://github.com/onesuper/HuggingFace-Datasets-Text-Quality-Analysis
Retrieves parquet files from Hugging Face, identifies and quantifies junky data, duplication, contamination, and biased content in dataset using pandas
dataset huggingface-datasets llm machine-learning nlp streamlit text-processing
Last synced: 09 Aug 2024
https://github.com/ai-forever/model-zoo
NLP model zoo for Russian
bert nlp pytorch roberta roberta-model russian russian-language t5 t5-model transformers
Last synced: 16 Nov 2024
https://github.com/nlpcloud/nlpcloud-js
NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, code generation, and much more...
ad-generator chatbot code-generation conversational-ai embeddings intent-classification keywords-extraction language-detection machine-translation ner nlp paraphrasing question-answering semantic-similarity sentiment-analysis text-classification text-generation text-summarization tokenization
Last synced: 07 Nov 2024
https://github.com/ailln/nlp-roadmap
🗺️ 一个自然语言处理的学习路线图
natural-language-processing nlp roadmap sequence-labeling word-embedding word-segmentation
Last synced: 18 Nov 2024
https://github.com/ljvmiranda921/calamancy
NLP pipelines for Tagalog using spaCy
computational-linguistics low-resource-languages low-resource-nlp machine-learning natural-language-processing ner nlp spacy
Last synced: 14 Nov 2024
https://github.com/obss/trapper
State-of-the-art NLP through transformer models in a modular design and consistent APIs.
allennlp deep-learning natural-language-processing nlp python pytorch pytorch-transformers transformer transformers
Last synced: 27 Oct 2024
https://github.com/dermatologist/nlp-qrmine
Qualitative Research support tools in Python
hacktoberfest interview-data machine-learning nlp nlp-machine-learning python3 qualitative-data-analysis qualitative-research research-tool
Last synced: 01 Nov 2024
https://github.com/christabor/namebot
A company/project name generator for Python. Uses NLTK and diverse techniques derived from existing corporate etymologies and naming agencies for sophisticated word generation and ideation.
language name-generation naming naming-agencies nlp nltk
Last synced: 07 Nov 2024
https://github.com/s-nlp/rudetoxifier
Code and data of "Methods for Detoxification of Texts for the Russian Language" paper
nlp russian-language style-transfer
Last synced: 07 Aug 2024
https://github.com/johnsnowlabs/johnsnowlabs
Gateway into the John Snow Labs Ecosystem
bert databricks gpt machine-learning natural-language-processing nlp python seq2seq spark t5
Last synced: 18 Nov 2024
https://github.com/qibowen2008/supertexttoolbox
一个免费的文字处理工具箱
addin ai csharp free language netframework nlp ocr office text text-to-speech windows-desktop windows-forms winforms wordcloud wordcloud-generator
Last synced: 07 Nov 2024
https://github.com/kensuke-mitsuzawa/documentfeatureselection
A set of metrics for feature selection from text data
bns docker feature-extraction feature-selection flask-application nlp pmi python-3 soa tf-idf web-app web-application webapp
Last synced: 08 Nov 2024
https://github.com/deshwalmahesh/phudge
Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute, relative and much more. It contains a list of all the available tool, methods, repo, code etc to detect hallucination, LLM evaluation, grading and much more.
ai custom-dataset evaluation feedback-collection finetuning hallucination hallucination-detection judge llm llm-evaluation ml nlp phi-3 pytorch sota
Last synced: 18 Nov 2024
https://github.com/teticio/llama-squad
Train Llama 2 & 3 on the SQuAD v2 task as an example of how to specialize a generalized (foundation) model.
decoder fine-tuning llama2 llama3 nlp question-answering squad
Last synced: 10 Oct 2024
https://github.com/kommunicate-io/kommunicate-ios-sdk
Kommunicate iOS SDK for customer support
ai-agents chat chat-application chat-sdk chatapp chatbots chatserver cocoapods customer-support in-app-communication ios kommunicate-ios-sdk livechat messaging nlp sdk-ios swift
Last synced: 14 Nov 2024
https://github.com/natasha/naeval
Comparing quality and performance of NLP systems for Russian language
evaluation nlp performance-analysis python russian
Last synced: 10 Nov 2024
https://github.com/tuanacelik/should-i-follow
🦄 An NLP application just for the lols: built with Haystack to get an overview of what a user is posting about on Twitter
Last synced: 22 Oct 2024
https://github.com/adamlui/autoclear-chatgpt-history
🕶️ Adds chat auto-clear functionality to ChatGPT for more privacy
artificial-intelligence chat chatbot chatgpt chatgpt3 gpt gpt-3 gpt-4 greasemonkey javascript machine-learning ml nlp openai privacy userscripts
Last synced: 14 Oct 2024
https://github.com/kavgan/word_cloud
Python word cloud library for use within Jupyter notebook and Python apps.
cloud-library jupyter-notebook nlp python visualization word-cloud wordcloud
Last synced: 30 Oct 2024
https://github.com/omarsar/pytorch_neural_machine_translation_attention
Neural Machine Translation with Attention (PyTorch)
attention-mechanism deep-learning encoder-decoder neural-machine-translation nlp pytorch seq2seq
Last synced: 28 Oct 2024
https://github.com/yoshoku/suika
Suika 🍉 is a Japanese morphological analyzer written in pure Ruby
morphological-analysis nlp postagger ruby tokenizer
Last synced: 10 Nov 2024
https://github.com/jonathanbratt/RBERTviz
Visualization tools to use with RBERT
bert htmlwidgets natural-language-processing nlp rstats rstudio tensorflow
Last synced: 05 Aug 2024
https://github.com/stas00/porting
Helper scripts and notes that were used while porting various nlp models
Last synced: 22 Oct 2024
https://github.com/ruu3f/freegpt-discord
Discord chatbot and image generator powered by freeGPT. Now with image detection.
ai artificial-intelligence bot chatgpt deep-learning discord freegpt gpt gpt4all gpt4free image image-detection image-processing llama llm machine-learning nlp python
Last synced: 27 Oct 2024
https://github.com/xavidop/dialogflow-cx-cli
The missing Dialogflow CX CLI to interact with your projects
cli cxcli dialogflow dialogflow-cx dialogflowcx golang nlp nlu test-automation testing-tools
Last synced: 15 Nov 2024
https://github.com/explosion/assets
💥 Explosion Assets
machine-learning nlp spacy spacy-nlp
Last synced: 07 Oct 2024
https://github.com/aphp/eds-pseudo
EDS-Pseudo is a hybrid model for detecting personally identifying entities in clinical reports
Last synced: 03 Sep 2024
https://github.com/edwardcooper/piidetect
A package to build an end-to-end pipeline for detecting personally identifiable information from text.
nlp pii pii-detection word2vec
Last synced: 11 Nov 2024
https://github.com/Lipairui/textgo
Text preprocessing, representation, similarity calculation, text search and classification. Let's go and play with text!
bert nlp text-classification text-preprocessing text-representation text-search text-similarity
Last synced: 07 Aug 2024
https://github.com/coosto/dutch-word-embeddings
Dutch word embeddings, trained on a large collection of Dutch social media messages and news/blog/forum posts.
coosto dutch nlp word2vec word2vec-model wordembeddings
Last synced: 17 Nov 2024
https://github.com/jaykef/avachat
AvaChat - is a realtime AI chat demo with animated talking heads - it uses Large Language Models (GPT, API2D GPT4, Cluade) as text inputs to D-ID's image-to-video talking head model (via D-ID stream api)
Last synced: 29 Oct 2024
https://github.com/sciss/ws4j
WordNet Similarity for Java provides an API for several Semantic Relatedness/Similarity algorithms. Mirror of https://codeberg.org/sciss/ws4j
Last synced: 09 Nov 2024
https://github.com/keyvan-m-sadeghi/assister
Private Open General Assistant Platform
artificial-intelligence assistant assistant-chat-bots chatbot nlp voice voice-recognition
Last synced: 16 Oct 2024
https://github.com/trainingbypackt/natural-language-processing-fundamentals
Use Python and NLTK to build out your own text classifiers and solve common NLP problems
api binary-classifier latent-dirichlet-allocation lda linear-regression markov-chain natural-language-processing nlp pandas python scikit-learn supervised tokenization unsupervised
Last synced: 14 Nov 2024
https://github.com/dongjunlee/dmn-tensorflow
TensorFlow implementation of 'Ask Me Anything: Dynamic Memory Networks for Natural Language Processing (2015)'
babi-tasks dynamic-memory-network hb-experiment natural-language-processing nlp question-answering tensorflow
Last synced: 08 Nov 2024
https://github.com/kinosal/cowriter
Write 10x faster using OpenAI's GPT-3 based Davinci model to autocomplete your text
Last synced: 27 Oct 2024
https://github.com/osu-nlp-group/amplegcg
AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM
adversarial-attacks gcg nlp safety
Last synced: 11 Nov 2024
https://github.com/prihoda/golem
Open-source chatbot framework for python developers. Batteries included 🔋🔋
bot chatbot dialog-management messenger nlp python telegram witai
Last synced: 16 Nov 2024
https://github.com/OpenSextant/Xponents
Geographic Place, Date/time, and Pattern entity extraction toolkit along with text extraction from unstructured data and GIS outputters.
document-conversion geocoding geonames geoparsing geotagging information-extraction nlp solr tika
Last synced: 05 Nov 2024
https://github.com/explosion/spacy-huggingface-hub
🤗 Push your spaCy pipelines to the Hugging Face Hub
huggingface machine-learning ml-models models natural-language-processing nlp spacy
Last synced: 07 Oct 2024
https://github.com/apache/opennlp-sandbox
Apache OpenNLP Sandbox
apache compling languagetechnology nlp opennlp textprocessing
Last synced: 07 Oct 2024
https://github.com/salesforce/query-focused-sum
Official code repository for "Exploring Neural Models for Query-Focused Summarization".
deep-learning machine-learning neural-network nlp question-answering summarization
Last synced: 08 Nov 2024
https://github.com/skroutz/turkish_stemmer
A simple Turkish stemming library
Last synced: 11 Nov 2024
https://github.com/kenlimmj/rouge
A Javascript implementation of the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) evaluation metric for summaries.
bootstrapping-statistics evaluation-metric jackknifing nlp rouge summarization
Last synced: 10 Nov 2024
https://github.com/argosopentech/metaltranslate
Customizable machine translation in C++
machine-learning nlp nlp-machine-learning translation
Last synced: 08 Nov 2024
https://github.com/lizadaly/blackout
NaNoGenMo 2016 entry #2
blackout grammar nlp ocr tesseract-ocr tracery tracery-grammar
Last synced: 10 Nov 2024
https://github.com/yasinkuyu/turkish.js
Turkish Suffix Library for Javascript - Türkçe Çekim ve Yapım Ekleri
Last synced: 06 Nov 2024
https://github.com/shineware/PyKOMORAN
(Beta) PyKOMORAN is wrapped KOMORAN in Python using Py4J.
komoran korean korean-analysis korean-nlp korean-text-processing korean-tokenizer morphological-analyser nlp py4j pypi-packages
Last synced: 12 Nov 2024
https://github.com/saareliad/FTPipe
FTPipe and related pipeline model parallelism research.
deep-neural-networks distributed-training fine-tuning nlp pipeline-parallelism t5
Last synced: 07 Nov 2024
https://github.com/yuvalpinter/m3gm
Max-Margin Markov Graph Models for WordNet (EMNLP 2018)
markov-model nlp relation-extraction semantics wordnet
Last synced: 27 Oct 2024
https://github.com/zamgi/lingvo--ner-ru
Named entity recognition (NER) in Russian texts / Определение именованных сущностей (NER) в тексте на русском языке
linguistics lingvo named-entity-recognition natural-language-processing ner nlp nlp-machine-learning
Last synced: 05 Nov 2024
https://github.com/ecohealthalliance/epitator
EpiTator annotates epidemiological information in text documents. It is the natural language processing framework that powers GRITS and EIDR Connect.
disease-surveillance epidemiology geonames nlp spacy toponym-resolution
Last synced: 14 Oct 2024
https://github.com/tattle-made/uli
Software and Resources for Mitigating Online Gender Based Violence in India
browser-extension content-moderation extension-chrome gender-based-violence india indian-languages indic indic-languages machine-learning ml nlp ogbv sdg sdg-10 sdg-5 social-impact trust-and-safety
Last synced: 14 Nov 2024
https://github.com/dengbocong/prompt-tuning
A pipeline for Prompt-tuning
classification deep-learning few-shot-learning fine-tuning nlp pretrained-models prompt prompt-tuning
Last synced: 08 Nov 2024
https://github.com/lunarwhite/covid-social-analysis
Apply ML on weibo sentiment. 疫情背景下微博文本情感分析与可视化
crawling data-analysis machine-learning nlp python vizualization
Last synced: 06 Nov 2024
https://github.com/tokestermw/spacy_grammar
:black_nib: Language Tool style grammar handling with spaCy 2.0
Last synced: 07 Nov 2024
https://github.com/tugstugi/mongolian-bert
Pre-trained Mongolian BERT models
bert machine-learning mongolian natural-language-processing natural-language-understanding nlp pytorch tensorflow
Last synced: 15 Nov 2024
https://github.com/IndexFziQ/KMRC-Papers
A list of recent papers regarding knowledge-based machine reading comprehension.
knowledge knowledge-base machine-reading-comprehension nlp paper reading-comprehension
Last synced: 13 Nov 2024
https://github.com/greenelab/pubtator
Retrieve and process PubTator annotations
data nlp pubmed pubtator snorkel text-mining tool
Last synced: 13 Nov 2024
https://github.com/rangilyu/llama.mmengine
Training LLaMA language model with MMEngine! It supports LoRA fine-tuning!
alpaca fine-tuning language-model llama lora nlp
Last synced: 22 Oct 2024
https://github.com/dpressel/textrank-js
TextRank algorithm implementation in Javascript
Last synced: 28 Oct 2024
https://github.com/danieldeutsch/repro
Repro is a library for easily running code from published papers via Docker.
docker machine-learning nlp reproducibility reproducible-research
Last synced: 06 Nov 2024
https://github.com/jina-ai/example-multimodal-fashion-search
Input text or image, get back matching image fashion results, using Jina, DocArray, and CLIP
computer-vision deep-learning neural-search nlp python
Last synced: 16 Nov 2024
https://github.com/eellak/gsoc2018-3gm
💫 Automated codification of Greek Legislation with NLP
automation codification government-documents government-gazette gsoc-2018 legal-texts natural-language-processing natural-language-understanding nlp python3 text-mining
Last synced: 08 Nov 2024
https://github.com/GeekDream-x/SemEval2022-Task8-TonyX
Deep-learning system proposed by HFL for SemEval-2022 Task 8: Multilingual News Similarity
computational-linguistics cross-lingual crosslingual deep-learning machine-learning multi-lingual multilingual natural-language-processing nlp paper semantic-similarity semeval-2022 xlm-roberta
Last synced: 16 Nov 2024
https://github.com/sanghviharshit/pocket-tagger
📖👓🏷Tag your getpocket.com articles automatically using natural language processing
articles getpocket google-cloud natural-language-processing nlp pocket scraper tag
Last synced: 30 Oct 2024
https://github.com/applenob/simple_crf
simple Conditional Random Field implementation in Python
Last synced: 07 Nov 2024
https://github.com/thehamkercat/python-arq
Asynchronous Python Wrapper For A.R.Q API.
api api-wrapper arq chatbot-api deezer deezer-api fastapi natural-language-processing nlp pornhub-api python-arq saavn spam-classification spam-detection spellcheck torrent-api wallpaper-api youtube-api
Last synced: 20 Oct 2024
https://github.com/perone/feste
Feste is a free and open-source framework allowing scalable composition of NLP tasks using a graph execution model that is optimized and executed by specialized schedulers.
deep-learning language-model machine-learning nlp
Last synced: 28 Oct 2024
https://github.com/machinelearningzh/simply-simplify-language
Use machine learning to make your institutional communication more understandable and inclusive.
anthropic einfachesprache leichtesprache llm llms mistral mistralai natural-language-processing nlp openai plainlanguage python spacy streamlit
Last synced: 14 Oct 2024
https://github.com/nschneid/arabic-tagger
AQMAR Arabic Tagger: Sequence tagger with cost-augmented structured perceptron training
arabic arabic-language arabic-nlp arabic-wikipedia java named-entities nlp nlp-machine-learning sequence-tagger tagger
Last synced: 08 Nov 2024
https://github.com/tmalsburg/txl.el
Emacs extension providing direct access to DeepL's machine translation API.
emacs language language-technology machine-translation nlp
Last synced: 27 Oct 2024
https://github.com/bentoml/transformers-nlp-service
Online Inference API for NLP Transformer models - summarization, text classification, sentiment analysis and more
llm llmops mlops model-deployment model-inference-service model-serving nlp nlp-machine-learning online-inference transformer
Last synced: 13 Nov 2024
https://github.com/megagonlabs/t5-japanese
Codes to pre-train Japanese T5 models
natural-language-processing nlp t5 transformer
Last synced: 10 Nov 2024
https://github.com/rosette-api/python
Babel Street Analytics Client Library for Python
categorization entity-extraction fuzzy-matching language-detection language-identification lemmatization machine-learning morphology name-generation name-similarity name-translation natural-language-processing nlp python relation-extraction sentiment-analysis text text-analysis text-mining tokenization
Last synced: 12 Nov 2024
https://github.com/shibing624/text-feature
文本特征提取,适用于小说,论文,议论文等文本,提取词语、句子、依存关系等特征。python开发。
Last synced: 22 Oct 2024
https://github.com/TheHamkerCat/python-arq
Asynchronous Python Wrapper For A.R.Q API.
api api-wrapper arq chatbot-api deezer deezer-api fastapi natural-language-processing nlp pornhub-api python-arq saavn spam-classification spam-detection spellcheck torrent-api wallpaper-api youtube-api
Last synced: 09 Aug 2024
https://github.com/winkjs/wink-naive-bayes-text-classifier
Naive Bayes Text Classifier
chatbot classifier machine-learning naive-bayes natural-language-processing nlp sentiment-analysis text-classification winkjs winknlp
Last synced: 16 Nov 2024
https://github.com/palewire/storysniffer
Inspect a URL and estimate if it contains a news story
data-journalism journalism jupyter-notebook machine-learning news nlp python scikit-learn
Last synced: 11 Oct 2024
https://github.com/sfischer13/python-arpa
:snake: Python library for n-gram models in ARPA format
arpa computational-linguistics language-model library lm nlp python python-3
Last synced: 01 Nov 2024
https://github.com/nlpodyssey/gotokenizers
Go implementation of today's most used tokenizers
bert language-model natural-language-processing natural-language-understanding nlp transformers
Last synced: 15 Nov 2024
https://github.com/microsoft/vistalk
A JavaScript toolkit for Natural Language-based Visualization Authoring
nlp nx reactjs tensorflowjs transformer vega vega-lite visualization
Last synced: 07 Oct 2024