Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Natural language processing
Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
- GitHub: https://github.com/topics/nlp
- Wikipedia: https://en.wikipedia.org/wiki/Natural_language_processing
- Created by: Alan Turing
- Aliases: natural-language-processing, nlp-machine-learning, nlp-resources,
- Last updated: 2025-02-20 00:20:37 UTC
- JSON Representation
https://github.com/avacaondata/nlpboost
Python library for automatic training, optimization and comparison of Transformer models on most NLP tasks.
deep-learning hyperparameter-optimization hyperparameter-tuning natural-language-generation natural-language-processing natural-language-understanding nlp pytorch text-classification text-generation
Last synced: 11 Feb 2025
https://github.com/code-kern-ai/embedders
With embedders, you can easily convert your texts into sentence- or token-level embeddings within a few lines of code. Use cases for this include similarity search between texts, information extraction such as named entity recognition, or basic text classification.
classification machine-learning named-entity-recognition natural-language-processing ner nlp python representation-learning similarity-search
Last synced: 10 Nov 2024
https://github.com/daac-tools/python-vaporetto
🛥 Vaporetto is a fast and lightweight pointwise prediction based tokenizer. This is a Python wrapper for Vaporetto.
analyzer japanese morphological-analysis nlp python rust segmentation tokenization tokenizer
Last synced: 20 Dec 2024
https://github.com/deepset-ai/haystack-search-pipeline-streamlit
🚀 Template Haystack Search Application with Streamlit
Last synced: 06 Nov 2024
https://github.com/artitw/text2class
Multi-class text categorization using state-of-the-art pre-trained contextualized language models, e.g. BERT
artificial-intelligence bert categorization classification classifier machine-learning natural-language-processing natural-language-understanding nlp tensorflow text text-classification transformers
Last synced: 28 Oct 2024
https://github.com/yuvalpinter/nytwit
New York Times Word Innovation Types dataset
computational-linguistics corpus dataset news nlp
Last synced: 07 Feb 2025
https://github.com/ekinakyurek/knetlayers.jl
Useful Layers for Knet
computer-vision deep-learning machine-learning nlp
Last synced: 30 Jan 2025
https://github.com/appcoda/naturallanguageprocessing
A Quick Demo for NLP in Swift 4
demo nlp playgrounds swift swift4
Last synced: 15 Nov 2024
https://github.com/languagemachines/luiginlp
A workflow system for Natural Language Processing.
natural-language-processing nlp workflow-management-system
Last synced: 04 Dec 2024
https://github.com/anthonymrios/relation-extraction-rnn
Bi-directional LSTM model for relation extraction
machine-learning neural-network nlp relation-extraction
Last synced: 20 Nov 2024
https://github.com/abhisheksoni27/whatsapp-chat-analysis
Mine WhatsApp chat data and draw awesome inferences.
machine-learning nlp text-mining
Last synced: 10 Jan 2025
https://github.com/rbhatia46/twitter-sentiment-analysis-web
Twitter Sentiment Analysis using Textblob and Tweepy, wrapped with Flask as a web app.
deep-learning nlp twitter-sentiment-analysis
Last synced: 24 Nov 2024
https://github.com/rchtgpt/gitg0
a magnificent tool to auto-suggest everything you need before pushing a git commit • built @mlh-fellowship
cli-tool dev-tool javascript nlp npm
Last synced: 15 Feb 2025
https://github.com/priyamakeshwari/teachgpt
An AI Powered teacher that can help you learn your topics faster before exam
ai hacktoberfest hacktoberfest2023 llm machine-learning nlp python
Last synced: 19 Jan 2025
https://github.com/jpmanson/llm_templates
Instruction/chat prompts creation library for text generation LLMs. It supports local and Hugging Face models.
chatbot cohere gemma huggingface jinja2 library llama2 llama3 llm mistral nlp nlp-library phi3 template
Last synced: 24 Jan 2025
https://github.com/shekswess/synthgenai
SynthGenAI - Package for Generating Synthetic Datasets using LLMs.
ai bedrock gemini generative-ai huggingface langfuse litellm llm llms machine-learning nlp ollama openai python synthetic-data synthetic-dataset-generation
Last synced: 24 Jan 2025
https://github.com/salesforce/overture
Library for soft prompt tuning
deep-learning nlp prompt-tuning python pytorch soft-prompt-tuning
Last synced: 08 Nov 2024
https://github.com/eric-haibin-lin/nlp-notebooks
A collection of natural language processing notebooks.
deep-learning deep-learning-tutorial natural-language-generation natural-language-inference natural-language-processing natural-language-understanding nlp nlp-resources
Last synced: 28 Oct 2024
https://github.com/tirendazacademy/hugging-face-tutorials
Getting started with Hugging Face
deep-learning hugging-face huggingface huggingface-datasets huggingface-library huggingface-pipeline huggingface-transformer huggingface-transformers image-classification machine-learning natural-language-processing nlp pretrained-models pytorch sentiment-analysis tensorflow text-classification transfer-learning
Last synced: 08 Nov 2024
https://github.com/fer-aguirre/pmdm
Political Misogynistic Discourse Monitor team from the 2021 JournalismAI Collab Challenges
nlp social-network-analysis text-classification
Last synced: 05 Nov 2024
https://github.com/bretttolbert/verbecc-svc
Dockerized Python microservice with REST API for verbs conjugation in French, Spanish and Portuguese
conjugation conjugator french french-language french-nlp linguistics machine-learning natural-language natural-language-processing nlp portuguese-language portuguese-verbs romanian romanian-language scikit-learn spanish-language spanish-verbs verb-conjugation
Last synced: 13 Jan 2025
https://github.com/joshdevins/demo-es-lang-ident
Demo: Elasticsearch Language Identification
demo elasticsearch language-identification nlp search
Last synced: 23 Oct 2024
https://github.com/explosion/spacy-benchmarks
💫 Runtime performance comparison of spaCy against other NLP libraries
benchmarking benchmarks natural-language-processing nlp spacy
Last synced: 17 Jan 2025
https://github.com/li-plus/rouge-metric
A Python wrapper of the official ROUGE-1.5.5.pl script and a re-implementation of full ROUGE metrics.
machine-learning nlp pypi python rouge rouge-metric summarization
Last synced: 06 Nov 2024
https://github.com/yhy1117/x-mixup
Implementation of ICLR 2022 paper "Enhancing Cross-lingual Transfer by Manifold Mixup".
cross-lingual-transfer manifold-mixup nlp
Last synced: 21 Dec 2024
https://github.com/rileynwong/rpi-poetry-generator
Poetry theremin: use Raspberry Pi with hardware sensors to generate poetry using NLP techniques, based on physical light and distance conditions
distance-sensor generative-art generative-poetry generative-text hardware interactive interactive-art interactive-text-generation light-sensor natural-language-processing nlp nltk poetry python raspberry-pi rpi sensors sentiment-analysis
Last synced: 14 Oct 2024
https://github.com/banyh/PyStanfordNLP
A Python Wrapper of Stanford Chinese Segmenter
nlp postagging python-wrapper stanford stanford-chinese-segmenter
Last synced: 14 Nov 2024
https://github.com/derintelligence/en-az-parallel-corpus
English-Azerbaijani parallel language corpus
azerbaijan azerbaijani-translation corpus language linguistics nlp parallel translation
Last synced: 13 Nov 2024
https://github.com/dpressel/arcs-py
Arc-Eager and Arc-Hybrid Greedy Dependency Parser with Dynamic Oracle in Python (with no Dependencies!)
Last synced: 28 Oct 2024
https://github.com/winkjs/wink-porter2-stemmer
Javascript Implementation of Porter Stemmer Algorithm V2 by Dr Martin F Porter
natural-language-processing nlp porter-stemmer-algorithm porter-stemmer-v2 stemmer
Last synced: 09 Nov 2024
https://github.com/anthonysigogne/web-search-engine
API - a simple web search engine
api elasticsearch google-search indexing nlp python search-engine
Last synced: 12 Nov 2024
https://github.com/alexcg1/easy_text_generator
Generate text from machine-learning models right in your browser
machine-learning nlp python streamlit
Last synced: 08 Jan 2025
https://github.com/sagorbrur/bntransformer
Bengali transformer using transformers
bangla bengali bengali-language-modeling bengali-ner bengali-nlp bengali-question-answering bengali-transformer nlp transformers
Last synced: 30 Dec 2024
https://github.com/ahammadmejbah/ahammadmejbah
Data Science || Machine Learning || Deep Learning || Computer Vision || NLP Enthusiast Talks about #datascience, #deeplearning, #dataanalytics, #machinelearning, and #machinelearningalgorithms
artificial-intelligence computer-vision data-science deep-learning machine-learning nlp python
Last synced: 11 Nov 2024
https://github.com/oroszgy/hungarian-text-mining-workshop
Materials for the Text Mining workshop held in the HuNLP meetup, June 2017
classification hungarian information-extraction keyword-extraction machine-learning meetup natural-language-processing nlp python scikit-learn sentiment-analysis spacy spacy-models text-mining text-mining-workshop textacy tutorial workshop
Last synced: 08 Nov 2024
https://github.com/percevalw/metanno
Annotator building tool for Jupyter
annotator customizable jupyter modular nlp
Last synced: 08 Nov 2024
https://github.com/tomhosking/hercules
Hercules: Attributable and Scalable Opinion Summarization (ACL 2023)
nlp opinion-summarization summarization vq-vae
Last synced: 30 Dec 2024
https://github.com/gmontamat/poor-mans-transformers
Implement Transformers (and Deep Learning) from scratch in NumPy
deep-learning from-scratch machine-learning ml-framework neural-network nlp transformers
Last synced: 30 Oct 2024
https://github.com/revdotcom/words2num
Convert words to numbers
inverse-text-normalization nlp
Last synced: 11 Nov 2024
https://github.com/mush42/libtashkeel
Add Arabic diacritics (tashkeel/harakat) using Rust/Python/C++/WASM and NLP models
arabic diacritics nlp tashkeel
Last synced: 04 Jan 2025
https://github.com/juliastrings/tinysegmenter.jl
Julia version of TinySegmenter, compact Japanese tokenizer
Last synced: 23 Jan 2025
https://github.com/richardlitt/thesis
My thesis on "Open Source Code and Low Resource Languages" for an MSc in Language Science and Technology at Saarland University
dissertation endangered-languages low-resource-languages lrl nlp nlproc saarland saarland-university thesis
Last synced: 04 Feb 2025
https://github.com/xv44586/knowledge-distillation-nlp
some demos of Knowledge Distillation in NLP
bert keras knowledge-distillation nlp
Last synced: 17 Nov 2024
https://github.com/primaprashant/ai-customer-support
📚 Curated collection of blogs and papers on how different companies are using machine learning in production for better customer support.
ai applied-data-science applied-machine-learning applied-ml artificial-intelligence customer-service customer-support data-science deep-learning machine-learning natural-language-processing nlp paper production tech-blog
Last synced: 19 Feb 2025
https://github.com/canclid/sentences
粵語對話語料
natural-language-processing nlp
Last synced: 12 Feb 2025
https://github.com/pnnl/cactus
LLM Agent that leverages cheminformatics tools to provide informed responses.
cheminformatics chemistry foundation-models llm llm-agent nlp science
Last synced: 25 Nov 2024
https://github.com/AnthonyMRios/adversarial-relation-classification
Unsupervised domain adaptation method for relation extraction
bioinformatics biomedical-data-science machine-learning natural-language-processing nlp nlp-machine-learning relation-extraction
Last synced: 15 Nov 2024
https://github.com/tencent-ailab/season
[EMNLP 2022] Salience Allocation as Guidance for Abstractive Summarization
nlp summarization summarization-model
Last synced: 18 Nov 2024
https://github.com/princeton-nlp/lwm
We develop world models that can be adapted with natural language. Intergrating these models into artificial agents allows humans to effectively control these agents through verbal communication.
Last synced: 09 Jan 2025
https://github.com/arbox/wlapi
Ruby based API for the project Wortschatz Leipzig.
computational-linguistics natural-language-processing nlp ruby rubynlp
Last synced: 15 Nov 2024
https://github.com/r13i/twitter-sentiment-analysis
What if we could see the emotions and moods of people through the breadcrumbs they leave on Twitter ?
docker emotions influxdb kafka natural-language-processing nlp python sentiment-analysis tweets twitter
Last synced: 07 Dec 2024
https://github.com/ayoolaolafenwa/trainnlp
Sample tutorials for training Natural Language Processing Models with Transformers
huggingface-transformers masked-language-models natural-language-processing nlp transformers
Last synced: 01 Jan 2025
https://github.com/anthonymrios/adversarial-relation-classification
Unsupervised domain adaptation method for relation extraction
bioinformatics biomedical-data-science machine-learning natural-language-processing nlp nlp-machine-learning relation-extraction
Last synced: 20 Nov 2024
https://github.com/pb2204/spam-detection-model
SPAM-Detection-Model Is A NLP Model To Detect SPAM Messages...
Last synced: 17 Jan 2025
https://github.com/abhaskumarsinha/minimalgpt
MinimalGPT is a concise, adaptable, and streamlined code framework that encompasses the essential components necessary for the construction, training, inference, and fine-tuning of the GPT model. This framework is implemented exclusively using Keras and TensorFlow, ensuring compatibility and coherence within the broader deep learning ecosystem.
ai artificial-intelligence fine-tuning generative-model gpt gpt-2 gpt-models keras keras-tensorflow language-model llm machine-learning neural-network nlp nlp-machine-learning tensorflow tensorflow2 training transformer transformer-architecture
Last synced: 21 Nov 2024
https://github.com/fursovia/geometric_embedding
"Zero-Training Sentence Embedding via Orthogonal Basis" paper implementation
Last synced: 17 Nov 2024
https://github.com/smartdataanalytics/ma-inf-4222-nlp-lab
MA-INF 4222: NLP Lab (University of Bonn)
labs natural-language-processing nlp sda university-of-bonn university-project
Last synced: 21 Nov 2024
https://github.com/koenvervloesem/rasa-docker-arm
Rasa Docker image for ARMv7. Runs on a Raspberry Pi.
arm armhf armv7 armv7l bot bot-framework bots chatbot chatbots chatbots-framework docker docker-image machine-learning natural-language-processing nlp nlu rasa raspberry-pi raspberry-pi-4 raspberry-pi-4b
Last synced: 20 Jan 2025
https://github.com/kampersanda/sif-embedding
Rust implementation of SIF and uSIF: Simple and fast sentence embedding
nlp sentence-embeddings vector-search
Last synced: 29 Nov 2024
https://github.com/deepraj1729/tchatbot
A ChatBot framework to create customizable all purpose Chatbots using NLP, Tensorflow, Speech Recognition
artificial-intelligence chatbot-framework conda deep-learning framework git github machine-learning neural-networks nlp nltk numpy pip pypi python3 sklearn speech-recognition tensorflow virtual-environment
Last synced: 14 Oct 2024
https://github.com/bramvanroy/astred
An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For instance useful for comparing a translation with the original text, to find differences and similarities between two different translations, or to see how a machine translation differs from a reference translation.
alignment linguistics nlp parallel-corpus parsing spacy stanza translation
Last synced: 14 Oct 2024
https://github.com/wetneb/pynif
A small Python library for NLP Interchange Format (NIF) for NER(D) systems
entity-linking gerbil named-entity-recognition nif nlp python
Last synced: 28 Oct 2024
https://github.com/wassname/phoneme2grapheme
Teaching machines to spell with deep learning (acc=>80%) e.g. a model hears "pɹˈaʊd˺ɚ" and writes "prowder" (but it should be "prouder")
cmudict deep-learning deeplearning machine-learning nlp pronunciation spelling
Last synced: 15 Oct 2024
https://github.com/azu/nlp-pattern-match
Natural Language pattern matching library for JavaScript.
english japanese javascript morphological-analysis nlcst nlp pos
Last synced: 01 Nov 2024
https://github.com/study-assist/browser-extension
A tool to help you organise your bookmarks intelligently
bookmarks bookmarks-manager browser-extension data-analysis machine-learning natural-language-processing nlp
Last synced: 06 Nov 2024
https://github.com/artitw/bert_qa
Accelerating the development of question-answering systems based on BERT and TF 2.0
artificial-intelligence bert machine-learning natural-language-processing natural-language-understanding nlp
Last synced: 28 Oct 2024
https://github.com/spacyturk/spacyturk
spaCyTurk - trained models & pipelines for Turkish
floret nlp nlp-library spacy turkish-nlp
Last synced: 15 Feb 2025
https://github.com/cmccomb/rust-stop-words
Common stop words in a variety of languages
languages natural-language-procressing nlp nltk rust-crate stopwords
Last synced: 16 Feb 2025
https://github.com/talmago/spacy_crfsuite
sequence tagging with spaCy and crfsuite
crf crf-model crfsuite entity-extraction entity-extraction-extension entity-tagging nlp sklearn-crfsuite spacy spacy-extension spacy-ner
Last synced: 15 Feb 2025
https://github.com/hpprc/defsent
DefSent: Sentence Embeddings using Definition Sentences
bert natural-language-processing nlp transformers
Last synced: 27 Oct 2024
https://github.com/proycon/deepfrog
An NLP-suite powered by deep learning
deep-learning deep-neural-networks dutch folia frog nlp transformers
Last synced: 08 Nov 2024
https://github.com/thunlp/babelnet-sememe-prediction
Code and data of the AAAI-20 paper "Towards Building a Multilingual Sememe Knowledge Base: Predicting Sememes for BabelNet Synsets"
Last synced: 10 Nov 2024
https://github.com/o19s/skipchunk
Extracts a latent knowledge graph from text and index/query it in elasticsearch or solr
elasticsearch knowledge-graph nlp solr
Last synced: 20 Nov 2024
https://github.com/contextlab/abstract2paper
Auto-generate an entire paper from a prompt or abstract using NLP
auto-text gpt-neo nlp notebook-jupyter text-generation
Last synced: 06 Nov 2024
https://github.com/kklemon/flashperceiver
Fast and memory efficient PyTorch implementation of the Perceiver with FlashAttention.
attention-mechanism deep-learning flash-attention nlp perceiver transformer
Last synced: 19 Nov 2024
https://github.com/monk1337/graph-neural-networks-for-nlp
Graph Neural networks for NLP
attention-mechanism gcn gnn graph graph-attention-networks graph-natural-language-processing graph-networks graph-neural-networks graph-nlp machine-learning multi-label-classification multi-label-learning natural-language-processing neural-network nlp pytorch pytorch-geometric
Last synced: 24 Nov 2024
https://github.com/tlkh/t2t-tuner
Convenient Text-to-Text Training for Transformers
gpt huggingface language-model nlp pytorch t5 transformers
Last synced: 07 Nov 2024
https://github.com/liyucheng09/llm-compressive
Longitudinal Evaluation of LLMs via Data Compression
benchmark evaluation llm llms nlp
Last synced: 30 Oct 2024
https://github.com/vishnunkumar/doc_transformers
Document processing using transformers
Last synced: 16 Nov 2024
https://github.com/bloomberg/mixce-acl2023
Implementation of MixCE method described in ACL 2023 paper by Zhang et al.
language-model machine-learning nlp python pytorch transformer
Last synced: 09 Nov 2024
https://github.com/gdamdam/sumo
Tool to extracts the text from a web article urls and get frequency words, entities recognition, automatic summary and more
automatic-summarization content-extraction entity-recognition nlp nltk semantic-analysis sentence-extraction
Last synced: 14 Nov 2024
https://github.com/anthonysigogne/keyword-mining
API - extract a list of keywords from a text.
docker keyword keyword-extraction nlp python-2 seo
Last synced: 20 Jan 2025
https://github.com/bububa/jiagu
Jiagu深度学习自然语言处理工具 知识图谱关系抽取 中文分词 词性标注 命名实体识别 情感分析 新词发现 关键词 文本摘要 文本聚类
chinese-nlp chinese-word-segmentation classification clustering cws ner nlp pos segmentation
Last synced: 08 Nov 2024
https://github.com/hc200ok/manual-data-masking
A lightweight javascript library for manual data masking
data-masking dataset dataset-generation ecmascript2020 javascript library manual-data-masking nlp
Last synced: 11 Nov 2024
https://github.com/andrewtavis/wikirec
Recommendation engine framework based on Wikipedia data
bert bert-embeddings books doc2vec lda machine-learning multilingual natural-language-processing neural-network nlp open-source python python3 recommendation-engine recommender-system text-mining tfidf unsupervised-learning wikipedia wikipedia-data
Last synced: 22 Jan 2025
https://github.com/code-kern-ai/refinery-python-sdk
Official Python SDK for Kern AI refinery.
active-learning data-centric-ai deep-learning labeling labeling-tool machine-learning natural-language-processing neural-search nlp python sdk spacy supervised-learning text-annotation text-classification transformer
Last synced: 01 Nov 2024
https://github.com/yashdew/assessor
An open-source Resume Analyzer and Ranking tool for recruiters and candidates.
flask hacktoberfest hacktoberfest2021 nextjs nlp python spacy
Last synced: 27 Oct 2024
https://github.com/fedenunez/tulp
Tulp is a command-line tool that can help you create and process piped content using the power of ChatGPT directly from the terminal.
chatgpt chatgpt-api console llm nlp shell unix-shell
Last synced: 13 Nov 2024
https://github.com/centrefordigitalhumanities/tscan
T-scan: an analysis tool for dutch texts to assess the complexity of the text, based on original work by Rogier Kraf
dutch-language feature-extraction nlp text-difficulty
Last synced: 06 Nov 2024
https://github.com/google-research/pangea
Panoramic Graph Environment Annotation toolkit, for collecting audio and text annotations in panoramic graph environments such as Matterport3D and StreetLearn.
annotation-tool computer-vision crowdsourcing nlp
Last synced: 10 Nov 2024
https://github.com/varunon9/sentence-type-classifier
Classify English sentences into assertive, negative, interrogative, imperative and exclamatory based on grammar.
english-grammar nlp nlp-machine-learning sentence-classification
Last synced: 27 Oct 2024
https://github.com/greenelab/preprint-similarity-search
A web app that uses machine learning to recommend the most suitable journals based on the text content of your preprint
journals nlp nlp-machine-learning web-app
Last synced: 13 Nov 2024
https://github.com/neokd/datastorehouse
DataStoreHouse is an open-source project that aims to create a collaborative platform for gathering and sharing a wide variety of datasets. It provides a centralised repository where individuals and organisations can contribute, discover, and collaborate on diverse datasets for various domains.
api csv datasets good-first-issue hacktoberfest hacktoberfest2023 json machinelearning nextjs13 nlp open-source opensource opensource-projects python reactjs
Last synced: 11 Jan 2025
https://github.com/mindspore-courses/deepnlp-models-mindspore
About MindSpore implementations of various Deep NLP models in cs-224n(Stanford Univ)
deep-learning mindspore nlp tutorial
Last synced: 09 Nov 2024
https://github.com/megagonlabs/ebe-dataset
Evidence-based Explanation Dataset (AACL-IJCNLP 2020)
dataset japanese-language nlp text-classification text-generation
Last synced: 07 Jan 2025
https://github.com/michellebonat/fed_funds_ml
Use machine learning (NLP) to demonstrate whether Federal Funds rate changes can be accurately predicted using just the FOMC - the US Federal Reserve Bank - meetings minutes.
ai federal-reserve-bank finance financial-services machine-learning nlp python3
Last synced: 22 Jan 2025
https://github.com/KxSystems/nlp
Natural-language processing library
clustering dataset embedpy kdb natural-language-processing nlp parsing python q vector
Last synced: 12 Nov 2024
https://github.com/kxsystems/nlp
Natural-language processing library
clustering dataset embedpy kdb natural-language-processing nlp parsing python q vector
Last synced: 07 Nov 2024