Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Natural language processing
Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
- GitHub: https://github.com/topics/nlp
- Wikipedia: https://en.wikipedia.org/wiki/Natural_language_processing
- Created by: Alan Turing
- Aliases: natural-language-processing, nlp-machine-learning, nlp-resources,
- Last updated: 2024-11-15 00:20:20 UTC
- JSON Representation
https://github.com/IBM/MAX-Text-Sentiment-Classifier
Detect the sentiment captured in short pieces of text
docker-image ibm machine-learning machine-learning-models natural-language-processing natural-language-understanding nlp sentiment tensorflow
Last synced: 04 Aug 2024
https://github.com/josefalbers/roy
Roy: A lightweight, model-agnostic framework for crafting advanced multi-agent systems using large language models.
agent agentgpt autogen autogpt baby-agi chat chatbot code-generation code-generator gpt langchain llm llm-agent multi-agent nlp prompt-engineering quantization retrieval-augmented-generation vector-index wizardcoder
Last synced: 08 Nov 2024
https://github.com/wjbmattingly/spacyex
SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.
Last synced: 31 Oct 2024
https://github.com/doragd/text-classification-pytorch
Implementation of papers for text classification task on SST-1/SST-2
bilstm-attention nlp sentiment-classification text-classification textcnn
Last synced: 29 Oct 2024
https://github.com/searchableai/kitanaqa
KitanaQA: Adversarial training and data augmentation for neural question-answering models
adversarial-attacks adversarial-training bert data-augmentation ml-automation natural-language-processing nlp pytorch question-answering transformer
Last synced: 13 Oct 2024
https://github.com/saturncloud/dask-pytorch-ddp
dask-pytorch-ddp is a Python package that makes it easy to train PyTorch models on dask clusters using distributed data parallel.
computer-vision dask deep-learning distributed-computing machine-learning nlp pytorch
Last synced: 15 Nov 2024
https://github.com/OpenSUM/CPSUM
Code and Data Repo for COLING'22 paper "Noise-injected Consistency Training and Entropy-constrained Pseudo Labeling for Semi-supervised Extractive Summarization"
extractive-summarization nlp semi-supervised-learning
Last synced: 16 Nov 2024
https://github.com/Legilibre/legi.py
Outils de manipulation des archives LEGI (lois françaises)
france laws legi legislation natural-language-processing nlp opendata python
Last synced: 03 Sep 2024
https://github.com/argilla-io/adept-augmentations
A Python library aimed at dissecting and augmenting NER training data.
dataset datasets few-shot-learning machine-learning natural-language-processing nlp spacy
Last synced: 18 Oct 2024
https://github.com/nicolay-r/AREkit
Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and processing large text collections with ML and for ML
bert datasets frames language-models neural-networks nlp pandas pandas-dataframe prompt prompting relation-extraction sentiment-analysis tensorflow
Last synced: 01 Nov 2024
https://github.com/knadh/indic.page
A directory of Indic (Indian) language computing resources.
datasets indian-language indic-languages language linguistics nlp
Last synced: 28 Oct 2024
https://github.com/NISH1001/tag-generator
A simple tool to generate tags for the given text (document) using TF-IDF.
Last synced: 05 Nov 2024
https://github.com/kubeflow/code-intelligence
ML-Powered Developer Tools, using Kubeflow
deep-learning fastai flask kubeflow kubernetes machine-learning natural-language-processing nlp pytorch rest-api
Last synced: 29 Sep 2024
https://github.com/nlp-uoregon/mlmm-evaluation
Multilingual Large Language Models Evaluation Benchmark
datasets evaluation evaluation-datasets evaluation-framework language-model large-language-models multilingual natural-language-processing nlp
Last synced: 11 Nov 2024
https://github.com/cluebenchmark/mobileqa
离线端阅读理解应用 QA for mobile, Android & iPhone
albert android bert chinese iphone machine-reading-comprehension nlp qa tensorflow tflite
Last synced: 09 Nov 2024
https://github.com/csvance/armchair-expert
Machine Learning Chatbot
ai bot discord machine-learning markov nlp python twitter
Last synced: 15 Nov 2024
https://github.com/ayaka14732/travis
TrAVis: Visualise BERT attention in your browser
attention-mechanism bart bert heatmap jax natural-language-processing nlp numpy transformers visualization
Last synced: 28 Oct 2024
https://github.com/xyntopia/pydoxtools
Effortlessly extract information from unstructured data with this library, utilizing advanced AI techniques. Compose AI in customizable pipelines and diverse sources for your projects.
chatgpt document-analysis document-extraction extraction information-retrieval llm nlp pdf python
Last synced: 17 Nov 2024
https://github.com/kyubyong/koparadigm
KoParadigm: Korean Inflectional Paradigm Generator
inflection korean linguistics morphology nlp paradigm
Last synced: 10 Nov 2024
https://github.com/doccano/doccano-mini
Annotation meets Large Language Models (ChatGPT, GPT-3 and alike).
annotation langchain nlp openai
Last synced: 01 Nov 2024
https://github.com/princeton-nlp/calm-textgame
[EMNLP 2020] Keep CALM and Explore: Language Models for Action Generation in Text-based Games
calm gpt n-gram nlp rl text-based-game
Last synced: 11 Nov 2024
https://github.com/openagi/tefla
Tensorflow based deep-learning library.
ai deep-learning gan generative-adversarial-network lstm ml nlp tensorflow-framework
Last synced: 14 Oct 2024
https://github.com/cluebenchmark/lightlm
高性能小模型测评 Shared Tasks in NLPCC 2020. Task 1 - Light Pre-Training Chinese Language Model for NLP Task
bert chinese chinese-language-model languagemodel nlp nlpcc nlpcc2020
Last synced: 09 Nov 2024
https://github.com/jackaduma/nlp4cybersecurity
NLP model and tech for cyber security tasks
code-injection command-injection cross-site-scripting cross-site-scripting-proof cyber-security cybersecurity deep-learning machine-learning malicious-url-detection network-security nlp nlp-deep-learning nlp-machine-learning password-strength phishing-attacks phishing-detection sql-injection text-classification xss-injection
Last synced: 11 Nov 2024
https://github.com/lunarwhite/tan-division
Chinese corpus sentiment analysis. 谭松波酒店评论中文文本情感分析
deep-learning keras lstm nlp python rnn tensorflow
Last synced: 06 Nov 2024
https://github.com/likejazz/jupyter-notebooks
This repo contains Jupyter Notebooks, miscellaneous stuff.
data-science decision-tree deep-learning jupyter-notebook keras machine-learning nlp pytorch random-forest statistics tensorflow
Last synced: 29 Oct 2024
https://github.com/SamEdwardes/spacytextblob
A TextBlob sentiment analysis pipeline component for spaCy.
natural-language-processing nlp python spacy
Last synced: 04 Aug 2024
https://github.com/mlabouardy/dialogflow-angular5
💬 Bot in Angular 5 & DialogFlow
ai angular angular5 api-ai bot chatbot dialogflow nlp
Last synced: 15 Nov 2024
https://github.com/cohere-ai/sandbox-accelerating-chatbot-training
Leveraging Cohere's models to enable zero-shot routing
chatbot large-language-models llm nlp routing
Last synced: 07 Oct 2024
https://github.com/pclubiitk/model-zoo
Implementations of various Deep Learning models in PyTorch and TensorFlow.
3d-vision classification cnn cnn-model deep-learning gans machine-learning model-zoo nlp object-detection pytorch super-resolution tensorflow vae-gan video
Last synced: 15 Nov 2024
https://github.com/samedwardes/spacytextblob
A TextBlob sentiment analysis pipeline component for spaCy.
natural-language-processing nlp python spacy
Last synced: 14 Oct 2024
https://github.com/dayyass/pytorch-ner
Pipeline for training NER models using PyTorch.
deep-learning hacktoberfest lstm machine-learning named-entity-recognition natural-language-processing ner nlp onnx pipeline python pytorch rnn text
Last synced: 07 Nov 2024
https://github.com/chiphuyen/metrotwitter
What Twitter reveals about the differences between cities and the monoculture of the Bay Area
data-analysis data-visualization emojis nlp nlp-datasets python twitter twitter-dataset
Last synced: 08 Nov 2024
https://github.com/chanind/frame-semantic-transformer
Frame Semantic Parser based on T5 and FrameNet
framenet huggingface nlp semantic-parsing t5 transformers
Last synced: 08 Nov 2024
https://github.com/LaVi-Lab/CLEVA
[EMNLP 2023 Demo] CLEVA: Chinese Language Models EVAluation Platform
Last synced: 16 Nov 2024
https://github.com/thepushkarp/nalcos
Search Git commits in natural language
commit-search commits hacktoberfest huggingface information-retrieval natural-language nlp python sentence-transformers
Last synced: 17 Nov 2024
https://github.com/swader/diffbot-php-client
[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library
ai artificial-intelligence bot crawl crawling diffbot machine-learning nlp php scrape scraped-data scraper scraping
Last synced: 15 Nov 2024
https://github.com/jackaduma/threatreportextractor
Extracting Attack Behavior from Threat Reports
advanced-persistent-threat cyber-threat-intelligence cybersecurity deep-learning deeplearning graph graph-algorithms machine-learning machine-learning-algorithms natural-language-processing nlp nlp-machine-learning nlp-parsing security threat-analysis threat-intelligence
Last synced: 11 Nov 2024
https://github.com/wroberts/pygermanet
GermaNet API for Python
german nlp nltk python python-2 python-3 semantic-similarity wordnet
Last synced: 12 Oct 2024
https://github.com/hironsan/neraug
A text augmentation tool for named entity recognition.
deep-learning machine-learning natural-language-processing nlp
Last synced: 27 Oct 2024
https://github.com/developersdigest/function-chain
The FunctionChain is a tool that simplifies and organizes the process of invoking OpenAI functions in your Node.js applications. With this toolkit, you can easily scaffold out and isolate all the OpenAI function calls you need, making your code more modular, maintainable, and scalable.
alpha-vantage artificial-intelligence automation function-calling functionchain langchain machine-learning natural-language natural-language-processing nlp openai pinecone
Last synced: 08 Nov 2024
https://github.com/paulrinckens/timexy
A spaCy custom component that extracts and normalizes temporal expressions
date-parser datetime natural-language-processing nlp python spacy spacy-extension timeml timex3
Last synced: 14 Oct 2024
https://github.com/clipperhouse/uax29
A tokenizer based on Unicode text segmentation (UAX #29), for Go. Split words, sentences and graphemes.
go golang nlp tokenization tokenizer uax29 unicode
Last synced: 14 Nov 2024
https://github.com/rguthrie3/morphologicalpriorsforwordembeddings
Code for EMNLP 2016 paper: Morphological Priors for Probabilistic Word Embeddings
blocks emnlp neural-network nlp theano word-embeddings
Last synced: 08 Nov 2024
https://github.com/shibing624/judger
自动作文评分工具,支持中文、英文作文智能评分,支持评分模型自训练,支持WEKA处理模型数据,支持自定义评分算法。java开发。
aes automated-essay-scoring essayscoring judger nlp
Last synced: 22 Oct 2024
https://github.com/dayyass/muse-as-service
REST API for sentence tokenization and embedding using Multilingual Universal Sentence Encoder.
api bert deep-learning docker embeddings flask gunicorn hacktoberfest machine-learning nlp python rest-api sentence-embeddings service tensorflow text universal-sentence-encoder web-server
Last synced: 07 Nov 2024
https://github.com/minasmz/Persian-Summarization
Statistical and Semantical Text Summarizer in Persian Language
doc2vec-model gensim nlp persian-language persian-nlp text-summarization textrank-algorithm
Last synced: 04 Aug 2024
https://github.com/sovoid/friend.ly
A social media platform with a friend recommendation engine based on personality trait extraction
api deep-learning ejs friend-recommendation javascript mean-stack mongoose nlp nodejs nodemailer oauth2 passportjs social social-login social-network speechtotext text-mining text-to-speech tron webrtc
Last synced: 12 Oct 2024
https://github.com/jenojp/extractacy
Spacy pipeline object for extracting values that correspond to a named entity (e.g., birth dates, account numbers, laboratory results)
entity-extraction entity-linking ner nlp pattern-matching spacy spacy-extension spacy-pipeline
Last synced: 14 Oct 2024
https://github.com/vamshiiitbhu14/nlpswift
NSLinguisticTagger provides a uniform interface to a variety of natural language processing functionality with support for many different languages and scripts. One can use this class to segment natural language text into paragraphs , sentences, or words and tag information about those segments such as parts of speech, lexical class, lemma!
coreml ios nlp nlp-apis swift4
Last synced: 10 Nov 2024
https://github.com/rizerphe/openai-functions
Generate ChatGPT function call schemas based on function docstrings.
chatgpt chatgpt-api chatgpt-functions nlp openai openai-api openai-chatgpt openai-function-call openai-functions python python-library
Last synced: 15 Nov 2024
https://github.com/tokenmill/beagle
Beagle helps you identify keywords, phrases, regexes, and complex search queries of interest in streams of text documents.
clojure java lucene luwak nlp real-time-search stemming stored-query-engine stream-search
Last synced: 10 Nov 2024
https://github.com/vmenger/deduce
Deduce: de-identification method for Dutch medical text
deidentification dutch dutch-clinical-nlp information-extraction nlp python python-library text-mining text-processing
Last synced: 11 Nov 2024
https://github.com/tokuhirom/jawiki-kana-kanji-dict
Generate SKK/MeCab dictionary from Wikipedia(Japanese edition)
japanese-language nlp wikipedia
Last synced: 27 Oct 2024
https://github.com/MrinmoiHossain/Udacity-Deep-Learning-Nanodegree
The course is contained knowledge that are useful to work on deep learning as an engineer. Simple neural networks & training, CNN, Autoencoders and feature extraction, Transfer learning, RNN, LSTM, NLP, Data augmentation, GANs, Hyperparameter tuning, Model deployment and serving are included in the course.
convolutional-networks convolutional-neural-networks deep-learning gans generative-adversarial-network long-short-term-memory-models lstms machine-learning nanodegree neural-network nlp pytorch recurrent-neural-networks rnn sentiment-analysis style-transfer transfer-learning udacity udacity-nanodegree
Last synced: 07 Aug 2024
https://github.com/IBM/MAX-Toxic-Comment-Classifier
Detect 6 types of toxicity in user comments.
comments docker-image ibm machine-learning machine-learning-models natural-language-processing natural-language-understanding nlp pytorch
Last synced: 04 Aug 2024
https://github.com/ferdinandzhong/punctuator
A small seq2seq punctuator tool based on DistilBERT
bert bert-ner chinese-nlp deep-learning nlp punctuation pytorch seq2seq
Last synced: 16 Nov 2024
https://github.com/gunthercox/mathparse
A Python library for evaluating natural language mathematical equations
Last synced: 27 Oct 2024
https://github.com/htaghizadeh/PersianStemmer-Python
PersianStemmer-Python
information-retrieval nlp persian persian-language persian-nlp persian-stemmer stemmer
Last synced: 04 Aug 2024
https://github.com/FerdinandZhong/punctuator
A small seq2seq punctuator tool based on DistilBERT
bert bert-ner chinese-nlp deep-learning nlp punctuation pytorch seq2seq
Last synced: 14 Nov 2024
https://github.com/ysenarath/sinling
A collection of NLP tools for Sinhalese (සිංහල).
joiner language-processing morphological-analyser natural-language-processing nlp part-of-speech pos-tagging sinhala sinhala-nlp sinhala-stemmer sinhala-tokenizer splitter tokenizer tool toolkit
Last synced: 06 Nov 2024
https://github.com/nzw0301/lightlda
fast sampling algorithm based on CGS
lda machine-learning nlp python topic-modeling
Last synced: 22 Oct 2024
https://github.com/nzw0301/lightLDA
fast sampling algorithm based on CGS
lda machine-learning nlp python topic-modeling
Last synced: 04 Nov 2024
https://github.com/vdutts7/ai-rapper
Talking Head of your favorite rapper using Transformers, PyTorch, Tortoise TTS, and OpenCV 🎵
huggingface-transformers nlp opencv pytorch tortoise-tts voice-clone
Last synced: 11 Nov 2024
https://github.com/yell/photo-organizer
Lviv Data Science Summer School 2016 project
caffe cnn-model deep-learning googlenet image-classification machine-learning network-in-network neural-network nlp photo-organizer photos resnet transfer-learning
Last synced: 22 Oct 2024
https://github.com/siddk/deep-nlp
Tensorflow Tutorial files and Implementations of various Deep NLP and CV Models.
deep mnist-nn neural-network nlp tensorflow tensorflow-tutorials
Last synced: 22 Oct 2024
https://github.com/abhinav-26/ai-chatbot
This is my Artificial Intelligence Project in which we build AI Contextual Chatbot
ai ai-assignments ai-chatbot ai-project chatbot contextual-chatbot deep-learning hacktoberfest hacktoberfest2020 machine-learning natural-language-processing neural-networks nlp stemming tensorflow tflearn
Last synced: 28 Oct 2024
https://github.com/machine-learning-tokyo/poetry-gan
creativity generative-adversarial-network machine-learning nlp research
Last synced: 09 Nov 2024
https://github.com/arne-cl/discoursegraphs
linguistic converter / merging tool for multi-level annotated corpora. graph-based (using Python and NetworkX).
conversion converter natural-language-processing networkx nlp python
Last synced: 10 Nov 2024
https://github.com/wayfair-incubator/extra-model
Code to run the ExtRA algorithm for unsupervised topic/aspect extraction on English texts.
aspect-based-sentiment-analysis aspect-extraction machine-learning-algorithms nlp nlp-keywords-extraction nlp-library python python-library python3
Last synced: 07 Nov 2024
https://github.com/gentaiscool/few-shot-lm
The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)
few-shot few-shot-learning gpt intent language-model multilingual nlp t5
Last synced: 08 Nov 2024
https://github.com/lonepatient/bert-sentence-similarity-pytorch
This repo contains a PyTorch implementation of a pretrained BERT model for sentence similarity task.
bert nlp pytorch sentence-similarity text-classification
Last synced: 06 Nov 2024
https://github.com/lonePatient/bert-sentence-similarity-pytorch
This repo contains a PyTorch implementation of a pretrained BERT model for sentence similarity task.
bert nlp pytorch sentence-similarity text-classification
Last synced: 02 Nov 2024
https://github.com/kaleidophon/token2index
A lightweight but powerful library to build token indices for NLP tasks, compatible with major Deep Learning frameworks like PyTorch and Tensorflow.
deep-learning deeplearning i2t i2w indexing itos nlp numpy python pytorch rnn rnns seq2seq stoi t2i tensorflow token transformer transformers w2i
Last synced: 09 Nov 2024
https://github.com/blacksamorez/ebanko
NLP based telegram bot
docker-compose grafana nlp telegram-bot
Last synced: 05 Nov 2024
https://github.com/asyml/stave
An extensible framework for building visualization and annotation tools to enable better interaction with NLP and Artificial Intelligence. This is part of the CASL project: http://casl-project.ai/
annotation casl-project nlp petuum visualization
Last synced: 12 Nov 2024
https://github.com/wikimedia/sentencex
A sentence segmentation library with wide language support optimized for speed and utility.
natural-language-processing nlp sentence sentence-segmentation
Last synced: 07 Oct 2024
https://github.com/vzhong/e3
Dockerized code for E3: Entailment-driven Extracting and Editing for Conversational Machine Reading.
deep-learning machine-learning nlp
Last synced: 07 Nov 2024
https://github.com/strangetom/ingredient-parser
A tool to parse recipe ingredients into structured data
ingredients natural-language-processing nlp python recipes
Last synced: 08 Nov 2024
https://github.com/apache/ctakes
Apache cTAKES is a Natural Language Processing (NLP) platform for clinical text.
Last synced: 07 Oct 2024
https://github.com/chambliss/foodbert
FoodBERT: Food Extraction with DistilBERT
bert bert-model distilbert food food-extraction information-extraction natural-language-processing nlp nlp-machine-learning python transformers
Last synced: 28 Oct 2024
https://github.com/thomasahle/codenames
Codenames AI using Word Vectors
ai cli codenames game nlp word-embeddings word2vec
Last synced: 29 Oct 2024
https://github.com/jeroenouw/angularai
:speech_balloon: Angular 6 AI (localhost version is working correctly)
ai angular angular6 artificial-intelligence chatbotai dialogflow google machine-learning material-design ngx nlp rxjs
Last synced: 13 Nov 2024
https://github.com/koichiyasuoka/esupar
Tokenizer POS-Tagger and Dependency-parser with BERT/RoBERTa/DeBERTa models for Japanese and other languages
ainu chinese classical-chinese coptic english german japanese natural-language-processing nlp serbian thai transformers
Last synced: 16 Nov 2024
https://github.com/cdpierse/script_buddy_v2
Script Buddy v2 is a film script text generation tool built using film scripts from the world's most popular film scripts and GPT2.
artificial-intelligence gpt-2 language-generation machine-learning nlp pytorch transformers
Last synced: 08 Nov 2024
https://github.com/prajjwal1/language-modelling
LM, ULMFit et al.
deep-learning language-modeling nlp pytorch
Last synced: 05 Nov 2024
https://github.com/carrychang/real_time_datamining_software
携程/榛果民宿实时评论挖掘软件,包含数据的实时采集/数据清洗/结构化保存/ UGC 数据主题提取/情感分析/后结构化可视化等技术的综合性演示 Demo。基于在线民宿 UGC 数据的意见挖掘项目,包含数据挖掘和 NLP 相关的处理,负责数据采集、主题抽取、情感分析等任务。主要克服用户打分和评论不一致,实时对携程和美团在线民宿的满意度进行评测以及对额外数据进行可视化的综合性工具,多维度的对在线 UGC 进行数据挖掘并可视化,demo 视频演示见链接。
data-mining-software data-spider demo nlp real-time-anslysis sentiment-analysis ugc-analysis
Last synced: 13 Nov 2024
https://github.com/andi611/conditional-seqgan-tensorflow
Conditional Sequence Generative Adversarial Network trained with policy gradient, Implementation in Tensorflow
chatbot conditional-gan gan machine-learning nlp nlp-machine-learning seqgan tensorflow
Last synced: 07 Nov 2024
https://github.com/abhishek-ch/vectorverse
Explore Multiple Vector Databases and chat with documents on Multiple LLM models, private LLM models
chatbot chatgpt chromadb elasticsearch embeddings generative generativeai milvus nlp openai python qdrant redis streamlit vectorstore
Last synced: 28 Oct 2024
https://github.com/lexiestleszek/namegen
Self-contained, minimalistic implementation of a language model that generates coherent and normal sounding names. It uses an input dataset of names and probability distribution to generate new names based on the sequences of four characters.
language-model machine-learning markov-chain name-generation natural-language-processing nlp
Last synced: 14 Nov 2024
https://github.com/datawhalechina/unlock-hf
解锁HuggingFace生态的百般用法
datawhale nlp transformers tutorial
Last synced: 09 Nov 2024
https://github.com/LanguageMachines/PICCL
A set of workflows for corpus building through OCR, post-correction and normalisation
computational-linguistics corpus-linguistics corpus-tools folia nlp ocr workflow
Last synced: 03 Nov 2024
https://github.com/gunale0926/sorsa
SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models
deep-learning fine-tuning llama lora machine-learning nlp peft python pytorch rwkv sorsa svd transformer
Last synced: 17 Nov 2024
https://github.com/kennethenevoldsen/spacy-wrap
spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to include existing fine-tuned models within your SpaCy workflow.
deep-learning huggingface huggingface-transformers language-model machine-learning natural-language-processing nlp pytorch spacy spacy-extension spacy-extensions spacy-models spacy-nlp spacy-pipeline spacy-transformers text-classification transformers
Last synced: 12 Oct 2024