Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Natural language processing
Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
- GitHub: https://github.com/topics/nlp
- Wikipedia: https://en.wikipedia.org/wiki/Natural_language_processing
- Created by: Alan Turing
- Aliases: natural-language-processing, nlp-machine-learning, nlp-resources,
- Last updated: 2024-11-20 00:15:35 UTC
- JSON Representation
https://github.com/StefanHeng/Symbolic-Music-Generation
Symbolic music generation taking inspiration from NLP and human composition process
autoregressive-models melody-extraction melody-generation midi music-generation music-xml nlp reformer representation-learning transformer transformer-decoder transformer-xl transformers-models
Last synced: 05 Aug 2024
https://github.com/naturale0/nlp-do-it-yourself
Implement well-known NLP models from scratch with high-level APIs.
machine-learning natural-language-processing nlp pytorch-examples tensorflow-examples
Last synced: 09 Oct 2024
https://github.com/hsgodhia/squad_rasor_nn
Pytorch implementation of the RaSoR paper "Learning Recurrent Span Representations for Extractive Question Answering" (Lee et al. 2016) and experiments with various neural components
deep-learning machine-comprehension nlp pytorch
Last synced: 07 Aug 2024
https://github.com/anakin87/llama2-haystack
Using Llama2 with Haystack, the NLP/LLM framework.
haystack large-language-models llama llm nlp
Last synced: 23 Oct 2024
https://github.com/omarsar/emotion_analysis_elastic_pytorch
Deep Emotion Analysis with Elastic and PyTorch
deep-learning elasticsearch kibana nlp pytorch visualization
Last synced: 13 Oct 2024
https://github.com/megagonlabs/ginza-transformers
Use custom tokenizers in spacy-transformers
ginza natural-language-processing nlp spacy spacy-transformers sudachitra tokenizers transformers
Last synced: 14 Oct 2024
https://github.com/kensuke-mitsuzawa/word2vec-wikification-py
Disambiguation of wikipedia article name
disambiguation entity-linking nlp python-3 wikipedia wikipedia-database
Last synced: 12 Oct 2024
https://github.com/JnRMnT/ZemberekDotNet
ZemberekDotNet is the .NET Port of Zemberek-NLP (Natural Language Processing tools for Turkish).
csharp language machine-learning morphology natural-language-processing nlp nuget turkish zemberek zemberek-nlp
Last synced: 12 Nov 2024
https://github.com/diego-vicente/dandelion
Rhyme detection in Python
natural-language nlp rhyme rhyme-analysis
Last synced: 19 Nov 2024
https://github.com/wazzabeee/pyspark-etl-twitter
Implementation of an ETL process for real-time sentiment analysis of tweets with Docker, Apache Kafka, Spark Streaming, MongoDB and Delta Lake
delta-lake docker etl etl-pipeline etl-process kafka kafka-consumer kafka-producer kafka-streams mongodb nlp pyspark python sentiment-analysis spark spark-streaming tweet-analysis tweet-classification twitter twitter-sentiment-analysis
Last synced: 13 Nov 2024
https://github.com/yfletberliac/mango
Question-Answering NLP model with character-level RNN (TensorFlow).
character-level-rnn deep-learning machine-learning nlp rnn tensorflow
Last synced: 05 Nov 2024
https://github.com/jaron/sciencegraph
A comprehensive knowledge graph of scientific concepts
knowledge-graph neo4j nlp question-answering
Last synced: 04 Nov 2024
https://github.com/aayushpatel007/topicrankpy
A Python package to get useful information from documents using TopicRank Algorithm.
data-preprocessing email-parsing graph-algorithms hierarchical-clustering keyphrase-extraction keywords-extraction named-entity-recognition network-x nlp pagerank-python phone-parse spacy text-cleaning textrank topicrank
Last synced: 12 Oct 2024
https://github.com/SapienzaNLP/xl-amr
XL-AMR is a sequence-to-graph cross-lingual AMR parser that exploits transfer learning (EMNLP2020).
abstract-meaning-representation amr amr-graphs amr-parsing natural-language-processing nlp semantic-parsing translations
Last synced: 30 Oct 2024
https://github.com/karoly-hars/gpt2_episode_summary_generator
Utilizing webscraping and state-of-the-art NLP to generate TV show episode summaries.
artifical-intelligense deep-learning gpt2 imdb natural-language-generation natural-language-processing neural-networks nlp pytorch-transformers scrapy torch webscraping wikipedia
Last synced: 15 Nov 2024
https://github.com/snipsco/snips-nlu-parsers
Rust crate for entity parsing
entity-recognition entity-resolution nlp nlu rust
Last synced: 08 Nov 2024
https://github.com/tokestermw/spacy_kenlm
:game_die: KenLM extension for spaCy 2.0.
kenlm language-model nlp spacy spacy-extension spacy-nlp
Last synced: 14 Oct 2024
https://github.com/yuanxiaosc/Deep_dynamic_word_representation
TensorFlow code and pre-trained models for A Dynamic Word Representation Model Based on Deep Context. It combines the idea of BERT model and ELMo's deep context word representation.
Last synced: 22 Aug 2024
https://github.com/grahamwaters/lorebook_generator_for_novelai
Generates a lorebook for novelai
author history nlp novelai research writing-tool
Last synced: 01 Nov 2024
https://github.com/maxbot-ai/maxbot
Maxbot is an open source library and framework for creating conversational apps
bot botkit chatbot chatbot-framework conversational-ai conversational-apps maxbot nlp rasa spacy text-bot voice-bot
Last synced: 18 Oct 2024
https://github.com/tuanacelik/unstructuredio-haystack
💙 Unstructured Data Connectors for Haystack 2.0
haystack llm nlp python unstructured-data
Last synced: 23 Oct 2024
https://github.com/ccoreilly/spacy-catala
Spacy NLP Model for the Catalan language
catalan catalan-language nlp nlp-model nlu nlu-model spacy
Last synced: 23 Oct 2024
https://github.com/liamdugan/human-detection
Code for the AAAI 2023 Paper "Real or Fake Text?: Investigating Human Ability to Detect Boundaries Between Human-Written and Machine-Generated Text"
human-annotation nlp text-generation
Last synced: 27 Oct 2024
https://github.com/carrychang/litnlp
litNlp 是基于 Tensorflow2.0 实现的一个轻量级的深度情感极性分析库,可以实现细粒度的多级别情感极性训练和预测,是搭建情感分析和文本分类模型的快速方案,具体应用如:民宿顾客意见挖掘,见右边的链接。
Last synced: 13 Nov 2024
https://github.com/cloneofsimo/zeroshot-storytelling
Github repository for Zero Shot Visual Storytelling
beam-search ml nlp vision-and-language
Last synced: 20 Nov 2024
https://github.com/mramshaw/speech-recognition
Speech recognition with Python
microprocessor monotonic nlp pocketsphinx pyaudio python raspberry-pi raspberry-pi-3 speech-recognition
Last synced: 14 Nov 2024
https://github.com/koichiyasuoka/supar-kanbun-1.3.4
Tokenizer POS-tagger and Dependency-parser for Classical Chinese
ancient-chinese classical-chinese literary-chinese nlp
Last synced: 16 Nov 2024
https://github.com/ppke-nlpg/purepos
PurePos is an open source hybrid morphological tagger.
hungarian morphological-analysis nlp parser pos-tagger tagger
Last synced: 17 Nov 2024
https://github.com/ailln/text-classification
🦑 中文文本分类(支持 API 部署)
nlp text-classification textcnn
Last synced: 18 Nov 2024
https://github.com/reinfer/blingfire-rs
Rust wrapper for the BlingFire tokenization library
machine-learning nlp rust rust-wrapper tokenizer
Last synced: 19 Nov 2024
https://github.com/LanguageMachines/libfolia
FoLiA library for C++
folia library natural-language-processing nlp
Last synced: 30 Oct 2024
https://github.com/xusenlinzy/lightningblocks
pytorch-lightning code blocks for nlp
bert infomation-extraction named-entity-recognition nlp pytorch pytorch-lightning relation-extraction transformers uie
Last synced: 10 Nov 2024
https://github.com/0101011/analitika
Testing Automatic Text Summarization
hdf5 machine-learning natural-language-processing nlp nlp-machine-learning pickle python tokenizer transformers
Last synced: 09 Nov 2024
https://github.com/sudodoki/nlp-how-to-annotate
Set of guides and references for annotating NLP data
annotation-tool annotations dataset guide nlp nlp-machine-learning
Last synced: 14 Oct 2024
https://github.com/worldbank/iqual
iQual is a package that leverages natural language processing to scale up interpretative qualitative analysis. It also provides methods to assess the bias, interpretability and efficiency of the machine-enhanced codes. iQual has been applied to analyse interviews on parents' aspirations for their children in Cox's Bazaar, Bangladesh.
human-coding natural-language-processing nlp python qualitative-analysis qualitative-research
Last synced: 10 Nov 2024
https://github.com/bluelovers/node-segment
Chinese word segmentation 簡繁中文分词模块 以網路小說為樣本 基于 Node.js 的中文分词模块
chinese javascript nlp nodejs segment typescript
Last synced: 12 Nov 2024
https://github.com/google-research/fool-me-twice
Game code and data for Fool Me Twice: Entailment from Wikipedia Gamification https://arxiv.org/abs/2104.04725
entailment fever firebase game nlp verification wikipedia
Last synced: 10 Nov 2024
https://github.com/winkjs/wink-lexicon
English lexicon useful in NLP/NLU
english-lexicon nlp nlu wink wink-lexicon wordnet
Last synced: 09 Nov 2024
https://github.com/dair-ai/nlp_highlights
✨ A report of the most important NLP highlights (A Yearly Report - 2018, 2019)
deep-learning machine-learning nlp
Last synced: 04 Nov 2024
https://github.com/hengluchang/visualizing_contextual_vectors
Visualizing ELMo Contextual Vectors for Word Sense Disambiguation
elmo machine-learning nlp python visualization
Last synced: 12 Nov 2024
https://github.com/piesposito/transformers-low-code-experiments
Low-code pre-built pipelines for experiments with huggingface/transformers for Data Scientists in a rush.
deep-learning machine-learning nlp pytorch transformer
Last synced: 08 Nov 2024
https://github.com/erfanzar/ost-opensourcetransformers
OST Collection: An AI-powered suite of models that predict the next word matches with remarkable accuracy (Text Generative Models). OST Collection is based on a novel approach to work as a full and intelligent NLP Model.
deep-learning nlp pytorch transformer-architecture transformers
Last synced: 07 Nov 2024
https://github.com/thunlp/bkdatk-lws
Code and data of the ACL 2021 paper "Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution"
Last synced: 10 Nov 2024
https://github.com/gorango/lexrank.js
Unsupervised text summarization using the lexrank algorithm
javascript lexrank ml nlp pagerank sentence-relevance text-summarization
Last synced: 11 Oct 2024
https://github.com/Etwas-Builders/Twitter-Source-Bot
Ever wanted to know the source of a tweet? Just @whosaidthis_bot and I'll tell you where it came from
bot mozilla-builders nlp source-verify twitter-bot twitter-source-bot web-scraping
Last synced: 06 Nov 2024
https://github.com/wadaboa/ner-annotator
GUI useful to manually annotate text for Named Entity Recognition purposes
named-entity-recognition ner nlp pyqt5 spacy
Last synced: 12 Oct 2024
https://github.com/bvolpato/mdmlang
🔄 Natural Transformation Language for Java
cleaner integration java nlp rules
Last synced: 13 Oct 2024
https://github.com/brunoarine/findlike
Command-line tool that finds lexically similar documents in relation to a reference text file or ad-hoc query
bm25 nlp similarity-search tfidf
Last synced: 07 Aug 2024
https://github.com/quimpm/youtube_discussion_tree
This is a python API that allows you to obtain the discusion that occurs on the comments of a Youtube video as a tree structure. It also controls the quota usage that consumes your implementation over Youtube Data Api through this library, and allows you to represent and serialize the discusion tree.
comment-tree conversational nlp social-media social-media-analysis tree-structure youtube youtube-analysis youtube-api youtube-comment youtube-comments youtube-comments-downloader youtube-video
Last synced: 11 Oct 2024
https://github.com/yusufusta/es-anlamlilar
Türkçe Eş Anlamlı Kelimeler (JSON, csv, xml)
csv json nlp turkce turkish turkish-nlp
Last synced: 23 Oct 2024
https://github.com/JackHCC/Arxiv-NLP-Reporter
每日自动获取Arxiv上NLP相关最新论文【Arxiv Natural Language Processing Paper Automatic Crawl Daily】
Last synced: 09 Nov 2024
https://github.com/totalhack/zillion-web
Zillion Web: A Demo UI and Web API for Zillion
analytics data-warehousing demo-ui docker-swarm-mode dockerswarm fastapi nlp text-to-sql typescript vue warehouse zillion
Last synced: 28 Oct 2024
https://github.com/derhuerst/nbayes
A Naive Bayes classifier written in JavaScript.
bayes natural-language-processing nlp
Last synced: 08 Nov 2024
https://github.com/ztjhz/minilm
Small Model Is All You Need - NTU SC4001 Neural Network & Deep Learning Project
bert deep-learning deepspeed gpt2 llama llm neural-network nlp ntu roberta sc4001 wandb
Last synced: 28 Oct 2024
https://github.com/erfaniaa/trump-vs-harris-on-reddit
Analyze Reddit comments using NLP to predict the potential winner of the US 2024 election
natural-language-processing nlp politics reddit
Last synced: 28 Oct 2024
https://github.com/ndabap/assocentity
Package assocentity returns the mean distance from tokens to an entity and its synonyms
go golang natural-language-processing nlp social-sciences tokenizer
Last synced: 27 Oct 2024
https://github.com/dbklim/uk_stemmer
A small modification of the stemmer for the Ukrainian language (https://github.com/Amice13/ukr_stemmer)
natural-language-processing nlp stemmer stemmers stemming stemming-algorithm uk ukr ukrainian ukrainian-morphology
Last synced: 11 Nov 2024
https://github.com/jpmanson/llm_templates
Instruction/chat prompts creation library for text generation LLMs. It supports local and Hugging Face models.
chatbot cohere gemma huggingface jinja2 library llama2 llama3 llm mistral nlp nlp-library phi3 template
Last synced: 10 Oct 2024
https://github.com/shnewto/ttaw
a piecemeal natural language processing library
alliteration cmu cmudict crates double-metaphone language metaphone natural natural-language natural-language-processing nlp phonemes phones processing pronounce pronounciation rhyme rust syllables
Last synced: 02 Nov 2024
https://github.com/hscspring/bytepiece-rs
The Bytepiece Tokenizer Implemented in Rust.
bytepiece language-model nlp tokenizer
Last synced: 28 Oct 2024
https://github.com/priyamakeshwari/teachgpt
An AI Powered teacher that can help you learn your topics faster before exam
ai hacktoberfest hacktoberfest2023 llm machine-learning nlp python
Last synced: 27 Oct 2024
https://github.com/jzonthemtn/hashitalks2021-terraform-nlp
From Training to Serving: Machine Learning Models with Terraform
Last synced: 07 Nov 2024
https://github.com/azu/morpheme-match
match function that match token(形態素解析) with sentence.
japanese javascript kuromoji morpheme-match nlp
Last synced: 02 Nov 2024
https://github.com/ikergarcia1996/questionclustering
Clasificador de preguntas escrito en python 3 que fue implementado en el siguiente vídeo: https://youtu.be/qnlW1m6lPoY
clustering crawler deep-learning inteligencia-artificial machine-learning natural-language-processing nlp pln sentiment-analysis techonology unsupervised-machine-learning word-embeddings
Last synced: 27 Oct 2024
https://github.com/zafarali/sentiment.datalogue
Sentiment analysis challenge for Datalogue recruiting
deep-neural-networks machine-learning nlp sentiment-analysis
Last synced: 13 Oct 2024
https://github.com/rokoroku/node-twitter-korean-text
(Deprecated) use open-korean-text
Last synced: 28 Oct 2024
https://github.com/seujung/gluonnlp_tutorial?fbclid=IwAR1dVxeXYp06Zr4h4OFjL38W6enZ4SjJd27n7MSkmt4v9wKOtj9Sol5B3Es
GluonNLP tutorial for Pycon2019
Last synced: 11 Nov 2024
https://github.com/talschuster/tokenmasker
Masking tokens to modify the predictions of a pretrained sentence classifier
fact-checking masker nlp rational
Last synced: 08 Nov 2024
https://github.com/chachamatcha/DL_Text_Classification
Collection of Deep Learning Text Classification Models in Keras; Includes a GPU tutorial.
benchmark benchmarks deep-learning deep-learning-tutorial deep-neural-networks gpu kaggle kaggle-competition keras keras-tutorials natural-language-processing nlp python tensorflow-gpu text text-classification text-processing toxic-comment-classification tutorial
Last synced: 07 Aug 2024
https://github.com/generall/extwikilinks
Extended Wikilinks dataset description
corenlp dataset nlp wiki-links
Last synced: 14 Oct 2024
https://github.com/mmbazel/predicting-kickstarter-campaign-outcomes-using-nlp-feature-engineering
Turning raw kickstarter text data => Campaign predictions using SpaCy, Scikit-learn, SQLAlchemy, SQLite3 & XGBoost Classifier (feat eng = Bag-of-Words, Tfdvectorizer)
classification data-science feature-engineering kickstarter-campaigns nlp nlp-feature-engineering nlp-machine-learning springboard springboard-career-track springboard-data-science springboard-projects sqlite-database sqlite3
Last synced: 18 Nov 2024
https://github.com/mullerpeter/authorstyle
Python package to deal with PAN corpora and extract stylometric features from text documents.
author-attribution intrinsic-plagiarism-detection nlp pan python stylometric-features stylometry
Last synced: 15 Nov 2024
https://github.com/EmreTaha/Unsupervised-Domain-Adaptation-with-BERT
Unsupervised domain adaptation with BERT for Amazon food product reviews sentiment analysis.
adversarial-learning amazon-food-reviews bert bert-model colab domain-adaptation nlp sentiment-analysis tensorflow unsupervised-learning
Last synced: 15 Nov 2024
https://github.com/hkproj/mistral-llm-notes
Notes on the Mistral AI model
llm mistral mistral-7b mixtral mixtral-8x7b nlp pytorch sliding-window-attention xformers
Last synced: 17 Nov 2024
https://github.com/kyegomez/logicguide
Plug in and Play implementation of "Certified Reasoning with Language Models" that elevates model reasoning by 40%
artificial-intelligence attention-mechanism deep-learning gpt4 large-language-models nlp prompt-engineering transformer
Last synced: 09 Nov 2024
https://github.com/vzhong/vocab
Vocabulary objects for natural language processing
Last synced: 14 Oct 2024
https://github.com/danieljdufour/location-extractor
Extracts locations from text
Last synced: 06 Nov 2024
https://github.com/tlkh/fake-news-chrome-extension
Chrome Extension to help fight Online Misinformation
chrome deep-learning fake-news nlp
Last synced: 07 Nov 2024
https://github.com/shujian2015/neural-net-linguistics
Papers about NN and linguistics
Last synced: 09 Nov 2024
https://github.com/bloomberg/emnlp20_depsrl
Research code and scripts used in the paper Semantic Role Labeling as Syntactic Dependency Parsing.
dependency-parsing emnlp emnlp2020 nlp semantic-role-labeling
Last synced: 09 Nov 2024
https://github.com/charlywargnier/s4_wiki_topic_grapher
Leverage the power of the Google Natural Language API NLP to retrieve entity relationships from Wikipedia URLs or topics! Get interactive networkx graphs of connected entities!
data-scraping datascience nlp nlproc seo wikipedia
Last synced: 19 Nov 2024
https://github.com/shujian2015/graphnet_nlp_paper
List of papers that applied graph network to NLP
Last synced: 09 Nov 2024
https://github.com/harshcasper/blind-app-reviews
Scraped reviews of over 25 companies from the Blind App ⚡️
blind-app company-reviews dataset nlp scrape scraped-data text-mining webscraping
Last synced: 08 Nov 2024
https://github.com/deepset-ai/haystack-home
Website for Haystack, the open source LLM framework
blog llm machine-learning nlp open-source
Last synced: 06 Nov 2024
https://github.com/csinva/cookiecutter-ml-research
A logical, reasonably standardized, but flexible project structure for conducting ml research 🍪
ai artificial-intelligence classification data-science machine-learning ml ml-tooling modeling natural-language-processing nlp python regression research statistics tabular-data template
Last synced: 09 Nov 2024
https://github.com/xtrendence/cryptoshare
Grade: 82%. My COMP3000 final year university project that allows you to manage nearly every facet of your finances with an open-source web, mobile, and desktop application, along with a self-hosted GraphQL API.
budgeting chatbot cryptocurrency cryptocurrency-portfolio docker encryption finance graphql income neutralinojs nlp node-nlp nodejs php react react-native stock-portfolio stocks webapp website
Last synced: 10 Nov 2024
https://github.com/cdpierse/breame
Lightweight utility tools for the detection of multiple spellings, meanings, and language-specific terminology in British and American English
nlp nlp-library python search-engine spelling utility-library
Last synced: 15 Oct 2024
https://github.com/varunon9/chat-reply-suggestions
Auto reply suggestions to chat messages/emails (like gmail and linkedin) built using rasa_nlu framework.
chat-reply chatbot nlp rasa rasa-nlu
Last synced: 27 Oct 2024
https://github.com/ndabAP/assocentity
Package assocentity returns the mean distance from tokens to an entity and its synonyms
go golang natural-language-processing nlp social-sciences tokenizer
Last synced: 26 Oct 2024
https://github.com/primaprashant/ai-customer-support
📚 Curated collection of blogs and papers on how different companies are using machine learning in production for better customer support.
ai applied-data-science applied-machine-learning applied-ml artificial-intelligence customer-service customer-support data-science deep-learning machine-learning natural-language-processing nlp paper production tech-blog
Last synced: 07 Nov 2024
https://github.com/izhx/ner-unlabeled-data-retrieval
[COLING 22] Domain-Specific NER via Retrieving Correlated Samples.
bert named-entity-recognition natural-language-processing ner nlp pytorch
Last synced: 12 Nov 2024
https://github.com/eklem/stopword-trainer
A module for creating stopword lists for any language, based on a set of documents.
document-processing information-retrieval nlp stopwords stopwords-removal
Last synced: 08 Nov 2024
https://github.com/scofield7419/strumatchdl
Codes for ICML 2022 paper: Matching Structure for Dual Learning
Last synced: 11 Nov 2024
https://github.com/linuxscout/mishtar
Mishtar: Named and temporal entities chunker
arabic-language arabic-nlp chunking named-entity-recognition nlp temporal-entities-chunker
Last synced: 25 Oct 2024
https://github.com/arthurdelamare/job-matcher
A resume parser, position parser and job matcher using Python.
ai artificial-intelligence job-matcher natural-language-processing nlp position position-parser python resume resume-parser
Last synced: 09 Nov 2024
https://github.com/winkjs/wink-jaro-distance
An Implementation of Jaro Distance Algorithm by Matthew A. Jaro
jaro jaro-distance jaro-similarity natural-language-processing nlp string-matching
Last synced: 09 Nov 2024
https://github.com/wannaphong/isannlp
Isan NLP
natural-language-processing nlp thai-language thai-nlp
Last synced: 08 Nov 2024
https://github.com/elliotxx/eventextractbynovel
【NLP】基于SVM的网络小说事件类型识别
event-detection event-extraction nlp svm
Last synced: 06 Nov 2024