Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Natural language processing
Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
- GitHub: https://github.com/topics/nlp
- Wikipedia: https://en.wikipedia.org/wiki/Natural_language_processing
- Created by: Alan Turing
- Aliases: natural-language-processing, nlp-machine-learning, nlp-resources,
- Last updated: 2024-11-20 00:15:35 UTC
- JSON Representation
https://github.com/elizalo/question-answering-based-on-squad
Question Answering System using BiDAF Model on SQuAD v2.0
bidaf machine-learning natural-language-processing natural-language-understanding neural-network nlp nlp-datasets nlp-machine-learning python python-3-6 question-answering squad
Last synced: 28 Sep 2024
https://github.com/kargaranamir/parstdex
A package that extracts Persian time and date markers by applying regexes -- AACL 2022
datetime event-extract event-extraction hengam hengamtagger information-extraction nlp parstdex persian persian-calendar persian-datetime persian-time regex-pattern time-date
Last synced: 04 Aug 2024
https://github.com/loomchild/segment
Program used to split text into segments
Last synced: 14 Nov 2024
https://github.com/yuewang-cuhk/hashtaggeneration
The official implementation of the NAACL-HLT 2019 paper "Microblog Hashtag Generation via Encoding Conversation Contexts"
hashtag-generator nlp social-media
Last synced: 09 Nov 2024
https://github.com/tlack/hairytext
A data labeling and NLP tool for Elixir (uses Spacy)
elixir entity-recognition nlp nlp-machine-learning phoenix-live-view spacy text-classification
Last synced: 28 Oct 2024
https://github.com/karma9874/seq2seq-chatbot
Chatbot based Seq2Seq model with bidirectional rnn and attention mechanism with tensorflow, trained on Cornell Movie-Dialogs Corpus and deployed on a Flask Server
attention-mechanism bidirectional-lstm chatbot deep-learning flask nlp question-answering seq2seq tensorflow
Last synced: 06 Nov 2024
https://github.com/anoopkunchukuttan/geomm
Geometry-aware Multilingual Embeddings
bilingual-word-embedding multilingual nlp translation word-embedding
Last synced: 18 Nov 2024
https://github.com/rileynwong/spotify-analysis
Data analysis on my monthly playlists
audio-features data-analysis data-scraping lyrics machine-learning natural-language-processing nlp nlp-machine-learning sentiment-analysis spotify-analysis supervised-learning supervised-machine-learning text text-analysis
Last synced: 14 Oct 2024
https://github.com/shibing624/title-generator
Automatic Text Summarization and Title Generation.
deep-learning nlp text-summarization title-generation
Last synced: 22 Oct 2024
https://github.com/hankcs/sub-character-cws
Sub-Character Representation Learning
chinese-word-segmentation cws natural-language-processing nlp representation-learning simplified-chinese traditional-chinese
Last synced: 13 Oct 2024
https://github.com/midas-research/dlkp
A deep learning library for identifying keyphrases from text
dataset deep-learning information-extraction information-retrieval keyphrase-extraction keyphrase-generation machine-learning nlp
Last synced: 17 Nov 2024
https://github.com/mlabouardy/dialogflow-watchnow-messenger
WatchNow FB Messenger bot with DialogFlow & Golang 💬
api-ai bot dialogflow golang messenger nlp
Last synced: 15 Nov 2024
https://gair-nlp.github.io/BeHonest/
BeHonest: Benchmarking Honesty in Large Language Models
alignment benchmark evaluation honesty llm nlp
Last synced: 11 Oct 2024
https://github.com/adhaamehab/arabicnlp
Python package for Arabic natural language processing
arabic arabic-nlp keras ml nlp part-of-speech-tagger postagging sequence-modeling
Last synced: 11 Oct 2024
https://github.com/chengchingwen/BytePairEncoding.jl
Julia implementation of Byte Pair Encoding for NLP
nlp nlp-library nlp-machine-learning word-segmentation
Last synced: 28 Oct 2024
https://github.com/dayyass/latent-semantic-analysis
Pipeline for training LSA models using Scikit-Learn.
data-science hacktoberfest latent-semantic-analysis lsa machine-learning natural-language-processing nlp pipeline python topic-modeling
Last synced: 14 Oct 2024
https://github.com/decalogue/ai
AI ——人工智能工具集,包含机器学习,深度学习,自然语言处理
ai deep-learning dl machine-learning ml natural-language-processing nlp python
Last synced: 15 Nov 2024
https://github.com/andythefactory/romanian-nlp-datasets
A list of Romanian NLP Datasets
nlp nlp-data nlp-dataset nlp-datasets nlp-resources romanian romanian-language
Last synced: 07 Nov 2024
https://github.com/janekb04/py2gpt
Convert Python code into JSON consumable by OpenAI's function API.
ai api chatgpt converter function gpt gpt-4 json nlp openai openai-api python schema transcoding
Last synced: 05 Nov 2024
https://github.com/timbmg/structured-self-attentive-sentence-embedding
Re-Implementation of "A Structured Self-Attentive Sentence Embedding" by Lin et al., 2017
attention deep-learning machine-learning neural-networks nlp pytorch recurrent-neural-networks self-attention self-attentive-rnn sentiment-analysis text-classification vizualisation yelp-dataset
Last synced: 27 Oct 2024
https://github.com/dair-ai/odsc_2020_nlp
Repository for ODSC talk related to Deep Learning NLP
elasticsearch nlp search transformer
Last synced: 10 Nov 2024
https://github.com/chengchingwen/bytepairencoding.jl
Julia implementation of Byte Pair Encoding for NLP
nlp nlp-library nlp-machine-learning word-segmentation
Last synced: 15 Oct 2024
https://github.com/anakin87/neural-search-pills
Knowledge pills on Neural Search
deep-learning information-retrieval machine-learning machine-reading multimodal-search natural-language-processing neural-search nlp question-answering retrieval-systems search-engines semantic-search transformers vector-search
Last synced: 23 Oct 2024
https://github.com/thunlp/hiddenkiller
Code and data of the ACL-IJCNLP 2021 paper "Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger"
Last synced: 10 Nov 2024
https://github.com/erikgartner/sentimental
Sentiment analysis made easy; built on top off solid libraries.
natural-language-processing nlp sentiment-analysis
Last synced: 02 Nov 2024
https://github.com/jasonwbw/recordpapers4nlp
Record papers for some NLP related area
deep-learning dialogue-generation nlp reading-comprehension
Last synced: 28 Oct 2024
https://github.com/fractalego/subjectivity_classifier
Detects if a sentence is in a subjective or objective form
nlp rnn-tensorflow subjectivity
Last synced: 28 Oct 2024
https://github.com/undertheseanlp/slp3-vietnamese
Speech and Language Processing 3rd edition Vietnamese Translation
book-translation nlp vietnamese-nlp
Last synced: 11 Nov 2024
https://github.com/shrebox/Personified-Chatbot
A personified chatbot responding to a query based on the answering pattern of Dr. APJ Abdul Kalam using Information Retrieval, Natural Language Processing, and Deep Learning techniques.
apj-abdul-kalam chatbot deep-learning information-retrieval lstm natural-language-processing nlp ranking-algorithm seq2seq-chatbot seq2seq-model summarization word2vec
Last synced: 11 Nov 2024
https://github.com/kampersanda/tongrams-rs
Rust library providing fast language model queries in compressed space
compression elias-fano language-model ngrams nlp trie
Last synced: 11 Nov 2024
https://github.com/liebeck/spacy-iwnlp
German lemmatization with IWNLP as extension for spaCy
nlp spacy spacy-extension spacy-pipeline
Last synced: 14 Oct 2024
https://github.com/quickgrid/AI-Resources
Research Paper Summaries, Setup & Performance Notes, Resource Links on AI, Deep Learning, NLP, Computer Vision for my learning.
ai ai-notes ai-research blender computer-vision deep-learning nlp paper-summaries papers research-paper research-paper-summaries
Last synced: 02 Nov 2024
https://github.com/korpling/pepper
A highly extensible plattform for conversion and manipulation of linguistic data between an unbound set of formats. Pepper can be used stand-alone as a command line interface, or be integrated as an API into other software products.
annotations converter format java linguistic-formats linguistics nlp pepper
Last synced: 15 Nov 2024
https://github.com/botpress/nlu
This repo contains every ML/NLU related code written by Botpress in the NodeJS environment. This includes the Botpress Standalone NLU Server.
Last synced: 05 Nov 2024
https://github.com/quickgrid/ai-resources
Research Paper Summaries, Setup & Performance Notes, Resource Links on AI, Deep Learning, NLP, Computer Vision for my learning.
ai ai-notes ai-research blender computer-vision deep-learning nlp paper-summaries papers research-paper research-paper-summaries
Last synced: 07 Aug 2024
https://github.com/mgechev/gently-js
Module which returns the offensive words in a string. A soft reminder to be nicer to each other ❤️.
Last synced: 22 Oct 2024
https://github.com/davidsvy/Neural-Scam-Artist
Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
dataset deduplication fine-tuning fraud gpt2 huggingface lsh minhash nlp pytorch readability scam transformer web-scraping
Last synced: 05 Aug 2024
https://github.com/minerva-ml/steppy-toolkit
Curated set of transformers that make your work with steppy faster and more effective :telescope:
data-science deep-learning keras keras-models machine-learning nlp open-source pipeline pipeline-framework python python3 pytorch pytorch-models reproducibility reproducible-research steppy steppy-toolkit steps tensorflow tensorflow-models
Last synced: 26 Sep 2024
https://github.com/pfnet-research/vat_nmt
Implementation of "Effective Adversarial Regularization for Neural Machine Translation", ACL 2019
acl2019 adversarial neural-machine-translation nlp nmt vat
Last synced: 15 Nov 2024
https://github.com/ikegami-yukino/rakutenma-python
Rakuten MA (Python version)
chinese japanese-language nlp part-of-speech-tagger pos-tagging python word-segmentation
Last synced: 12 Oct 2024
https://github.com/i008/nyyelp
predicting yelp review rating using recurrent neural networks
deep-learning nlp python recurrent-neural-networks yelp-dataset
Last synced: 13 Nov 2024
https://github.com/hscspring/ALL4AI
AI Related Tools/Projects
ai jupyter linux machine-learning nlp python ssh toolbox
Last synced: 07 Nov 2024
https://github.com/hscspring/all4ai
AI Related Tools/Projects
ai jupyter linux machine-learning nlp python ssh toolbox
Last synced: 28 Oct 2024
https://github.com/dbklim/russian_subtitles_dataset
Preprocessing of the dataset of 347 subtitles for the TV series (thanks to Taiga Corpus) to build a word2vec model, JamSpell model, neural network training, chat bot training or in any other NLP task.
bot cnn corpus dataset lstm machine-learning ml natural-language-processing nlp nlu rnn russian subtitles text text-analysis text-processing word2vec
Last synced: 11 Nov 2024
https://github.com/code-kern-ai/sequence-learn
With sequence-learn, you can build models for named entity recognition as quickly as if you were building a sklearn classifier.
machine-learning named-entity-recognition natural-language-processing ner nlp python
Last synced: 10 Nov 2024
https://github.com/amsqr/NaiveSumm
NaiveSumm is a naive summarization approach based on Luhn1958 work "The Automatic Creation of Literature Abstracts" It uses the frequencies of words in the document in order to calculate and extract the sentences that include the most frequent words.
natural-language-processing nlp python summarization
Last synced: 31 Oct 2024
https://github.com/hpprc/bert-classification-tutorial-2024
【2024年版】BERTによるテキスト分類
Last synced: 27 Oct 2024
https://github.com/generall/oneshotnlp
PyTorch text matching models implementation for One-Shot Named Entity Linking
Last synced: 14 Oct 2024
https://github.com/jawahar273/practNLPTools-lite
Practical Natural Language Processing Tools for Humans is build on the top of Senna Natural Language Processing (NLP) predictions: part-of-speech (POS) tags, chunking (CHK), name entity recognition (NER), semantic role labeling (SRL) and syntactic parsing (PSG) with skip-gram all in Python and still more features will be added. The website give is for downlarding Senna tool
nlp practnlptools3 senna senna-nlp
Last synced: 07 Aug 2024
https://github.com/KGCP/MEL-TNNT
Metadata Extractor & Loader (MEL) ■ The NLP-NER Toolkit (TNNT)
metadata-extraction named-entity-recognition natural-language-processing nlp nlp-ner pipeline
Last synced: 17 Nov 2024
https://github.com/AmrHendy/programming-language-translator
An easy way to use the released TransCoder by Facebook AI Research to convert code from one programming language to another using unsupervised neural machine translation (NMT) systems that use deep-learning to translate text from one natural language to another and is trained only on monolingual source data.
machine-translation nlp programming-language transcoder transformer unsupervised-deep-learning unsupervised-translation
Last synced: 06 Aug 2024
https://github.com/astrazeneca/vecner
A library of tools for dictionary-based Named Entity Recognition (NER), based on word vector representations to expand dictionary terms.
dictionary-based-ner entity-extraction natural-language-processing ner nlp
Last synced: 18 Nov 2024
https://github.com/senthilchandrasegaran/textplorer
Visual analytics application for qualitative text analysis
nlp text-visualization visual-analytics
Last synced: 27 Oct 2024
https://github.com/sap-samples/acl2022-self-contrastive-decorrelation
Source code for ACL 2022 paper "Self-contrastive Decorrelation for Sentence Embeddings".
ai nlp research research-paper sample self-contrastive self-contrastive-decorrelation self-supervised-learning sentence-embeddings
Last synced: 15 Nov 2024
https://github.com/uetchy/homebrew-nlp
🍺 a Homebrew keg that specialized in Natural Language Processing.
homebrew natural-language-processing nlp
Last synced: 18 Oct 2024
https://github.com/proycon/gecco
Generic Environment for Context-Aware Correction of Orthography
nlp python spelling-correction
Last synced: 08 Nov 2024
https://github.com/adriangonz/statistical-nlp-17
Repository for group 17 on the Statistical Natural Language Processing module at UCL
Last synced: 22 Oct 2024
https://github.com/rileynwong/poetry-generator
Generate poetry based on text corpus input
generative-art generative-poetry generative-text natural-language-processing nlp poetry poetry-generator text
Last synced: 14 Oct 2024
https://github.com/KGCP/MEL-TNNT/
Metadata Extractor & Loader (MEL) ■ The NLP-NER Toolkit (TNNT)
metadata-extraction named-entity-recognition natural-language-processing nlp nlp-ner pipeline
Last synced: 14 Aug 2024
https://github.com/4ai/agn
Official Code for Merging Statistical Feature via Adaptive Gate for Improved Text Classification (AAAI2021)
bert deep-learning nlp text-classification
Last synced: 10 Nov 2024
https://github.com/bhattbhavesh91/text-summarizer-using-bert
Text summarization with BERT using bert-extractive-summarizer
bert google-bert language-model nlp nlp-machine-learning text-sumarization text-summarization
Last synced: 16 Nov 2024
https://github.com/doccano/spacy-partial-tagger
A simple library for training named entity recognition model from partially annotated data
named-entity-recognition natural-language-processing nlp spacy weak weak-supervision weakly-supervised-learning
Last synced: 10 Oct 2024
https://github.com/humansignal/brand-sentiment-analysis
Scripts utilizing Heartex platform to build brand sentiment analysis from the news
lstm-sentiment-analysis natural-language-processing nlp nlp-machine-learning nlp-sentiment-classifier nlp-tutorial sentiment sentiment-analyser sentiment-analysis sentiment-classification tensorflow-text-classifiers transfer-learning
Last synced: 14 Nov 2024
https://github.com/stanford-oval/ovalchat
OVALChat is a customizable Web app aimed at conducting user studies with chatbots
chatbots crowdsourcing nextjs nlp react tailwindcss
Last synced: 06 Nov 2024
https://github.com/deepset-ai/haystack-search-pipeline-streamlit
🚀 Template Haystack Search Application with Streamlit
Last synced: 06 Nov 2024
https://github.com/kalebu/desktop-chatbot-app
A python knowledge-based chatbot application built with Tkinter
chatbot chatbot-application data-science nlp nlp-projects python-tanzania python3 tanzania
Last synced: 09 Nov 2024
https://github.com/breezedeus/loveshare
breezedeus的各种分享
cnocr cnstd cv deep-learning llm nlp ocr pix2text
Last synced: 15 Nov 2024
https://github.com/alan-turing-institute/prompto
An open source library for asynchronous querying of LLM endpoints
deep-learning hut23 large-language-models llm-eval llm-evaluation llms machine-learning natural-language-processing nlp python transformer transformers
Last synced: 13 Nov 2024
https://github.com/TianyuZhuuu/CHIP2018
CHIP2018问句匹配大赛 Rank6解决方案
nlp pytorch sentence-similarity
Last synced: 06 Nov 2024
https://github.com/pyurbans/urbans
A tool for translating text from source grammar to target grammar (context-free) with corresponding dictionary.
artificial-intelligence data-science machine-translation nlp python
Last synced: 10 Nov 2024
https://github.com/chinnichaitanya/spellwise
🚀 Extremely fast fuzzy matcher & spelling checker in Python!
caverphone editex levenshtein natural-language-processing nlp spellcheck spelling-correction trie typox
Last synced: 27 Oct 2024
https://github.com/yuvalpinter/nytwit
New York Times Word Innovation Types dataset
computational-linguistics corpus dataset news nlp
Last synced: 27 Oct 2024
https://github.com/adamspannbauer/lexrankr
Extractive Text Summariztion with lexRankr (an R package implementing the LexRank algorithm)
lexrank lexrank-algorithm nlp r r-package rstat
Last synced: 27 Oct 2024
https://github.com/code-kern-ai/embedders
With embedders, you can easily convert your texts into sentence- or token-level embeddings within a few lines of code. Use cases for this include similarity search between texts, information extraction such as named entity recognition, or basic text classification.
classification machine-learning named-entity-recognition natural-language-processing ner nlp python representation-learning similarity-search
Last synced: 10 Nov 2024
https://github.com/artitw/text2class
Multi-class text categorization using state-of-the-art pre-trained contextualized language models, e.g. BERT
artificial-intelligence bert categorization classification classifier machine-learning natural-language-processing natural-language-understanding nlp tensorflow text text-classification transformers
Last synced: 28 Oct 2024
https://github.com/appcoda/naturallanguageprocessing
A Quick Demo for NLP in Swift 4
demo nlp playgrounds swift swift4
Last synced: 15 Nov 2024
https://github.com/tomaarsen/ttstextnormalization
Convert English text from written expressions into spoken forms
competition nlp normalization spoken-forms text-normalization tts
Last synced: 08 Nov 2024
https://github.com/ekinakyurek/knetlayers.jl
Useful Layers for Knet
computer-vision deep-learning machine-learning nlp
Last synced: 15 Oct 2024
https://github.com/twardoch/split-markdown4gpt
A Python tool for splitting large Markdown files into smaller sections based on a specified token limit. This is particularly useful for processing large Markdown files with GPT models, as it allows the models to handle the data in manageable chunks.
data-preprocessing gpt gpt-3 gpt-35-turbo gpt-35-turbo-16k gpt-4 markdown markdown-processing mistletoe natural-language-processing nlp openai openai-gpt python split-text summarization text-analysis text-processing text-summarization text-tokenization
Last synced: 27 Oct 2024
https://github.com/xv44586/knowledge-distillation-nlp
some demos of Knowledge Distillation in NLP
bert keras knowledge-distillation nlp
Last synced: 17 Nov 2024
https://github.com/li-plus/rouge-metric
A Python wrapper of the official ROUGE-1.5.5.pl script and a re-implementation of full ROUGE metrics.
machine-learning nlp pypi python rouge rouge-metric summarization
Last synced: 06 Nov 2024
https://github.com/banyh/PyStanfordNLP
A Python Wrapper of Stanford Chinese Segmenter
nlp postagging python-wrapper stanford stanford-chinese-segmenter
Last synced: 14 Nov 2024
https://github.com/derintelligence/en-az-parallel-corpus
English-Azerbaijani parallel language corpus
azerbaijan azerbaijani-translation corpus language linguistics nlp parallel translation
Last synced: 13 Nov 2024
https://github.com/explosion/spacy-benchmarks
💫 Runtime performance comparison of spaCy against other NLP libraries
benchmarking benchmarks natural-language-processing nlp spacy
Last synced: 25 Sep 2024
https://github.com/avacaondata/nlpboost
Python library for automatic training, optimization and comparison of Transformer models on most NLP tasks.
deep-learning hyperparameter-optimization hyperparameter-tuning natural-language-generation natural-language-processing natural-language-understanding nlp pytorch text-classification text-generation
Last synced: 11 Oct 2024
https://github.com/dpressel/arcs-py
Arc-Eager and Arc-Hybrid Greedy Dependency Parser with Dynamic Oracle in Python (with no Dependencies!)
Last synced: 28 Oct 2024
https://github.com/anthonysigogne/web-search-engine
API - a simple web search engine
api elasticsearch google-search indexing nlp python search-engine
Last synced: 12 Nov 2024
https://github.com/tirendazacademy/hugging-face-tutorials
Getting started with Hugging Face
deep-learning hugging-face huggingface huggingface-datasets huggingface-library huggingface-pipeline huggingface-transformer huggingface-transformers image-classification machine-learning natural-language-processing nlp pretrained-models pytorch sentiment-analysis tensorflow text-classification transfer-learning
Last synced: 08 Nov 2024
https://github.com/bretttolbert/verbecc-svc
Dockerized Python microservice with REST API for verbs conjugation in French, Spanish and Portuguese
conjugation conjugator french french-language french-nlp linguistics machine-learning natural-language natural-language-processing nlp portuguese-language portuguese-verbs romanian romanian-language scikit-learn spanish-language spanish-verbs verb-conjugation
Last synced: 18 Oct 2024
https://github.com/ahammadmejbah/ahammadmejbah
Data Science || Machine Learning || Deep Learning || Computer Vision || NLP Enthusiast Talks about #datascience, #deeplearning, #dataanalytics, #machinelearning, and #machinelearningalgorithms
artificial-intelligence computer-vision data-science deep-learning machine-learning nlp python
Last synced: 11 Nov 2024
https://github.com/eric-haibin-lin/nlp-notebooks
A collection of natural language processing notebooks.
deep-learning deep-learning-tutorial natural-language-generation natural-language-inference natural-language-processing natural-language-understanding nlp nlp-resources
Last synced: 28 Oct 2024
https://github.com/yhy1117/x-mixup
Implementation of ICLR 2022 paper "Enhancing Cross-lingual Transfer by Manifold Mixup".
cross-lingual-transfer manifold-mixup nlp
Last synced: 28 Aug 2024
https://github.com/fer-aguirre/pmdm
Political Misogynistic Discourse Monitor team from the 2021 JournalismAI Collab Challenges
nlp social-network-analysis text-classification
Last synced: 05 Nov 2024
https://github.com/salesforce/overture
Library for soft prompt tuning
deep-learning nlp prompt-tuning python pytorch soft-prompt-tuning
Last synced: 08 Nov 2024
https://github.com/richardlitt/thesis
My thesis on "Open Source Code and Low Resource Languages" for an MSc in Language Science and Technology at Saarland University
dissertation endangered-languages low-resource-languages lrl nlp nlproc saarland saarland-university thesis
Last synced: 21 Oct 2024
https://github.com/winkjs/wink-porter2-stemmer
Javascript Implementation of Porter Stemmer Algorithm V2 by Dr Martin F Porter
natural-language-processing nlp porter-stemmer-algorithm porter-stemmer-v2 stemmer
Last synced: 09 Nov 2024