Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Natural language processing
Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
- GitHub: https://github.com/topics/nlp
- Wikipedia: https://en.wikipedia.org/wiki/Natural_language_processing
- Created by: Alan Turing
- Aliases: natural-language-processing, nlp-machine-learning, nlp-resources,
- Last updated: 2024-11-15 00:20:20 UTC
- JSON Representation
https://github.com/lonepatient/electra_pytorch
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
bert deeplearning electra glue language-model nlp pytorch
Last synced: 06 Nov 2024
https://github.com/fdalvi/neurox
A Python library that encapsulates various methods for neuron interpretation and analysis in Deep NLP models.
explainable-ai natural-language-processing neurons nlp nlp-machine-learning
Last synced: 14 Nov 2024
https://github.com/ikegami-yukino/oseti
Dictionary based Sentiment Analysis for Japanese
japanese-language nlp sentiment-analysis sentiment-polarity
Last synced: 17 Nov 2024
https://github.com/fdalvi/NeuroX
A Python library that encapsulates various methods for neuron interpretation and analysis in Deep NLP models.
explainable-ai natural-language-processing neurons nlp nlp-machine-learning
Last synced: 17 Nov 2024
https://github.com/nlp-uoregon/okapi
Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback
bloom chatbot dataset instruction-tuning language-model large-language-models llama multilingual natural-language-processing nlp question-answering reinforcement-learning reinforcement-learning-from-human-feedback rlhf
Last synced: 11 Nov 2024
https://github.com/epwalsh/nlp-models
NLP research experiments, built on PyTorch within the AllenNLP framework.
allennlp nlp pytorch pytorch-nlp
Last synced: 01 Nov 2024
https://github.com/bjascob/pyinflect
A python module for word inflections designed for use with spaCy.
inflection nlp python spacy spacy-extension
Last synced: 07 Nov 2024
https://github.com/kyubyong/name2nat
name2nat: a Python package for nationality prediction from a name
Last synced: 10 Nov 2024
https://github.com/nlp-uoregon/Okapi
Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback
bloom chatbot dataset instruction-tuning language-model large-language-models llama multilingual natural-language-processing nlp question-answering reinforcement-learning reinforcement-learning-from-human-feedback rlhf
Last synced: 05 Oct 2024
https://github.com/saidziani/arabic-news-article-classification
Automatic categorization of documents, consists in assigning a category to a text based on the information it contains. We'll follow different approach of Supervised Machine Learning.
arabic-language arabic-nlp corpora machine-learning nlp nltk python3 text-categorization
Last synced: 28 Oct 2024
https://github.com/saidziani/Arabic-News-Article-Classification
Automatic categorization of documents, consists in assigning a category to a text based on the information it contains. We'll follow different approach of Supervised Machine Learning.
arabic-language arabic-nlp corpora machine-learning nlp nltk python3 text-categorization
Last synced: 14 Nov 2024
https://github.com/lsys/lexicalrichness
:smile_cat: :speech_balloon: A module to compute textual lexical richness (aka lexical diversity).
data-mining data-science information-retrieval lexical-analysis lexical-analyzer linguistic-analysis natural-language natural-language-processing nlp python
Last synced: 16 Nov 2024
https://github.com/qdata/lamp
ECML 2019: Graph Neural Networks for Multi-Label Classification
computer-vision graph-attention-networks graph-neural-networks multi-label-classification nlp transformers
Last synced: 12 Nov 2024
https://github.com/shibing624/pysenti
Chinese Sentiment Classification Tool. 情感极性分类,基于知网、清华、BosonNLP情感词典,易扩展,基准方法,开箱即用。
Last synced: 15 Nov 2024
https://github.com/undertheseanlp/chatbot
Vietnamese Chatbot
chatbot nlp vietnamese vietnamese-nlp
Last synced: 11 Nov 2024
https://github.com/kakaobrain/helo-word
Team Kakao&Brain's Grammatical Error Correction System for the ACL 2019 BEA Shared Task
deep-learning fairseq grammatical-error-correction nlp pre-training transfer-learning transformer
Last synced: 10 Nov 2024
https://github.com/PKU-YuanGroup/Hallucination-Attack
Attack to induce LLMs within hallucinations
adversarial-attacks ai-safety deep-learning hallucinations llm llm-safety machine-learning nlp
Last synced: 05 Sep 2024
https://github.com/mohabmes/Arabycia
Arabic NLP tool used to perform Text Search, POS tagging, Translation, auto-diacritization, etc..
arabic-language arabic-nlp nlp
Last synced: 14 Nov 2024
https://github.com/maxoodf/russian_news_corpus
Russian mass media stemmed texts corpus / Корпус лемматизированных (морфологически нормализованных) текстов российских СМИ
articles corpus machine-learning ml nlp nlp-machine-learning russian text word2vec
Last synced: 15 Aug 2024
https://github.com/ines/spacy-graphql
🤹♀️ Query spaCy's linguistic annotations using GraphQL
flask flask-api graphql natural-language-processing nlp python spacy
Last synced: 08 Nov 2024
https://github.com/mohabmes/arabycia
Arabic NLP tool used to perform Text Search, POS tagging, Translation, auto-diacritization, etc..
arabic-language arabic-nlp nlp
Last synced: 27 Oct 2024
https://github.com/thunlp/prompt-transferability
On Transferability of Prompt Tuning for Natural Language Processing
nlp parameter-efficient-learning parameter-efficient-tuning pretrained-language-model pretrained-language-models pretrained-models prompt prompt-tuning pytorch transfer-learning
Last synced: 10 Nov 2024
https://github.com/louisbrulenaudet/apple-ocr
Easy-to-Use Apple Vision wrapper for text extraction, scalar representation and clustering using K-means.
apple clustering kmeans nlp ocr ocr-recognition pyobjc python scatter-plot sklearn
Last synced: 10 Oct 2024
https://github.com/forzagreen/n2words
Convert numerical numbers to written numbers, in 25+ languages.
convert-numbers language natural-language nlp
Last synced: 15 Nov 2024
https://github.com/arazd/ProgressivePrompts
Progressive Prompts: Continual Learning for Language Models
continual-learning llms nlp prompt-tuning
Last synced: 09 Aug 2024
https://github.com/quanta-quest/quanta-quest
AI-powered universal search for all your personal data, tailored just for you. Goal:The world's first product with "edge-side LLMs + consumer data localization" as its core development direction.
agent ai anthropic bert chatgpt claude edge-computing gpt huggingface knowledgebase language-models llm nextjs nlp personal-ass rag semantic-vector-search transformers universal-search workflow
Last synced: 30 Oct 2024
https://github.com/hyperparticle/graph-nlu
Graph NLU is a natural language understanding tool that leverages the power of graph databases
ipython jupyter machine-learning natural-language neo4j nlp nlu research
Last synced: 12 Oct 2024
https://github.com/thunlp/sememepso-attack
Code and data of the ACL 2020 paper "Word-level Textual Adversarial Attacking as Combinatorial Optimization"
adversarial-attacks adversarial-examples nlp pso sememe
Last synced: 10 Nov 2024
https://github.com/philgooch/abbreviation-extraction
Python3 implementation of the Schwartz-Hearst algorithm for extracting abbreviation-definition pairs
abbreviations information-extraction keyword-extraction nlp python3
Last synced: 04 Nov 2024
https://github.com/legacyai/tf-transformers
State of the art faster Transformer with Tensorflow 2.0 ( NLP, Computer Vision, Audio ).
bert gpt2 keras language-model natural-language-processing nlp nlp-library tensorflow tensorflow2 text-classification text-generation transformer
Last synced: 07 Nov 2024
https://github.com/mynameisvinn/emailparser
remove signature blocks from emails
email-parser email-parsing natural-language-processing nlp python signature-blocks
Last synced: 07 Nov 2024
https://github.com/adhaamehab/textblob-ar
Arabic support for textblob
arabic-language arabic-nlp machine-learning natural-language-processing nlp part-of-speech-tagger sentiment-analysis spelling-correction text-classification text-similarity textblob word-embeddings
Last synced: 14 Nov 2024
https://github.com/ARBML/tkseem
Arabic Tokenization Library. It provides many tokenization algorithms.
arabic-nlp nlp tkseem tokenization
Last synced: 27 Oct 2024
https://github.com/kevinschaich/billboard
🎤 Lyrics/associated NLP data for Billboard's Top 100, 1950-2015.
billboard billboard-charts billboard100 billboards-hot-100 d3 d3-visualization d3js data-visualization data-visualization-project html javascript lyrics nlp nlp-parsing nltk nltk-python sentiment sentiment-analysis sentiment-classification visualization
Last synced: 31 Oct 2024
https://github.com/mynameisvinn/EmailParser
remove signature blocks from emails
email-parser email-parsing natural-language-processing nlp python signature-blocks
Last synced: 07 Aug 2024
https://github.com/bernhard2202/rankqa
This is the PyTorch implementation of the ACL 2019 paper RankQA: Neural Question Answering with Answer Re-Ranking.
nlp question-answering questionanswering
Last synced: 09 Nov 2024
https://github.com/stanford-oval/genienlp
GenieNLP: A versatile codebase for any NLP task
deep-learning dialogue natural-language-processing nlp paraphrasing question-answering semantic-parsing seq2seq translation
Last synced: 06 Nov 2024
https://github.com/dusty-nv/jetson-voice
ASR/NLP/TTS deep learning inference library for NVIDIA Jetson using PyTorch and TensorRT
deep-learning jetson jetson-nano nlp pytorch speech-recognition tensorrt text-to-speech
Last synced: 07 Nov 2024
https://github.com/thushv89/manning_tf2_in_action
The official code repository for "TensorFlow in Action" by Manning.
computer-vision deep-learning machine-learning nlp notebook python tensorflow tensorflow2 tf tf2
Last synced: 27 Oct 2024
https://github.com/rguthrie3/bilstm-crf
BiLSTM-CRF for sequence labeling in Dynet
conditional-random-fields dynet neural-network nlp part-of-speech-tagger
Last synced: 08 Nov 2024
https://github.com/writer/fitbert
Use BERT to Fill in the Blanks
bert fill-in-the-blank nlp python pytorch
Last synced: 29 Sep 2024
https://github.com/aliosm/shakkelha
Neural Arabic text diacritization
arabic-language comparison dataset diacritization ffnn nlp rnn sequence-labeling
Last synced: 27 Oct 2024
https://github.com/nschneid/unix-text-commands
Unix Text Processing Command Reference
command-line nlp reference text-processing unix
Last synced: 08 Nov 2024
https://github.com/Dumbris/trunklucator
Python module for data scientists for quick creating annotation projects.
active-learning annotation annotation-tool data-science machine-learning nlp
Last synced: 04 Nov 2024
https://github.com/codelucas/cracking-the-da-vinci-code-with-google-interview-problems-and-nlp-in-python
A guide on how to crack combinatorics puzzles shown in The Da Vinci Code movie using CS fundamentals and NLP
combinatorics interview-questions nlp nlp-machine-learning python
Last synced: 15 Nov 2024
https://github.com/scofield7419/hesyfu
Code for the ACL2021 paper: Better Combine Them Together! Integrating Syntactic Constituency and Dependency Representations for Semantic Role Labeling
nlp semantic-role-labeling syntactic-parsing syntax
Last synced: 11 Nov 2024
https://github.com/SentometricsResearch/sentometrics
An integrated framework in R for textual sentiment time series aggregation and prediction
nlp prediction sentiment-analysis text-mining time-series
Last synced: 11 Nov 2024
https://github.com/lyeoni/pretraining-for-language-understanding
Pre-training of Language Models for Language Understanding
language-model language-modeling language-understanding nlp pytorch pytorch-tutorial
Last synced: 06 Nov 2024
https://github.com/umesh-01/python-assistant
Python Assistant (PA) is a voice command based assistant service written in Python 3.9+. It can recognize human speech or voice, talk to user and execute basic commands.
ai-assistants google-recognition nlp openweathermap-api pycharm-ide python python-assistant python-automation python39 pyttsx3 speech-recognition text-to-speech virtual-assistant voice-assistant voice-commands voice-recognition web-scraping wikipedia-search wolfram-alpha
Last synced: 09 Oct 2024
https://github.com/bdbc-kg-nlp/covid-19-tracker
北航大数据高精尖中心研究团队进行数据来源的整理与获取,利用自然语言处理等技术从已公开全国4626确诊患者轨迹中抽取了基本信息(性别、年龄、常住地、工作、武汉/湖北接触史等)、轨迹(时间、地点、交通工具、事件)及病患关系形成结构化信息
covid-19 extraction nlp tracking visualization
Last synced: 12 Nov 2024
https://github.com/jackaduma/secbert
pretrained BERT model for cyber security text, learned CyberSecurity Knowledge
apt attention bert bert-embeddings cyber-security cyber-threat-intelligence cybersecurity deep-learning-security deeplearning machine-learning-security nlp nlp-machine-learning security security-automation threat-analysis threat-detection threat-hunting threat-intelligence transformer-encoder transformers
Last synced: 11 Nov 2024
https://github.com/zencephalon/tactful_tokenizer
Accurate Bayesian sentence tokenizer in Ruby.
Last synced: 16 Nov 2024
https://github.com/tomasonjo/trinity-ie
Information extraction pipeline containing coreference resolution, named entity linking, and relationship extraction
information-extraction named-entity-recognition nlp relationship-extraction
Last synced: 22 Oct 2024
https://github.com/nalbion/whisper-server
streaming speech to text server using Whisper
Last synced: 28 Oct 2024
https://github.com/google-research-datasets/query-wellformedness
25,100 queries from the Paralex corpus (Fader et al., 2013) annotated with human ratings of whether they are well-formed natural language questions.
deep-learning deep-neural-networks information-retrieval nlp nlp-machine-learning search-engine
Last synced: 08 Nov 2024
https://github.com/princeton-nlp/nlproofs
EMNLP 2022: Generating Natural Language Proofs with Verifier-Guided Search https://arxiv.org/abs/2205.12443
machine-learning nlp reasoning
Last synced: 11 Nov 2024
https://github.com/ekramasif/basic-machine-learning
This is a repo of basic Machine Learning what I learn. More to go...
ann artficial-neural-network artificial-intelligence bert-embeddings bert-model blstm collaborate data-science deep-learning embeddings keras lstm machine-learning natural-language-processing neural-network nlp pandas python seaborn tensorflow
Last synced: 26 Oct 2024
https://github.com/datquocnguyen/jLDADMM
A Java package for the LDA and DMM topic models
gibbs-sampling lda nlp short-text topic-modeling topic-models
Last synced: 13 Nov 2024
https://github.com/julesbelveze/bert-squeeze
🛠️ Tools for Transformers compression using PyTorch Lightning ⚡
bert deebert distillation fastbert lstm nlp pruning pytorch-lightning quantization theseus transformers
Last synced: 15 Nov 2024
https://github.com/pythainlp/attacut
A Fast and Accurate Neural Thai Word Segmenter
cnn hacktoberfest hactoberfest2022 nlp tokenization
Last synced: 15 Nov 2024
https://github.com/apache/incubator-nlpcraft
Apache NLPCraft - API to convert natural language into actions.
Last synced: 07 Oct 2024
https://github.com/indiejoseph/chinese-char-rnn
Character-Level language models
chinese deep-learning language-modeling nlp rnn tensorflow
Last synced: 14 Nov 2024
https://github.com/hyunwoongko/summarizers
Package for controllable summarization
Last synced: 27 Oct 2024
https://github.com/dmitryryumin/emnlp-2023-papers
EMNLP 2023 Papers: Explore cutting-edge research from EMNLP 2023, the premier conference for advancing empirical methods in natural language processing. Stay updated on the latest in machine learning, deep learning, and natural language processing with code included. :star: support NLP!
bert computational-linguistics emnlp emnlp2023 gpt language-models llms machine-learning machine-translation multilingual-nlp named-entity-recognition natural-language-processing ner nlp nlp-applications sentiment-analysis syntax-and-semantics text-mining transformers word-embeddings
Last synced: 15 Nov 2024
https://github.com/salesforce/taichi
Open source library for few shot NLP
conversational-ai few-shot-learning intent-classification multilingual nli nlp
Last synced: 08 Nov 2024
https://github.com/krystalan/Multi-hopRC
:notebook_with_decorative_cover: notes for Multi-hop Reading Comprehension and open-domain question answering
machine-reading-comprehension natural-language-processing nlp open-domain-qa paper-list question-answering
Last synced: 13 Nov 2024
https://github.com/bukosabino/justicio
Building an assistant for Boletin Oficial del Estado (BOE) using Retrieval Augmented Generation (RAG)
Last synced: 15 Oct 2024
https://github.com/unilight/R-NET-in-Tensorflow
R-NET implementation in TensorFlow.
machine-comprehension nlp squad tensorflow
Last synced: 07 Aug 2024
https://github.com/unilight/r-net-in-tensorflow
R-NET implementation in TensorFlow.
machine-comprehension nlp squad tensorflow
Last synced: 28 Oct 2024
https://github.com/bnosac/textrank
Summarise text by finding relevant sentences and keywords using the Textrank algorithm
natural-language-processing nlp r textrank textrank-algorithm
Last synced: 10 Oct 2024
https://github.com/yuchenlin/lstm_sentence_classifier
LSTM-based Models for Sentence Classification in PyTorch
lstm-model nlp pytorch sentence-classification
Last synced: 28 Oct 2024
https://github.com/felladrin/MiniSearch
Minimalist web-searching app with an AI assistant that runs directly from your browser. Uses Web-LLM, Ratchet-ML, Wllama and SearXNG. Demo: https://felladrin-minisearch.hf.space
ai artificial-intelligence generative-ai gpu-accelerated information-retrieval llm llm-inference machine-learning nlp question-answering ratchet-ml retrieval-augmented-generation search search-engine searxng typescript web-llm webapp wllama
Last synced: 29 Oct 2024
https://github.com/Neuraxio/New-Empty-Python-Project-Base
The Perfect Python Project Template. Bored of coding anew the same thing for your new Python projects? Here is what you need. Click below on the "use this template" green button to start using it instantly. Rename the "project" folder and all references to this folder to customize your project name.
base computer-vision library nlp project-template project-templates pypi-package python-library time-series
Last synced: 14 Nov 2024
https://github.com/uripeled2/llm-client-sdk
SDK for using LLM
ai ai21labs api async bard bard-api chatgpt-api generative-ai gpt huggingface huggingface-transformers large-language-models llm llms nlp openai openai-api palm-api python sdk
Last synced: 08 Nov 2024
https://github.com/scofield7419/uabsa-symux
Codes for the IJCAI2022 paper: Inheriting the Wisdom of Predecessors: A Multiplex Cascade Framework for Unified Aspect-based Sentiment Analysis
Last synced: 11 Nov 2024
https://github.com/ishantchauhan710/writerai
WriterAI is an AI based content writing tool that helps users easily write high quality emails, blogs, letters, thesis and other stuff. One can also share their project with others and work as a team.
ai artificial-intelligence firebase ktor machine-learning nlp reactjs sql
Last synced: 28 Oct 2024
https://github.com/howardyclo/pytorch-seq2seq-example
Fully batched seq2seq example based on practical-pytorch, and more extra features.
attention-seq2seq checkpoint fixed-embedding glove-embeddings jupyter-notebook nlp nlp-machine-learning pretrained-embedding pytorch pytorch-nlp-tutorial pytorch-tutorial seq2seq shared-embedding tensorboard tensorboard-visualization tie-embedding
Last synced: 09 Nov 2024
https://github.com/cbilgili/zemberek-nlp-server
Zemberek Türkçe NLP Java Kütüphanesi üzerine REST Docker Sunucu
docker javascript nlp part-of-speech-tagger rest sentence-tokenizer spark turkish turkish-language zemberek
Last synced: 12 Nov 2024
https://github.com/sorenlind/lemmy
🤘Lemmy is a lemmatizer for Danish 🇩🇰 and Swedish 🇸🇪
danish lemma lemmatizer nlp spacy swedish
Last synced: 12 Oct 2024
https://github.com/Loodos/turkish-language-models
Transformer based Turkish language models
language-models natural-language-processing nlp turkish
Last synced: 12 Nov 2024
https://github.com/seeed-projects/recomputer-jetson-for-beginners
Beginner's Guide to reComputer Jetson
ai application beginner-friendly cuda cv deepstream examples generative-ai guide jetson llm ml nlp nvidia recomputer robotics tao tensorrt
Last synced: 15 Nov 2024
https://github.com/Flight-School/ner
A command-line utility for extracting names of people, places, and organizations from text on macOS.
cli macos named-entity-recognition nlp swift
Last synced: 05 Aug 2024
https://github.com/philipperemy/stanford-ner-python
Stanford Named Entity Recognizer (NER) - Python Wrapper
extraction named-entity-recognition nlp python-wrapper stanford stanford-ner
Last synced: 22 Oct 2024
https://github.com/bramvanroy/spacy_conll
Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doc and its sentences and tokens. Can also be used as a command-line tool.
conll conll-u data-science machine-learning natural-language-processing nlp pandas parser python spacy spacy-extension spacy-pipeline stanford-machine-learning stanford-nlp stanza udpipe
Last synced: 13 Nov 2024
https://github.com/leehanchung/cs182
Berkeley CS182/282A Designing, Visualizing and Understanding Deep Neural Networks
berkeley cnn cs182 cs231n cs231n-assignment natural-language-processing nlp pytorch reinforcement-learning tensorflow transformer
Last synced: 10 Nov 2024
https://github.com/cpjku/wechsel
Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.
bert language-model natural-language-processing nlp transfer-learning transformers
Last synced: 13 Nov 2024
https://github.com/yuanjie-ai/chinesesensitivevocabulary
暴恐违禁 文本色情 政治敏感 恶意推广 低俗辱骂
Last synced: 16 Oct 2024
https://github.com/telecombcn-dl/2017-persontyle
Applied Deep Learning Workshop London 2017
computer-vision deep-learning nlp
Last synced: 07 Aug 2024
https://github.com/PhantomInsights/comments-generator
A Reddit bot that generates new context-aware comments using Markov chains trained from a set of given users or subreddits comments history.
markov-chain nlp praw python3 reddit-bot requests
Last synced: 04 Nov 2024
https://github.com/LanguageMachines/frog
Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.
computational-linguistics dependency-parser dutch folia lemmatiser morphological-analyser morphology named-entity-recognition natural-language-processing nlp pos-tagger syntax text-processing
Last synced: 30 Oct 2024
https://github.com/phantominsights/comments-generator
A Reddit bot that generates new context-aware comments using Markov chains trained from a set of given users or subreddits comments history.
markov-chain nlp praw python3 reddit-bot requests
Last synced: 11 Nov 2024
https://github.com/benbusby/namebuster
A tool for enumerating usernames from text, files, or websites
brute-force enumeration hackthebox named-entity-recognition nlp offensive-security penetration-testing pentesting user-enumeration username username-generator
Last synced: 27 Oct 2024
https://github.com/bretttolbert/verbecc
Complete Conjugation of any Verb(e) in Catalan, French, Italian, Portuguese, Romanian or Spanish and conjugate unknown verbs using Machine Learning
catalan catalan-language conjugation conjugator french french-language french-nlp linguistics machine-learning natural-language-processing nlp portuguese-language portuguese-verbs romanian romanian-language scikit-learn spanish-language spanish-verbs verb-conjugation verbs
Last synced: 07 Nov 2024