Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Natural language processing
Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
- GitHub: https://github.com/topics/nlp
- Wikipedia: https://en.wikipedia.org/wiki/Natural_language_processing
- Created by: Alan Turing
- Aliases: natural-language-processing, nlp-machine-learning, nlp-resources,
- Last updated: 2024-11-15 00:20:20 UTC
- JSON Representation
https://github.com/hiyouga/pban-pytorch
A Position-aware Bidirectional Attention Network for Aspect-level Sentiment Analysis, PyTorch implementation.
aspect-based-sentiment-analysis attention-model deep-learning natural-language-processing nlp pytorch sentiment-analysis
Last synced: 27 Oct 2024
https://github.com/kudoai/chatgpt.js-greasemonkey-starter
🙈 A starting point for developing your own Greasemonkey userscript using chatgpt.js
ai artificial-intelligence chatgpt gpt gpt-3 gpt-4 greasemonkey greasemonkey-script greasemonkey-userscript javascript javascript-library kudoai nlp nlp-machine-learning openai template userscript userscripts ux ux-design
Last synced: 14 Oct 2024
https://github.com/cocoa-ai/NamesCoreMLDemo
🏷 iOS11 demo application for predicting gender from first names.
classification coreml coreml-models gender-classification ios machine-learning nlp swift swift4
Last synced: 09 Aug 2024
https://github.com/anjum48/commonlitreadabilityprize
4th Place solution for the Kaggle CommonLit Readability Prize
huggingface kaggle nlp pytorch transformers
Last synced: 14 Oct 2024
https://github.com/psolbach/metadoc
Aviation grade news article metadata extraction
extraction metadata news nlp perceptron
Last synced: 08 Nov 2024
https://github.com/cocoa-ai/namescoremldemo
🏷 iOS11 demo application for predicting gender from first names.
classification coreml coreml-models gender-classification ios machine-learning nlp swift swift4
Last synced: 07 Nov 2024
https://github.com/xxjwxc/gohanlp
Golang RESTful Client for HanLP.中文分词 词性标注 命名实体识别 依存句法分析 语义依存分析 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
ai dependency-parser hanlp named-entity-recognition natural-language-processing nlp pos-tagging semantic-parsing text-classification
Last synced: 28 Oct 2024
https://github.com/bastienbot/nlp-js-tools-french
POS Tagger, lemmatizer and stemmer for french language in javascript
lemmatization lemmatizer nlp postagging postgresql stemmer stemming tokenization tokenizer
Last synced: 28 Aug 2024
https://github.com/macournoyer/utterance_parser
Extract intent and entities from natural language utterances
extracts-intent nlp slot-filling
Last synced: 09 Nov 2024
https://github.com/bangla-rag/porag
Fully Configurable RAG Pipeline for Bengali Language RAG Applications. Supports both Local and Huggingface Models, Built with Langchain.
ai bengali bengali-nlp chromadb langchain llama3 llm nlp rag transformers
Last synced: 10 Oct 2024
https://github.com/thisisiron/nmt-attention-tf2
👫 Effective Approaches to Attention-based Neural Machine Translation implemented as Tensorflow 2.0
attention lstm natural-language-processing neural-machine-translation nlp nmt tensorflow tensorflow2 tf2 translation
Last synced: 08 Nov 2024
https://github.com/liebeck/spacy-sentiws
German sentiment scores with SentiWS as extension for spaCy
nlp spacy spacy-extension spacy-pipeline
Last synced: 14 Oct 2024
https://github.com/syzer/sentiment-analyser
ML that can extract german and english sentiment
english german nlp nlp-library node-js nodejs sentiment-analyser sentiment-analysis
Last synced: 28 Oct 2024
https://github.com/MiuLab/FlowDelta
FlowDelta: Modeling Flow Information Gain in Reasoning for Conversational Machine Comprehension
machine-comprehension nlp pytorch question-answering
Last synced: 07 Aug 2024
https://github.com/kasnerz/reffix
A tool for fixing a BibTeX reference list using DBLP API
arxiv arxiv-org arxiv-papers bibtex bibtex-entry bibtex-references bibtexparser dblp dblp-api dblp-bibliography natural-language-processing nlp nlproc research-paper
Last synced: 28 Oct 2024
https://github.com/chrismattmann/lucene-geo-gazetteer
Uses Apache Lucene, OpenNLP and geonames and extracts locations from text and geocodes them.
allcountries apache gazetteer geoindex geonames irds lucene nlp nlp-machine-learning opennlp
Last synced: 30 Oct 2024
https://github.com/adirthaborgohain/ner-re
A Named Entity Recognition + Entity Linker + Relation Extraction Pipeline built using spacy v3. Given a text, the pipeline will extract entities from the text as trained and will disambiguate the entities to its normalized form through an Entity Linker connected to a Knowledge Base and will assign a relation between the entities, if any.
named-entity-recognition nlp relation-extraction spacy transformers
Last synced: 09 Nov 2024
https://github.com/GermanT5/wikipedia2corpus
Wikipedia text corpus for self-supervised NLP model training
corpus german-nlp machine-learning nlp somajo wikipedia wikipedia-corpus
Last synced: 31 Oct 2024
https://github.com/seanlee97/clfzoo
A deep text classifiers library.
nlp tensorflow text-classification
Last synced: 27 Oct 2024
https://github.com/datawhalechina/whale-paper
Datawhale论文分享,阅读前沿论文,分享技术创新
cv nlp papers recommendation-system
Last synced: 09 Nov 2024
https://github.com/kbogas/medknow
Medical Relations and Entities Extraction
biomedical metamap neo4j nlp relation-extraction semrep umls
Last synced: 28 Oct 2024
https://github.com/neomatrix369/chatbot-conversations
Chatbot conversations: a demo application how two (or more) chatbots can talk to each other, the logic used to build Eliza (along with an NLP model) has been used to power the chatbots.
ai chat-application chatbot eliza eliza-chatbot graalvm helidon helidon-example java ml nlp python quarkus text
Last synced: 14 Oct 2024
https://github.com/rainmaker712/nlp_ryan
Study for Natural Language Processing & Deep Learning Framework
chatbot deep-learning machine-comprehension machine-learning nlp python pytorch scala spark tensorflow
Last synced: 13 Nov 2024
https://github.com/bnosac/rdrpostagger
R package for Ripple Down Rules-based Part-Of-Speech Tagging (RDRPOS). On more than 45 languages.
java multi-language natural-language-processing nlp pos pos-tagging r r-package tagging
Last synced: 11 Nov 2024
https://github.com/johncmunson/react-taggy
A simple zero-dependency React component for tagging user-defined entities within a block of text.
component entities named-entity-recognition natural-language ner nlp react react-component
Last synced: 28 Aug 2024
https://github.com/nikhilbarhate99/char-rnn-pytorch
Minimal implementation of Multi-layer Recurrent Neural Networks (LSTM) for character-level language modelling in PyTorch
char-rnn deep-learning lstm natural-language-generation natural-language-processing nlp pytorch pytorch-implementation pytorch-nlp pytorch-tutorial rnn
Last synced: 13 Nov 2024
https://github.com/michaelaquilina/hashedindex
Python package providing an Inverted Index implementation using dictionaries
indexing nlp nlp-machine-learning numpy pandas python2 python3 text-processing
Last synced: 28 Oct 2024
https://github.com/mirusu400/clova-x
Unofficial API for CLOVA X
api clova clovaai hacktoberfest llm naver naver-api nlp
Last synced: 06 Nov 2024
https://github.com/alexeyev/keras-generating-sentences-from-a-continuous-space
Text Variational Autoencoder inspired by the paper 'Generating Sentences from a Continuous Space' Bowman et al. https://arxiv.org/abs/1511.06349
deep-learning deeplearning keras keras-implementations nlp text-generation vae variational-autoencoder
Last synced: 11 Nov 2024
https://github.com/vinitra-zz/neural-text-style-transfer
Style Transfer for non-parallel text
autoencoder deep-neural-networks nlp style-transfer
Last synced: 22 Oct 2024
https://github.com/Lingkai-Kong/Calibrated-BERT-Fine-Tuning
Code for Paper: Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data
bert calibration deep-learning language-model nlp nlp-machine-learning ood-detection open-world-classification robustness text-classification uncertainty-estimation uncertainty-quantification
Last synced: 13 Nov 2024
https://github.com/selimfirat/bilkent-turkish-writings-dataset
Turkish writings dataset that promotes creativity, content, composition, grammar, spelling and punctuation.
bilkent-university creative-writing dataset nlp nlp-datasets pdf-conversion turkish turkish-language
Last synced: 10 Oct 2024
https://github.com/sea-snell/calm-dialogue
Official code for the paper "Context-Aware Language Modeling for Goal-Oriented Dialogue Systems"
deep-learning language-model nlp python pytorch reinforcement-learning
Last synced: 27 Oct 2024
https://github.com/nlppln/nlppln
NLP pipeline software using common workflow language
cwl nlp pipeline text-mining workflow
Last synced: 23 Oct 2024
https://github.com/cyclecycle/spacy-pattern-builder
Reverse engineer patterns for use with SpaCy's DependencyMatcher
Last synced: 10 Oct 2024
https://github.com/livingbio/syntaxnet_wrapper
A Python Wrapper for Google SyntaxNet
google-syntaxnet nlp python python-wrapper syntaxnet
Last synced: 09 Nov 2024
https://github.com/wit-ai/android-voice-demo
Example on how to build a voice-enabled Android app with Wit.ai
android machine-learning nlp nlu voice wit witai
Last synced: 15 Nov 2024
https://github.com/hyperparticle/lemmatag
A neural network that jointly part-of-speech tags and lemmatizes sentences, boosting accuracy for morphologically-rich languages (Czech, Arabic, etc.)
deep-learning lemmatization machine-learning natural-language-processing neural-network nlp pos-tagging tensorflow
Last synced: 14 Nov 2024
https://github.com/omarsar/clinical_nlp_elastic
Clinical NLP Analysis with Elasticsearch and Kibana
elastic elasticsearch kibana linguistics machine-learning mental-health nlp
Last synced: 28 Oct 2024
https://github.com/x-lance/mobile-env
A Universal Platform for Training and Evaluation of Mobile Interaction
decision-making information-ui infoui interaction-platform nlp rl-environments rl-platform
Last synced: 12 Nov 2024
https://github.com/ivan-bilan/nlp-and-data-science-spotlights
Regular spotlights of underrated NLP and Data Science GitHub repositories
data-science deep-learning natural-language-processing nlp spotlight
Last synced: 08 Nov 2024
https://github.com/pyunits/pyunit-ner
NER实体识别模型,快速高效简单一键部署docker部署调用模型。能识别:地址、人名、机构名实体。
Last synced: 12 Nov 2024
https://github.com/miroozyx/BERT_with_keras
A Keras version of Google's BERT model
bert deep-learning nlp tensorflow
Last synced: 02 Nov 2024
https://github.com/georgezouq/awosome-ai-in-social-media
💻 Collect those AI & Bot use in social media wechat/facebook/twitter/instagram/weibo/TikTok etc.
facebook ins nlp social-media social-network social-network-analysis twitter wechat
Last synced: 10 Nov 2024
https://github.com/wri-dssg-omdena/policy-data-analyzer
Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
active-learning bert data-science document-classification environmental huggingface incentives landscape-restoration lda machine-learning nlp policy sbert scraping scrapy sentence-transformers spyder text-classification topic transformers
Last synced: 30 Oct 2024
https://github.com/Ermlab/PoLitBert
Polish RoBERTA model trained on Polish literature, Wikipedia, and Oscar. The major assumption is that quality text will give a good model.
nlp polish roberta text-corpus
Last synced: 11 Nov 2024
https://github.com/riccorl/transformers-embedder
A Word Level Transformer layer based on PyTorch and 🤗 Transformers.
allennlp bert deep-learning embeddings hidden-states huggingface huggingface-transformers language-model natural-language-processing nlp preprocess pretrained-models python pytorch sentences tokenizer transformer transformer-embedder transformers transformers-embedder
Last synced: 08 Nov 2024
https://github.com/nitotm/efficient-language-detector-js
Fast and accurate natural language detection. Detector written in Javascript. Nito-ELD, ELD.
javascript language language-detection language-detector language-identification natural-language natural-language-processing nlp nodejs
Last synced: 12 Oct 2024
https://github.com/writer/replacy
spaCy match and replace, maintaining conjugation
Last synced: 01 Nov 2024
https://github.com/sayakpaul/bert-for-mobile
Compares the DistilBERT and MobileBERT architectures for mobile deployments.
bert distilbert mobile mobile-bert nlp tensorflow-lite
Last synced: 23 Oct 2024
https://github.com/omarsar/nlp_pycon
Material for PyCon 2019 NLP Tutorial
deep machine-learning nlp pytorch
Last synced: 28 Oct 2024
https://github.com/pszemraj/ai-msgbot
Training & Implementation of chatbots leveraging GPT-like architecture with the aitextgen package to enable dynamic conversations.
ai aitextgen chat-application chatbot deep-learning deepspeed deployment gpt-2 gpt-j gpt-j-6b gradio huggingface huggingface-transformers natural-language-processing nlp nlp-parsing telegram telegram-bot text-generation transformers
Last synced: 03 Oct 2024
https://github.com/alan-turing-institute/robots-in-disguise
Information and materials for the Turing's "robots-in-disguise" reading group on fundamental AI research.
deep-learning diffusion-models foundation-model hut23 language-models large-language-models machine-learning nlp transformers
Last synced: 13 Nov 2024
https://github.com/nlpir-team/nlpir-python
NLPIR-python A python wrapper and toolkit for NLPIR
Last synced: 14 Nov 2024
https://github.com/dzieciou/pystempel
Python port of Stempel, an algorithmic stemmer for Polish language.
Last synced: 26 Oct 2024
https://github.com/google-research-datasets/swim-ir
SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 languages, generated using PaLM 2 and summarize-then-ask prompting.
cross-lingual datasets deep-learning information-retrieval machine-learning multilingual natural-language-processing neural-information-retrieval nlp training-data
Last synced: 08 Nov 2024
https://github.com/zlsh80826/msmarco
Machine Comprehension Train on MSMARCO with S-NET Extraction Modification
cntk extraction-net machine-comprehension msmarco nlp question-answering s-net
Last synced: 28 Oct 2024
https://github.com/aliosm/simplerepresentations
Easy-to-use text representations extraction library based on the Transformers library.
Last synced: 27 Oct 2024
https://github.com/JackHCC/Chinese-Tokenization
利用传统方法(N-gram,HMM等)、神经网络方法(CNN,LSTM等)和预训练方法(Bert等)的中文分词任务实现【The word segmentation task is realized by using traditional methods (n-gram, HMM, etc.), neural network methods (CNN, LSTM, etc.) and pre training methods (Bert, etc.)】
bert-crf bilstm-crf hmm-viterbi-algorithm ngram nlp tokenization
Last synced: 03 Aug 2024
https://github.com/hellohaptik/HINT3
This repository contains datasets and code for the paper "HINT3: Raising the bar for Intent Detection in the Wild" accepted at EMNLP-2020's Insights Workshop https://insights-workshop.github.io/ Preprint for the paper is available here https://arxiv.org/abs/2009.13833
conversational-ai datasets dialogue-systems nlp
Last synced: 16 Nov 2024
https://github.com/uminosachi/open-llm-webui
This repository contains a web application designed to execute relatively compact, locally-operated Large Language Models (LLMs).
chatbot ggml gradio huggingface language-model llama llama2 llama3 llava llava-llama3 llm nlp transformers
Last synced: 10 Oct 2024
https://github.com/staticdev/human-readable
Lib to make data intended for machines, readable to humans.
formatting humanizable humanization natural-language-processing nlp readable
Last synced: 16 Nov 2024
https://github.com/vasilescur/parse_context
Use GPT-3 to process human conversations and extract context, identify information that would be useful, and suggest data sources to get that information. Intended for a voice assistant.
ai assistants gpt-3 natural-language nlp semantic-analysis
Last synced: 16 Nov 2024
https://github.com/oneoffcoder/docker-containers
A collection of pedantic docker containers.
deep-learning docker-containers docker-images jupyter nlp object-detection python raspberry-pi yolo
Last synced: 05 Nov 2024
https://github.com/hyunwoongko/bert2bert-summarization
Abstractive summarization using Bert2Bert framework.
Last synced: 28 Oct 2024
https://github.com/eimg/myanmar-text-breaker
Syllable and word, breaker/boundary-segmentation for Myanmar text in JavaScript
Last synced: 25 Oct 2024
https://github.com/proycon/analiticcl
an approximate string matching or fuzzy-matching system for spelling correction, normalisation or post-OCR correction
approximate-string-matching fuzzy-matching nlp normalization spelling-correction
Last synced: 14 Nov 2024
https://github.com/thisiscetin/textoken
Simple and customizable text tokenization gem.
Last synced: 07 Nov 2024
https://github.com/codewithzichao/deepclassifier
DeepClassifier is aimed at building general text classification model library.It's easy and user-friendly to build any text classification task.
deep-learning deepclassifier nlp pytorch text-classification torch
Last synced: 07 Nov 2024
https://github.com/peaceiris/actions-suggest-related-links
A GitHub Action to suggest related or similar issues, documents, and links. Based on the power of NLP and fastText.
actions fasttext github-actions issue-management nlp
Last synced: 31 Oct 2024
https://github.com/princeton-vl/attach-juxtapose-parser
Code for the paper "Strongly Incremental Constituency Parsing with Graph Neural Networks"
machine-learning neurips-2020 nlp parsing
Last synced: 09 Nov 2024
https://github.com/Smat26/Roman-Urdu-Dataset
Compilation of Manually Tagged Roman Urdu Dataset (Urdu written in Latin/Roman Script), along with other helpful Roman Urdu NLP resources
data-science dataset hindi hindi-language natural-language-processing nlp urdu urdu-language urdu-nlp
Last synced: 04 Aug 2024
https://github.com/dadosabertosdefeira/tomba
Identifique endereços, bairros e outras localizações brasileiras em um texto 🏘
brasil hacktoberfest nlp spacy
Last synced: 12 Oct 2024
https://github.com/mananshah99/sentR
Simple sentiment analysis framework for R
Last synced: 05 Aug 2024
https://github.com/Furyton/awesome-language-model-analysis
This paper list focuses on the theoretical and empirical analysis of language models, especially large language models (LLMs). The papers in this list investigate the learning behavior, generalization ability, and other properties of language models through theoretical analysis, empirical analysis, or a combination of both.
ai analysis analytics awesome chatgpt deep-learning generative-ai large-language-models llm nlp theory transformers
Last synced: 19 Sep 2024
https://github.com/solygambas/python-openai-projects
13 projects using ChatGPT API, Whisper, Embeddings, and DALL-E with Python.
auto-gpt chatbot chatgpt dall-e embeddings gpt-4 langchain langchain-python machine-learning nlp nlp-machine-learning open-ai-api openai python reddit reddit-api spotify spotify-api stable-diffusion whisper
Last synced: 27 Oct 2024
https://github.com/X-LANCE/Mobile-Env
A Universal Platform for Training and Evaluation of Mobile Interaction
decision-making information-ui infoui interaction-platform nlp rl-environments rl-platform
Last synced: 09 Nov 2024
https://github.com/brianspiering/nlp-course
An introduction to Natural Language Processing (NLP) course
machine-learning natural-language-processing nlp python
Last synced: 07 Nov 2024
https://github.com/qznan/qiznlp
Quick run NLP in many task 快速运行分类、序列标注、匹配、生成等NLP任务的Tensorflow框架 (中文 NLP 支持分布式)
beam-search chinese classification horovod match nlp sequence-labeling sequence-to-sequence tensorflow
Last synced: 13 Oct 2024
https://github.com/tangbinh/question-answering
bidaf drqa nlp pytorch question-answering squad
Last synced: 13 Nov 2024
https://github.com/navalnica/be_nlp_speech_resources
Links to Belarusian NLP and Speech resources
asr belarus belarusian belarusian-language natural-language-processing nlp speech speech-processing speech-recognition speech-synthesis speech-to-text stt text-to-speech tts
Last synced: 13 Nov 2024
https://github.com/hienduyph/oxford-deepnlp-2017
:rocket: :tada: :sparkles: Oxford Deep NLP 2017 Course Materials and Practicals, Solutions
Last synced: 09 Nov 2024
https://github.com/sarthakjshetty/pyresearchinsights
End-to-end NLP tool to analyze research publications. Published in Ecology & Evolution 2021.
gensim natural-language-processing nlp python scientific-analysis spacy text-mining
Last synced: 12 Oct 2024
https://github.com/pooya-mohammadi/persian-spell-checker-kenlm
A complete instruction for training a Persian spell checker and a language model based on SymSpell and KenLM, respectively using Wikipedia dataset.
bash kenlm language-model nlp persian python spellcheck spellchecker symspell
Last synced: 04 Aug 2024
https://github.com/dalmia/quora-question-pairs
The code for our submission in Kaggle's competition Quora Question Pairs which ranked in the top 25%.
deep-learning machine-learning nlp quora-question-pairs tensorflow
Last synced: 30 Oct 2024
https://github.com/sudharsan13296/word2vec-from-scratch
simple Word2vec from scratch using tensorflow for understanding
deep-learning natural-language-processing nlp scratch word2vec word2vec-algorithm word2vec-model
Last synced: 15 Nov 2024
https://github.com/ownthink/chatbot
基于语义理解、知识图谱的聊天机器人
chatbot knowledgegraph nlp nlu qa
Last synced: 07 Nov 2024
https://github.com/explosion/vscode-prodigy
🧬 A VS Code extension for annotating data with Prodigy
annotation-tool data-annotation data-labeling data-labeling-tools data-science labeling-tool nlp prodigy spacy vscode vscode-extension
Last synced: 07 Oct 2024
https://github.com/thunlp/cokebert
CokeBERT: Contextual Knowledge Selection and Embedding towards Enhanced Pre-Trained Language Models
bert knowledge-graph nlp pretrained-language-model pytorch
Last synced: 10 Nov 2024
https://github.com/songyouwei/fiction_generator
Fiction generator with Tensorflow. 模仿王小波的风格的小说生成器
deep-learning keras lstm nlp seq2seq tensorflow text-generation
Last synced: 11 Nov 2024
https://github.com/anthonysigogne/web-search-engine-ui
UI - a simple web search engine
elasticsearch google-search indexing nlp python search-engine
Last synced: 12 Nov 2024
https://github.com/dsdanielpark/gpt2-bert-medical-qa-chat
Medical domain-focused GPT-2 fine-tuning, optimization, and lightweighting research repository (compared to GPT-4).
bert chatgpt gpt2 gpt4 medical-chatbot natural-language-processing nlp nlp-keywords-extraction
Last synced: 14 Nov 2024
https://github.com/arjunpatel7/perfect-prompt
An approach to creating the perfect prompt for any image generation task.
cohere nlp prompt stable-diffusion streamlit text-generation
Last synced: 11 Oct 2024