Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Natural language processing
Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
- GitHub: https://github.com/topics/nlp
- Wikipedia: https://en.wikipedia.org/wiki/Natural_language_processing
- Created by: Alan Turing
- Aliases: natural-language-processing, nlp-machine-learning, nlp-resources,
- Last updated: 2024-07-29 13:51:14 UTC
- JSON Representation
https://github.com/pyurbans/urbans
A tool for translating text from source grammar to target grammar (context-free) with corresponding dictionary.
artificial-intelligence data-science machine-translation nlp python
Last synced: 02 Aug 2024
https://github.com/banyh/PyStanfordNLP
A Python Wrapper of Stanford Chinese Segmenter
nlp postagging python-wrapper stanford stanford-chinese-segmenter
Last synced: 02 Aug 2024
https://github.com/AnthonyMRios/adversarial-relation-classification
Unsupervised domain adaptation method for relation extraction
bioinformatics biomedical-data-science machine-learning natural-language-processing nlp nlp-machine-learning relation-extraction
Last synced: 03 Aug 2024
https://github.com/study-assist/browser-extension
A tool to help you organise your bookmarks intelligently
bookmarks bookmarks-manager browser-extension data-analysis machine-learning natural-language-processing nlp
Last synced: 01 Aug 2024
https://github.com/fursovia/geometric_embedding
"Zero-Training Sentence Embedding via Orthogonal Basis" paper implementation
Last synced: 03 Aug 2024
https://github.com/IITGuwahati-AI/Fake-News-Detection
Detecting Fake News using AI
bert clickbait fake-news-articles fake-news-detection huggingface huggingface-transformer natural-language-processing nlp python3 pytorch tensorflowjs transformer
Last synced: 01 Aug 2024
https://github.com/proycon/deepfrog
An NLP-suite powered by deep learning
deep-learning deep-neural-networks dutch folia frog nlp transformers
Last synced: 03 Aug 2024
https://github.com/cmccomb/rust-stop-words
Common stop words in a variety of languages
languages natural-language-procressing nlp nltk rust-crate stopwords
Last synced: 04 Aug 2024
https://github.com/systats/textlearnR
A simple collection of well working NLP models (Keras, H2O, StarSpace) tuned and benchmarked on a variety of datasets.
classification hyperparameter-optimization keras nlp r text-mining
Last synced: 05 Aug 2024
https://github.com/MilaNLProc/bertlang
A web interface to understand language-specific BERT-models
artificial-intelligence bert-model machine-learning nlp nlp-machine-learning
Last synced: 28 Aug 2024
https://github.com/sno2/bertml
Use common pre-trained ML models in Deno!
bert deno machine-learning nlp rust
Last synced: 17 Aug 2024
https://github.com/Hoiy/berserker
Berserker - BERt chineSE woRd toKenizER
bert bert-chinese chinese-nlp chinese-word-segmentation nlp sequence-to-sequence state-of-the-art tensorflow tokenizer tpu
Last synced: 01 Aug 2024
https://github.com/MeiFagundes/PolarisAI
Personal Assistant Engine built with ML.NET.
asp-net-core dotnet machine-learning ml-net mlnet natural-language-processing nlp personal-assistant personal-assistant-engine rest-api
Last synced: 02 Aug 2024
https://github.com/jaron/sciencegraph
A comprehensive knowledge graph of scientific concepts
knowledge-graph neo4j nlp question-answering
Last synced: 01 Aug 2024
https://github.com/yuanxiaosc/Deep_dynamic_contextualized_word_representation
TensorFlow code and pre-trained models for A Dynamic Word Representation Model Based on Deep Context. It combines the idea of BERT model and ELMo's deep context word representation.
Last synced: 01 Aug 2024
https://github.com/innodatalabs/tbert
PyTorch port of BERT ML model
bert-model natural-language-processing neural-network nlp pytorch
Last synced: 01 Aug 2024
https://github.com/hsgodhia/squad_rasor_nn
Pytorch implementation of the RaSoR paper "Learning Recurrent Span Representations for Extractive Question Answering" (Lee et al. 2016) and experiments with various neural components
deep-learning machine-comprehension nlp pytorch
Last synced: 07 Aug 2024
https://github.com/StefanHeng/Symbolic-Music-Generation
Symbolic music generation taking inspiration from NLP and human composition process
autoregressive-models melody-extraction melody-generation midi music-generation music-xml nlp reformer representation-learning transformer transformer-decoder transformer-xl transformers-models
Last synced: 05 Aug 2024
https://github.com/grafit-io/grafit
grafit.io - shared knowledge
django-rest-framework knowledge-graph knowledge-management nlp react
Last synced: 01 Aug 2024
https://github.com/yuanxiaosc/Deep_dynamic_word_representation
TensorFlow code and pre-trained models for A Dynamic Word Representation Model Based on Deep Context. It combines the idea of BERT model and ELMo's deep context word representation.
Last synced: 22 Aug 2024
https://github.com/grahamwaters/lorebook_generator_for_novelai
Generates a lorebook for novelai
author history nlp novelai research writing-tool
Last synced: 01 Aug 2024
https://github.com/karoly-hars/gpt2_episode_summary_generator
Utilizing webscraping and state-of-the-art NLP to generate TV show episode summaries.
artifical-intelligense deep-learning gpt2 imdb natural-language-generation natural-language-processing neural-networks nlp pytorch-transformers scrapy torch webscraping wikipedia
Last synced: 03 Aug 2024
https://github.com/SapienzaNLP/xl-amr
XL-AMR is a sequence-to-graph cross-lingual AMR parser that exploits transfer learning (EMNLP2020).
abstract-meaning-representation amr amr-graphs amr-parsing natural-language-processing nlp semantic-parsing translations
Last synced: 31 Jul 2024
https://github.com/snipsco/snips-nlu-parsers
Rust crate for entity parsing
entity-recognition entity-resolution nlp nlu rust
Last synced: 01 Aug 2024
https://github.com/reinfer/blingfire-rs
Rust wrapper for the BlingFire tokenization library
machine-learning nlp rust rust-wrapper tokenizer
Last synced: 04 Aug 2024
https://github.com/Etwas-Builders/Twitter-Source-Bot
Ever wanted to know the source of a tweet? Just @whosaidthis_bot and I'll tell you where it came from
bot mozilla-builders nlp source-verify twitter-bot twitter-source-bot web-scraping
Last synced: 01 Aug 2024
https://github.com/ppke-nlpg/purepos
PurePos is an open source hybrid morphological tagger.
hungarian morphological-analysis nlp parser pos-tagger tagger
Last synced: 03 Aug 2024
https://github.com/brunoarine/findlike
Command-line tool that finds lexically similar documents in relation to a reference text file or ad-hoc query
bm25 nlp similarity-search tfidf
Last synced: 07 Aug 2024
https://github.com/JackHCC/Arxiv-NLP-Reporter
每日自动获取Arxiv上NLP相关最新论文【Arxiv Natural Language Processing Paper Automatic Crawl Daily】
Last synced: 02 Aug 2024
https://github.com/totalhack/zillion-web
Zillion Web: A Demo UI and Web API for Zillion
analytics data-warehousing demo-ui docker-swarm-mode dockerswarm fastapi nlp text-to-sql typescript vue warehouse zillion
Last synced: 31 Jul 2024
https://github.com/bluelovers/node-segment
Chinese word segmentation 簡繁中文分词模块 以網路小說為樣本 基于 Node.js 的中文分词模块
chinese javascript nlp nodejs segment typescript
Last synced: 02 Aug 2024
https://github.com/KxSystems/nlp
Natural-language processing library
clustering dataset embedpy kdb natural-language-processing nlp parsing python q vector
Last synced: 02 Aug 2024
https://github.com/dair-ai/nlp_highlights
✨ A report of the most important NLP highlights (A Yearly Report - 2018, 2019)
deep-learning machine-learning nlp
Last synced: 01 Aug 2024
https://github.com/EmreTaha/Unsupervised-Domain-Adaptation-with-BERT
Unsupervised domain adaptation with BERT for Amazon food product reviews sentiment analysis.
adversarial-learning amazon-food-reviews bert bert-model colab domain-adaptation nlp sentiment-analysis tensorflow unsupervised-learning
Last synced: 03 Aug 2024
https://github.com/JnRMnT/ZemberekDotNet
ZemberekDotNet is the .NET Port of Zemberek-NLP (Natural Language Processing tools for Turkish).
csharp language machine-learning morphology natural-language-processing nlp nuget turkish zemberek zemberek-nlp
Last synced: 02 Aug 2024
https://github.com/seujung/gluonnlp_tutorial?fbclid=IwAR1dVxeXYp06Zr4h4OFjL38W6enZ4SjJd27n7MSkmt4v9wKOtj9Sol5B3Es
GluonNLP tutorial for Pycon2019
Last synced: 02 Aug 2024
https://github.com/LanguageMachines/libfolia
FoLiA library for C++
folia library natural-language-processing nlp
Last synced: 31 Jul 2024
https://github.com/ndabAP/assocentity
Package assocentity returns the mean distance from tokens to an entity and its synonyms
go golang natural-language-processing nlp social-sciences tokenizer
Last synced: 30 Jul 2024
https://github.com/chachamatcha/DL_Text_Classification
Collection of Deep Learning Text Classification Models in Keras; Includes a GPU tutorial.
benchmark benchmarks deep-learning deep-learning-tutorial deep-neural-networks gpu kaggle kaggle-competition keras keras-tutorials natural-language-processing nlp python tensorflow-gpu text text-classification text-processing toxic-comment-classification tutorial
Last synced: 07 Aug 2024
https://github.com/adriacabeza/DeepCatalan
🤖 Deep Catalan: Bring closer the Catalan Language to Deep Learning using ULMFit.
catalan catalan-language classificador fastai fine-tuning nlp pytorch ulmfit
Last synced: 30 Jul 2024
https://github.com/varunon9/chat-reply-suggestions
Auto reply suggestions to chat messages/emails (like gmail and linkedin) built using rasa_nlu framework.
chat-reply chatbot nlp rasa rasa-nlu
Last synced: 01 Aug 2024
https://github.com/delph-in/pydmrs
A library for manipulating DMRS structures
computational-linguistics delph-in dependency-graph dmrs formal-semantics hpsg linguistics minimal-recursion-semantics mrs natural-language natural-language-processing nlp python semantics
Last synced: 07 Aug 2024
https://github.com/helboukkouri/embedding-visualization
This is a project for visualizing word embeddings based on the work of Andrei Kashcha (@anvaka).
fasttext glove graphs nlp visualization word-embeddings word2vec
Last synced: 03 Sep 2024
https://github.com/alexandrevl/supersummarizeai
Unleash the power of AI with SuperSummarizeAI! Effortlessly extract, condense, and clip content from webpages and YouTube videos using ChatGPT. Turning endless streams of content into digestible summaries.
beautifulsoup chatgpt content-analysis multilingual nlp openai papperclip text text-processing text-summarization web-scraping youtube
Last synced: 02 Aug 2024
https://github.com/tech4germany/bam-inclusify
INCLUSIFY is a tool to support the practical use of diversity-sensitive language in German.
diversity equality german govtech language nlp react t4g tech4germany
Last synced: 03 Aug 2024
https://github.com/amir9ume/urdu_ghazals_rekhta
Dataset for Urdu Ghazals
data dataset language-model machine-learning nlp parser rekhta urdu
Last synced: 04 Aug 2024
https://github.com/Rajan-sust/WikiTextCorpusDownloader
A Language Independent Wikipedia Text Corpus Downloader
gensim nlp python3 tensorflow wikipedia
Last synced: 31 Jul 2024
https://github.com/KompleteAI/xllm
🦖 X—LLM: Simple & Cutting Edge LLM Finetuning
alpaca cerebras chatgpt deep-learning deep-neural-networks deeplearning falcon gpt language-model large-language-models llama2 llama2-7b llm mistal mistral mistralai natural-language-processing nlp openai vicuna
Last synced: 05 Aug 2024
https://github.com/AnatoliiPotapov/squad
fastqa keras machine-comprehension ml nlp qa qrqa question-answering squad tensorflow website wikipedia
Last synced: 02 Aug 2024
https://github.com/trinker/sentimentpy
A Python port of the #rstats sentimentr package
emotion nlp polarity sentiment text-mining
Last synced: 05 Aug 2024
https://github.com/rosette-api/java
Rosette API Client Library for Java
entity-extraction entity-linking fuzzy-matching java machine-learning name-translation natural-language-processing nlp rosette text-analytics text-mining tokenization
Last synced: 03 Aug 2024
https://github.com/Cartus/AMR-Parser
Better Transition-Based AMR Parsing with a Refined Search Space (authors' DyNet implementation for the EMNLP18 paper)
aligner amr-parser nlp semantic-parsing transition-based-parser
Last synced: 02 Aug 2024
https://github.com/moj-analytical-services/pq-tool
Tool to analyse past parliamentary questions with visualisation in RShiny
clustering latent-semantic-analysis nlp shiny
Last synced: 13 Aug 2024
https://github.com/MaartenGr/Reviewer
Tool for extracting and analyzing IMDB reviews
bert disney imdb ner nlp sentiment-analysis
Last synced: 04 Aug 2024
https://github.com/fostroll/junky
Layers, datasets and utilities for PyTorch
artificial-intelligence deep-learning machine-learning natural-language-processing nlp python pytorch
Last synced: 03 Aug 2024
https://github.com/Pushkar1853/Cover-Generator
Application of OpenAI tools such as Whisper, DALL-E, and ChatGPT to generate album covers from audio
chat-gpt computer-vision dall-e music nlp openai stable-diffusion whisper-ai
Last synced: 31 Jul 2024
https://github.com/Captmoonshot/data-bore
A Django REST API with an embedded ML model for sentiment analysis of movie reviews.
dash dashboard django django-rest-framework machine-learning nlp pandas plotly powerbi python
Last synced: 08 Aug 2024
https://github.com/contefranz/OpTop
Optimal topic identification from a pool of Latent Dirichlet Allocation models
latent-dirichlet-allocation lda model-selection natural-language-processing nlp text-mining topic-modeling
Last synced: 05 Aug 2024
https://github.com/simonepri/varname-seq2seq
📄Source code variable naming using a seq2seq architecture
Last synced: 30 Jul 2024
https://github.com/doubledaibo/compcaption_neurips2018
A Neural Compositional Paradigm for Image Captioning
compositionality image-captioning neurips-2018 nlp
Last synced: 31 Jul 2024
https://github.com/minhd-vu/toxicity-filter
Natural language processing API to detect toxic chat.
Last synced: 01 Aug 2024
https://github.com/imraviagrawal/ReadingComprehension
Bi-Directional Attention Flow for Machine Comprehensions
artificial-intelligence attention-mechanism bilstm deep-learning deep-recurrent-q-network glove-embeddings glove-vectors gru lstm machine-comprehension natural-language-processing nlp nlp-machine-learning question-answering readingcomprehension squad standard umassamherst word2vec wordembedding
Last synced: 07 Aug 2024
https://github.com/UCSB-NLP-Chang/ULD
Implementation of paper 'Reversing the Forget-Retain Objectives: An Efficient LLM Unlearning Framework from Logit Difference'
large-language-models nlp pytorch transformers unlearning
Last synced: 31 Jul 2024
https://github.com/ariya/tebakmasa
Infer the date and time from the general description in Bahasa Indonesia
bahasa bahasa-indonesia date indonesian nlp time timestamp
Last synced: 01 Aug 2024
https://github.com/mhezarei/ai-bot
2020 AI bot challenge (ai-bot.ir) repository. This program answers a given question with a specific format and subject.
Last synced: 04 Aug 2024
https://github.com/praekelt/feersum-nlu-api-wrappers
Swagger spec and generated Python language wrappers for the FeersumNLU HTTP Rest API for building intelligent chatbots.
chatbot-framework nlp nlp-machine-learning
Last synced: 13 Aug 2024
https://github.com/AlbertSuarez/casescan
🔍 Clinical cases search by similarity specialized in Covid-19
nlp python react similarity-search
Last synced: 30 Jul 2024
https://github.com/derintelligence/az-summarization
Abstractive summarization for Azerbaijani language
azerbaijan dataset language linguistics nlp summarization
Last synced: 02 Aug 2024
https://github.com/alvations/myth
Myanmar and Thai Language Resources
machine-translation myanmar nlp thai
Last synced: 30 Jul 2024
https://github.com/uds-lsv/Multi-tasking_Learning_With_Unreliable_Labels
Extending the NLNN algorithm proposed by Bekker & Goldbergers in a Multi-tasking Learning set-up to handle noisy labels. In order to extend low-resource data we often used artificial annotators. In this following setup we aim to generate clean training labeled data from artificial annotators.
machine-learning nlp noise-reduction
Last synced: 01 Aug 2024
https://github.com/rosette-api/nodejs
Rosette API Client Library for Node.js
categorization entity-extraction language-detection lemmatization machine-learning morphology name-translation natural-language-processing nlp nodejs npm relationship-extraction text-analysis tokenization transliteration
Last synced: 03 Aug 2024
https://github.com/strayMat/tag_serve
Deployable Neural Tagger implementation for Named Entity Recognition
bilstm-crf deep-learning docker-image flask flask-application machine-learning ner neural-network nlp pytorch tagger
Last synced: 03 Sep 2024
https://github.com/BrianWeinstein/googlenlp
An Interface to Google's Cloud Natural Language API
api cran google-cloud-platform nlp r
Last synced: 13 Aug 2024
https://github.com/clnnn/chat-summarizer
💬 Real-time chat application prototype that can summarise the entire chat log
angular flask huggingface-transformer java ngrx nlp python websocket
Last synced: 31 Jul 2024
https://github.com/patrick-miller/textbook-concept-map
Build a concept map from textbooks using DBpedia Spotlight
concept-map dbpedia-spotlight educational-technology nlp
Last synced: 01 Aug 2024
https://github.com/denismosolov/alice-entities-library
Набор именованных сущностей для платформы Яндекс.Диалоги. Используйте при создании навыков Алисы.
alice-sdk alice-skills nlp yandex-dialogs
Last synced: 03 Aug 2024
https://github.com/pharo-ai/Polyglot
A library for Natural Language Processing
natural-language-processing nlp pharo
Last synced: 03 Aug 2024
https://github.com/oneapi-src/disease-prediction
AI Starter Kit for the implementation of AI-based NLP Disease Prediction system using Intel® Extension for PyTorch* and Intel® Neural Compressor
Last synced: 01 Aug 2024
https://github.com/mochi-co/ngrams
A Go n-gram indexer for natural language processing with modular tokenizers and data stores
bigrams go golang language-model natural-language-processing ngram ngrams nlp tokenization trigrams
Last synced: 02 Aug 2024
https://github.com/rosette-api/csharp
Rosette API Client Library for C#
capi csharp entity-extraction language-identification machine-learning morphology name-translation natural-language-processing nlp nuget rosette text-analysis text-analytics text-embedding visual-studio
Last synced: 03 Aug 2024
https://github.com/webpolis/musai
Machine learning-powered music generation. Full-featured tokenizer, customization options, and high-quality output files. Integration with music production tools.
deep-learning generative-art large-language-models llm machine-learning midi music music-generation nlp recurrent-neural-networks rnn text-generation tokenizer vae variational-autoencoder
Last synced: 05 Aug 2024
https://github.com/rosette-api-community/document-summarization
Summarize documents based on content extracted via Rosette API
document-summarization entity-extraction machine-learning morphological-analysis morphology named-entities natural-language-processing nlp python
Last synced: 03 Aug 2024
https://github.com/codeastra2/ChatGPTDevFriendly
A wrapper over the Chatgpt Python APIs, for better developer experience.
api chatgpt chatgpt-api chatgpt-api-wrapper chatgpt-bot chatgpt-python chatgpt-sdk gpt3 gpt4-api nlp openai-api python
Last synced: 01 Aug 2024
https://github.com/kaushalpowar/talk_to_pdf
Talk to your pdf using OpenAI
ai ghdesktop github gpt-3 learn llm nlp nlp-machine-learning opeanai
Last synced: 01 Aug 2024
https://github.com/thomas-chauvet/names_transliteration
Neural Machine Translation (NMT) applied to transliterate names in arabic characters to latin characters (romanization).
arabic characters cli data dataset deep-learning latin neural-network nlp nmt romanization seq2seq translation transliteration typer-cli
Last synced: 03 Aug 2024
https://github.com/seanghay/khmernormalizer
A missing toolkit for Khmer Natural Language Processing.
khmer nlp normalization normalizer verbalization
Last synced: 01 Aug 2024
https://github.com/minnesotanlp/Quantifying-Annotation-Disagreement
Official implementation of Wan et al's paper "Everyone's Voice Matters: Quantifying Annotation Disagreement Using Demographic Information" (AAAI 2023)
aaai ai annotation natural-language-processing nlp roberta
Last synced: 01 Aug 2024
https://github.com/rosette-api/R-Binding
R client binding for the Rosette API
categorization entity-extraction fuzzy-matching machine-learning morphology name-translation natural-language-processing nlp r relationship-extraction sentiment-analysis text-analysis tokenization transliteration
Last synced: 03 Aug 2024
https://github.com/Anbani/word-embeddings
anbani georgian natural-language-processing nlp word-embeddings
Last synced: 31 Jul 2024
https://github.com/rosette-api/php
Rosette API Client Library for PHP
entity-extraction language-identification lemma morphology named-entity-recognition natural-language-processing nlp php text-analytics text-embedding tokenization
Last synced: 03 Aug 2024
https://github.com/Davisy/Texthero-Python-Toolkit
Texthero is a simple python toolkit to work with a text-based dataset. It provides quick and effortlessly functionalities to preprocess, represent, map it into vectors and visualize text data in just a couple of lines of codes.
machine-learning natural-language-processing nlp preprocessing python
Last synced: 01 Aug 2024
https://github.com/VoXera/VoXera
An Open-Source Persian Language Techs Toolkit with Python
deep-learning deep-neural-networks keyword-extraction machine-learning natural-language-processing nlp openai persian persian-language speech-recognition speech-to-text text-processing vosk vosk-api whisper
Last synced: 04 Aug 2024
https://github.com/Joppewouts/belabBERT
🤧belabBERT: Repository for a new Dutch language model based on the RoBERTa architecture
bert language-model nlp roberta
Last synced: 03 Aug 2024
https://github.com/aman5319/Classification-Report
This repo helps to track model Weights, Biases and Gradients during training with loss tracking and gives detailed insight for Classification-Model Evaluation
classification image-classification loss-plotting metrics-visualization model-visualization nlp pytorch sklearn tensorboard tensorboard-pytorch tensorboard-visualization text-classification
Last synced: 03 Aug 2024
https://github.com/sinaahmadi/KurdishTokenization
Tokenization resources for Kurdish (Sorani & Kurmanji dialects)
kurdish kurdish-language-processing kurmanji natural-language-processing nlp sorani tokenization
Last synced: 03 Aug 2024
https://github.com/DidierRLopes/similarstocks
This repository will hold similar stocks based on their description through NLP models
finance nlp similarity-search stocks
Last synced: 01 Aug 2024
https://github.com/giocoal/reddit-tldr-summarizer-and-topic-modeling
Extreme Extractive Text Summarization and Topic Modeling (using LSA and LDA techniques) over Reddit Posts from TLDRHQ dataset.
extreme-summarization latent-dirichlet-allocation latent-semantic-analysis lda lda-model lsa lsa-model nlp part-of-speech-tagging reddit reddit-bot reddit-dataset social-media summarization text-analysis text-preprocessing text-summarization tldr tldr9 topic-modeling
Last synced: 29 Jul 2024
https://github.com/sinaahmadi/KurdishMT
Towards Machine Translation for the Kurdish Language
kurdish kurdish-language-processing less-resource-languages machine-translation nlp
Last synced: 03 Aug 2024
https://github.com/maxoodf/tgnews
Telegram Data Clustering Contest (Bossy Gnu's submission )
cpp document-clustering document-embedding document-similarity nlp nlp-machine-learning telegram word2vec
Last synced: 01 Aug 2024