Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Natural language processing
Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
- GitHub: https://github.com/topics/nlp
- Wikipedia: https://en.wikipedia.org/wiki/Natural_language_processing
- Created by: Alan Turing
- Aliases: natural-language-processing, nlp-machine-learning, nlp-resources,
- Last updated: 2024-07-29 13:51:14 UTC
- JSON Representation
https://github.com/fdalvi/NeuroX
A Python library that encapsulates various methods for neuron interpretation and analysis in Deep NLP models.
explainable-ai natural-language-processing neurons nlp nlp-machine-learning
Last synced: 03 Aug 2024
https://github.com/princeton-nlp/LLMBar
[ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following
Last synced: 02 Aug 2024
https://github.com/saidziani/Arabic-News-Article-Classification
Automatic categorization of documents, consists in assigning a category to a text based on the information it contains. We'll follow different approach of Supervised Machine Learning.
arabic-language arabic-nlp corpora machine-learning nlp nltk python3 text-categorization
Last synced: 03 Aug 2024
https://github.com/cambridgeltl/visual-spatial-reasoning
[TACL'23] VSR: A probing benchmark for spatial undersranding of vision-language models.
computer-vision multimodal-deep-learning nlp vision-and-language
Last synced: 01 Aug 2024
https://github.com/foxminchan/LawKnowledge
A legal knowledge search and Q&A application based on Vietnam's Legal Code and legal document database ⚖️
generative-ai microservice natural-language-processing nlp nx searching semantic-search
Last synced: 03 Aug 2024
https://github.com/PKU-YuanGroup/Hallucination-Attack
Attack to induce LLMs within hallucinations
adversarial-attacks ai-safety deep-learning hallucinations llm llm-safety machine-learning nlp
Last synced: 05 Sep 2024
https://github.com/mohabmes/Arabycia
Arabic NLP tool used to perform Text Search, POS tagging, Translation, auto-diacritization, etc..
arabic-language arabic-nlp nlp
Last synced: 03 Aug 2024
https://github.com/maxoodf/russian_news_corpus
Russian mass media stemmed texts corpus / Корпус лемматизированных (морфологически нормализованных) текстов российских СМИ
articles corpus machine-learning ml nlp nlp-machine-learning russian text word2vec
Last synced: 15 Aug 2024
https://github.com/arazd/ProgressivePrompts
Progressive Prompts: Continual Learning for Language Models
continual-learning llms nlp prompt-tuning
Last synced: 09 Aug 2024
https://github.com/feldberlin/timething
Timething is a library for aligning text transcripts with their audio recordings.
alignment audio cli forced-alignment huggingface nlp python speech speech-recognition tts
Last synced: 31 Jul 2024
https://github.com/philgooch/abbreviation-extraction
Python3 implementation of the Schwartz-Hearst algorithm for extracting abbreviation-definition pairs
abbreviations information-extraction keyword-extraction nlp python3
Last synced: 01 Aug 2024
https://github.com/legacyai/tf-transformers
State of the art faster Transformer with Tensorflow 2.0 ( NLP, Computer Vision, Audio ).
bert gpt2 keras language-model natural-language-processing nlp nlp-library tensorflow tensorflow2 text-classification text-generation transformer
Last synced: 01 Aug 2024
https://github.com/mynameisvinn/EmailParser
remove signature blocks from emails
email-parser email-parsing natural-language-processing nlp python signature-blocks
Last synced: 07 Aug 2024
https://github.com/thushv89/manning_tf2_in_action
The official code repository for "TensorFlow in Action" by Manning.
computer-vision deep-learning machine-learning nlp notebook python tensorflow tensorflow2 tf tf2
Last synced: 31 Jul 2024
https://github.com/forzagreen/n2words
Convert numerical numbers to written numbers, in 25+ languages.
convert-numbers language natural-language nlp
Last synced: 03 Aug 2024
https://github.com/Dumbris/trunklucator
Python module for data scientists for quick creating annotation projects.
active-learning annotation annotation-tool data-science machine-learning nlp
Last synced: 01 Aug 2024
https://github.com/adhaamehab/textblob-ar
Arabic support for textblob
arabic-language arabic-nlp machine-learning natural-language-processing nlp part-of-speech-tagger sentiment-analysis spelling-correction text-classification text-similarity textblob word-embeddings
Last synced: 03 Aug 2024
https://github.com/ARBML/tkseem
Arabic Tokenization Library. It provides many tokenization algorithms.
arabic-nlp nlp tkseem tokenization
Last synced: 31 Jul 2024
https://github.com/SentometricsResearch/sentometrics
An integrated framework in R for textual sentiment time series aggregation and prediction
nlp prediction sentiment-analysis text-mining time-series
Last synced: 02 Aug 2024
https://github.com/datquocnguyen/jLDADMM
A Java package for the LDA and DMM topic models
gibbs-sampling lda nlp short-text topic-modeling topic-models
Last synced: 02 Aug 2024
https://github.com/hyunwoongko/summarizers
Package for controllable summarization
Last synced: 02 Aug 2024
https://github.com/indiejoseph/chinese-char-rnn
Character-Level language models
chinese deep-learning language-modeling nlp rnn tensorflow
Last synced: 02 Aug 2024
https://github.com/krystalan/Multi-hopRC
:notebook_with_decorative_cover: notes for Multi-hop Reading Comprehension and open-domain question answering
machine-reading-comprehension natural-language-processing nlp open-domain-qa paper-list question-answering
Last synced: 02 Aug 2024
https://github.com/unilight/R-NET-in-Tensorflow
R-NET implementation in TensorFlow.
machine-comprehension nlp squad tensorflow
Last synced: 07 Aug 2024
https://github.com/Neuraxio/New-Empty-Python-Project-Base
The Perfect Python Project Template. Bored of coding anew the same thing for your new Python projects? Here is what you need. Click below on the "use this template" green button to start using it instantly. Rename the "project" folder and all references to this folder to customize your project name.
base computer-vision library nlp project-template project-templates pypi-package python-library time-series
Last synced: 03 Aug 2024
https://github.com/felladrin/MiniSearch
Minimalist web-searching app with an AI assistant that runs directly from your browser. Uses Web-LLM, Ratchet-ML, Wllama and SearXNG. Demo: https://felladrin-minisearch.hf.space
ai artificial-intelligence generative-ai gpu-accelerated information-retrieval llm llm-inference machine-learning nlp question-answering ratchet-ml retrieval-augmented-generation search search-engine searxng typescript web-llm webapp wllama
Last synced: 31 Jul 2024
https://github.com/bnosac/textrank
Summarise text by finding relevant sentences and keywords using the Textrank algorithm
natural-language-processing nlp r textrank textrank-algorithm
Last synced: 05 Aug 2024
https://github.com/Flight-School/ner
A command-line utility for extracting names of people, places, and organizations from text on macOS.
cli macos named-entity-recognition nlp swift
Last synced: 05 Aug 2024
https://github.com/cbilgili/zemberek-nlp-server
Zemberek Türkçe NLP Java Kütüphanesi üzerine REST Docker Sunucu
docker javascript nlp part-of-speech-tagger rest sentence-tokenizer spark turkish turkish-language zemberek
Last synced: 02 Aug 2024
https://github.com/bukosabino/justicio
Building an assistant for Boletin Oficial del Estado (BOE) using Retrieval Augmented Generation (RAG)
Last synced: 31 Jul 2024
https://github.com/uripeled2/llm-client-sdk
SDK for using LLM
ai ai21labs api async bard bard-api chatgpt-api generative-ai gpt huggingface huggingface-transformers large-language-models llm llms nlp openai openai-api palm-api python sdk
Last synced: 01 Aug 2024
https://github.com/telecombcn-dl/2017-persontyle
Applied Deep Learning Workshop London 2017
computer-vision deep-learning nlp
Last synced: 07 Aug 2024
https://github.com/LanguageMachines/frog
Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.
computational-linguistics dependency-parser dutch folia lemmatiser morphological-analyser morphology named-entity-recognition natural-language-processing nlp pos-tagger syntax text-processing
Last synced: 31 Jul 2024
https://github.com/adrien2p/nestjs-dialogflow
Dialog flow module that simplify the web hook handling for your NLP application using NestJS :satellite:
addons dialogflow google-dialogflow nestjs nlp typescript webhook
Last synced: 01 Aug 2024
https://github.com/sorenlind/lemmy
🤘Lemmy is a lemmatizer for Danish 🇩🇰 and Swedish 🇸🇪
danish lemma lemmatizer nlp spacy swedish
Last synced: 31 Jul 2024
https://github.com/nalbion/whisper-server
streaming speech to text server using Whisper
Last synced: 02 Aug 2024
https://github.com/PhantomInsights/comments-generator
A Reddit bot that generates new context-aware comments using Markov chains trained from a set of given users or subreddits comments history.
markov-chain nlp praw python3 reddit-bot requests
Last synced: 01 Aug 2024
https://github.com/akashp1712/nlp-akash
Natural Language Processing notes and implementations.
natural-language-processing nlp nlp-akash nltk summarization text-summarization
Last synced: 03 Aug 2024
https://github.com/Loodos/turkish-language-models
Transformer based Turkish language models
language-models natural-language-processing nlp turkish
Last synced: 02 Aug 2024
https://github.com/KudoAI/googlegpt
🤖 Bring the magic of ChatGPT to Google Search (powered by GPT-4!)
ai bot bots chatbot chatbots chatgpt experimental generative-ai google gpt gpt-4 greasemonkey llm machine-learning ml nlp openai search search-engine userscripts
Last synced: 31 Jul 2024
https://github.com/jantrienes/nereval
Evaluation script for named entity recognition (NER) systems based on entity-level F1 score.
evaluation-metrics machine-learning named-entity-recognition nlp
Last synced: 07 Aug 2024
https://github.com/prajjwal1/fluence
A deep learning library based on Pytorch focussed on low resource language research and robustness
attention deep-learning nlp pytorch transformers
Last synced: 07 Aug 2024
https://github.com/AlirezaTheH/perke
A keyphrase extractor for Persian
data-mining data-processing information-retrieval keyphrase keyphrase-extraction keyphrase-extractor keyword keyword-extraction keyword-extractor machine-learning ml natural-language-processing nlp persian persian-language python text-mining text-processing unsupervised-learning
Last synced: 04 Aug 2024
https://github.com/Flight-School/sentiment
A command-line utility that evaluates the emotional sentiment of natural language text.
macos nlp polarity sentiment-analysis swift
Last synced: 09 Aug 2024
https://github.com/leehanchung/cs182
Berkeley CS182/282A Designing, Visualizing and Understanding Deep Neural Networks
berkeley cnn cs182 cs231n cs231n-assignment natural-language-processing nlp pytorch reinforcement-learning tensorflow transformer
Last synced: 31 Jul 2024
https://github.com/RUCAIBox/MVP
This repository is the official implementation of our paper MVP: Multi-task Supervised Pre-training for Natural Language Generation.
data-to-text dialog multi-task-learning natural-language-generation natural-language-processing nlg nlp plm pre-trained-model question-answering question-generation seq2seq sequence-to-sequence story-generation summarization text-generation
Last synced: 03 Aug 2024
https://github.com/bnosac/pattern.nlp
R package to perform sentiment analysis and Parts of Speech tagging for Dutch/French/English/German/Spanish/Italian
nlp pattern pos-tagging r sentiment-analysis text-mining
Last synced: 05 Aug 2024
https://github.com/VietHoang1512/khmer-nltk
Khmer language processing toolkit
crf khmer-language nlp nlp-library part-of-speech-tagging segmentation sentence-segmenter word-segmenter
Last synced: 01 Aug 2024
https://github.com/Zlasejd/HuangDi
黄帝(Huang-Di)模型仓库,基于Ziya-LLaMA-13B-V1的中医古籍知识问答大模型。
ancient-books llm nlp qusetion-answer tcm
Last synced: 03 Aug 2024
https://github.com/abitdodgy/gibran
Gibran is an Elixir natural language processor, and a port of WordsCounted.
elixir-lang natural-language-processing nlp
Last synced: 01 Aug 2024
https://github.com/LanguageMachines/ucto
Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing steps such as changing case that you can all use to make your text suited for further processing such as indexing, part-of-speech tagging, or machine translation. Ucto comes with tokenisation rules for several languages and can be easily extended to suit other languages. It has been incorporated for tokenizing Dutch text in Frog, our Dutch morpho-syntactic processor. http://ilk.uvt.nl/ucto --
computational-linguistics folia language natural-language-processing nlp punctuation tokeniser
Last synced: 31 Jul 2024
https://github.com/ParikhKadam/bidaf-keras
Bidirectional Attention Flow for Machine Comprehension implemented in Keras 2
bidaf deep-learning deep-neural-networks deeplearning keras keras-models keras-neural-networks keras-tensorflow machine-comprehension machine-intelligence natural-language-processing natural-language-understanding neural-nets neural-network neural-networks neuralnetwork nlp python3 question-answering tensorflow
Last synced: 07 Aug 2024
https://github.com/chnsh/deep-semantic-code-search
Deep Semantic Code Search aims to explore a joint embedding space for code and description vectors and then use it for a code search application
deep-learning deep-neural-networks nlp nlp-machine-learning
Last synced: 31 Jul 2024
https://github.com/techcentaur/PyLex
Perform lexical analysis on words, one word at a time.
cli lexical-analysis nlp poets python3 scraping words
Last synced: 30 Jul 2024
https://github.com/aiwithqasim/Free-Artificial-Intelligence-Resources
Welcome, to this Open Source Repository regarding FREE ARTIFICIAL INTELLIGENCE RESOURCE. Get Benefit from the free resources mention & kindly five STAR & FORK this so that it can get maximum Fame so that Everyone can take advantage.
ai article artificial-intelligence artificial-neural-networks blog data-science datascientist deep-learning freeresources hacktoberfest hecktoberfest2021 jobs machine-learning machine-learning-algorithms natural-language-processing nlp project python3 youtube
Last synced: 01 Aug 2024
https://github.com/mirfan899/Urdu
Collection of Urdu datasets for POS, NER, Sentiment, Summarization and NLP tasks.
machine-learning ner nlp sentiment-analysis spacy-models summarization urdu-language urdu-model
Last synced: 03 Aug 2024
https://github.com/sammous/spacy-lefff
Custom French POS and lemmatizer based on Lefff for spacy
dataesr eig-2018 entrepreneur-interet-general french french-pos lemmatizer nlp pos-tagging python spacy spacy-extensions
Last synced: 01 Aug 2024
https://github.com/bnosac/crfsuite
Labelling Sequential Data in Natural Language Processing with R - using CRFsuite
chunking conditional-random-fields crf crfsuite data-science intent-classification natural-language-processing ner nlp r r-package
Last synced: 31 Jul 2024
https://github.com/winkjs/wink-sentiment
Accurate and fast sentiment scoring of phrases with #hashtags, emoticons :) & emojis 🎉
emoji emoticons hashtag nlp sentiment sentiment-analysis sentiment-classification sentiment-scores wink
Last synced: 31 Jul 2024
https://github.com/datasciencecampus/pyGrams
Extracts key terminology (n-grams) from any large collection of documents (>1000) and forecasts emergence
dsc-projects emergence-calculations natural-language-processing nlp nltk patents python scikit-learn tf-idf
Last synced: 31 Jul 2024
https://github.com/batzner/tensorlm
Wrapper library for text generation / language models at char and word level with RNN in TensorFlow
char-lm char-rnn language-model nlp tensorflow tensorflow-library
Last synced: 31 Jul 2024
https://github.com/nixon-voxell/UnityNLP
Natural Language Processing in Unity.
natural-language-processing natural-language-understanding nlp nlp-machine-learning nlu unity unity3d
Last synced: 02 Aug 2024
https://github.com/proycon/folia
FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for processing FoLiA is implemented as part of PyNLPl, this contains higher-level tools that use the library as well as the full documentation, validation schemas, and set definitions
computational-linguistics corpus file-format folia language library linguistic-annotation-framework linguistics nlp python xml
Last synced: 03 Aug 2024
https://github.com/avilum/llama-saas
A client/server for LLaMA (Large Language Model Meta AI) that can run ANYWHERE.
ai client-server facebook llama llm nlp
Last synced: 01 Aug 2024
https://github.com/ohnlp/MedTagger
MedTagger is a light weight clinical NLP system built upon Apache UIMA.
Last synced: 04 Aug 2024
https://github.com/asigalov61/tegridy-tools
Symbolic Music NLP Artificial Intelligence Toolkit
architectures artificial-intelligence artificial-intelligence-systems computer-music deep-learning markovify midi midi-classification midi-processing midi-processor midi-search midi-toolkit music music-generation music-origami nanogpt nlp plagiarism-detection raspberry-pi symbolic-music
Last synced: 31 Jul 2024
https://github.com/IBM/MAX-Text-Sentiment-Classifier
Detect the sentiment captured in short pieces of text
docker-image ibm machine-learning machine-learning-models natural-language-processing natural-language-understanding nlp sentiment tensorflow
Last synced: 04 Aug 2024
https://github.com/OpenSUM/CPSUM
Code and Data Repo for COLING'22 paper "Noise-injected Consistency Training and Entropy-constrained Pseudo Labeling for Semi-supervised Extractive Summarization"
extractive-summarization nlp semi-supervised-learning
Last synced: 03 Aug 2024
https://github.com/salesforce/factualNLG
Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"
factual-consistency factuality large-language-models llm nlp summarization
Last synced: 31 Jul 2024
https://github.com/saturncloud/dask-pytorch-ddp
dask-pytorch-ddp is a Python package that makes it easy to train PyTorch models on dask clusters using distributed data parallel.
computer-vision dask deep-learning distributed-computing machine-learning nlp pytorch
Last synced: 03 Aug 2024
https://github.com/Legilibre/legi.py
Outils de manipulation des archives LEGI (lois françaises)
france laws legi legislation natural-language-processing nlp opendata python
Last synced: 03 Sep 2024
https://github.com/csvance/armchair-expert
Machine Learning Chatbot
ai bot discord machine-learning markov nlp python twitter
Last synced: 03 Aug 2024
https://github.com/xyntopia/pydoxtools
Effortlessly extract information from unstructured data with this library, utilizing advanced AI techniques. Compose AI in customizable pipelines and diverse sources for your projects.
chatgpt document-analysis document-extraction extraction information-retrieval llm nlp pdf python
Last synced: 03 Aug 2024
https://github.com/NISH1001/tag-generator
A simple tool to generate tags for the given text (document) using TF-IDF.
Last synced: 01 Aug 2024
https://github.com/SamEdwardes/spacytextblob
A TextBlob sentiment analysis pipeline component for spaCy.
natural-language-processing nlp python spacy
Last synced: 04 Aug 2024
https://github.com/LaVi-Lab/CLEVA
[EMNLP 2023 Demo] CLEVA: Chinese Language Models EVAluation Platform
Last synced: 03 Aug 2024
https://github.com/nicolay-r/AREkit
Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and processing large text collections with ML and for ML
bert datasets frames language-models neural-networks nlp pandas pandas-dataframe prompt prompting relation-extraction sentiment-analysis tensorflow
Last synced: 01 Aug 2024
https://github.com/minasmz/Persian-Summarization
Statistical and Semantical Text Summarizer in Persian Language
doc2vec-model gensim nlp persian-language persian-nlp text-summarization textrank-algorithm
Last synced: 04 Aug 2024
https://github.com/IBM/MAX-Toxic-Comment-Classifier
Detect 6 types of toxicity in user comments.
comments docker-image ibm machine-learning machine-learning-models natural-language-processing natural-language-understanding nlp pytorch
Last synced: 04 Aug 2024
https://github.com/vmenger/deduce
Deduce: de-identification method for Dutch medical text
deidentification dutch dutch-clinical-nlp information-extraction nlp python python-library text-mining text-processing
Last synced: 02 Aug 2024
https://github.com/MrinmoiHossain/Udacity-Deep-Learning-Nanodegree
The course is contained knowledge that are useful to work on deep learning as an engineer. Simple neural networks & training, CNN, Autoencoders and feature extraction, Transfer learning, RNN, LSTM, NLP, Data augmentation, GANs, Hyperparameter tuning, Model deployment and serving are included in the course.
convolutional-networks convolutional-neural-networks deep-learning gans generative-adversarial-network long-short-term-memory-models lstms machine-learning nanodegree neural-network nlp pytorch recurrent-neural-networks rnn sentiment-analysis style-transfer transfer-learning udacity udacity-nanodegree
Last synced: 07 Aug 2024
https://github.com/Zlasejd/HuangDI
黄帝(Huang-Di)模型仓库,基于Ziya-LLaMA-13B-V1的中医古籍知识问答大模型。
ancient-books llm nlp qusetion-answer tcm
Last synced: 01 Aug 2024
https://github.com/htaghizadeh/PersianStemmer-Python
PersianStemmer-Python
information-retrieval nlp persian persian-language persian-nlp persian-stemmer stemmer
Last synced: 04 Aug 2024
https://github.com/nzw0301/lightLDA
fast sampling algorithm based on CGS
lda machine-learning nlp python topic-modeling
Last synced: 01 Aug 2024
https://github.com/arne-cl/discoursegraphs
linguistic converter / merging tool for multi-level annotated corpora. graph-based (using Python and NetworkX).
conversion converter natural-language-processing networkx nlp python
Last synced: 01 Aug 2024
https://github.com/masci/banks
LLM prompt language based on Jinja
chatgpt llm nlp openai prompt-engineering prompt-toolkit
Last synced: 31 Jul 2024
https://github.com/lonePatient/bert-sentence-similarity-pytorch
This repo contains a PyTorch implementation of a pretrained BERT model for sentence similarity task.
bert nlp pytorch sentence-similarity text-classification
Last synced: 01 Aug 2024
https://github.com/FerdinandZhong/punctuator
A small seq2seq punctuator tool based on DistilBERT
bert bert-ner chinese-nlp deep-learning nlp punctuation pytorch seq2seq
Last synced: 03 Aug 2024
https://github.com/LanguageMachines/PICCL
A set of workflows for corpus building through OCR, post-correction and normalisation
computational-linguistics corpus-linguistics corpus-tools folia nlp ocr workflow
Last synced: 01 Aug 2024
https://github.com/prajjwal1/language-modelling
LM, ULMFit et al.
deep-learning language-modeling nlp pytorch
Last synced: 01 Aug 2024
https://github.com/onesuper/HuggingFace-Datasets-Text-Quality-Analysis
Retrieves parquet files from Hugging Face, identifies and quantifies junky data, duplication, contamination, and biased content in dataset using pandas
dataset huggingface-datasets llm machine-learning nlp streamlit text-processing
Last synced: 09 Aug 2024
https://github.com/kootenpv/spacy_api
Server/Client around Spacy to load spacy only once
api machine-learning nlp spacy
Last synced: 07 Aug 2024
https://github.com/khakhulin/compressed-transformer
Compression of NMT transformer model with tensor methods
compression deep-learning mnist nlp nmt pytorch tensor-train transformer translation tucker
Last synced: 04 Aug 2024
https://github.com/obss/trapper
State-of-the-art NLP through transformer models in a modular design and consistent APIs.
allennlp deep-learning natural-language-processing nlp python pytorch pytorch-transformers transformer transformers
Last synced: 02 Aug 2024
https://github.com/ChenghaoMou/pytorch-pQRNN
Implementation of pQRNN in PyTorch
nlp pqrnn pytorch text-classification
Last synced: 03 Aug 2024