Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/UCSB-NLP-Chang/ULD

Implementation of paper 'Reversing the Forget-Retain Objectives: An Efficient LLM Unlearning Framework from Logit Difference'

large-language-models nlp pytorch transformers unlearning

Last synced: 29 Oct 2024

https://github.com/catqaq/nlp-notes

详细双语注释版word2vec源码,well-annotated word2vec

dl-pytorch lstm nlp speech-tagger word2vec

Last synced: 19 Nov 2024

https://github.com/mrseanryan/gpt-workflow

Generate workflows (for flowcharts or low code) via LLM. Also describe workflow given in DOT.

flow-generator flowchart-generator flowchart-nlp gpt nlp worflow-nlp workflow-generation workflow-generator workflows

Last synced: 07 Nov 2024

https://github.com/worldbank/wb-nlp-tools

Natural language processing tools developed by the World Bank's DECAT unit. A suite of text preprocessing and cleaning algorithms for NLP analysis and modeling.

gensim langdetect nlp nltk pdf2text python spacy text-mining

Last synced: 10 Nov 2024

https://github.com/diyclassics/la_core_web_lg

spaCy-compatible sm/md/lg/trf core models for Latin, i.e pipeline with POS tagger, morphologizer, lemmatizer, dependency parser, and NER

latin nlp spacy-models

Last synced: 08 Nov 2024

https://github.com/asirihewage/facebook-messenger-auto-reply-using-php-nlp

This conversational bot will learn and answer , also it will search Wikipedia for you!

bot chat chatbot conversational facebook messenger mysql nlp php

Last synced: 15 Oct 2024

https://github.com/deeppavlov/autointent

An open source tool for automatic configuration of a text classification pipeline for intent prediction

auto-ml dialog-systems intent-detection nlp transformers

Last synced: 09 Nov 2024

https://github.com/mtimkovich/rip_quick_add

Quickly create Google Calendar events from natural language.

google-calendar nlp

Last synced: 28 Oct 2024

https://github.com/contextlab/data-wrangler

Wrangle messy numerical, image, and text data into consistent well-organized formats

data data-analysis data-science data-wrangling hugging-face image-data machine-learning nlp numpy pandas python scikit-learn

Last synced: 12 Oct 2024

https://github.com/minhd-vu/toxicity-filter

Natural language processing API to detect toxic chat.

flask nlp python

Last synced: 05 Nov 2024

https://github.com/praekelt/feersum-nlu-api-wrappers

Swagger spec and generated Python language wrappers for the FeersumNLU HTTP Rest API for building intelligent chatbots.

chatbot-framework nlp nlp-machine-learning

Last synced: 13 Aug 2024

https://github.com/azure99/blossomdata

A simple way to synthesize LLM training data. (under construction⚠)

data-engineering data-science fine-tuning gpt llama llm nlp supervised-learning

Last synced: 12 Oct 2024

https://github.com/AlbertSuarez/casescan

🔍 Clinical cases search by similarity specialized in Covid-19

nlp python react similarity-search

Last synced: 26 Oct 2024

https://github.com/omarsar/nlp_research

🔥 Summary of interesting NLP Papers and Research (Fast and easy reads!) 🔥

artificial-intelligence data-science deep-learning machine-learning nlp

Last synced: 13 Oct 2024

https://github.com/doubledaibo/compcaption_neurips2018

A Neural Compositional Paradigm for Image Captioning

compositionality image-captioning neurips-2018 nlp

Last synced: 28 Oct 2024

https://github.com/derintelligence/az-summarization

Abstractive summarization for Azerbaijani language

azerbaijan dataset language linguistics nlp summarization

Last synced: 13 Nov 2024

https://github.com/borisshapa/bert-crf

Solutions of the problems NER and RE in the domain of business documents with the BERT+CRF model.

bert crf ner nlp relation-extraction

Last synced: 14 Nov 2024

https://github.com/thinkwee/abstract_summarization_rnn

RNN Seq2Seq Based Abstract Summarization(ABS) On Tensorflow

lstm nlp python rnn seq2seq summarization tensorflow-seq2seq tensroflow

Last synced: 27 Oct 2024

https://github.com/ivanbongiorni/maximal

A TensorFlow-compatible Python library that provides models and layers to implement custom Transformer neural networks. Built on TensorFlow 2.

attention-is-all-you-need attention-mechanism deep-learning keras machine-learning natural-language-generation natural-language-processing natural-language-understanding neural-network nlp tensorflow tensorflow2 transformer transformers

Last synced: 23 Oct 2024

https://github.com/hsankesara/the-tweets-of-wisdom

A dataset which contains 30k+ so called "self-help" tweets from 100+ authors.

nlp text-data text-datasets tweepy tweets

Last synced: 27 Oct 2024

https://github.com/franfj/summarizer

Text summarization Python library (in progress)

machine-learning nlp nltk python python2 text-mining text-summarization

Last synced: 12 Oct 2024

https://github.com/paradite/tf-idf-keyword

:mag_right: Get keywords from a piece of text using tf-idf

keyword nlp tf-idf

Last synced: 09 Nov 2024

https://github.com/andi611/conditional-specgan-tensorflow

Text-to-Speech Synthesis by Generating Spectrograms using Generative Adversarial Network

audio-synthesis conditional-gan digital-signal-processing gan librosa machine-learning nlp nlp-machine-learning tensorflow tts

Last synced: 14 Oct 2024

https://github.com/softmarshmallow/inked-engine

🤖 natural language processing out of the box

django nlp nltk python

Last synced: 11 Oct 2024

https://github.com/mohammadrezaamani/hamspam

persian Ham Spam detector with python, hazm, nltk and nlp.

hazm machine-learning nlp nltk numpy pandas python sickit-learn

Last synced: 13 Nov 2024

https://github.com/thammegowda/006-many-to-eng

Machine translation of many to English

machine-translation machine-translation-models nlp

Last synced: 15 Oct 2024

https://github.com/jacksoncakes/chinese_keybert

A minimal chinese keywords extraction with BERT

bert chinese keyword-extraction nlp python pytorch

Last synced: 01 Oct 2024

https://github.com/omarsar/appworks_meetup_2018

Contains all the material used for the "Applied Deep Learning for NLP Using PyTorch" meetup at AppWorks

cnn deep-learning neural-network nlp pytorch rnn

Last synced: 13 Oct 2024

https://github.com/bond005/runne_contrastive_ner

This project is concerned with my participating in the RuNNE competition https://github.com/dialogue-evaluation/RuNNE

bert-ner contrastive-learning deep-learning ner nlp siamese-neural-network tensorflow

Last synced: 11 Oct 2024

https://github.com/barqawiz/aind2-nlp-translation

Machine translation pipeline. Ready models to translate between languages.

keras nlp tensorflow translation

Last synced: 10 Oct 2024

https://github.com/wjbmattingly/bagpipes-spacy

Bagpipes spaCy is a collection of custom spaCy pipeline components designed to enhance text processing capabilities.

nlp spacy

Last synced: 10 Oct 2024

https://github.com/ljos/egennavn

Named-entity chunker for Norwegian

named-entities nlp norwegian

Last synced: 26 Oct 2024

https://github.com/wjbmattingly/tap-2024-spacy-llms

This is the repository for my 2024 Tap Institute Course on spaCy with LLMs

nlp spacy

Last synced: 14 Oct 2024

https://github.com/strayMat/tag_serve

Deployable Neural Tagger implementation for Named Entity Recognition

bilstm-crf deep-learning docker-image flask flask-application machine-learning ner neural-network nlp pytorch tagger

Last synced: 03 Sep 2024

https://github.com/shibing624/fake-news-detector

Fake News Detection Competition

nlp text-classification

Last synced: 24 Oct 2024

https://github.com/dan-oak/pos

Simple English part-of-speech tagger : 93% accuracy

dynamic-programming hmm hmm-model hmm-viterbi-algorithm nlp nltk numpy pos python scipy

Last synced: 27 Oct 2024

https://github.com/lm-kit/lm-kit-net-samples

.NET samples for LM-Kit.NET

ai dotnet genai gpt llama llm lm-kit nlp

Last synced: 10 Oct 2024

https://github.com/accraze/python-ia-markov

Train Markov models on Internet Archive text files.

markov-chain markov-model markov-text nlp text-generation text-generator

Last synced: 28 Oct 2024

https://github.com/bobazooba/wgpt

This repository features an example of how to utilize the xllm library. Included is a solution for a common type of assessment given to LLM engineers, who typically earn between $120,000 to $140,000 annually

alpaca cerebras chatgpt deep-learning deep-neural-networks deeplearning falcon gpt language-model large-language-models llama2 llama2-7b llm mistral mistralai natural-language-processing nlp openai vicuna zephyr

Last synced: 27 Oct 2024

https://github.com/patrick-miller/textbook-concept-map

Build a concept map from textbooks using DBpedia Spotlight

concept-map dbpedia-spotlight educational-technology nlp

Last synced: 04 Nov 2024

https://github.com/mukhopadhyay/youtubers-saying-things

Dataset containing popular YouTuber channel's video subtitles

classification dataset kaggle kaggle-dataset nlp subtitles youtube

Last synced: 28 Oct 2024

https://github.com/davebulaval/spacy-language-detection

Fully customizable language detection for spaCy pipeline

language-detection nlp spacy spacy-extension

Last synced: 30 Sep 2024

https://github.com/lll-lll-lll-lll/sent-pattern

sent-pattern package categorizes English sentences into one of five basic sentence patterns.

japanese nlp portfolio python spacy

Last synced: 14 Oct 2024

https://github.com/andrewrosss/rake-spacy

Python implementation of the Rapid Automatic Keyword Extraction algorithm using spaCy

algorithm keyword-extraction ml nlp python rake rake-nltk spacy

Last synced: 14 Oct 2024

https://github.com/BrianWeinstein/googlenlp

An Interface to Google's Cloud Natural Language API

api cran google-cloud-platform nlp r

Last synced: 13 Aug 2024

https://github.com/applenob/nlp_projects

my nlp projects notebook

gensim ipynb nlp notebook rnn rnn-namer

Last synced: 11 Oct 2024

https://github.com/nikhiljsk/preprocess_nlp

A fast framework for pre-processing (Cleaning text, Reduction of vocabulary, Feature extraction and Vectorization). Implemented with parallel processing using custom number of processes.

cleaning-data feature-extraction glove natural-language-processing nlp parallel-processing preprocess python3 reduction spacy stages tfidf vectorization word2vec

Last synced: 14 Oct 2024

https://github.com/clnnn/chat-summarizer

💬 Real-time chat application prototype that can summarise the entire chat log

angular flask huggingface-transformer java ngrx nlp python websocket

Last synced: 30 Oct 2024

https://github.com/iamncj/duma-pytorch-lightning

Unofficial Implementation of the DUMA Paper

bert deep-learning duma nlp pytorch transformer

Last synced: 07 Nov 2024

https://github.com/zamgi/lingvo--textsegmenter

Text segmentation into separate words using a simple unigram model and the Viterbi algorithm

linguistics lingvo natural-language-processing nlp text-segmentation viterbi-algorithm

Last synced: 05 Nov 2024

https://github.com/google-research-datasets/nlp-fairness-for-india

Contains data resources to replicate results from the paper “Re-contextualizing Fairness in NLP: The Case of India”.

fairness nlp

Last synced: 08 Nov 2024

https://github.com/ikegami-yukino/pytypo

English spelling correction

english-word nlp spelling-correction typo

Last synced: 12 Oct 2024

https://github.com/achuttarsing/inflecteur

python inflector 🌀 for French language : control gender, tense and number

data-augmentation french-nlp inflection inflector nlp python

Last synced: 14 Nov 2024

https://github.com/alan-turing-institute/netts

Toolbox for creating networks capturing semantic content of speech transcripts.

graph-theory hut23 networks nlp python semantic-content transcripts

Last synced: 13 Nov 2024

https://github.com/tchin25/japanese-dependency-visualizer

A dependency visualizer for Japanese to help beginners deconstruct complex sentences. Also my first Vue 3 project c:

bulma cabocha japanese-language nlp vue3

Last synced: 15 Nov 2024

https://github.com/justingosses/geovec-playground

playground for exploring geoVec pre-trained glove model of geoscience embeddings

geology geoscience nlp notebook playground

Last synced: 17 Nov 2024

https://github.com/lorenzominto/ex-gpt-summarizer

An Extractive-Abstractive Summarization Framework with a Sentence Embeddings Twist. Based on GPT-2 transformer fine-tuned on CNN/DailyMail dataset

gpt nlp nlp-machine-learning sentence-embeddings summarization-framework

Last synced: 18 Nov 2024

https://github.com/jaykef/min-patchnizer

Minimal, clean code for video/image "patchnization" - a process commonly used in tokenizing visual data for use in a Transformer encoder.

computer-vision nlp patchnization tokenization transformer

Last synced: 29 Oct 2024

https://github.com/wjbmattingly/keyword-spacy

Keyword spaCy is a spaCy pipeline component for extracting keywords from text using cosine similarity.

keyword-extraction nlp spacy

Last synced: 14 Oct 2024

https://github.com/egorsmkv/ukrainian-accentor

Add accents to words in the Ukrainian language

accentor nlp ukrainian word-stress

Last synced: 18 Oct 2024

https://github.com/chaoscodes/untl

EMNLP'2022: Unsupervised Non-transferable Text Classification

emnlp2022 nlp text-classification transfer-learning

Last synced: 14 Oct 2024

https://github.com/educationaltestingservice/ies-writing-achievement-study-data

Data from an IES research study that explores the relationship between writing achievement and success at 4-year postsecondary institutions.

data features ies nlp writing

Last synced: 06 Nov 2024

https://github.com/blurred-machine/amazon-fine-food-review-analysis-using-nlp-techniques

This repository consists of analysis over Amazon fine food purchase reviews by customers. The data has been collected by Stanford Network Analysis Project(SNAP). This dataset consists of reviews of fine foods from amazon. The data span a period of more than 10 years, including all ~500,000 reviews up to October 2012. Reviews include product and user information, ratings, and a plain text review. It also includes reviews from all other Amazon categories.

amazon classification deep-learning kaggle-competition modeling nlp python sentiment-analysis text-classification

Last synced: 11 Nov 2024

https://github.com/keon/language-model

RNN Language Modeling with PyTorch

language-model nlp pytorch rnn

Last synced: 06 Nov 2024

https://github.com/phenax/f-inator-3000

Convert a sentence into an f-ing great sentence.

fuck json-api nlp nodejs

Last synced: 16 Nov 2024

https://github.com/qibinc/lyrics

Deep learning for lyrics.

artificial-intelligence deep-learning literacy music nlp

Last synced: 14 Nov 2024

https://github.com/recap-utr/nlp-service

NLP microservice for computing embeddings

nlp

Last synced: 11 Nov 2024

https://github.com/pyladiesams/azure-functions-beginner-mar2020

An introduction to Azure functions in Python.

azure-functions nlp python refactoring serverless workshop

Last synced: 09 Nov 2024

https://github.com/uds-lsv/Multi-tasking_Learning_With_Unreliable_Labels

Extending the NLNN algorithm proposed by Bekker & Goldbergers in a Multi-tasking Learning set-up to handle noisy labels. In order to extend low-resource data we often used artificial annotators. In this following setup we aim to generate clean training labeled data from artificial annotators.

machine-learning nlp noise-reduction

Last synced: 02 Nov 2024

https://github.com/dengbocong/competition

Solutions for NLP-related competition

deep-learning kaggle meachine-learning nlp tian-chi

Last synced: 08 Nov 2024

https://github.com/aajanki/fi-sentence-embeddings-eval

Comparison of sentence embedding models for Finnish

finnish nlp sentence-embeddings

Last synced: 10 Nov 2024

https://github.com/labrijisaad/twitter-sentiment-analysis-with-python

We aim in this project to analyze the sentiment of tweets provided from the Sentiment140 dataset by developing a machine learning sentiment analysis model involving the use of classifiers. The performance of these classifiers is then evaluated using accuracy and F1 scores.

accuracy-score bernoulli-naive-bayes confusion-matrix f1-score lemmatization logistic-regression machine-learning nlp roc-auc-curve sentiment-analysis sentiment140-dataset stemming support-vector-machine tokenization twitter-sentiment-analysis

Last synced: 06 Nov 2024

https://github.com/jinensetpal/boilerbot

Official Open-Source Implementation of BoilerBot: A Reliable Task-Oriented Chatbot Enhanced with Large Language Models.

conversational-agent llms nlp

Last synced: 29 Oct 2024

https://github.com/denismosolov/alice-entities-library

Набор именованных сущностей для платформы Яндекс.Диалоги. Используйте при создании навыков Алисы.

alice-sdk alice-skills nlp yandex-dialogs

Last synced: 16 Nov 2024

https://github.com/jfilter/german-lemmatizer

✂️ Python package (using a Docker image under the hood) to lemmatize German texts.

german lemmatization lemmatizer natural-language-processing nlp python

Last synced: 11 Nov 2024

https://github.com/gesiscss/ptm

Introduction to Natural Language Processing with a special emphasis on the analysis of Job Advertisements

binder data-science information-retrieval labour-market nlp r text-mining topic-modeling

Last synced: 09 Nov 2024

https://github.com/cyclecycle/visualise-spacy-tree

Create dependency tree plots from SpaCy Doc objects

nlp python spacy

Last synced: 14 Oct 2024