Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/clam004/minichatgpt

annotated tutorial of the huggingface TRL repo for reinforcement learning from human feedback connecting equations from PPO and GAE to the lines of code in the pytorch implementation

deep-learning deep-reinforcement-learning fine-tuning language-model large-language-models nlp pytorch reinforcement-learning reinforcement-learning-from-human-feedback transformers

Last synced: 15 Nov 2024

https://github.com/MaartenGr/Reviewer

Tool for extracting and analyzing IMDB reviews

bert disney imdb ner nlp sentiment-analysis

Last synced: 20 Nov 2024

https://github.com/adulau/napkin-text-analysis

Napkin is a simple tool to produce statistical analysis of a text

nlp text-analysis text-mining

Last synced: 18 Nov 2024

https://github.com/paritoshtripathi935/product-matching

The topic is about product matching via Machine Learning. This involves using various machine learning techniques such as natural language processing, image recognition, and collaborative filtering algorithms to match similar products together.

amazon-scraper collaborative-filtering data-science django flipkart-scraper-python langchain machine-learning nlp opencv product-matching python

Last synced: 20 Nov 2024

https://github.com/jianlins/fastcontext

FastContext is an optimized Java implementation of ConText algorithm (https://www.ncbi.nlm.nih.gov/pubmed/23920642).

context-detection java nlp

Last synced: 11 Nov 2024

https://github.com/huu4ontocord/rio

Text pre-processing for NLP datasets

language natural-language-processing nlp

Last synced: 11 Nov 2024

https://github.com/yohasebe/monadic-chat-cli

Highly configurable CLI app for OpenAI's chat/text completion API

ai chat cli completion conversation monad natural-language nlp openai

Last synced: 08 Nov 2024

https://github.com/elysian01/codify

Codify enables data scientists to perform all the tedious and time-consuming tasks such as EDA (exploratory data analysis), data cleaning, data pre-processing, data visualization, modeling, and evaluation in the data-science life cycle, by only conveying the logic of the task in natural language (English) and the system will automatically give out all the relevant python code snippets.

ai ai-assistant autocomplete automation data-science data-science-tools final-year-project intent-classification ml named-entity-recognition nlp reactjs research-paper research-project

Last synced: 07 Nov 2024

https://github.com/poteminr/agrocode2021

MultiLabel classification of cow diseases by text and symptoms recognition (NER)

multilabel-classification ner nlp rubert text-classification

Last synced: 07 Nov 2024

https://github.com/miserman/lingmatch

An all-in-one R package for the assessment of linguistic similarity

nlp r rcpp text-analysis

Last synced: 19 Nov 2024

https://github.com/analyticsinmotion/werpy

🐍📦 Rapidly calculate and analyze the Word Error Rate (WER) with this powerful yet lightweight Python package.

asr asr-evaluation automatic-speech-recognition levenshtein-distance metrics nlp pandas python python-package speech-to-text stt stt-benchmark wer word-error-rate

Last synced: 06 Nov 2024

https://github.com/eklem/words-n-numbers

Tokenizing strings of text. Regex extracting arrays of words and optionally numbers, emojis, tags, usernames and email addresses from strings. For Node.js and the browser. When you need more than just [a-z] regular expressions.

nlp offline-first regex tokenization tokenizer

Last synced: 08 Nov 2024

https://github.com/bloomberg/fast-noise-aware-topic-clustering

Research code and scripts used in the Silburt et al. (2021) EMNLP 2021 paper 'FANATIC: FAst Noise-Aware TopIc Clustering'

clustering emnlp emnlp2021 machine-learning nlp python topic-noise

Last synced: 09 Nov 2024

https://github.com/deepset-ai/haystack-sagemaker

🚀 This repo is a showcase of how you can use models deployed on AWS SageMaker in your Haystack Retrieval Augmented Generative AI pipelines

aws haystack llm nlp opensearch sagemaker

Last synced: 06 Nov 2024

https://github.com/dcavar/flair-json-nlp

Flair wrapper for JSON-NLP.

flair json natural-language-processing nlp

Last synced: 07 Nov 2024

https://github.com/contefranz/optop

Optimal topic identification from a pool of Latent Dirichlet Allocation models

latent-dirichlet-allocation lda model-selection natural-language-processing nlp text-mining topic-modeling

Last synced: 05 Nov 2024

https://github.com/akb89/witokit

A Python toolkit to generate a tokenized dump of Wikipedia for NLP

dump multilingual nlp tokenize wikipedia wikipedia-dump

Last synced: 09 Nov 2024

https://github.com/messense/bosonnlp-rs

BosonNLP SDK for Rust

bosonnlp nlp sdk

Last synced: 08 Nov 2024

https://github.com/maartengr/vlac

Vectors of Locally Aggregated Concepts

fasttext kmeans machine-learning nlp word-embeddings word2vec

Last synced: 27 Oct 2024

https://github.com/prrao87/fine-grained-sentiment-app

A Flask LIME explainer app for fine-grained sentiment classification.

flask interpretability lime lime-explainer nlp visualization web-app

Last synced: 09 Nov 2024

https://github.com/bpred754/augeo

Web application written with the MEAN stack that uses Natural Language Processing to classify a user's internet activity into different skills. In a nutshell, Augeo is the gamification of life.

angularjs fitbit-api gamification github-api mean-stack nlp nodejs open-source twitter-api website

Last synced: 08 Nov 2024

https://github.com/princeton-nlp/metric-wsd

NAACL'2021: Non-Parametric Few-Shot Learning for Word Sense Disambiguation

few-shot-learning nlp word-sense-disambiguation

Last synced: 11 Nov 2024

https://github.com/blackhc/player_of_jeopardy

ChatGPT can solve Jeopardy! clues really well!

chatgpt langchain machine-learning nlp openai-api

Last synced: 09 Nov 2024

https://github.com/jfilter/german-preprocessing

🇩🇪 Preprocess German texts to do some serious natural-language processing.

german nlp package python

Last synced: 11 Nov 2024

https://github.com/amazon-science/wqa-multi-sentence-inference

This repository contains code used for our Multi Sentence Inference NAACL'22 paper.

answer-sentence-selection nlp pretraining question-answering transformer

Last synced: 12 Nov 2024

https://github.com/smappnyu/smaberta

Wrapper for stable version of RoBERTa language models

huggingface nlp roberta transfer-learning

Last synced: 14 Nov 2024

https://github.com/dark-art108/megatronbot

MegatronBot is a Full-Fledged ChatBot with some awesome Powers 🔥🔥

bert-model chat-application chatbot elmo nlp nlu

Last synced: 12 Nov 2024

https://github.com/Pushkar1853/Cover-Generator

Application of OpenAI tools such as Whisper, DALL-E, and ChatGPT to generate album covers from audio

chat-gpt computer-vision dall-e music nlp openai stable-diffusion whisper-ai

Last synced: 27 Oct 2024

https://github.com/hctilg/finglish

A Finglish to Persian converter.

finglish finglish-dataset nlp

Last synced: 07 Nov 2024

https://github.com/ariya/tebakmasa

Infer the date and time from the general description in Bahasa Indonesia

bahasa bahasa-indonesia date indonesian nlp time timestamp

Last synced: 22 Oct 2024

https://github.com/princeton-nlp/blindfold-textgame

[NAACL 2021] Reading and Acting while Blindfolded: The Need for Semantics in Text Game Agents

naacl naacl2021 natural-language-processing nlp reinforcement-learning rl text-based-game text-game

Last synced: 11 Nov 2024

https://github.com/neelguha/legal-segmenter

A simple library for segmenting legal texts

law legal legaltech nlp segmentation

Last synced: 15 Oct 2024

https://github.com/bububa/timenlp

Time-NLP的golang版本 中文时间表达词转换

nlp parser time

Last synced: 08 Nov 2024

https://github.com/dobbersc/fundus-evaluation

[ACL 2024] Evaluation of the Fundus News Scraper

acl2024 crawling evaluation fundus nlp python scraping

Last synced: 09 Nov 2024

https://github.com/plandes/deepnlp

Deep learning utility library for natural language processing

deep-learning deep-neural-networks framework natural-language-processing nlp

Last synced: 08 Nov 2024

https://github.com/bloomberg/emnlp21_fewrel

Code to reproduce the results of the paper 'Towards Realistic Few-Shot Relation Extraction' (EMNLP 2021)

emnlp emnlp2021 few-shot-learning nlp relation-extraction

Last synced: 09 Nov 2024

https://github.com/shanepeckham/customspeech-processing-pipeline

Pre Processing Pipeline for Azure Custom Speech service

azure cognitive-services nlp speech-to-text

Last synced: 21 Oct 2024

https://github.com/sebpuetz/lumberjack

Read and modify constituency trees in Rust.

constituency constituency-tree negra nlp ptb rust rust-crate tree

Last synced: 08 Nov 2024

https://github.com/sematext/activate

Examples for the Activate conference

activate entity nlp opennlp recognition sematext solr spacy tagger

Last synced: 11 Nov 2024

https://github.com/contefranz/OpTop

Optimal topic identification from a pool of Latent Dirichlet Allocation models

latent-dirichlet-allocation lda model-selection natural-language-processing nlp text-mining topic-modeling

Last synced: 05 Aug 2024

https://github.com/simonepri/varname-seq2seq

📄Source code variable naming using a seq2seq architecture

nlp pytorch rnn seq2seq

Last synced: 22 Oct 2024

https://github.com/griptape-ai/griptape-flow

Python framework for building LLM workflows and pipelines with memory, rules, and chain of thought reasoning.

ai cohere gpt huggingface llm nlp openai python

Last synced: 27 Sep 2024

https://github.com/Cartus/AMR-Parser

Better Transition-Based AMR Parsing with a Refined Search Space (authors' DyNet implementation for the EMNLP18 paper)

aligner amr-parser nlp semantic-parsing transition-based-parser

Last synced: 11 Nov 2024

https://github.com/moj-analytical-services/pq-tool

Tool to analyse past parliamentary questions with visualisation in RShiny

clustering latent-semantic-analysis nlp shiny

Last synced: 13 Aug 2024

https://github.com/zongxr/tensorflow-in-practice

tensorflow实践案例 - 学习tensorflow的绝佳笔记,包括CV用法、NLP用法、常用tricks、训练的模型等,不断完善中……

cv jupyter-notebook nlp python tensorflow time-series

Last synced: 15 Nov 2024

https://github.com/cb1cyf/cbionamer

Nested Named Entity Recognition for Chinese Biomedical Text

named-entity-recognition natural-language-processing ner nlp pytorch

Last synced: 14 Oct 2024

https://github.com/Captmoonshot/data-bore

A Django REST API with an embedded ML model for sentiment analysis of movie reviews.

dash dashboard django django-rest-framework machine-learning nlp pandas plotly powerbi python

Last synced: 08 Aug 2024

https://github.com/kigawas/coheoka

Python coherence evaluation tool using Stanford's CoreNLP.

coherence-analysis nlp python

Last synced: 11 Oct 2024

https://github.com/maartengr/reviewer

Tool for extracting and analyzing IMDB reviews

bert disney imdb ner nlp sentiment-analysis

Last synced: 13 Oct 2024

https://github.com/trinker/sentimentpy

A Python port of the #rstats sentimentr package

emotion nlp polarity sentiment text-mining

Last synced: 27 Oct 2024

https://github.com/zyxue/stanford-cs224n

Exercise answers to the problem sets from CS224n: Natural Language Processing with Deep Learning Winter quarter (January - March, 2017)

cs224n deep-learning deep-neural-networks nlp stanford-machine-learning stanford-nlp tensorflow

Last synced: 23 Oct 2024

https://github.com/felixmohr/nlp-with-python

Using Conditional Random Fields for segmenting Latin words written in scriptio continua

conditional-random-fields latin machine-learning nlp

Last synced: 23 Oct 2024

https://github.com/wj-mcat/allennlp-tutorials

a detail tutorials of allennlp , which is based on my own view.

allennlp nlp pytorch transformer

Last synced: 23 Oct 2024

https://github.com/pharo-contributions/singularizepluralize

Transforming singular nouns to their plural form and vice versa.

language natural-language-processing nlp pharo pharo-smalltalk smalltalk

Last synced: 09 Oct 2024

https://github.com/saidziani/assistantchef

" Le but de ce projet est de vous permettre d’apprécier de façon pratique la puissance des techniques de Traitement Automatique du Langage Naturel. Le problème qui vous est proposé vous permettra de développer une application qui assiste à trouver des recettes de cuisine selon les besoins du ... Chef et les produits disponibles! " Pr. Ahmed Guessoum

arabic-nlp french-nlp information-extraction information-retrieval nlp nltk pyqt5

Last synced: 28 Oct 2024

https://github.com/shibing624/text2vec-service

Service for Bert model to Vector. 高效的文本转向量(Text-To-Vector)服务,支持GPU多卡、多worker、多客户端调用,开箱即用。

gpu nlp pytorch service

Last synced: 09 Nov 2024

https://github.com/code2k13/nlphose

Enables creation of complex NLP pipelines in seconds, for processing static files or streaming text, using a set of simple command line tools. Perform multiple operation on text like NER, Sentiment Analysis, Chunking, Language Identification, Q&A, 0-shot Classification and more by executing a single command in the terminal. Can be used as a low code or no code Natural Language Processing solution. Also works with Kubernetes and PySpark !

ai artifical-intelligense data-science language-detection low-code machine-learning named-entity-recognition natural-language-processing nlp no-code sentiment-analysis text-mining twitter-sentiment-analysis

Last synced: 13 Nov 2024

https://github.com/graykode/mlm-pipeline

mlm-pipeline is a cloud architecture that preprocesses the masked language model (mlm)

ansible aws bert cloud mlm natural-language-processing nlp terraform

Last synced: 23 Oct 2024

https://github.com/da03/residual-ebm

Code for Residual Energy-Based Models for Text Generation in PyTorch.

machine-learning nlp

Last synced: 28 Oct 2024

https://github.com/sergey-tihon/maltparser.net

MaltParser is a system for data-driven dependency parsing, which can be used to induce a parsing model from treebank data and to parse new data using an induced model.

dotnet fsharp maltparser nlp

Last synced: 17 Oct 2024

https://github.com/chrisrzhou/wordcloud-generator

Create and share wordcloud visualizations!

create-react-app d3 io nlp visualization wordcloud wordcloud-generator

Last synced: 17 Oct 2024

https://github.com/proycon/foliatools

A number of command-line tools for working with FoLiA (Format for Linguistic Annotation). Includes validators, converters, visualisers, and more.

clariah clarin computational-linguistics conllu converters folia nlp

Last synced: 01 Nov 2024

https://github.com/euskadi31/go-tokenizer

A Text Tokenizer library for Golang

go golang golang-library machine-learning nlp text tokenizer

Last synced: 28 Oct 2024

https://github.com/d-one/nlpeasy

Easy Peasy Language Squeezy

datascience elasticsearch kibana nlp spacy

Last synced: 14 Oct 2024

https://github.com/sovlookup/nlp-api

python 数地工厂 NLPSDK 关键词提取 摘要提取 新词发现 事件三元组提取 数据三元组提取 逻辑三元组提取 实体识别 短语组块识别 相似度计算 概念抽象 语义联想 情感极性判定 情感对提取 实体属性情感提取 主观性计算 网页正文解析 网页表格解析 实体链接 问题解析 概念描述

api api-client nlp

Last synced: 15 Oct 2024

https://github.com/dsdanielpark/all-about-llm

dsdanielpark's curation and categorization of resources on large language models, along with documentation.

large-language-model llm nlp

Last synced: 14 Nov 2024

https://github.com/weaviate/weaviate-javascript-client

No longer maintained, please see the TypeScript client

generative-search javascript nlp semantic-search vector-database vector-search

Last synced: 14 Nov 2024

https://github.com/thumnlab/autoattend

Code Implementation for AutoAttend: Automated Attention Representation Search

nas neural-architecture-search nlp

Last synced: 15 Nov 2024

https://github.com/simoninithomas/time-series-prediction-and-text-generation

Built RNNs that can generate sequences based on input data - with a focus on two applications: used real market data in order to predict future Apple stock prices using an RNN model. The second one will be trained on Sir Arthur Conan Doyle's classic novel Sherlock Holmes and generates wacky sentences based on it that may - or may not - become the next great Sherlock Holmes novel.

keras lstm nlp rnn-model text-generation

Last synced: 17 Nov 2024

https://github.com/aflah02/easy-data-augmentation-implementation

My Implementation for the paper EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks using Tensorflow

data deep-learning lstm nlp tensorflow2

Last synced: 20 Nov 2024

https://github.com/mhezarei/ai-bot

2020 AI bot challenge (ai-bot.ir) repository. This program answers a given question with a specific format and subject.

bert nlp persian-nlp

Last synced: 20 Nov 2024

https://github.com/zengfr/svm-neuro-matching

SVM Neuro Matching C#机器学习 LibSVM支持向量机 神经网络 匹配 中文文本分词分类聚类

accrod aforge csharp hotel java learning libsvm matching neuro nlp room svm svm-neuro-matching tfidf zengfr

Last synced: 14 Nov 2024

https://github.com/pythainlp/han-solo

🪿 Han-solo: Thai syllable segmenter

nlp syllable-segmentation thai-nlp thai-nlp-library

Last synced: 15 Nov 2024

https://github.com/catqaq/nlp-notes

详细双语注释版word2vec源码,well-annotated word2vec

dl-pytorch lstm nlp speech-tagger word2vec

Last synced: 19 Nov 2024

https://github.com/worldbank/wb-nlp-tools

Natural language processing tools developed by the World Bank's DECAT unit. A suite of text preprocessing and cleaning algorithms for NLP analysis and modeling.

gensim langdetect nlp nltk pdf2text python spacy text-mining

Last synced: 10 Nov 2024

https://github.com/UCSB-NLP-Chang/ULD

Implementation of paper 'Reversing the Forget-Retain Objectives: An Efficient LLM Unlearning Framework from Logit Difference'

large-language-models nlp pytorch transformers unlearning

Last synced: 29 Oct 2024

https://github.com/jfilter/german-abbreviations

📖 A list of 4262 German abbreviations from Wiktionary

abbreviations german nlp

Last synced: 11 Nov 2024

https://github.com/soskek/dynamic_neural_text_model

A Neural Language Model for Dynamically Representing the Meanings of Unknown Words and Entities in a Discourse, Sosuke Kobayashi, Naoaki Okazaki, Kentaro Inui, IJCNLP 2017

chainer deep-learning language-model neural-network nlp

Last synced: 16 Nov 2024

https://github.com/ianarawjo/chatgpt-extractive-shortener

Shortens a paragraph of text with ChatGPT, using successive rounds of word-level extractive summarization.

chatgpt editing-support extractive-summarization nlp shortener

Last synced: 15 Oct 2024

https://github.com/buckthorndev/sentiment_dart

Sentiment Dart is a dart module that uses the AFINN-165 wordlist and Emoji Sentiment Ranking to perform sentiment analysis on arbitrary blocks of input text. Sentiment Dart heavily inspired by the Javascript package sentiment

dart dart-library dart2 flutter natural-language-processing nlp package sentiment-analysis

Last synced: 12 Nov 2024

https://github.com/mrseanryan/gpt-workflow

Generate workflows (for flowcharts or low code) via LLM. Also describe workflow given in DOT.

flow-generator flowchart-generator flowchart-nlp gpt nlp worflow-nlp workflow-generation workflow-generator workflows

Last synced: 07 Nov 2024

https://github.com/jonaylor89/wineinamillion

Wine Recommender created with sentence-BERT and NearestNeighbor on AWS SageMaker

bert jupyter-notebook nlp python pytorch sagemaker sentence-bert sentence-embeddings sklearn

Last synced: 12 Nov 2024