Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/josefalbers/roy

Roy: A lightweight, model-agnostic framework for crafting advanced multi-agent systems using large language models.

agent agentgpt autogen autogpt baby-agi chat chatbot code-generation code-generator gpt langchain llm llm-agent multi-agent nlp prompt-engineering quantization retrieval-augmented-generation vector-index wizardcoder

Last synced: 08 Nov 2024

https://github.com/wjbmattingly/spacyex

SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.

nlp spacy

Last synced: 31 Oct 2024

https://github.com/doragd/text-classification-pytorch

Implementation of papers for text classification task on SST-1/SST-2

bilstm-attention nlp sentiment-classification text-classification textcnn

Last synced: 29 Oct 2024

https://github.com/messense/fasttext-rs

fastText Rust binding

fasttext nlp

Last synced: 14 Nov 2024

https://github.com/searchableai/kitanaqa

KitanaQA: Adversarial training and data augmentation for neural question-answering models

adversarial-attacks adversarial-training bert data-augmentation ml-automation natural-language-processing nlp pytorch question-answering transformer

Last synced: 13 Oct 2024

https://github.com/saturncloud/dask-pytorch-ddp

dask-pytorch-ddp is a Python package that makes it easy to train PyTorch models on dask clusters using distributed data parallel.

computer-vision dask deep-learning distributed-computing machine-learning nlp pytorch

Last synced: 15 Nov 2024

https://github.com/OpenSUM/CPSUM

Code and Data Repo for COLING'22 paper "Noise-injected Consistency Training and Entropy-constrained Pseudo Labeling for Semi-supervised Extractive Summarization"

extractive-summarization nlp semi-supervised-learning

Last synced: 16 Nov 2024

https://github.com/Legilibre/legi.py

Outils de manipulation des archives LEGI (lois françaises)

france laws legi legislation natural-language-processing nlp opendata python

Last synced: 03 Sep 2024

https://github.com/argilla-io/adept-augmentations

A Python library aimed at dissecting and augmenting NER training data.

dataset datasets few-shot-learning machine-learning natural-language-processing nlp spacy

Last synced: 18 Oct 2024

https://github.com/nicolay-r/AREkit

Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and processing large text collections with ML and for ML

bert datasets frames language-models neural-networks nlp pandas pandas-dataframe prompt prompting relation-extraction sentiment-analysis tensorflow

Last synced: 01 Nov 2024

https://github.com/thunlp/paragraph2vec

Paragraph Vector Implementation

nlp

Last synced: 10 Nov 2024

https://github.com/knadh/indic.page

A directory of Indic (Indian) language computing resources.

datasets indian-language indic-languages language linguistics nlp

Last synced: 28 Oct 2024

https://github.com/NISH1001/tag-generator

A simple tool to generate tags for the given text (document) using TF-IDF.

nlp tagging tf-idf tfidf

Last synced: 05 Nov 2024

https://github.com/cluebenchmark/mobileqa

离线端阅读理解应用 QA for mobile, Android & iPhone

albert android bert chinese iphone machine-reading-comprehension nlp qa tensorflow tflite

Last synced: 09 Nov 2024

https://github.com/xyntopia/pydoxtools

Effortlessly extract information from unstructured data with this library, utilizing advanced AI techniques. Compose AI in customizable pipelines and diverse sources for your projects.

chatgpt document-analysis document-extraction extraction information-retrieval llm nlp pdf python

Last synced: 17 Nov 2024

https://github.com/kyubyong/koparadigm

KoParadigm: Korean Inflectional Paradigm Generator

inflection korean linguistics morphology nlp paradigm

Last synced: 10 Nov 2024

https://github.com/doccano/doccano-mini

Annotation meets Large Language Models (ChatGPT, GPT-3 and alike).

annotation langchain nlp openai

Last synced: 01 Nov 2024

https://github.com/anfederico/poesy

Poetry generation via natural language markov models

markov modeling ngrams nlp poetry

Last synced: 14 Oct 2024

https://github.com/princeton-nlp/calm-textgame

[EMNLP 2020] Keep CALM and Explore: Language Models for Action Generation in Text-based Games

calm gpt n-gram nlp rl text-based-game

Last synced: 11 Nov 2024

https://github.com/openagi/tefla

Tensorflow based deep-learning library.

ai deep-learning gan generative-adversarial-network lstm ml nlp tensorflow-framework

Last synced: 14 Oct 2024

https://github.com/cluebenchmark/lightlm

高性能小模型测评 Shared Tasks in NLPCC 2020. Task 1 - Light Pre-Training Chinese Language Model for NLP Task

bert chinese chinese-language-model languagemodel nlp nlpcc nlpcc2020

Last synced: 09 Nov 2024

https://github.com/lunarwhite/tan-division

Chinese corpus sentiment analysis. 谭松波酒店评论中文文本情感分析

deep-learning keras lstm nlp python rnn tensorflow

Last synced: 06 Nov 2024

https://github.com/SamEdwardes/spacytextblob

A TextBlob sentiment analysis pipeline component for spaCy.

natural-language-processing nlp python spacy

Last synced: 04 Aug 2024

https://github.com/mlabouardy/dialogflow-angular5

💬 Bot in Angular 5 & DialogFlow

ai angular angular5 api-ai bot chatbot dialogflow nlp

Last synced: 15 Nov 2024

https://github.com/cohere-ai/sandbox-accelerating-chatbot-training

Leveraging Cohere's models to enable zero-shot routing

chatbot large-language-models llm nlp routing

Last synced: 07 Oct 2024

https://github.com/samedwardes/spacytextblob

A TextBlob sentiment analysis pipeline component for spaCy.

natural-language-processing nlp python spacy

Last synced: 14 Oct 2024

https://github.com/chiphuyen/metrotwitter

What Twitter reveals about the differences between cities and the monoculture of the Bay Area

data-analysis data-visualization emojis nlp nlp-datasets python twitter twitter-dataset

Last synced: 08 Nov 2024

https://github.com/chanind/frame-semantic-transformer

Frame Semantic Parser based on T5 and FrameNet

framenet huggingface nlp semantic-parsing t5 transformers

Last synced: 08 Nov 2024

https://github.com/LaVi-Lab/CLEVA

[EMNLP 2023 Demo] CLEVA: Chinese Language Models EVAluation Platform

chinese evaluation nlp

Last synced: 16 Nov 2024

https://github.com/swader/diffbot-php-client

[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library

ai artificial-intelligence bot crawl crawling diffbot machine-learning nlp php scrape scraped-data scraper scraping

Last synced: 15 Nov 2024

https://github.com/hironsan/neraug

A text augmentation tool for named entity recognition.

deep-learning machine-learning natural-language-processing nlp

Last synced: 27 Oct 2024

https://github.com/developersdigest/function-chain

The FunctionChain is a tool that simplifies and organizes the process of invoking OpenAI functions in your Node.js applications. With this toolkit, you can easily scaffold out and isolate all the OpenAI function calls you need, making your code more modular, maintainable, and scalable.

alpha-vantage artificial-intelligence automation function-calling functionchain langchain machine-learning natural-language natural-language-processing nlp openai pinecone

Last synced: 08 Nov 2024

https://github.com/paulrinckens/timexy

A spaCy custom component that extracts and normalizes temporal expressions

date-parser datetime natural-language-processing nlp python spacy spacy-extension timeml timex3

Last synced: 14 Oct 2024

https://github.com/clipperhouse/uax29

A tokenizer based on Unicode text segmentation (UAX #29), for Go. Split words, sentences and graphemes.

go golang nlp tokenization tokenizer uax29 unicode

Last synced: 14 Nov 2024

https://github.com/rguthrie3/morphologicalpriorsforwordembeddings

Code for EMNLP 2016 paper: Morphological Priors for Probabilistic Word Embeddings

blocks emnlp neural-network nlp theano word-embeddings

Last synced: 08 Nov 2024

https://github.com/shibing624/judger

自动作文评分工具,支持中文、英文作文智能评分,支持评分模型自训练,支持WEKA处理模型数据,支持自定义评分算法。java开发。

aes automated-essay-scoring essayscoring judger nlp

Last synced: 22 Oct 2024

https://github.com/minasmz/Persian-Summarization

Statistical and Semantical Text Summarizer in Persian Language

doc2vec-model gensim nlp persian-language persian-nlp text-summarization textrank-algorithm

Last synced: 04 Aug 2024

https://github.com/jenojp/extractacy

Spacy pipeline object for extracting values that correspond to a named entity (e.g., birth dates, account numbers, laboratory results)

entity-extraction entity-linking ner nlp pattern-matching spacy spacy-extension spacy-pipeline

Last synced: 14 Oct 2024

https://github.com/vamshiiitbhu14/nlpswift

NSLinguisticTagger provides a uniform interface to a variety of natural language processing functionality with support for many different languages and scripts. One can use this class to segment natural language text into paragraphs , sentences, or words and tag information about those segments such as parts of speech, lexical class, lemma!

coreml ios nlp nlp-apis swift4

Last synced: 10 Nov 2024

https://github.com/tokenmill/beagle

Beagle helps you identify keywords, phrases, regexes, and complex search queries of interest in streams of text documents.

clojure java lucene luwak nlp real-time-search stemming stored-query-engine stream-search

Last synced: 10 Nov 2024

https://github.com/tokuhirom/jawiki-kana-kanji-dict

Generate SKK/MeCab dictionary from Wikipedia(Japanese edition)

japanese-language nlp wikipedia

Last synced: 27 Oct 2024

https://github.com/MrinmoiHossain/Udacity-Deep-Learning-Nanodegree

The course is contained knowledge that are useful to work on deep learning as an engineer. Simple neural networks & training, CNN, Autoencoders and feature extraction, Transfer learning, RNN, LSTM, NLP, Data augmentation, GANs, Hyperparameter tuning, Model deployment and serving are included in the course.

convolutional-networks convolutional-neural-networks deep-learning gans generative-adversarial-network long-short-term-memory-models lstms machine-learning nanodegree neural-network nlp pytorch recurrent-neural-networks rnn sentiment-analysis style-transfer transfer-learning udacity udacity-nanodegree

Last synced: 07 Aug 2024

https://github.com/ferdinandzhong/punctuator

A small seq2seq punctuator tool based on DistilBERT

bert bert-ner chinese-nlp deep-learning nlp punctuation pytorch seq2seq

Last synced: 16 Nov 2024

https://github.com/gunthercox/mathparse

A Python library for evaluating natural language mathematical equations

mathematics nlp python

Last synced: 27 Oct 2024

https://github.com/FerdinandZhong/punctuator

A small seq2seq punctuator tool based on DistilBERT

bert bert-ner chinese-nlp deep-learning nlp punctuation pytorch seq2seq

Last synced: 14 Nov 2024

https://github.com/nzw0301/lightlda

fast sampling algorithm based on CGS

lda machine-learning nlp python topic-modeling

Last synced: 22 Oct 2024

https://github.com/nzw0301/lightLDA

fast sampling algorithm based on CGS

lda machine-learning nlp python topic-modeling

Last synced: 04 Nov 2024

https://github.com/vdutts7/ai-rapper

Talking Head of your favorite rapper using Transformers, PyTorch, Tortoise TTS, and OpenCV 🎵

huggingface-transformers nlp opencv pytorch tortoise-tts voice-clone

Last synced: 11 Nov 2024

https://github.com/siddk/deep-nlp

Tensorflow Tutorial files and Implementations of various Deep NLP and CV Models.

deep mnist-nn neural-network nlp tensorflow tensorflow-tutorials

Last synced: 22 Oct 2024

https://github.com/arne-cl/discoursegraphs

linguistic converter / merging tool for multi-level annotated corpora. graph-based (using Python and NetworkX).

conversion converter natural-language-processing networkx nlp python

Last synced: 10 Nov 2024

https://github.com/wayfair-incubator/extra-model

Code to run the ExtRA algorithm for unsupervised topic/aspect extraction on English texts.

aspect-based-sentiment-analysis aspect-extraction machine-learning-algorithms nlp nlp-keywords-extraction nlp-library python python-library python3

Last synced: 07 Nov 2024

https://github.com/gentaiscool/few-shot-lm

The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)

few-shot few-shot-learning gpt intent language-model multilingual nlp t5

Last synced: 08 Nov 2024

https://github.com/lonepatient/bert-sentence-similarity-pytorch

This repo contains a PyTorch implementation of a pretrained BERT model for sentence similarity task.

bert nlp pytorch sentence-similarity text-classification

Last synced: 06 Nov 2024

https://github.com/lonePatient/bert-sentence-similarity-pytorch

This repo contains a PyTorch implementation of a pretrained BERT model for sentence similarity task.

bert nlp pytorch sentence-similarity text-classification

Last synced: 02 Nov 2024

https://github.com/kaleidophon/token2index

A lightweight but powerful library to build token indices for NLP tasks, compatible with major Deep Learning frameworks like PyTorch and Tensorflow.

deep-learning deeplearning i2t i2w indexing itos nlp numpy python pytorch rnn rnns seq2seq stoi t2i tensorflow token transformer transformers w2i

Last synced: 09 Nov 2024

https://github.com/blacksamorez/ebanko

NLP based telegram bot

docker-compose grafana nlp telegram-bot

Last synced: 05 Nov 2024

https://github.com/asyml/stave

An extensible framework for building visualization and annotation tools to enable better interaction with NLP and Artificial Intelligence. This is part of the CASL project: http://casl-project.ai/

annotation casl-project nlp petuum visualization

Last synced: 12 Nov 2024

https://github.com/wikimedia/sentencex

A sentence segmentation library with wide language support optimized for speed and utility.

natural-language-processing nlp sentence sentence-segmentation

Last synced: 07 Oct 2024

https://github.com/vzhong/e3

Dockerized code for E3: Entailment-driven Extracting and Editing for Conversational Machine Reading.

deep-learning machine-learning nlp

Last synced: 07 Nov 2024

https://github.com/strangetom/ingredient-parser

A tool to parse recipe ingredients into structured data

ingredients natural-language-processing nlp python recipes

Last synced: 08 Nov 2024

https://github.com/apache/ctakes

Apache cTAKES is a Natural Language Processing (NLP) platform for clinical text.

bioinformatics clinical nlp

Last synced: 07 Oct 2024

https://github.com/donderom/llm4s

Scala 3 bindings for llama.cpp

llama llm nlp scala

Last synced: 15 Nov 2024

https://github.com/thomasahle/codenames

Codenames AI using Word Vectors

ai cli codenames game nlp word-embeddings word2vec

Last synced: 29 Oct 2024

https://github.com/jeroenouw/angularai

:speech_balloon: Angular 6 AI (localhost version is working correctly)

ai angular angular6 artificial-intelligence chatbotai dialogflow google machine-learning material-design ngx nlp rxjs

Last synced: 13 Nov 2024

https://github.com/koichiyasuoka/esupar

Tokenizer POS-Tagger and Dependency-parser with BERT/RoBERTa/DeBERTa models for Japanese and other languages

ainu chinese classical-chinese coptic english german japanese natural-language-processing nlp serbian thai transformers

Last synced: 16 Nov 2024

https://github.com/cdpierse/script_buddy_v2

Script Buddy v2 is a film script text generation tool built using film scripts from the world's most popular film scripts and GPT2.

artificial-intelligence gpt-2 language-generation machine-learning nlp pytorch transformers

Last synced: 08 Nov 2024

https://github.com/carrychang/real_time_datamining_software

携程/榛果民宿实时评论挖掘软件,包含数据的实时采集/数据清洗/结构化保存/ UGC 数据主题提取/情感分析/后结构化可视化等技术的综合性演示 Demo。基于在线民宿 UGC 数据的意见挖掘项目,包含数据挖掘和 NLP 相关的处理,负责数据采集、主题抽取、情感分析等任务。主要克服用户打分和评论不一致,实时对携程和美团在线民宿的满意度进行评测以及对额外数据进行可视化的综合性工具,多维度的对在线 UGC 进行数据挖掘并可视化,demo 视频演示见链接。

data-mining-software data-spider demo nlp real-time-anslysis sentiment-analysis ugc-analysis

Last synced: 13 Nov 2024

https://github.com/andi611/conditional-seqgan-tensorflow

Conditional Sequence Generative Adversarial Network trained with policy gradient, Implementation in Tensorflow

chatbot conditional-gan gan machine-learning nlp nlp-machine-learning seqgan tensorflow

Last synced: 07 Nov 2024

https://github.com/nltk/wordnet

Stand-alone WordNet API

nlp wordnet

Last synced: 10 Nov 2024

https://github.com/abhishek-ch/vectorverse

Explore Multiple Vector Databases and chat with documents on Multiple LLM models, private LLM models

chatbot chatgpt chromadb elasticsearch embeddings generative generativeai milvus nlp openai python qdrant redis streamlit vectorstore

Last synced: 28 Oct 2024

https://github.com/lexiestleszek/namegen

Self-contained, minimalistic implementation of a language model that generates coherent and normal sounding names. It uses an input dataset of names and probability distribution to generate new names based on the sequences of four characters.

language-model machine-learning markov-chain name-generation natural-language-processing nlp

Last synced: 14 Nov 2024

https://github.com/datawhalechina/unlock-hf

解锁HuggingFace生态的百般用法

datawhale nlp transformers tutorial

Last synced: 09 Nov 2024

https://github.com/LanguageMachines/PICCL

A set of workflows for corpus building through OCR, post-correction and normalisation

computational-linguistics corpus-linguistics corpus-tools folia nlp ocr workflow

Last synced: 03 Nov 2024

https://github.com/gunale0926/sorsa

SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models

deep-learning fine-tuning llama lora machine-learning nlp peft python pytorch rwkv sorsa svd transformer

Last synced: 17 Nov 2024

https://github.com/kennethenevoldsen/spacy-wrap

spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to include existing fine-tuned models within your SpaCy workflow.

deep-learning huggingface huggingface-transformers language-model machine-learning natural-language-processing nlp pytorch spacy spacy-extension spacy-extensions spacy-models spacy-nlp spacy-pipeline spacy-transformers text-classification transformers

Last synced: 12 Oct 2024