Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/dccuchile/spanish-word-embeddings

Spanish word embeddings computed with different methods and from different corpora

fasttext-embeddings glove-embeddings nlp spanish word-embeddings word2vec-embeddinngs

Last synced: 05 Aug 2024

https://github.com/MAIF/melusine

📧 Melusine: Use python to automatize your email processing workflow

courriels datascience emails natural-language-processing nlp nlp-machine-learning python python3

Last synced: 03 Nov 2024

https://github.com/momegas/megabots

🤖 State-of-the-art, production ready LLM apps made mega-easy, so you don't have to build them from scratch 🤯 Create a bot, now 🫵

chatbot faiss fastapi gpt-35-turbo gpt-4 information-retrieval langchain llama natural-language-processing nlp pinecone prompt-engineering python question-answering s3

Last synced: 11 Oct 2024

https://github.com/Koziev/NLP_Datasets

My NLP datasets for Russian language

datasets nlp nlp-resources

Last synced: 02 Aug 2024

https://github.com/domluna/memn2n

End-To-End Memory Network using Tensorflow

memory-networks nlp tensorflow

Last synced: 26 Oct 2024

https://github.com/deepset-ai/covid-qa

API & Webapp to answer questions about COVID-19. Using NLP (Question Answering) and trusted data sources.

api corona covid-19 covid-data faq nlp question-answering search

Last synced: 06 Nov 2024

https://github.com/alibaba-edu/simple-effective-text-matching

Source code of the ACL2019 paper "Simple and Effective Text Matching with Richer Alignment Features".

deep-learning nlp quora-question-pairs snli tensorflow

Last synced: 06 Nov 2024

https://github.com/thisandagain/troll

Language sentiment analysis and neural networks... for trolls.

javascript moderation neural-network nlp sentiment sentiment-analysis

Last synced: 26 Oct 2024

https://github.com/kyzhouhzau/nlpgnn

1. Use BERT, ALBERT and GPT2 as tensorflow2.0's layer. 2. Implement GCN, GAN, GIN and GraphSAGE based on message passing.

albert albert-ner bert bert-cls bert-ner bilstm-attention gan gcn gin gnn gpt2 graph-classfication graph-convolutional-networks graphsage message-passing nlp tensorflow2 textcnn textgcn tf2

Last synced: 14 Oct 2024

https://github.com/oswaldoludwig/Seq2seq-Chatbot-for-Keras

This repository contains a new generative model of chatbot based on seq2seq modeling.

chatbot conversational-agents deep-learning dialogue dialogue-generation gan generative-adversarial-network glove keras nlp seq2seq

Last synced: 02 Nov 2024

https://github.com/xplip/pixel

Research code for pixel-based encoders of language (PIXEL)

deep-learning deep-neural-networks language-model machine-learning nlp pytorch

Last synced: 31 Oct 2024

https://github.com/davidmigloz/langchain_dart

Build LLM-powered Dart/Flutter applications.

ai dart flutter generative-ai llms nlp

Last synced: 03 Nov 2024

https://github.com/wuba/qa_match

A simple effective ToolKit for short text matching

58 ai deep-learning dssm lstm machine-learning nlp qabot qatools tensorflow

Last synced: 03 Aug 2024

https://github.com/shibing624/dialogbot

dialogbot, provide search-based dialogue, task-based dialogue and generative dialogue model. 对话机器人,基于问答型对话、任务型对话、聊天型对话等模型实现,支持网络检索问答,领域知识问答,任务引导问答,闲聊问答,开箱即用。

chatbot deep-learning dialog dialogbot nlp qa question-answering

Last synced: 30 Oct 2024

https://github.com/machine-learning-apps/Issue-Label-Bot

Code For The Issue Label Bot, an App that automatically labels issues using machine learning, available on the GitHub Marketplace. This is also code for the blog article: "How to automate tasks on GitHub with machine learning for fun and profit"

bigquery bootstrap data-science deep-learning end-to-end-application flask gcp-cloud gharchive github-api-v3 github-app keras kubernetes machine-learning machine-learning-tutorials nlp production-machine-learning tensorflow

Last synced: 25 Oct 2024

https://github.com/yunwei37/covid-19-nlp-vis

使用 flask + pyecharts 搭建的新冠肺炎疫情数据可视化交互分析网站平台,包含疫情数据获取、每日疫情地图、曲线图展示,数据统计分析、态势感知、确诊人数预测分析算法设计、NLP舆情监测等任务(部署在http://covid.yunwei123.tech/)

covid-19 flask maps nlp pyecharts visualization

Last synced: 26 Oct 2024

https://github.com/machine-learning-apps/issue-label-bot

Code For The Issue Label Bot, an App that automatically labels issues using machine learning, available on the GitHub Marketplace. This is also code for the blog article: "How to automate tasks on GitHub with machine learning for fun and profit"

bigquery bootstrap data-science deep-learning end-to-end-application flask gcp-cloud gharchive github-api-v3 github-app keras kubernetes machine-learning machine-learning-tutorials nlp production-machine-learning tensorflow

Last synced: 29 Sep 2024

https://github.com/CogStack/OpenGPT

A framework for creating grounded instruction based datasets and training conversational domain expert Large Language Models (LLMs).

chatgpt gpt-4 health healthcare huggingface llm medicine nlp opengpt

Last synced: 03 Aug 2024

https://github.com/enoch3712/ExtractThinker

ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.

ai llm nlp ocr openai python

Last synced: 05 Nov 2024

https://github.com/discopy/discopy

The Python toolkit for computing with string diagrams.

category-theory diagrams nlp quantum-computing

Last synced: 09 Aug 2024

https://github.com/drahnr/cargo-spellcheck

Checks all your documentation for spelling and grammar mistakes with hunspell and a nlprule based checker for grammar

cargo cargo-plugin cargo-spellcheck grammar grammar-mistakes grammarchecker hacktoberfest hunspell languagetool nlp spellchecker spelling

Last synced: 30 Oct 2024

https://github.com/HIT-SCIR/huozi

活字通用大模型

fine-tuning large-language-models llm nlp

Last synced: 08 Nov 2024

https://github.com/explosion/prodigy-openai-recipes

✨ Bootstrap annotation with zero- & few-shot learning via OpenAI GPT-3

annotation-tool few-shot-learning gpt-3 nlp openai openai-api prodigy zero-shot-learning

Last synced: 25 Sep 2024

https://github.com/dpressel/dliss-tutorial

Tutorial for International Summer School on Deep Learning, 2019

deep-learning machine-learning nlp

Last synced: 26 Oct 2024

https://github.com/asahi417/lm-question-generation

Multilingual/multidomain question generation datasets, models, and python library for question generation.

bart nlp pytorch question-answering question-generation t5

Last synced: 04 Nov 2024

https://github.com/cli99/llm-analysis

Latency and Memory Analysis of Transformer Models for Training and Inference

analysis deep-learning language-model language-models machine-learning nlp transformers

Last synced: 06 Aug 2024

https://github.com/swhl/ai-competition-collections

AI比赛经验帖子 & 训练和测试技巧帖子 集锦(收集整理各种人工智能比赛经验帖)

competition cv data-discovery graph-neural-networks knowledge-graph nlp recommender-system speech

Last synced: 01 Nov 2024

https://github.com/xiangking/ark-nlp

A private nlp coding package, which quickly implements the SOTA solutions.

bert nlp transfomer

Last synced: 06 Nov 2024

https://github.com/JetRunner/BERT-of-Theseus

⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).

bert glue model-compression nlp transformers

Last synced: 03 Nov 2024

https://github.com/SimGus/Chatette

A powerful dataset generator for Rasa NLU, inspired by Chatito

botkit chatbot chatbots chatito cli dataset-generation nlg nlp nlu parsing python rasa rasa-nlu sentence

Last synced: 31 Oct 2024

https://github.com/mcs07/chemdataextractor

Automatically extract chemical information from scientific documents

chemistry information-extraction natural-language-processing nlp python text-mining

Last synced: 07 Nov 2024

https://github.com/qiangsiwei/bert_distill

BERT distillation(基于BERT的蒸馏实验 )

bert classification distillation nlp

Last synced: 02 Nov 2024

https://github.com/UKPLab/gpl

Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577

bert domain-adaptation information-retrieval nlp transformers vector-search

Last synced: 05 Aug 2024

https://github.com/natasha/yargy

Rule-based facts extraction for Russian language

earley-parser information-extraction morphology nlp python russian tomita tomita-parser

Last synced: 10 Nov 2024

https://github.com/graykode/ai-docstring

Visual Studio Code extension to quickly generate docstrings for python functions using AI(NLP) technology.

bert code-summarization docstrings nlp vs-code-extenstion

Last synced: 04 Nov 2024

https://github.com/xkzhangsan/xk-time

xk-time 是时间转换,时间计算,时间格式化,时间解析,日历,时间cron表达式和时间NLP等的工具,使用Java8(JSR-310),线程安全,简单易用,多达70几种常用日期格式化模板,支持Java8时间类和Date,轻量级,无第三方依赖。

calendar cron cron-java8 date datetimeformatter-formatter dateutil formatter java jsr-310 localdate localdatetime nlp time timeconvertion

Last synced: 04 Aug 2024

https://github.com/alibaba-edu/simple-effective-text-matching-pytorch

A pytorch implementation of the ACL2019 paper "Simple and Effective Text Matching with Richer Alignment Features".

deep-learning nlp pytorch quora-question-pairs snli

Last synced: 06 Nov 2024

https://github.com/GaoQ1/rasa_nlu_gq

turn natural language into structured data(支持中文,自定义了N种模型,支持不同的场景和任务)

bert bilstm-idcnn jieba natural-language nlp nlu rasa rasa-nlu rasa-nlu-gao tensorflow

Last synced: 02 Nov 2024

https://github.com/dair-ai/nlp_newsletter

📰Natural language processing (NLP) newsletter

deep-learning machine-learning nlp

Last synced: 03 Sep 2024

https://github.com/hankcs/multi-criteria-cws

Simple Solution for Multi-Criteria Chinese Word Segmentation

bi-lstm-crf cws dynet multi-criteria-cws nlp

Last synced: 09 Nov 2024

https://github.com/farukalamai/advanced-machine-learning-engineer-roadmap-2024

A Full Stack ML (Machine Learning) Roadmap involves learning the necessary skills and technologies to become proficient in all aspects of machine learning, including data collection and preprocessing, model development, deployment, and maintenance.

aws computer-vision data-analysis data-science data-visualization deep-learning git-github machine-learning machine-learning-roadmap mlops natural-language-processing neural-network nlp opencv pandas python pytorch statistics tensorflow yolo

Last synced: 07 Nov 2024

https://github.com/phospho-app/phospho

Text analytics for LLM apps. PostHog for prompts. Extract evaluations, intents and events from text messages. phospho leverages LLM (OpenAI, MistralAI, Ollama, etc.)

ai analytics generative-ai llm nextjs nlp ollama python self-hosted typescript

Last synced: 13 Oct 2024

https://github.com/kevinlu1248/pyate

PYthon Automated Term Extraction

ai nlp symbolic-ai term-extraction

Last synced: 28 Sep 2024

https://github.com/gagolews/stringi

Fast and portable character string processing in R (with the Unicode ICU)

icu icu4c natural-language-processing nlp r regex regexp string-manipulation stringi stringr text text-processing tidy-data unicode

Last synced: 26 Oct 2024

https://github.com/hankcs/hanlp-lucene-plugin

HanLP中文分词Lucene插件,支持包括Solr在内的基于Lucene的系统

chinese-text-segmentation hanlp lucene nlp solr traditional-chinese

Last synced: 26 Oct 2024

https://github.com/daac-tools/vibrato

🎤 vibrato: Viterbi-based accelerated tokenizer

japanese morphological-analysis nlp rust segmentation tokenization tokenizer

Last synced: 07 Nov 2024

https://github.com/gentaiscool/code-switching-papers

A curated list of research papers and resources on code-switching

bilingual code-mixed code-mixing code-switch code-switching language nlp papers research speech

Last synced: 08 Nov 2024

https://github.com/jameshwade/gpttools

gpttools extends gptstudio for package development to help you document code, write tests, or even explain code

chatgpt nlp openai package-development rstats rstudio-addin

Last synced: 09 Nov 2024

https://github.com/jsksxs360/AHANLP

啊哈自然语言处理包,提供包括分词、依存句法分析、语义角色标注、自动摘要、语义相似度计算、LDA 主题预测、词云等服务。

chinese nlp

Last synced: 30 Oct 2024

https://github.com/sekwiatkowski/Komputation

Komputation is a neural network framework for the Java Virtual Machine written in Kotlin and CUDA C.

artificial-intelligence convolutional-neural-networks cuda framework gpu jvm kotlin machine-learning neural-networks nlp nvidia recurrent-neural-networks seq2seq

Last synced: 02 Nov 2024

https://github.com/feedly/transfer-nlp

NLP library designed for reproducible experimentation management

framework language-model natural-language-understanding nlp playground pytorch transfer-learning

Last synced: 13 Oct 2024

https://github.com/JamesHWade/gpttools

gpttools extends gptstudio for package development to help you document code, write tests, or even explain code

chatgpt nlp openai package-development rstats rstudio-addin

Last synced: 13 Aug 2024

https://github.com/hhstore/blog

My Tech Blog: about Mojo / Rust / Golang / Python / Kotlin / Flutter / VueJS / Blockchain etc.

ai android blockchain blog dart docker flutter golang gpt ios k8s kotlin mojo nlp python rust vuejs web3 zig

Last synced: 14 Oct 2024

https://github.com/thunlp/NSC

Neural Sentiment Classification

nlp

Last synced: 07 Aug 2024

https://github.com/caiyinqiong/semantic-retrieval-models

A curated list of awesome papers for Semantic Retrieval (TOIS Accepted: Semantic Models for the First-stage Retrieval: A Comprehensive Review).

dense-retrieval information-retrieval nlp paper-list semantic-retrieval

Last synced: 10 Nov 2024

https://github.com/igorbrigadir/stopwords

Default English stopword lists from many different sources

en-stopwords english-stopwords natural-language-processing nlp stopwords

Last synced: 05 Nov 2024

https://github.com/zhongkaifu/RNNSharp

RNNSharp is a toolkit of deep recurrent neural network which is widely used for many different kinds of tasks, such as sequence labeling, sequence-to-sequence and so on. It's written by C# language and based on .NET framework 4.6 or above versions. RNNSharp supports many different types of networks, such as forward and bi-directional network, sequence-to-sequence network, and different types of layers, such as LSTM, Softmax, sampled Softmax and others.

c-sharp crf deep-learning dotnet lstm machine-learning nlp recurrent-neural-networks rnn rnn-model sequence-labeling

Last synced: 17 Aug 2024

https://github.com/liyucheng09/selective_context

Compress your input to ChatGPT or other LLMs, to let them process 2x more content and save 40% memory and GPU time.

chatgpt llms nlp

Last synced: 30 Oct 2024

https://github.com/sunzeyeah/RLHF

Implementation of Chinese ChatGPT

chatgpt deep-learning deepspeed glm nlp pangu pytorch

Last synced: 31 Oct 2024

https://github.com/merantix-momentum/squirrel-core

A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way :chestnut:

ai cloud-computing collaboration computer-vision cv data-ingestion data-mesh data-science dataops datasets deep-learning distributed internal machine-learning ml natural-language-processing nlp python pytorch tensorflow

Last synced: 02 Nov 2024

https://github.com/boat-group/fancy-nlp

NLP for human. A fast and easy-to-use natural language processing (NLP) toolkit, satisfying your imagination about NLP.

bert bert-chinese bert-classifier bert-embeddings bert-ner bilstm-crf bimpm chinese-nlp crf esim keras named-entity-recognition nlp python-library semantic-similarity tensorflow text-classification tf2

Last synced: 30 Oct 2024

https://github.com/google-research/retvec

RETVec is an efficient, multilingual, and adversarially-robust text vectorizer.

deep-learning natural-language-processing nlp python tensorflow text-classification

Last synced: 13 Oct 2024

https://github.com/natasha/corus

Links to Russian corpora + Python functions for loading and parsing

corpora datasets nlp python russian

Last synced: 10 Nov 2024

https://github.com/extreme-bert/extreme-bert

ExtremeBERT is a toolkit that accelerates the pretraining of customized language models on customized datasets, described in the paper “ExtremeBERT: A Toolkit for Accelerating Pretraining of Customized BERT”.

bert deep-learning language-model language-models machine-learning natural-language-processing nlp python pytorch transformer

Last synced: 03 Aug 2024

https://github.com/LudwigStumpp/llm-leaderboard

A joint community effort to create one central leaderboard for LLMs.

leaderboard llm machine-learning nlp

Last synced: 02 Nov 2024

https://github.com/amirshnll/persian-swear-words

Persian Swear Dataset - you can use in your production to filter unwanted content. دیتاست کلمات نامناسب و بد فارسی برای فیلتر کردن متن ها

dataset datasets farsi farsiswear farsiswearword nlp nlp-dataset persian persiandataset persianswearword swear sweardataset swearword

Last synced: 30 Oct 2024

https://github.com/bonzanini/nlp-tutorial

Tutorial: Natural Language Processing in Python

natural-language-processing nlp python

Last synced: 07 Aug 2024

https://github.com/RTIInternational/gobbli

Deep learning with text doesn't have to be scary.

deep-learning docker nlp python

Last synced: 04 Nov 2024

https://github.com/shineware/KOMORAN

Korean Morphological Analyzer by shineware

komoran korean-nlp korean-text-processing morphological-analysis nlp shineware

Last synced: 02 Aug 2024

https://github.com/krishnap25/mauve

Package to compute Mauve, a similarity score between neural text and human text. Install with `pip install mauve-text`.

deep-learning huggingface-transformers nlp pytorch text-generation

Last synced: 13 Oct 2024

https://github.com/phantominsights/summarizer

A Reddit bot that summarizes news articles written in Spanish or English. It uses a custom built algorithm to rank words and sentences.

nlp praw python3 reddit-bot spacy web-scraper wordcloud

Last synced: 31 Oct 2024

https://github.com/stanford-oval/genie-server

The home server version of Almond

hacktoberfest nlp raspberrypi voice

Last synced: 11 Oct 2024

https://github.com/amirshnll/Persian-Swear-Words

Persian Swear Dataset - you can use in your production to filter unwanted content. دیتاست کلمات نامناسب و بد فارسی برای فیلتر کردن متن ها

dataset datasets farsi farsiswear farsiswearword nlp nlp-dataset persian persiandataset persianswearword swear sweardataset swearword

Last synced: 04 Aug 2024

https://github.com/quadrismegistus/prosodic

Prosodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.

finnish-language-analysis linguistics metrical-parser nlp poetry rhythm

Last synced: 30 Oct 2024

https://github.com/neuml/txtchat

💭 Retrieval augmented generation (RAG) and language model powered search applications

large-language-models llm machine-learning nlp python rag retrieval-augmented-generation search txtai

Last synced: 28 Oct 2024