Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/cluebenchmark/mobileqa

离线端阅读理解应用 QA for mobile, Android & iPhone

albert android bert chinese iphone machine-reading-comprehension nlp qa tensorflow tflite

Last synced: 09 Nov 2024

https://github.com/knadh/indic.page

A directory of Indic (Indian) language computing resources.

datasets indian-language indic-languages language linguistics nlp

Last synced: 28 Oct 2024

https://github.com/doccano/doccano-mini

Annotation meets Large Language Models (ChatGPT, GPT-3 and alike).

annotation langchain nlp openai

Last synced: 01 Nov 2024

https://github.com/xyntopia/pydoxtools

Effortlessly extract information from unstructured data with this library, utilizing advanced AI techniques. Compose AI in customizable pipelines and diverse sources for your projects.

chatgpt document-analysis document-extraction extraction information-retrieval llm nlp pdf python

Last synced: 03 Aug 2024

https://github.com/openagi/tefla

Tensorflow based deep-learning library.

ai deep-learning gan generative-adversarial-network lstm ml nlp tensorflow-framework

Last synced: 14 Oct 2024

https://github.com/chiphuyen/metrotwitter

What Twitter reveals about the differences between cities and the monoculture of the Bay Area

data-analysis data-visualization emojis nlp nlp-datasets python twitter twitter-dataset

Last synced: 08 Nov 2024

https://github.com/mlabouardy/dialogflow-angular5

💬 Bot in Angular 5 & DialogFlow

ai angular angular5 api-ai bot chatbot dialogflow nlp

Last synced: 15 Nov 2024

https://github.com/SamEdwardes/spacytextblob

A TextBlob sentiment analysis pipeline component for spaCy.

natural-language-processing nlp python spacy

Last synced: 04 Aug 2024

https://github.com/cohere-ai/sandbox-accelerating-chatbot-training

Leveraging Cohere's models to enable zero-shot routing

chatbot large-language-models llm nlp routing

Last synced: 07 Oct 2024

https://github.com/samedwardes/spacytextblob

A TextBlob sentiment analysis pipeline component for spaCy.

natural-language-processing nlp python spacy

Last synced: 14 Oct 2024

https://github.com/cluebenchmark/lightlm

高性能小模型测评 Shared Tasks in NLPCC 2020. Task 1 - Light Pre-Training Chinese Language Model for NLP Task

bert chinese chinese-language-model languagemodel nlp nlpcc nlpcc2020

Last synced: 09 Nov 2024

https://github.com/lunarwhite/tan-division

Chinese corpus sentiment analysis. 谭松波酒店评论中文文本情感分析

deep-learning keras lstm nlp python rnn tensorflow

Last synced: 06 Nov 2024

https://github.com/chanind/frame-semantic-transformer

Frame Semantic Parser based on T5 and FrameNet

framenet huggingface nlp semantic-parsing t5 transformers

Last synced: 08 Nov 2024

https://github.com/LaVi-Lab/CLEVA

[EMNLP 2023 Demo] CLEVA: Chinese Language Models EVAluation Platform

chinese evaluation nlp

Last synced: 03 Aug 2024

https://github.com/paulrinckens/timexy

A spaCy custom component that extracts and normalizes temporal expressions

date-parser datetime natural-language-processing nlp python spacy spacy-extension timeml timex3

Last synced: 14 Oct 2024

https://github.com/swader/diffbot-php-client

[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library

ai artificial-intelligence bot crawl crawling diffbot machine-learning nlp php scrape scraped-data scraper scraping

Last synced: 08 Nov 2024

https://github.com/hironsan/neraug

A text augmentation tool for named entity recognition.

deep-learning machine-learning natural-language-processing nlp

Last synced: 27 Oct 2024

https://github.com/clipperhouse/uax29

A tokenizer based on Unicode text segmentation (UAX #29), for Go. Split words, sentences and graphemes.

go golang nlp tokenization tokenizer uax29 unicode

Last synced: 14 Nov 2024

https://github.com/developersdigest/function-chain

The FunctionChain is a tool that simplifies and organizes the process of invoking OpenAI functions in your Node.js applications. With this toolkit, you can easily scaffold out and isolate all the OpenAI function calls you need, making your code more modular, maintainable, and scalable.

alpha-vantage artificial-intelligence automation function-calling functionchain langchain machine-learning natural-language natural-language-processing nlp openai pinecone

Last synced: 08 Nov 2024

https://github.com/shibing624/judger

自动作文评分工具,支持中文、英文作文智能评分,支持评分模型自训练,支持WEKA处理模型数据,支持自定义评分算法。java开发。

aes automated-essay-scoring essayscoring judger nlp

Last synced: 22 Oct 2024

https://github.com/vamshiiitbhu14/nlpswift

NSLinguisticTagger provides a uniform interface to a variety of natural language processing functionality with support for many different languages and scripts. One can use this class to segment natural language text into paragraphs , sentences, or words and tag information about those segments such as parts of speech, lexical class, lemma!

coreml ios nlp nlp-apis swift4

Last synced: 10 Nov 2024

https://github.com/rguthrie3/morphologicalpriorsforwordembeddings

Code for EMNLP 2016 paper: Morphological Priors for Probabilistic Word Embeddings

blocks emnlp neural-network nlp theano word-embeddings

Last synced: 08 Nov 2024

https://github.com/jenojp/extractacy

Spacy pipeline object for extracting values that correspond to a named entity (e.g., birth dates, account numbers, laboratory results)

entity-extraction entity-linking ner nlp pattern-matching spacy spacy-extension spacy-pipeline

Last synced: 14 Oct 2024

https://github.com/minasmz/Persian-Summarization

Statistical and Semantical Text Summarizer in Persian Language

doc2vec-model gensim nlp persian-language persian-nlp text-summarization textrank-algorithm

Last synced: 04 Aug 2024

https://github.com/tokenmill/beagle

Beagle helps you identify keywords, phrases, regexes, and complex search queries of interest in streams of text documents.

clojure java lucene luwak nlp real-time-search stemming stored-query-engine stream-search

Last synced: 10 Nov 2024

https://github.com/MrinmoiHossain/Udacity-Deep-Learning-Nanodegree

The course is contained knowledge that are useful to work on deep learning as an engineer. Simple neural networks & training, CNN, Autoencoders and feature extraction, Transfer learning, RNN, LSTM, NLP, Data augmentation, GANs, Hyperparameter tuning, Model deployment and serving are included in the course.

convolutional-networks convolutional-neural-networks deep-learning gans generative-adversarial-network long-short-term-memory-models lstms machine-learning nanodegree neural-network nlp pytorch recurrent-neural-networks rnn sentiment-analysis style-transfer transfer-learning udacity udacity-nanodegree

Last synced: 07 Aug 2024

https://github.com/tokuhirom/jawiki-kana-kanji-dict

Generate SKK/MeCab dictionary from Wikipedia(Japanese edition)

japanese-language nlp wikipedia

Last synced: 27 Oct 2024

https://github.com/FerdinandZhong/punctuator

A small seq2seq punctuator tool based on DistilBERT

bert bert-ner chinese-nlp deep-learning nlp punctuation pytorch seq2seq

Last synced: 14 Nov 2024

https://github.com/ferdinandzhong/punctuator

A small seq2seq punctuator tool based on DistilBERT

bert bert-ner chinese-nlp deep-learning nlp punctuation pytorch seq2seq

Last synced: 26 Oct 2024

https://github.com/nzw0301/lightlda

fast sampling algorithm based on CGS

lda machine-learning nlp python topic-modeling

Last synced: 22 Oct 2024

https://github.com/nzw0301/lightLDA

fast sampling algorithm based on CGS

lda machine-learning nlp python topic-modeling

Last synced: 04 Nov 2024

https://github.com/gunthercox/mathparse

A Python library for evaluating natural language mathematical equations

mathematics nlp python

Last synced: 27 Oct 2024

https://github.com/siddk/deep-nlp

Tensorflow Tutorial files and Implementations of various Deep NLP and CV Models.

deep mnist-nn neural-network nlp tensorflow tensorflow-tutorials

Last synced: 22 Oct 2024

https://github.com/vdutts7/ai-rapper

Talking Head of your favorite rapper using Transformers, PyTorch, Tortoise TTS, and OpenCV 🎵

huggingface-transformers nlp opencv pytorch tortoise-tts voice-clone

Last synced: 11 Nov 2024

https://github.com/asyml/stave

An extensible framework for building visualization and annotation tools to enable better interaction with NLP and Artificial Intelligence. This is part of the CASL project: http://casl-project.ai/

annotation casl-project nlp petuum visualization

Last synced: 12 Nov 2024

https://github.com/arne-cl/discoursegraphs

linguistic converter / merging tool for multi-level annotated corpora. graph-based (using Python and NetworkX).

conversion converter natural-language-processing networkx nlp python

Last synced: 10 Nov 2024

https://github.com/wayfair-incubator/extra-model

Code to run the ExtRA algorithm for unsupervised topic/aspect extraction on English texts.

aspect-based-sentiment-analysis aspect-extraction machine-learning-algorithms nlp nlp-keywords-extraction nlp-library python python-library python3

Last synced: 07 Nov 2024

https://github.com/lonePatient/bert-sentence-similarity-pytorch

This repo contains a PyTorch implementation of a pretrained BERT model for sentence similarity task.

bert nlp pytorch sentence-similarity text-classification

Last synced: 02 Nov 2024

https://github.com/kaleidophon/token2index

A lightweight but powerful library to build token indices for NLP tasks, compatible with major Deep Learning frameworks like PyTorch and Tensorflow.

deep-learning deeplearning i2t i2w indexing itos nlp numpy python pytorch rnn rnns seq2seq stoi t2i tensorflow token transformer transformers w2i

Last synced: 09 Nov 2024

https://github.com/lonepatient/bert-sentence-similarity-pytorch

This repo contains a PyTorch implementation of a pretrained BERT model for sentence similarity task.

bert nlp pytorch sentence-similarity text-classification

Last synced: 06 Nov 2024

https://github.com/gentaiscool/few-shot-lm

The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)

few-shot few-shot-learning gpt intent language-model multilingual nlp t5

Last synced: 08 Nov 2024

https://github.com/blacksamorez/ebanko

NLP based telegram bot

docker-compose grafana nlp telegram-bot

Last synced: 05 Nov 2024

https://github.com/strangetom/ingredient-parser

A tool to parse recipe ingredients into structured data

ingredients natural-language-processing nlp python recipes

Last synced: 08 Nov 2024

https://github.com/apache/ctakes

Apache cTAKES is a Natural Language Processing (NLP) platform for clinical text.

bioinformatics clinical nlp

Last synced: 07 Oct 2024

https://github.com/jeroenouw/angularai

:speech_balloon: Angular 6 AI (localhost version is working correctly)

ai angular angular6 artificial-intelligence chatbotai dialogflow google machine-learning material-design ngx nlp rxjs

Last synced: 13 Nov 2024

https://github.com/vzhong/e3

Dockerized code for E3: Entailment-driven Extracting and Editing for Conversational Machine Reading.

deep-learning machine-learning nlp

Last synced: 07 Nov 2024

https://github.com/thomasahle/codenames

Codenames AI using Word Vectors

ai cli codenames game nlp word-embeddings word2vec

Last synced: 29 Oct 2024

https://github.com/donderom/llm4s

Scala 3 bindings for llama.cpp

llama llm nlp scala

Last synced: 08 Nov 2024

https://github.com/wikimedia/sentencex

A sentence segmentation library with wide language support optimized for speed and utility.

natural-language-processing nlp sentence sentence-segmentation

Last synced: 07 Oct 2024

https://github.com/cdpierse/script_buddy_v2

Script Buddy v2 is a film script text generation tool built using film scripts from the world's most popular film scripts and GPT2.

artificial-intelligence gpt-2 language-generation machine-learning nlp pytorch transformers

Last synced: 08 Nov 2024

https://github.com/lexiestleszek/namegen

Self-contained, minimalistic implementation of a language model that generates coherent and normal sounding names. It uses an input dataset of names and probability distribution to generate new names based on the sequences of four characters.

language-model machine-learning markov-chain name-generation natural-language-processing nlp

Last synced: 14 Nov 2024

https://github.com/andi611/conditional-seqgan-tensorflow

Conditional Sequence Generative Adversarial Network trained with policy gradient, Implementation in Tensorflow

chatbot conditional-gan gan machine-learning nlp nlp-machine-learning seqgan tensorflow

Last synced: 07 Nov 2024

https://github.com/carrychang/real_time_datamining_software

携程/榛果民宿实时评论挖掘软件,包含数据的实时采集/数据清洗/结构化保存/ UGC 数据主题提取/情感分析/后结构化可视化等技术的综合性演示 Demo。基于在线民宿 UGC 数据的意见挖掘项目,包含数据挖掘和 NLP 相关的处理,负责数据采集、主题抽取、情感分析等任务。主要克服用户打分和评论不一致,实时对携程和美团在线民宿的满意度进行评测以及对额外数据进行可视化的综合性工具,多维度的对在线 UGC 进行数据挖掘并可视化,demo 视频演示见链接。

data-mining-software data-spider demo nlp real-time-anslysis sentiment-analysis ugc-analysis

Last synced: 13 Nov 2024

https://github.com/datawhalechina/unlock-hf

解锁HuggingFace生态的百般用法

datawhale nlp transformers tutorial

Last synced: 09 Nov 2024

https://github.com/abhishek-ch/vectorverse

Explore Multiple Vector Databases and chat with documents on Multiple LLM models, private LLM models

chatbot chatgpt chromadb elasticsearch embeddings generative generativeai milvus nlp openai python qdrant redis streamlit vectorstore

Last synced: 28 Oct 2024

https://github.com/nltk/wordnet

Stand-alone WordNet API

nlp wordnet

Last synced: 10 Nov 2024

https://github.com/LanguageMachines/PICCL

A set of workflows for corpus building through OCR, post-correction and normalisation

computational-linguistics corpus-linguistics corpus-tools folia nlp ocr workflow

Last synced: 03 Nov 2024

https://github.com/khakhulin/compressed-transformer

Compression of NMT transformer model with tensor methods

compression deep-learning mnist nlp nmt pytorch tensor-train transformer translation tucker

Last synced: 04 Aug 2024

https://github.com/kennethenevoldsen/spacy-wrap

spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to include existing fine-tuned models within your SpaCy workflow.

deep-learning huggingface huggingface-transformers language-model machine-learning natural-language-processing nlp pytorch spacy spacy-extension spacy-extensions spacy-models spacy-nlp spacy-pipeline spacy-transformers text-classification transformers

Last synced: 12 Oct 2024

https://github.com/kaleidophon/nlp-uncertainty-zoo

Model zoo for different kinds of uncertainty quantification methods used in Natural Language Processing, implemented in PyTorch.

deep-learning lstm nlp nlp-machine-learning package python pytorch rnn transformers uncertainty-estimation uncertainty-neural-networks uncertainty-quantification

Last synced: 10 Oct 2024

https://github.com/hsm207/bert_attn_viz

Visualize BERT's self-attention layers on text classification tasks

attention bert explainable-ai nlp tensorflow

Last synced: 28 Oct 2024

https://github.com/skoltech-nlp/rudetoxifier

Code and data of "Methods for Detoxification of Texts for the Russian Language" paper

nlp russian-language style-transfer

Last synced: 08 Nov 2024

https://github.com/onesuper/HuggingFace-Datasets-Text-Quality-Analysis

Retrieves parquet files from Hugging Face, identifies and quantifies junky data, duplication, contamination, and biased content in dataset using pandas

dataset huggingface-datasets llm machine-learning nlp streamlit text-processing

Last synced: 09 Aug 2024

https://github.com/kootenpv/spacy_api

Server/Client around Spacy to load spacy only once

api machine-learning nlp spacy

Last synced: 14 Oct 2024

https://github.com/welding-torch/excel-anonymizer

A Python script that anonymizes an Excel file and synthesizes new data in its place.

data-science microsoft nlp pandas presidio privacy

Last synced: 07 Nov 2024

https://github.com/nlpcloud/nlpcloud-js

NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, code generation, and much more...

ad-generator chatbot code-generation conversational-ai embeddings intent-classification keywords-extraction language-detection machine-translation ner nlp paraphrasing question-answering semantic-similarity sentiment-analysis text-classification text-generation text-summarization tokenization

Last synced: 07 Nov 2024

https://github.com/christabor/namebot

A company/project name generator for Python. Uses NLTK and diverse techniques derived from existing corporate etymologies and naming agencies for sophisticated word generation and ideation.

language name-generation naming naming-agencies nlp nltk

Last synced: 07 Nov 2024

https://github.com/s-nlp/rudetoxifier

Code and data of "Methods for Detoxification of Texts for the Russian Language" paper

nlp russian-language style-transfer

Last synced: 07 Aug 2024

https://github.com/obss/trapper

State-of-the-art NLP through transformer models in a modular design and consistent APIs.

allennlp deep-learning natural-language-processing nlp python pytorch pytorch-transformers transformer transformers

Last synced: 27 Oct 2024

https://github.com/ChenghaoMou/pytorch-pQRNN

Implementation of pQRNN in PyTorch

nlp pqrnn pytorch text-classification

Last synced: 03 Aug 2024

https://github.com/teticio/llama-squad

Train Llama 2 & 3 on the SQuAD v2 task as an example of how to specialize a generalized (foundation) model.

decoder fine-tuning llama2 llama3 nlp question-answering squad

Last synced: 10 Oct 2024