Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/lan-ce-lot/pythorch-text-classification

对豆瓣影评进行文本分类情感分析,利用爬虫豆瓣爬取评论,进行数据清洗,分词,采用BERT、CNN、LSTM等模型进行训练,采用tensorboardX可视化训练过程,自然语言处理项目\A project for text classification, based on torch 1.7.1

bert cnn douban lstm natural-language-processing nlp qt qt5 qt6 rnn scrapy sentiment-analysis tensorboard tensorboardx text-classification ui

Last synced: 12 Oct 2024

https://github.com/ZhixiuYe/Intra-Bag-and-Inter-Bag-Attentions

Code for NAACL 2019 paper: Distant Supervision Relation Extraction with Intra-Bag and Inter-Bag Attentions

deeplearning distant-supervision nlp pytorch relation-extraction

Last synced: 01 Nov 2024

https://github.com/clovaai/focusseq2seq

[EMNLP 2019] Mixture Content Selection for Diverse Sequence Generation (Question Generation / Abstractive Summarization)

emnlp2019 generation nlp pytorch question-generation summarization

Last synced: 12 Nov 2024

https://github.com/Nipun1212/Claude_api

Claude_api is a Python package that provides a convenient way to interact with Claude 2 from Anthropic.

anthropic anthropic-claude claude claude-ai claude-api nlp

Last synced: 11 Nov 2024

https://github.com/logpai/bughub

A collection of free-text bug reports for duplicate issue identification

bug-reports datasets duplicate-detection nlp

Last synced: 07 Nov 2024

https://github.com/aphp/edsnlp

Modular, fast NLP framework, compatible with Pytorch and spaCy, offering tailored support for French clinical notes.

clinical-data-warehouse deep-learning fast french medical multi-task nlp pytorch rule-based spacy text-mining

Last synced: 14 Oct 2024

https://github.com/McGill-NLP/weblinx

WebLINX is a benchmark for building web navigation agents with conversational capabilities

agent agents computer-vision llm multimodal navigation nlp web

Last synced: 20 Oct 2024

https://github.com/deepset-ai/haystack-core-integrations

Additional packages (components, document stores and the likes) to extend the capabilities of Haystack version 2.0 and onwards

ai haystack llm mlops nlp

Last synced: 13 Nov 2024

https://github.com/shibing624/nerpy

🌈 NERpy: Implementation of Named Entity Recognition using Python. 命名实体识别工具,支持BertSoftmax、BertSpan等模型,开箱即用。

bert bert-softmax bert-span named-entity-recognition ner nlp pytorch transformers

Last synced: 31 Oct 2024

https://github.com/bayeru/chat-to-your-database

Chat to your database with AI. An experimental app to test the abilities of LLMs to query SQL databases using natural language.

chatgpt chatgpt-app database langchain langchain-typescript llm llms mysql natural-language-processing nlp openai postgres sql sqlite

Last synced: 10 Aug 2024

https://github.com/winkjs/wink-nlp-utils

NLP Functions for amplifying negations, managing elisions, creating ngrams, stems, phonetic codes to tokens and more.

bag-of-words natural-language-processing ngrams nlp phonetize sentence-boundary-detection stem stop-words tokenize

Last synced: 09 Nov 2024

https://github.com/johnbumgarner/wordhoard

This Python module can be used to obtain antonyms, synonyms, hypernyms, hyponyms, homophones and definitions.

antonyms bag-of-words definitions dictionary homophones hypernyms hyponyms lexicon nlp python python3 synonyms text-analysis textual-analysis wordlists wordnet wordnets wordsearch

Last synced: 04 Aug 2024

https://github.com/pooya-mohammadi/deep_utils

An open-source toolkit which is full of handy functions, including the most used models and utilities for deep-learning practitioners!

augmentation coco computer-vision cutmix deep-learning face-detection face-recognition machine-learning modelcheckpoint nlp object-detection python pytorch senet tensorflow utils vggface2 yolov5

Last synced: 09 Nov 2024

https://github.com/deep-diver/en-fr-mlt-tensorflow

English-French Machine Language Translation in Tensorflow

deep-learning english-to-french machine-translation nlp tensorflow

Last synced: 01 Nov 2024

https://github.com/yohasebe/lemmatizer

Lemmatizer for text in English. Inspired by Python's nltk.corpus.reader.wordnet.morphy

lemmatizer nlp ruby rubynlp wordnet

Last synced: 08 Nov 2024

https://github.com/DFKI-NLP/TRE

[AKBC 19] Improving Relation Extraction by Pre-trained Language Representations

information-extraction machine-learning multi-task-learning nlp relation-extraction transformer

Last synced: 01 Nov 2024

https://github.com/clipperhouse/jargon

Tokenizers and lemmatizers for Go

data-science go lemmatizer nlp tokenizer

Last synced: 14 Nov 2024

https://github.com/proycon/flat

FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.github.io/folia), a rich XML-based format for linguistic annotation. Flat allows users to view annotated FoLiA documents and enrich these documents with new annotations, a wide variety of linguistic annotation types is supported through the FoLiA paradigm.

annotation-tool clariah clarin computational-linguistics folia javascript linguistic-annotation-framework linguistics nlp python web-application

Last synced: 31 Oct 2024

https://github.com/ahmedbesbes/media-agent

Scrape data from social media and chat with it using Langchain

langchain large-language-models llms nlp nlproc python tweepy

Last synced: 06 Nov 2024

https://github.com/ahmedbesbes/twitter-agent

Scrape data from social media and chat with it using Langchain

langchain large-language-models llms nlp nlproc python tweepy

Last synced: 22 Aug 2024

https://github.com/prrao87/tweet-stance-prediction

Applying NLP transfer learning techniques to predict Tweet stance toward a topic

natural-language-processing nlp openai-gpt python text-classification transfer-learning transformers ulmfit

Last synced: 02 Nov 2024

https://github.com/textlint-rule/sentence-splitter

Split {Japanese, English} text into sentences.

english japanese javascript nlp segement sentence

Last synced: 04 Aug 2024

https://github.com/kororo/excelcy

Excel Integration with spaCy. Training NER using Excel/XLSX from PDF, DOCX, PPT, PNG or JPG.

entity excel nlp python python3 spacy spacy-extensions spacy-nlp spacy-pipeline training xlsx

Last synced: 14 Oct 2024

https://github.com/martinomensio/spacy-sentence-bert

Sentence transformers models for SpaCy

bert models nlp sentence-bert sentence-transformers spacy

Last synced: 14 Oct 2024

https://github.com/orthagonal/langchainex

Language Chain Library for Elixir

ai langchain nlp

Last synced: 01 Nov 2024

https://github.com/SunLemuria/OpenGPTAndBeyond

Open efforts to implement ChatGPT-like models and beyond.

alpaca chatbot chatglm chatgpt large-language-models llm nlp openai opensource

Last synced: 06 Nov 2024

https://mcgill-nlp.github.io/weblinx/

WebLINX is a benchmark for building web navigation agents with conversational capabilities

agent agents computer-vision llm multimodal navigation nlp web

Last synced: 03 Aug 2024

https://github.com/hongzhaohua/jstarcraft-nlp

专注于解决自然语言处理领域的几个核心问题:词法分析,句法分析,语义分析,语种检测,信息抽取,文本聚类和文本分类. 为相关领域的研发人员提供完整的通用设计与参考实现. 涵盖了多种自然语言处理算法,适配了多个自然语言处理框架. 兼容Lucene/Solr/ElasticSearch插件.

ansj corenlp elasticsearch hanlp ik java jcseg jieba language-detection lucene mmseg mynlp nlp solr thulac word

Last synced: 08 Nov 2024

https://github.com/jmisilo/clip-gpt-captioning

CLIPxGPT Captioner is Image Captioning Model based on OpenAI's CLIP and GPT-2.

computer-vision cv deep-learning image-caption image-caption-generator image-captioning machine-learning nlp python pytorch

Last synced: 04 Nov 2024

https://github.com/SergeyShk/ruTS

Библиотека для извлечения статистик из текстов на русском языке.

computational-linguistics natural-language-processing nlp russian-specific text-analytics

Last synced: 07 Aug 2024

https://github.com/davidberenstein1957/crosslingual-coreference

A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.

coreference coreference-resolution hacktoberfest natural-language-processing nlp python spacy

Last synced: 01 Nov 2024

https://github.com/awslabs/speech-representations

Code for DeCoAR (ICASSP 2020) and BERTphone (Odyssey 2020)

deep-learning nlp speech-recognition

Last synced: 25 Oct 2024

https://github.com/deepset-ai/haystack-demos

Fully working applications that demonstrate how to use Haystack to implement common NLP use cases

nlp python question-answering semantic-search

Last synced: 06 Nov 2024

https://github.com/rerender2021/echo

A simple asr translator powered by avernakis react.

asr ave avernakis nlp offline translation

Last synced: 06 Nov 2024

https://github.com/ben-aaron188/rgpt3

Making requests from R to the GPT models

chatgpt gpt3 llm nlp openai r

Last synced: 13 Nov 2024

https://github.com/leehanchung/lora-instruct

Finetune Falcon, LLaMA, MPT, and RedPajama on consumer hardware using PEFT LoRA

agi falcon gpt llama llm lora mpt nlp redpajama

Last synced: 10 Nov 2024

https://github.com/bnosac/ruimtehol

R package to Embed All the Things! using StarSpace

classification embeddings natural-language-processing nlp r similarity starspace text-mining

Last synced: 11 Nov 2024

https://github.com/lonePatient/BERT-chinese-text-classification-pytorch

This repo contains a PyTorch implementation of a pretrained BERT model for text classification.

bert chinese chinese-text-classification nlp pytorch text-classification

Last synced: 02 Nov 2024

https://github.com/KudoAI/bravegpt

🦁 Brave Search add-on that brings the magic of ChatGPT to search results (powered by GPT-4!)

ai artificial-intelligence brave brave-search chat chatbot chatgpt chatgpt3 gpt gpt-3 gpt-4 greasemonkey javascript machine-learning nlp openai search userscripts web websearch

Last synced: 08 Nov 2024

https://github.com/makaveli10/stockprediction_transformer

Intra day Stock Prediction 10 minutes into the future

intraday-stock-trading nlp stock-price-prediction transformer

Last synced: 27 Oct 2024

https://github.com/kudoai/bravegpt

🦁 Brave Search add-on that brings the magic of ChatGPT to search results (powered by GPT-4!)

ai artificial-intelligence brave brave-search chat chatbot chatgpt chatgpt3 gpt gpt-3 gpt-4 greasemonkey javascript machine-learning nlp openai search userscripts web websearch

Last synced: 12 Oct 2024

https://github.com/alvinwan/timefhuman

Convert natural language date-like strings--dates, date ranges, and lists of dates--to Python objects

date-parser datetime datetime-inputs nlp python3

Last synced: 26 Oct 2024

https://github.com/lonepatient/bert-chinese-text-classification-pytorch

This repo contains a PyTorch implementation of a pretrained BERT model for text classification.

bert chinese chinese-text-classification nlp pytorch text-classification

Last synced: 06 Nov 2024

https://github.com/d99kris/spacy-cpp

C++ wrapper library for the NLP library spaCy

c-plus-plus linux nlp nlp-libraries spacy

Last synced: 14 Oct 2024

https://github.com/princeton-nlp/LLMBar

[ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following

evaluation llm nlp

Last synced: 10 Nov 2024

https://chats-lab.github.io/KokoMind/

KokoMind: Can LLMs Understand Social Interactions?

chatgpt deep-learning gpt-4 language-model neural-network nlp

Last synced: 03 Aug 2024

https://github.com/etherealengine/digital-beings

A platform for letting researchers connect an intelligent AI directly to real time communication networks and 3D worlds. Your AI, Anywhere.

ai artificial-intelligence bot computer-vision cv digital-beings digital-humans machine-learning ml nlp telegram

Last synced: 12 Nov 2024

https://github.com/harunzafer/nuve

Natural Language Processing Library for Turkish in C#

ngram-extraction nlp nuve turkish

Last synced: 12 Nov 2024

https://github.com/xlang-ai/icl-selective-annotation

[ICLR 2023] Code for our paper "Selective Annotation Makes Language Models Better Few-Shot Learners"

active-learning in-context-learning language-model natural-language-processing nlp sample-selection

Last synced: 13 Nov 2024

https://github.com/dengbocong/nlp-dialogue

A full-process dialogue system that can be deployed online

bot bots chatbot conversational-ai deep-learning machine-learning natural-language-processing nlp nlu

Last synced: 08 Nov 2024

https://github.com/clovaai/webvicob

Official Implementation of Web-based Visual Corpus Builder (Webvicob), ICDAR 2023

document-ai icdar2023 nlp ocr

Last synced: 12 Nov 2024

https://github.com/xv44586/toolkit4nlp

transformers implement (architecture, task example, serving and more)

bert keras nlp

Last synced: 13 Oct 2024

https://github.com/thunlp/multird

Code and data of the AAAI-20 paper "Multi-channel Reverse Dictionary Model"

nlp reverse-dictionary

Last synced: 10 Nov 2024

https://github.com/salvatorera/ml-news-of-the-week

A collection of the the best ML and AI news every week (research, news, resources)

agents ai artificial-intelligence computer-vision llms machine-learning nlp python rag retrieval-augmented-generation transformer

Last synced: 26 Oct 2024

https://github.com/google-research-datasets/wiki-atomic-edits

A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.

deep-learning deep-neural-networks nlp nlp-machine-learning wikipedia

Last synced: 08 Nov 2024

https://github.com/JDongian/python-jamo

Hangul syllable decomposition and synthesis using jamo.

hangul korean nlp python

Last synced: 03 Aug 2024

https://github.com/mgechev/ngx-tfjs

🤖 TensorFlow.js bindings for Angular

angular machine-learning nlp tensorflowjs

Last synced: 01 Nov 2024

https://github.com/adhikary97/Sharetape-Open-Source

Script that takes any long form video or podcast and outputs clips for social media

instagram-reels nlp podcast tiktok video-clipper video-clips youtube

Last synced: 04 Aug 2024

https://github.com/chunml/nlp

This is where I put all my work in Natural Language Processing

natural-language-processing nlp python tensorflow tensorflow-experiments tensorflow-tutorials

Last synced: 12 Nov 2024

https://github.com/MoritzLaurer/GPT-google-sheets

Code and documentation for running generative LLMs like ChatGPT or GPT4 in google sheets without any coding knowledge. Transform unstructured text to structured data.

chatgpt gpt3 gpt4 nlp nlp-machine-learning

Last synced: 03 Aug 2024

https://github.com/feldberlin/timething

Timething is a library for aligning text transcripts with their audio recordings.

alignment audio cli forced-alignment huggingface nlp python speech speech-recognition tts

Last synced: 27 Oct 2024

https://github.com/ropensci-archive/monkeylearn

:no_entry: ARCHIVED :no_entry: Accesses the Monkeylearn API for Text Classifiers and Extractors

classifier extractor monkeylearn nlp nlp-machine-learning peer-reviewed r r-package rstats

Last synced: 25 Oct 2024

https://github.com/IlyaGusev/tgcontest

Telegram Data Clustering contest solution by Mindful Squirrel

classification clustering cpp data-science document-similarity fasttext machine-learning nlp

Last synced: 04 Nov 2024

https://github.com/cisnlp/Glot500?tab=readme-ov-file

Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages (ACL 2023)

acl dataset glot glot500 multilingual multilingual-models multilingual-nlp natural-language-processing nlp xlm xlm-r

Last synced: 05 Oct 2024

https://github.com/explosion/spacy-lookups-data

📂 Additional lookup tables and data resources for spaCy

lemmatization machine-learning natural-language-processing nlp spacy

Last synced: 07 Oct 2024

https://github.com/cambridgeltl/visual-spatial-reasoning

[TACL'23] VSR: A probing benchmark for spatial undersranding of vision-language models.

computer-vision multimodal-deep-learning nlp vision-and-language

Last synced: 04 Nov 2024

https://github.com/hpcaitech/CachedEmbedding

A memory efficient DLRM training solution using ColossalAI

colossal-ai deep-learning dlrm embeddings nlp pytorch recommandation-system

Last synced: 07 Nov 2024

https://github.com/deeplearningturkiye/kelime_kok_ayirici

Derin Öğrenme Tabanlı - seq2seq - Türkçe için kelime kökü bulma web uygulaması - Turkish Stemmer (tr_stemmer)

flask keras nlp python stemmer

Last synced: 09 Nov 2024

https://github.com/fdalvi/neurox

A Python library that encapsulates various methods for neuron interpretation and analysis in Deep NLP models.

explainable-ai natural-language-processing neurons nlp nlp-machine-learning

Last synced: 07 Nov 2024

https://github.com/bjascob/pyinflect

A python module for word inflections designed for use with spaCy.

inflection nlp python spacy spacy-extension

Last synced: 07 Nov 2024

https://github.com/kyubyong/name2nat

name2nat: a Python package for nationality prediction from a name

names nationality nlp

Last synced: 10 Nov 2024

https://github.com/lonepatient/electra_pytorch

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

bert deeplearning electra glue language-model nlp pytorch

Last synced: 06 Nov 2024

https://github.com/fdalvi/NeuroX

A Python library that encapsulates various methods for neuron interpretation and analysis in Deep NLP models.

explainable-ai natural-language-processing neurons nlp nlp-machine-learning

Last synced: 03 Aug 2024

https://github.com/ikegami-yukino/oseti

Dictionary based Sentiment Analysis for Japanese

japanese-language nlp sentiment-analysis sentiment-polarity

Last synced: 26 Oct 2024

https://github.com/epwalsh/nlp-models

NLP research experiments, built on PyTorch within the AllenNLP framework.

allennlp nlp pytorch pytorch-nlp

Last synced: 01 Nov 2024

https://github.com/hpprc/llm-lora-classification

LLMとLoRAを用いたテキスト分類

deep-learning llm lora nlp

Last synced: 27 Oct 2024

https://github.com/saidziani/arabic-news-article-classification

Automatic categorization of documents, consists in assigning a category to a text based on the information it contains. We'll follow different approach of Supervised Machine Learning.

arabic-language arabic-nlp corpora machine-learning nlp nltk python3 text-categorization

Last synced: 28 Oct 2024

https://github.com/lsys/lexicalrichness

:smile_cat: :speech_balloon: A module to compute textual lexical richness (aka lexical diversity).

data-mining data-science information-retrieval lexical-analysis lexical-analyzer linguistic-analysis natural-language natural-language-processing nlp python

Last synced: 02 Nov 2024

https://github.com/saidziani/Arabic-News-Article-Classification

Automatic categorization of documents, consists in assigning a category to a text based on the information it contains. We'll follow different approach of Supervised Machine Learning.

arabic-language arabic-nlp corpora machine-learning nlp nltk python3 text-categorization

Last synced: 03 Aug 2024

https://github.com/qdata/lamp

ECML 2019: Graph Neural Networks for Multi-Label Classification

computer-vision graph-attention-networks graph-neural-networks multi-label-classification nlp transformers

Last synced: 12 Nov 2024