Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/pooya-mohammadi/deep_utils

An open-source toolkit which is full of handy functions, including the most used models and utilities for deep-learning practitioners!

augmentation coco computer-vision cutmix deep-learning face-detection face-recognition machine-learning modelcheckpoint nlp object-detection python pytorch senet tensorflow utils vggface2 yolov5

Last synced: 09 Nov 2024

https://github.com/tunib-ai/tunib-electra

Korean-English Bilingual Electra Models

electra nlp tunib

Last synced: 24 Jan 2025

https://github.com/etienneab3d/whispertimesync

Synchronize Whisper's timestamps over an existing accurate transcription

aligner asr nlp subtitles text-to-speech whisper

Last synced: 19 Nov 2024

https://github.com/DFKI-NLP/TRE

[AKBC 19] Improving Relation Extraction by Pre-trained Language Representations

information-extraction machine-learning multi-task-learning nlp relation-extraction transformer

Last synced: 01 Nov 2024

https://github.com/Reason-Wang/ToolGen

The official implementation of paper "ToolGen: Unified Tool Retrieval and Calling via Generation"

agent llm nlp retrieval tool tool-learning tool-retrieval toolgen

Last synced: 09 Feb 2025

https://github.com/clipperhouse/jargon

Tokenizers and lemmatizers for Go

data-science go lemmatizer nlp tokenizer

Last synced: 14 Nov 2024

https://github.com/yohasebe/lemmatizer

Lemmatizer for text in English. Inspired by Python's nltk.corpus.reader.wordnet.morphy

lemmatizer nlp ruby rubynlp wordnet

Last synced: 04 Feb 2025

https://github.com/prrao87/tweet-stance-prediction

Applying NLP transfer learning techniques to predict Tweet stance toward a topic

natural-language-processing nlp openai-gpt python text-classification transfer-learning transformers ulmfit

Last synced: 16 Nov 2024

https://github.com/martinomensio/spacy-sentence-bert

Sentence transformers models for SpaCy

bert models nlp sentence-bert sentence-transformers spacy

Last synced: 10 Feb 2025

https://github.com/ahmedbesbes/media-agent

Scrape data from social media and chat with it using Langchain

langchain large-language-models llms nlp nlproc python tweepy

Last synced: 23 Nov 2024

https://github.com/textlint-rule/sentence-splitter

Split {Japanese, English} text into sentences.

english japanese javascript nlp segement sentence

Last synced: 06 Feb 2025

https://github.com/orthagonal/langchainex

Language Chain Library for Elixir

ai langchain nlp

Last synced: 01 Nov 2024

https://github.com/kororo/excelcy

Excel Integration with spaCy. Training NER using Excel/XLSX from PDF, DOCX, PPT, PNG or JPG.

entity excel nlp python python3 spacy spacy-extensions spacy-nlp spacy-pipeline training xlsx

Last synced: 19 Dec 2024

https://github.com/davidberenstein1957/crosslingual-coreference

A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.

coreference coreference-resolution hacktoberfest natural-language-processing nlp python spacy

Last synced: 01 Jan 2025

https://github.com/SunLemuria/OpenGPTAndBeyond

Open efforts to implement ChatGPT-like models and beyond.

alpaca chatbot chatglm chatgpt large-language-models llm nlp openai opensource

Last synced: 06 Nov 2024

https://github.com/vspiewak/twitter-sentiment-analysis

Streaming tweets with spark, language detection & sentiment analysis, dashboard with Kibana

dashboard kibana nlp scala sentiment-analysis spark tiwtter

Last synced: 25 Dec 2024

https://github.com/clovaai/webvicob

Official Implementation of Web-based Visual Corpus Builder (Webvicob), ICDAR 2023

document-ai icdar2023 nlp ocr

Last synced: 25 Jan 2025

https://github.com/hongzhaohua/jstarcraft-nlp

专注于解决自然语言处理领域的几个核心问题:词法分析,句法分析,语义分析,语种检测,信息抽取,文本聚类和文本分类. 为相关领域的研发人员提供完整的通用设计与参考实现. 涵盖了多种自然语言处理算法,适配了多个自然语言处理框架. 兼容Lucene/Solr/ElasticSearch插件.

ansj corenlp elasticsearch hanlp ik java jcseg jieba language-detection lucene mmseg mynlp nlp solr thulac word

Last synced: 16 Nov 2024

https://github.com/ailln/nlp-roadmap

🗺️ 一个自然语言处理的学习路线图

natural-language-processing nlp roadmap sequence-labeling word-embedding word-segmentation

Last synced: 18 Jan 2025

https://github.com/awslabs/speech-representations

Code for DeCoAR (ICASSP 2020) and BERTphone (Odyssey 2020)

deep-learning nlp speech-recognition

Last synced: 25 Oct 2024

https://github.com/cy69855522/ai-paper-drawer

人工智能论文关键点集结。This project aims to collect key points of AI papers.

ai-papers cv deep-learning gans gcn gnn graph nlp rl

Last synced: 29 Jan 2025

https://github.com/SergeyShk/ruTS

Библиотека для извлечения статистик из текстов на русском языке.

computational-linguistics natural-language-processing nlp russian-specific text-analytics

Last synced: 27 Nov 2024

https://github.com/graniet/rllm

Use multiple LLM backends in a single crate, simple builder-based configuration, and built-in prompt chaining & templating.

ai anthropic builder-pattern chatbot llm nlp ollama openai prompt-engineering rust rust-crate rust-library

Last synced: 09 Feb 2025

https://github.com/googlecloudplatform/dataflow-opinion-analysis

Opinion Analysis of News, Threaded Conversations, and User Generated Content

cloud-dataflow nlp sentiment-analysis

Last synced: 06 Feb 2025

https://github.com/6thsolution/apexnlp

A natural language event parser for java and android.

android event-parser java natural-language-processing nlp

Last synced: 20 Nov 2024

https://github.com/ben-aaron188/rgpt3

Making requests from R to the GPT models

chatgpt gpt3 llm nlp openai r

Last synced: 13 Nov 2024

https://github.com/leehanchung/lora-instruct

Finetune Falcon, LLaMA, MPT, and RedPajama on consumer hardware using PEFT LoRA

agi falcon gpt llama llm lora mpt nlp redpajama

Last synced: 29 Jan 2025

https://github.com/rerender2021/echo

A simple asr translator powered by avernakis react.

asr ave avernakis nlp offline translation

Last synced: 06 Nov 2024

https://github.com/bnosac/ruimtehol

R package to Embed All the Things! using StarSpace

classification embeddings natural-language-processing nlp r similarity starspace text-mining

Last synced: 06 Feb 2025

https://github.com/alvinwan/timefhuman

Extract datetimes and durations from natural language text as Python objects. Supports ranges, lists, and more.

date-parser datetime datetime-inputs nlp python3

Last synced: 05 Feb 2025

https://github.com/makaveli10/stockprediction_transformer

Intra day Stock Prediction 10 minutes into the future

intraday-stock-trading nlp stock-price-prediction transformer

Last synced: 27 Oct 2024

https://github.com/lonePatient/BERT-chinese-text-classification-pytorch

This repo contains a PyTorch implementation of a pretrained BERT model for text classification.

bert chinese chinese-text-classification nlp pytorch text-classification

Last synced: 02 Nov 2024

https://github.com/KudoAI/bravegpt

🦁 Brave Search add-on that brings the magic of ChatGPT to search results (powered by GPT-4!)

ai artificial-intelligence brave brave-search chat chatbot chatgpt chatgpt3 gpt gpt-3 gpt-4 greasemonkey javascript machine-learning nlp openai search userscripts web websearch

Last synced: 08 Nov 2024

https://github.com/shivanandroy/keyphrasetransformer

KeyPhraseTransformer lets you quickly extract key phrases, topics, themes from your text data with T5 transformer | Keyphrase extraction | Keyword extraction

keyphrase keyphrase-extraction keyphrase-extraction-algorithm keyword-extraction nlp t5 transformers

Last synced: 06 Feb 2025

https://github.com/d99kris/spacy-cpp

C++ wrapper library for the NLP library spaCy

c-plus-plus linux nlp nlp-libraries spacy

Last synced: 14 Oct 2024

https://github.com/hrzafer/nuve

Natural Language Processing Library for Turkish in C#

ngram-extraction nlp nuve turkish

Last synced: 14 Dec 2024

https://github.com/lonepatient/bert-chinese-text-classification-pytorch

This repo contains a PyTorch implementation of a pretrained BERT model for text classification.

bert chinese chinese-text-classification nlp pytorch text-classification

Last synced: 06 Nov 2024

https://github.com/harunzafer/nuve

Natural Language Processing Library for Turkish in C#

ngram-extraction nlp nuve turkish

Last synced: 12 Nov 2024

https://github.com/etherealengine/digital-beings

A platform for letting researchers connect an intelligent AI directly to real time communication networks and 3D worlds. Your AI, Anywhere.

ai artificial-intelligence bot computer-vision cv digital-beings digital-humans machine-learning ml nlp telegram

Last synced: 12 Nov 2024

https://github.com/skblaz/rakun

Rank-based Unsupervised Keyword Extraction via Metavertex Aggregation

keyword keyword-detection keyword-extraction machine-learning nlp unsupervised-learning

Last synced: 05 Feb 2025

https://github.com/louisbrulenaudet/apple-ocr

Easy-to-Use Apple Vision wrapper for text extraction, scalar representation and clustering using K-means.

apple clustering kmeans nlp ocr ocr-recognition pyobjc python scatter-plot sklearn

Last synced: 09 Feb 2025

https://github.com/princeton-nlp/LLMBar

[ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following

evaluation llm nlp

Last synced: 10 Nov 2024

https://chats-lab.github.io/KokoMind/

KokoMind: Can LLMs Understand Social Interactions?

chatgpt deep-learning gpt-4 language-model neural-network nlp

Last synced: 18 Nov 2024

https://github.com/explosion/spacy-lookups-data

📂 Additional lookup tables and data resources for spaCy

lemmatization machine-learning natural-language-processing nlp spacy

Last synced: 09 Feb 2025

https://github.com/thunlp/multird

Code and data of the AAAI-20 paper "Multi-channel Reverse Dictionary Model"

nlp reverse-dictionary

Last synced: 10 Nov 2024

https://github.com/xv44586/toolkit4nlp

transformers implement (architecture, task example, serving and more)

bert keras nlp

Last synced: 19 Dec 2024

https://github.com/dengbocong/nlp-dialogue

A full-process dialogue system that can be deployed online

bot bots chatbot conversational-ai deep-learning machine-learning natural-language-processing nlp nlu

Last synced: 08 Nov 2024

https://github.com/xlang-ai/icl-selective-annotation

[ICLR 2023] Code for our paper "Selective Annotation Makes Language Models Better Few-Shot Learners"

active-learning in-context-learning language-model natural-language-processing nlp sample-selection

Last synced: 13 Nov 2024

https://github.com/oxford-cs-deepnlp-2017/practical-3

Oxford Deep NLP 2017 course - Practical 3: Text Classification with RNNs

deep-learning machine-learning natural-language-processing nlp oxford

Last synced: 17 Jan 2025

https://github.com/MoritzLaurer/GPT-google-sheets

Code and documentation for running generative LLMs like ChatGPT or GPT4 in google sheets without any coding knowledge. Transform unstructured text to structured data.

chatgpt gpt3 gpt4 nlp nlp-machine-learning

Last synced: 18 Nov 2024

https://github.com/adhikary97/Sharetape-Open-Source

Script that takes any long form video or podcast and outputs clips for social media

instagram-reels nlp podcast tiktok video-clipper video-clips youtube

Last synced: 20 Nov 2024

https://github.com/google-research-datasets/wiki-atomic-edits

A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.

deep-learning deep-neural-networks nlp nlp-machine-learning wikipedia

Last synced: 08 Nov 2024

https://github.com/cisnlp/Glot500?tab=readme-ov-file

Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023

acl dataset glot glot500 multilingual multilingual-models multilingual-nlp natural-language-processing nlp xlm xlm-r

Last synced: 02 Feb 2025

https://github.com/JDongian/python-jamo

Hangul syllable decomposition and synthesis using jamo.

hangul korean nlp python

Last synced: 17 Nov 2024

https://github.com/mlh-fellowship/social-berterfly

Finding your MBTI personality type based on your Twitter activity using BERT

bert mbti myers-briggs nlp personality-predicting personality-profiling personality-test

Last synced: 09 Feb 2025

https://github.com/mgechev/ngx-tfjs

🤖 TensorFlow.js bindings for Angular

angular machine-learning nlp tensorflowjs

Last synced: 01 Nov 2024

https://github.com/quanta-quest/quanta-quest

AI-powered universal search for all your personal data, tailored just for you. Goal:The world's first product with "edge-side LLMs + consumer data localization" as its core development direction.

agent ai anthropic bert chatgpt claude edge-computing gpt huggingface knowledgebase language-models llm nextjs nlp personal-ass rag semantic-vector-search transformers universal-search workflow

Last synced: 06 Feb 2025

https://github.com/chunml/nlp

This is where I put all my work in Natural Language Processing

natural-language-processing nlp python tensorflow tensorflow-experiments tensorflow-tutorials

Last synced: 12 Nov 2024

https://github.com/ropensci-archive/monkeylearn

:no_entry: ARCHIVED :no_entry: Accesses the Monkeylearn API for Text Classifiers and Extractors

classifier extractor monkeylearn nlp nlp-machine-learning peer-reviewed r r-package rstats

Last synced: 25 Oct 2024

https://github.com/o19s/hello-nlp

A natural language search microservice

elasticsearch nlp solr spacy

Last synced: 20 Nov 2024

https://github.com/feldberlin/timething

Timething is a library for aligning text transcripts with their audio recordings.

alignment audio cli forced-alignment huggingface nlp python speech speech-recognition tts

Last synced: 27 Oct 2024

https://github.com/hyunwoongko/nlp-datasets

Curation note of NLP datasets

dataset nlp

Last synced: 26 Jan 2025

https://github.com/IlyaGusev/tgcontest

Telegram Data Clustering contest solution by Mindful Squirrel

classification clustering cpp data-science document-similarity fasttext machine-learning nlp

Last synced: 04 Nov 2024

https://github.com/1ytic/pytorch-edit-distance

Levenshtein edit-distance on PyTorch and CUDA

asr cuda edit-distance levenshtein nlp pytorch

Last synced: 07 Feb 2025

https://github.com/ikegami-yukino/oseti

Dictionary based Sentiment Analysis for Japanese

japanese-language nlp sentiment-analysis sentiment-polarity

Last synced: 24 Jan 2025

https://github.com/yuanjie-ai/chinesesensitivevocabulary

暴恐违禁 文本色情 政治敏感 恶意推广 低俗辱骂

anti anti-spam antispam nlp

Last synced: 01 Jan 2025

https://github.com/microsoft/safenlp

Safety Score for Pre-Trained Language Models

ai-safety fairness-ai nlp

Last synced: 02 Feb 2025

https://github.com/foxminchan/LawKnowledge

A legal knowledge search and Q&A application based on Vietnam's Legal Code and legal document database ⚖️

generative-ai microservice natural-language-processing nlp nx searching semantic-search

Last synced: 17 Nov 2024

https://github.com/deeplearningturkiye/kelime_kok_ayirici

Derin Öğrenme Tabanlı - seq2seq - Türkçe için kelime kökü bulma web uygulaması - Turkish Stemmer (tr_stemmer)

flask keras nlp python stemmer

Last synced: 09 Nov 2024

https://github.com/bjascob/pyinflect

A python module for word inflections designed for use with spaCy.

inflection nlp python spacy spacy-extension

Last synced: 19 Dec 2024

https://github.com/kristiyanvachev/leaf-question-generation

Easy to use and understand multiple-choice question generation algorithm using T5 Transformers.

ai distractors mcq ml multiple-choice neural-networks nlp question-generation quiz sense2vec t5 test transformers

Last synced: 23 Nov 2024

https://github.com/cambridgeltl/visual-spatial-reasoning

[TACL'23] VSR: A probing benchmark for spatial undersranding of vision-language models.

computer-vision multimodal-deep-learning nlp vision-and-language

Last synced: 04 Nov 2024

https://github.com/botpress/v12

Botpress OSS – v12

botpress chatbot chatbots nlp v12

Last synced: 09 Feb 2025

https://github.com/hpcaitech/CachedEmbedding

A memory efficient DLRM training solution using ColossalAI

colossal-ai deep-learning dlrm embeddings nlp pytorch recommandation-system

Last synced: 07 Nov 2024

https://github.com/doc-analysis/readingbank

ReadingBank: A Benchmark Dataset for Reading Order Detection

document-ai document-intelligence document-understanding natural-language-processing nlp ocr

Last synced: 29 Jan 2025

https://github.com/fdalvi/NeuroX

A Python library that encapsulates various methods for neuron interpretation and analysis in Deep NLP models.

explainable-ai natural-language-processing neurons nlp nlp-machine-learning

Last synced: 17 Nov 2024

https://github.com/fdalvi/neurox

A Python library that encapsulates various methods for neuron interpretation and analysis in Deep NLP models.

explainable-ai natural-language-processing neurons nlp nlp-machine-learning

Last synced: 10 Feb 2025

https://github.com/epwalsh/nlp-models

NLP research experiments, built on PyTorch within the AllenNLP framework.

allennlp nlp pytorch pytorch-nlp

Last synced: 01 Nov 2024

https://github.com/devsinghsachan/emdr2

Code and Models for the paper "End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering" (NeurIPS 2021)

information-retrieval natural-language-processing natural-questions nlp open-domain-qa open-domain-question-answering pytorch question-answering triviaqa webq

Last synced: 20 Dec 2024

https://github.com/kyubyong/name2nat

name2nat: a Python package for nationality prediction from a name

names nationality nlp

Last synced: 10 Nov 2024

https://github.com/hpprc/llm-lora-classification

LLMとLoRAを用いたテキスト分類

deep-learning llm lora nlp

Last synced: 27 Oct 2024

https://github.com/lonepatient/electra_pytorch

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

bert deeplearning electra glue language-model nlp pytorch

Last synced: 06 Nov 2024