Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/cohere-ai/sandbox-grounded-qa

A sandbox repo for grounded question answering with Cohere and Google Search

grounded-bot llm nlp question-answering search

Last synced: 24 Jan 2025

https://github.com/eisenjulian/nlp_estimator_tutorial

Educational material on using the TensorFlow Estimator framework for text classification

estimator nlp tensorflow text-classification

Last synced: 27 Dec 2024

https://github.com/hsinyuan-huang/FusionNet-NLI

An example for applying FusionNet to Natural Language Inference

deep-learning machine-comprehension nlp

Last synced: 27 Nov 2024

https://github.com/mk2112/nn-zero-to-hero-notes

Jupyter Notebook notes on Andrej Karpathy's tutorial series, "Neural Networks: Zero to Hero."

deep-learning gpt neural-networks nlp nn-zero-to-hero pytorch

Last synced: 21 Jan 2025

https://github.com/mallahyari/llm-hub

A curated collection of interesting applications, repos, and tutorials using large language models (LLM) like GPT-3

chatgpt deep-learning gpt-3 gpt-4 language-model llms nlp openai

Last synced: 28 Oct 2024

https://github.com/Living-with-machines/DeezyMatch

A Flexible Deep Learning Approach to Fuzzy String Matching

deep-learning hacktoberfest hut23 hut23-96 machine-learning natural-language-processing nlp

Last synced: 27 Oct 2024

https://github.com/renovamen/text-classification

PyTorch implementation of some text classification models (HAN, fastText, BiLSTM-Attention, TextCNN, Transformer) | 文本分类

bilstm-attention cnn document-classification fasttext han hierarchical-attention-networks lstm nlp text-classification textcnn transformer

Last synced: 10 Nov 2024

https://github.com/A-baoYang/alpaca-7b-chinese

Finetune LLaMA-7B with Chinese instruction datasets

alpaca chatgpt deep-learning fine-tuning instruction-following llm lora nlp pytorch

Last synced: 25 Oct 2024

https://github.com/rickiepark/nlp-with-transformers

<트랜스포머를 활용한 자연어 처리> 예제 코드를 위한 저장소입니다.

deep-learning huggingface machine-learning natural-language-processing nlp transformers

Last synced: 26 Jan 2025

https://github.com/yutkin/lenta.ru-news-dataset

Corpus of Russian news articles collected from Lenta.Ru

asynchronous asyncio corpus dataset lenta lenta-ru news nlp parser python russian

Last synced: 05 Nov 2024

https://github.com/yagays/ja-timex

自然言語で書かれた時間情報表現を抽出/規格化するルールベースの解析器

datetime nlp python regular-expression temporal time-parsing

Last synced: 06 Nov 2024

https://github.com/grid-parity-exchange/Egret

Tools for building power systems optimization problems

energy-system milp minlp nlp optimization power powerflow python snl-applications snl-science-libs

Last synced: 14 Nov 2024

https://github.com/jieyuz2/ecoassistant

EcoAssistant: using LLM assistant more affordably and accurately

chatbot gpt large-language-models llm-inference nlp

Last synced: 29 Dec 2024

https://github.com/comtravo/ctparse

Parse natural language time expressions in python

machine-learning nlp python python-library regular-expression time-parsing

Last synced: 11 Nov 2024

https://github.com/dmotz/emdash

📚🧙‍♂️ Wisdom indexer — use AI to organize text snippets so you can actually remember & learn from what you read

ai books ebook ebooks elm embeddings epub kindle kindle-clippings kindle-highlights literature ml nlp notes reading semantic-search

Last synced: 20 Jan 2025

https://github.com/farach/huggingfaceR

Hugging Face state-of-the-art models in R

huggingface nlp r rstats

Last synced: 11 Nov 2024

https://github.com/omarsar/pytorch_notebooks

A collection of PyTorch notebooks for learning and practicing deep learning

ai deeplearning machine-learning nlp notebook pytorch

Last synced: 27 Oct 2024

https://github.com/daoyuanli2816/kaggle-4th-place-solution-lmsys-chatbot-arena-human-preference-predictions

4th Place Solution for the Kaggle Competition: LMSYS - Chatbot Arena Human Preference Predictions

arena chatbot gemma2-9b gold-medal kaggle-competition kaggle-solution llm nlp

Last synced: 08 Nov 2024

https://github.com/proycon/clam

Quickly turn command-line applications into RESTful webservices with a web-application front-end. You provide a specification of your command line application, its input, output and parameters, and CLAM wraps around your application to form a fully fledged RESTful webservice.

nlp python rest webservice wrapper

Last synced: 20 Jan 2025

https://github.com/adamlui/chatgpt-widescreen

🖥️ Adds Widescreen + Fullscreen modes to chatgpt.com + perplexity.ai + poe.com for enhanced viewing + reduced scrolling

ai artificial-intelligence chat chatbot chatgpt chatgpt3 chrome-extension gpt gpt-3 gpt-4 greasemonkey javascript machine-learning nlp openai ui userscripts ux widescreen

Last synced: 26 Jan 2025

https://github.com/deepset-ai/haystack-core-integrations

Additional packages (components, document stores and the likes) to extend the capabilities of Haystack version 2.0 and onwards

ai haystack llm mlops nlp

Last synced: 24 Jan 2025

https://github.com/cyberzhg/keras-gpt-2

Load GPT-2 checkpoint and generate texts

gpt-2 keras language-model nlp

Last synced: 21 Jan 2025

https://github.com/noahgift/pragmaticai

[Book-2019] Pragmatic AI: An Introduction to Cloud-based Machine Learning

ai aws azure azure-cli book chalice gcp ipython jupyter-notebook machine-learning ml nlp plotly python r seaborn serverless step-functions

Last synced: 12 Oct 2024

https://github.com/moritzlaurer/gpt-google-sheets

Code and documentation for running generative LLMs like ChatGPT or GPT4 in google sheets without any coding knowledge. Transform unstructured text to structured data.

chatgpt gpt3 gpt4 nlp nlp-machine-learning

Last synced: 28 Nov 2024

https://github.com/microsoft/adamix

This is the implementation of the paper AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning (https://arxiv.org/abs/2205.12410).

adapter bert dart glue gpt-2 nlg nlp nlu parameter-efficient pytorch roberta webnlg

Last synced: 23 Jan 2025

https://github.com/patil-suraj/onnx_transformers

Accelerated NLP pipelines for fast inference on CPU. Built with Transformers and ONNX runtime.

inference nlp onnx onnxruntime transformers

Last synced: 18 Jan 2025

https://github.com/AlekseyKorshuk/optimum-transformers

Accelerated NLP pipelines for fast inference on CPU and GPU. Built with Transformers, Optimum and ONNX Runtime.

benchmark huggingface infinity natural-language-processing nlp onnx onnxruntime optimum pipeline transformers

Last synced: 27 Nov 2024

https://github.com/cosmoquester/2021-dialogue-summary-competition

[2021 훈민정음 한국어 음성•자연어 인공지능 경진대회] 대화요약 부문 알라꿍달라꿍 팀의 대화요약 학습 및 추론 코드를 공유하기 위한 레포입니다.

dialogue huggingface-transformers nlp pytorch-lightning summarization

Last synced: 09 Nov 2024

https://github.com/alisafaya/Arabic-BERT

Arabic edition of BERT pretrained language models

arabic arabic-nlp bert bert-language-models language-model nlp transformer

Last synced: 14 Nov 2024

https://github.com/RevanthRameshkumar/CRD3

The repo containing the Critical Role Dungeons and Dragons Dataset.

acl2020 dataset dialogue-systems machine-learning nlp storytelling summarization

Last synced: 03 Nov 2024

https://github.com/proycon/colibri-core

Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.

c-plus-plus computational-linguistics corpus library linguistics ngram ngrams nlp pattern-recognition python skipgram text-processing

Last synced: 21 Jan 2025

https://github.com/hliyan/jarvis

J.A.R.V.I.S - Just Another Rudimentary Verbal Instruction Shell

chatbot cli nlp

Last synced: 31 Oct 2024

https://github.com/mihaiii/semantic-autocomplete

A blazing-fast semantic search React component. Match by meaning, not just by letters. Search as you type without waiting (no debounce needed). Rank by cosine similarity.

cosine-similarity material-ui nlp nlp-machine-learning react semantic semantic-search

Last synced: 20 Jan 2025

https://github.com/explosion/spacy-dev-resources

💫 Scripts, tools and resources for developing spaCy

natural-language-processing nlp python spacy

Last synced: 17 Jan 2025

https://github.com/kavgan/phrase-at-scale

Detect common phrases in large amounts of text using a data-driven approach. Size of discovered phrases can be arbitrary. Can be used in languages other than English

collocation-extraction multiword-expressions multiword-extraction natural-language-processing nlp nlp-machine-learning phrase-discovery phrase-extraction pyspark spark

Last synced: 30 Oct 2024

https://github.com/minhpqn/nlp_100_drill_exercises

100 bài luyện tập xử lý ngôn ngữ tự nhiên

dependency-parsing exercises nlp nlp-tool

Last synced: 07 Nov 2024

https://github.com/norskregnesentral/weak-supervision-for-ner

Framework to learn Named Entity Recognition models without labelled data using weak supervision.

domain-adaptation hidden-markov-models named-entity-recognition natural-language-processing nlp python spacy weak-supervision

Last synced: 17 Jan 2025

https://github.com/suminb/hanja

한글, 한자 라이브러리

hangul hanja nlp python

Last synced: 20 Jan 2025

https://github.com/rdspring1/pytorch_gbw_lm

PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset

deep-learning gpu language-model lstm machine-learning nlp pytorch torch torch-gbw

Last synced: 29 Oct 2024

https://github.com/rajat2502/standnote

StandNote is a chrome extension that generates automated meeting minutes after every online meeting and helps save time for students and professionals.

canva chrome-extension collaborate django github gitlens icons8 inout learn meeting-summary microsoft nlp online-meetings productivity reactjs

Last synced: 06 Dec 2024

https://github.com/zhongkaifu/crfsharp

CRFSharp is Conditional Random Fields implemented by .NET(C#), a machine learning algorithm for learning from labeled sequences of examples.

c-sharp crf dotnet machine-learning nlp

Last synced: 20 Nov 2024

https://github.com/nullnull/simstring

A Python implementation of the SimString, a simple and efficient algorithm for approximate string matching.

nlp nlp-library python

Last synced: 22 Nov 2024

https://github.com/johnbumgarner/wordhoard

This Python module can be used to obtain antonyms, synonyms, hypernyms, hyponyms, homophones and definitions.

antonyms bag-of-words definitions dictionary homophones hypernyms hyponyms lexicon nlp python python3 synonyms text-analysis textual-analysis wordlists wordnet wordnets wordsearch

Last synced: 20 Nov 2024

https://github.com/cocoa-ai/SentimentCoreMLDemo

😃 iOS11 demo application for sentiment polarity analysis.

coreml coreml-models ios machine-learning nlp sentiment-analysis sentiment-polarity swift swift4

Last synced: 17 Nov 2024

https://github.com/tiesdekok/python_nlp_tutorial

This repository provides everything to get started with Python for Text Mining / Natural Language Processing (NLP)

computational-linguistics natural-language-processing nlp nltk python research spacy text-mining textblob textual-analysis

Last synced: 14 Oct 2024

https://github.com/cocoa-ai/sentimentcoremldemo

😃 iOS11 demo application for sentiment polarity analysis.

coreml coreml-models ios machine-learning nlp sentiment-analysis sentiment-polarity swift swift4

Last synced: 07 Nov 2024

https://github.com/lonepatient/bilstm-crf-ner-pytorch

This repo contains a PyTorch implementation of a BiLSTM-CRF model for named entity recognition task.

bilstm-crf crf lstm ner nlp pytorch

Last synced: 06 Nov 2024

https://github.com/graykode/toeicbert

TOEIC(Test of English for International Communication) solving using pytorch-pretrained-BERT model.

ai bert deep-learning lm mask nlp pytorch pytorch-pretrained toeic

Last synced: 01 Nov 2024

https://github.com/shjwudp/c4-dataset-script

Inspired by google c4, here is a series of colossal clean data cleaning scripts focused on CommonCrawl data processing. Including Chinese data processing and cleaning methods in MassiveText.

commoncrawl dataset massivetext nlp python spark

Last synced: 02 Dec 2024

https://github.com/aphp/edsnlp

Modular, fast NLP framework, compatible with Pytorch and spaCy, offering tailored support for French clinical notes.

clinical-data-warehouse deep-learning fast french medical multi-task nlp pytorch rule-based spacy text-mining

Last synced: 26 Jan 2025

https://github.com/princeton-nlp/llmbar

[ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following

evaluation llm nlp

Last synced: 06 Jan 2025

https://github.com/dlab-berkeley/R-Deep-Learning

Workshop (6 hours): Deep learning in R using Keras. Building & training deep nets, image classification, transfer learning, text analysis, visualization

biomedical cloudml deep-learning keras nlp tensorflow

Last synced: 11 Nov 2024

https://github.com/rajat2502/StandNote

StandNote is a chrome extension that generates automated meeting minutes after every online meeting and helps save time for students and professionals.

canva chrome-extension collaborate django github gitlens icons8 inout learn meeting-summary microsoft nlp online-meetings productivity reactjs

Last synced: 06 Nov 2024

https://github.com/naver/gdc

Code accompanying our papers on the "Generative Distributional Control" framework

ai controlled-nlg exponential-family fairness-ml gpt-2 gpt3 information-geometry language-model machine-learning nlg nlp reinforcement-learning

Last synced: 08 Nov 2024

https://github.com/kennethenevoldsen/asent

Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.

interpretability natural-language-processing nlp python3 sentiment-analysis spacy spacy-extensions

Last synced: 20 Jan 2025

https://github.com/bayeru/chat-to-your-database

Chat to your database with AI. An experimental app to test the abilities of LLMs to query SQL databases using natural language.

chatgpt chatgpt-app database langchain langchain-typescript llm llms mysql natural-language-processing nlp openai postgres sql sqlite

Last synced: 30 Nov 2024

https://github.com/minibikini/paasaa

🔤 Natural language detection for Elixir

detect-language elixir language language-detection nlp

Last synced: 26 Jan 2025

https://github.com/nschneid/amr-tutorial

Abstract Meaning Representation (AMR) tutorial slides

abstract-meaning-representation computational-linguistics nlp reference semantics tutorial

Last synced: 30 Dec 2024

https://github.com/cohere-ai/sandbox-toy-semantic-search

A demonstration of how a toy (but usable!) semantic search engine can be quickly built using Cohere's platform.

llm nlp search semantic-se

Last synced: 24 Jan 2025

https://github.com/salvatorera/ml-news-of-the-week

A collection of the the best ML and AI news every week (research, news, resources)

agents ai artificial-intelligence computer-vision llms machine-learning nlp python rag retrieval-augmented-generation transformer

Last synced: 21 Jan 2025

https://github.com/shibing624/nerpy

🌈 NERpy: Implementation of Named Entity Recognition using Python. 命名实体识别工具,支持BertSoftmax、BertSpan等模型,开箱即用。

bert bert-softmax bert-span named-entity-recognition ner nlp pytorch transformers

Last synced: 24 Jan 2025

https://github.com/clovaai/focusseq2seq

[EMNLP 2019] Mixture Content Selection for Diverse Sequence Generation (Question Generation / Abstractive Summarization)

emnlp2019 generation nlp pytorch question-generation summarization

Last synced: 25 Jan 2025

https://github.com/logpai/bughub

A collection of free-text bug reports for duplicate issue identification

bug-reports datasets duplicate-detection nlp

Last synced: 29 Dec 2024

https://github.com/oxford-cs-deepnlp-2017/practical-2

Oxford Deep NLP 2017 course - Practical 2: Text Classification

deep-learning machine-learning natural-language-processing nlp oxford

Last synced: 17 Jan 2025

https://github.com/ZhixiuYe/Intra-Bag-and-Inter-Bag-Attentions

Code for NAACL 2019 paper: Distant Supervision Relation Extraction with Intra-Bag and Inter-Bag Attentions

deeplearning distant-supervision nlp pytorch relation-extraction

Last synced: 01 Nov 2024

https://github.com/Nipun1212/Claude_api

Claude_api is a Python package that provides a convenient way to interact with Claude 2 from Anthropic.

anthropic anthropic-claude claude claude-ai claude-api nlp

Last synced: 11 Nov 2024

https://github.com/lan-ce-lot/pythorch-text-classification

对豆瓣影评进行文本分类情感分析,利用爬虫豆瓣爬取评论,进行数据清洗,分词,采用BERT、CNN、LSTM等模型进行训练,采用tensorboardX可视化训练过程,自然语言处理项目\A project for text classification, based on torch 1.7.1

bert cnn douban lstm natural-language-processing nlp qt qt5 qt6 rnn scrapy sentiment-analysis tensorboard tensorboardx text-classification ui

Last synced: 12 Oct 2024

https://github.com/McGill-NLP/weblinx

WebLINX is a benchmark for building web navigation agents with conversational capabilities

agent agents computer-vision llm multimodal navigation nlp web

Last synced: 20 Oct 2024

https://mcgill-nlp.github.io/weblinx/

WebLINX is a benchmark for building web navigation agents with conversational capabilities

agent agents computer-vision llm multimodal navigation nlp web

Last synced: 17 Nov 2024

https://github.com/pkshatechnology-research/tdmelodic

A Japanese accent dictionary generator

accent japanese nlp speech-synthesis

Last synced: 27 Dec 2024

https://github.com/winkjs/wink-nlp-utils

NLP Functions for amplifying negations, managing elisions, creating ngrams, stems, phonetic codes to tokens and more.

bag-of-words natural-language-processing ngrams nlp phonetize sentence-boundary-detection stem stop-words tokenize

Last synced: 21 Jan 2025

https://github.com/proycon/flat

FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.github.io/folia), a rich XML-based format for linguistic annotation. Flat allows users to view annotated FoLiA documents and enrich these documents with new annotations, a wide variety of linguistic annotation types is supported through the FoLiA paradigm.

annotation-tool clariah clarin computational-linguistics folia javascript linguistic-annotation-framework linguistics nlp python web-application

Last synced: 22 Jan 2025

https://github.com/bukosabino/justicio

Building an assistant for Boletin Oficial del Estado (BOE) using Retrieval Augmented Generation (RAG)

legal legaltech nlp spanish

Last synced: 25 Jan 2025

https://github.com/deep-diver/en-fr-mlt-tensorflow

English-French Machine Language Translation in Tensorflow

deep-learning english-to-french machine-translation nlp tensorflow

Last synced: 18 Jan 2025

https://github.com/jmisilo/clip-gpt-captioning

CLIPxGPT Captioner is Image Captioning Model based on OpenAI's CLIP and GPT-2.

computer-vision cv deep-learning image-caption image-caption-generator image-captioning machine-learning nlp python pytorch

Last synced: 17 Dec 2024

https://github.com/deepset-ai/haystack-demos

Fully working applications that demonstrate how to use Haystack to implement common NLP use cases

nlp python question-answering semantic-search

Last synced: 20 Jan 2025

https://github.com/pooya-mohammadi/deep_utils

An open-source toolkit which is full of handy functions, including the most used models and utilities for deep-learning practitioners!

augmentation coco computer-vision cutmix deep-learning face-detection face-recognition machine-learning modelcheckpoint nlp object-detection python pytorch senet tensorflow utils vggface2 yolov5

Last synced: 09 Nov 2024

https://github.com/tunib-ai/tunib-electra

Korean-English Bilingual Electra Models

electra nlp tunib

Last synced: 24 Jan 2025

https://github.com/etienneab3d/whispertimesync

Synchronize Whisper's timestamps over an existing accurate transcription

aligner asr nlp subtitles text-to-speech whisper

Last synced: 19 Nov 2024

https://github.com/clipperhouse/jargon

Tokenizers and lemmatizers for Go

data-science go lemmatizer nlp tokenizer

Last synced: 14 Nov 2024

https://github.com/yohasebe/lemmatizer

Lemmatizer for text in English. Inspired by Python's nltk.corpus.reader.wordnet.morphy

lemmatizer nlp ruby rubynlp wordnet

Last synced: 20 Jan 2025