Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/cluebenchmark/dataclue

DataCLUE: 数据为中心的NLP基准和工具包

ai chinese classification-algorithm data-centric human-in-the-loop nlp

Last synced: 09 Nov 2024

https://github.com/erfanzar/EasyDeL

EasyDeL is an OpenSource Library to make your training faster and more Optimized With cool Options for training and serving Both in Python And Mojo🔥

easydel flax gpt jax machine-learning mojo nlp optax pytorch transformers

Last synced: 03 Aug 2024

https://github.com/EmilHvitfeldt/R-text-data

List of textual data sources to be used for text mining in R

data-science nlp rstats text-analysis text-analytics-in-r text-mining tidytext

Last synced: 05 Aug 2024

https://github.com/CLUEbenchmark/DataCLUE

DataCLUE: 数据为中心的NLP基准和工具包

ai chinese classification-algorithm data-centric human-in-the-loop nlp

Last synced: 03 Aug 2024

https://github.com/rocketchat/hubot-natural

Natural Language Processing Chatbot for RocketChat

chatbot coffeescript hubot hubot-natural nlp nodejs rocketchat rocketchat-hubot

Last synced: 29 Oct 2024

https://github.com/RocketChat/hubot-natural

Natural Language Processing Chatbot for RocketChat

chatbot coffeescript hubot hubot-natural nlp nodejs rocketchat rocketchat-hubot

Last synced: 26 Oct 2024

https://github.com/alisonmitchell/stock-prediction

Technical and sentiment analysis to predict the stock market with machine learning models based on historical time series data and news article sentiment collected using APIs and web scraping.

beautifulsoup bert gensim huggingface keras-tensorflow machine-learning matplotlib mplfinance nlp nltk numpy pandas plotly python scikit-learn scipy seaborn spacy textblob yfinance

Last synced: 07 Nov 2024

https://github.com/Planeshifter/text-miner

text mining utilities for Node.js

nlp text-mining

Last synced: 10 Nov 2024

https://github.com/kudoai/duckduckgpt

🐤 DuckDuckGo add-on that brings the magic of ChatGPT to search results (powered by GPT-4!)

ai artificial-intelligence bot chatbot chatgpt chatgpt3 ddg duckduckgo gpt gpt-3 gpt-4 greasemonkey javascript machine-learning nlp openai search userscripts web

Last synced: 12 Oct 2024

https://github.com/KudoAI/duckduckgpt

🐤 DuckDuckGo add-on that brings the magic of ChatGPT to search results (powered by GPT-4!)

ai artificial-intelligence bot chatbot chatgpt chatgpt3 ddg duckduckgo gpt gpt-3 gpt-4 greasemonkey javascript machine-learning nlp openai search userscripts web

Last synced: 30 Oct 2024

https://github.com/ofa-sys/ofasys

OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models

audio computer-vision deep-learning motion multimodal-learning multitask-learning nlp pretrained-models pytorch transformers vision-and-language

Last synced: 10 Oct 2024

https://github.com/planeshifter/text-miner

text mining utilities for Node.js

nlp text-mining

Last synced: 26 Oct 2024

https://github.com/thunlp/OpenBackdoor

An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)

backdoor-attacks nlp

Last synced: 03 Aug 2024

https://github.com/ianycxu/GCN-with-BERT

Graph Convolutional Networks (GCN) with BERT for Coreference Resolution Task [Pytorch][DGL]

bert bert-model coreference-resolution gcn gnn graph-convolutional-networks graph-neural-networks nlp pytorch

Last synced: 02 Nov 2024

https://github.com/stanfordnlp/stanza-old

Stanford NLP group's shared Python tools.

natural-language-processing nlp python text-analysis text-processing

Last synced: 08 Nov 2024

https://github.com/MxDkl/pls

CLI to convert natural language to terminal commands

chatgpt cli llm nlp openai terminal

Last synced: 06 Nov 2024

https://github.com/amaiya/causalnlp

CausalNLP is a practical toolkit for causal inference with text as treatment, outcome, or "controlled-for" variable.

causal-inference nlp

Last synced: 12 Nov 2024

https://github.com/km1994/recommendation_advertisement_search

整理自然语言处理、推荐系统、搜索引擎等AI领域的入门笔记,论文学习笔记和面试资料(关于NLP那些你不知道的事、关于推荐系统那些你不知道的事、NLP百面百搭、推荐系统百面百搭、搜索引擎百面百搭)

advertisement nlp recommendation-system search-engine

Last synced: 14 Nov 2024

https://github.com/datquocnguyen/RDRPOSTagger

A fast and accurate POS and morphological tagging toolkit (EACL 2014)

java nlp part-of-speech-tagger pos-tagger pos-tagging python3

Last synced: 30 Oct 2024

https://github.com/redis-developer/redis-arxiv-search

Vector search demo with the arXiv paper dataset, RedisVL, HuggingFace, OpenAI, Cohere, FastAPI, React, and Redis.

arxiv arxiv-papers cohere document-retrieval document-search huggingface machine-learning nlp openai react redis vector-database vector-search

Last synced: 13 Nov 2024

https://github.com/arian-askari/ChatGPT-RetrievalQA-CIKM2023

A dataset for training/evaluating Question Answering Retrieval models on ChatGPT responses with the possibility to training/evaluating on real human responses.

ai chatgpt chatgpt-information-retrieval chatgpt-ir data-augmentation dataset deep-learning gpt-3 gpt2 gpt3 information-retrieval information-retrieval-chatgpt ir ir-chatgpt machine-learning nlp openai python sequence-to-sequence text-retrieval

Last synced: 30 Oct 2024

https://github.com/hankcs/id-cnn-cws

Source codes and corpora of paper "Iterated Dilated Convolutions for Chinese Word Segmentation"

bilstm cnn crf cws nlp tensorflow

Last synced: 27 Oct 2024

https://github.com/boudinfl/ake-datasets

Large, curated set of benchmark datasets for evaluating automatic keyphrase extraction algorithms.

benchmarking datasets information-retrieval keyphrase-extraction keyphrase-generation keyword-extraction natural-language-processing nlp nlp-machine-learning

Last synced: 14 Oct 2024

https://github.com/cohere-ai/sandbox-grounded-qa

A sandbox repo for grounded question answering with Cohere and Google Search

grounded-bot llm nlp question-answering search

Last synced: 07 Oct 2024

https://github.com/hsinyuan-huang/FusionNet-NLI

An example for applying FusionNet to Natural Language Inference

deep-learning machine-comprehension nlp

Last synced: 07 Aug 2024

https://github.com/HKUST-KnowComp/MnemonicReader

A PyTorch implementation of Mnemonic Reader for the Machine Comprehension task

document-reader machine-comprehension mnemonic-reader nlp pytorch r-net squad

Last synced: 07 Aug 2024

https://github.com/eisenjulian/nlp_estimator_tutorial

Educational material on using the TensorFlow Estimator framework for text classification

estimator nlp tensorflow text-classification

Last synced: 03 Sep 2024

https://github.com/amazon-science/refined

ReFinED is an efficient and accurate entity linking (EL) system.

entity-extraction entity-linking entity-resolution nlp pytorch

Last synced: 12 Nov 2024

https://github.com/mallahyari/llm-hub

A curated collection of interesting applications, repos, and tutorials using large language models (LLM) like GPT-3

chatgpt deep-learning gpt-3 gpt-4 language-model llms nlp openai

Last synced: 28 Oct 2024

https://github.com/algolisted-org/algolisted

Algolisted is an AI-powered nonprofit analytics firm dedicated to assisting computer science students in preparing for placements and internships. Our services include tracking and analytics across various platforms and topics.

ai css firebase hacktoberfest-2023 javascript mern-stack ml nlp python3 react-js web-scraping

Last synced: 10 Oct 2024

https://github.com/Living-with-machines/DeezyMatch

A Flexible Deep Learning Approach to Fuzzy String Matching

deep-learning hacktoberfest hut23 hut23-96 machine-learning natural-language-processing nlp

Last synced: 27 Oct 2024

https://github.com/A-baoYang/alpaca-7b-chinese

Finetune LLaMA-7B with Chinese instruction datasets

alpaca chatgpt deep-learning fine-tuning instruction-following llm lora nlp pytorch

Last synced: 25 Oct 2024

https://github.com/renovamen/text-classification

PyTorch implementation of some text classification models (HAN, fastText, BiLSTM-Attention, TextCNN, Transformer) | 文本分类

bilstm-attention cnn document-classification fasttext han hierarchical-attention-networks lstm nlp text-classification textcnn transformer

Last synced: 10 Nov 2024

https://github.com/yutkin/lenta.ru-news-dataset

Corpus of Russian news articles collected from Lenta.Ru

asynchronous asyncio corpus dataset lenta lenta-ru news nlp parser python russian

Last synced: 05 Nov 2024

https://github.com/eugeneyan/recsys-nlp-graph

🛒 Simple recommender with matrix factorization, graph, and NLP. Beating the regular collaborative filtering baseline.

graph matrix-factorization nlp pytorch recommender-system

Last synced: 08 Nov 2024

https://github.com/yagays/ja-timex

自然言語で書かれた時間情報表現を抽出/規格化するルールベースの解析器

datetime nlp python regular-expression temporal time-parsing

Last synced: 06 Nov 2024

https://github.com/farach/huggingfaceR

Hugging Face state-of-the-art models in R

huggingface nlp r rstats

Last synced: 11 Nov 2024

https://github.com/grid-parity-exchange/Egret

Tools for building power systems optimization problems

energy-system milp minlp nlp optimization power powerflow python snl-applications snl-science-libs

Last synced: 14 Nov 2024

https://github.com/comtravo/ctparse

Parse natural language time expressions in python

machine-learning nlp python python-library regular-expression time-parsing

Last synced: 11 Nov 2024

https://github.com/omarsar/pytorch_notebooks

A collection of PyTorch notebooks for learning and practicing deep learning

ai deeplearning machine-learning nlp notebook pytorch

Last synced: 27 Oct 2024

https://github.com/dbklim/rnnoise_wrapper

A simple Python wrapper for audio noise reduction RNNoise. Simplifies work with it, adds new trained models and detailed instructions for training.

audio audio-processing denoise denoiser denoising dsp ml nlp noise noise-algorithms noise-reduction noise-suppression python-wrapper rnn rnnoise rnnoise-training rnnoise-wrapper rtc wav

Last synced: 11 Nov 2024

https://github.com/proycon/clam

Quickly turn command-line applications into RESTful webservices with a web-application front-end. You provide a specification of your command line application, its input, output and parameters, and CLAM wraps around your application to form a fully fledged RESTful webservice.

nlp python rest webservice wrapper

Last synced: 13 Nov 2024

https://github.com/daoyuanli2816/kaggle-4th-place-solution-lmsys-chatbot-arena-human-preference-predictions

4th Place Solution for the Kaggle Competition: LMSYS - Chatbot Arena Human Preference Predictions

arena chatbot gemma2-9b gold-medal kaggle-competition kaggle-solution llm nlp

Last synced: 08 Nov 2024

https://github.com/jieyuz2/ecoassistant

EcoAssistant: using LLM assistant more affordably and accurately

chatbot gpt large-language-models llm-inference nlp

Last synced: 30 Oct 2024

https://github.com/cyberzhg/keras-gpt-2

Load GPT-2 checkpoint and generate texts

gpt-2 keras language-model nlp

Last synced: 27 Sep 2024

https://github.com/azure99/blossomlm

中英双语对话式大型语言模型

artificial-intelligence chatgpt large-language-models llm nlp

Last synced: 14 Nov 2024

https://github.com/dmotz/emdash

📚🧙‍♂️ Wisdom indexer — use AI to organize text snippets so you can actually remember & learn from what you read

ai books ebook ebooks elm embeddings epub kindle kindle-clippings kindle-highlights literature ml nlp notes reading semantic-search

Last synced: 13 Nov 2024

https://github.com/noahgift/pragmaticai

[Book-2019] Pragmatic AI: An Introduction to Cloud-based Machine Learning

ai aws azure azure-cli book chalice gcp ipython jupyter-notebook machine-learning ml nlp plotly python r seaborn serverless step-functions

Last synced: 12 Oct 2024

https://github.com/cosmoquester/2021-dialogue-summary-competition

[2021 훈민정음 한국어 음성•자연어 인공지능 경진대회] 대화요약 부문 알라꿍달라꿍 팀의 대화요약 학습 및 추론 코드를 공유하기 위한 레포입니다.

dialogue huggingface-transformers nlp pytorch-lightning summarization

Last synced: 09 Nov 2024

https://github.com/alisafaya/Arabic-BERT

Arabic edition of BERT pretrained language models

arabic arabic-nlp bert bert-language-models language-model nlp transformer

Last synced: 14 Nov 2024

https://github.com/AlekseyKorshuk/optimum-transformers

Accelerated NLP pipelines for fast inference on CPU and GPU. Built with Transformers, Optimum and ONNX Runtime.

benchmark huggingface infinity natural-language-processing nlp onnx onnxruntime optimum pipeline transformers

Last synced: 07 Aug 2024

https://github.com/microsoft/adamix

This is the implementation of the paper AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning (https://arxiv.org/abs/2205.12410).

adapter bert dart glue gpt-2 nlg nlp nlu parameter-efficient pytorch roberta webnlg

Last synced: 07 Oct 2024

https://github.com/patil-suraj/onnx_transformers

Accelerated NLP pipelines for fast inference on CPU. Built with Transformers and ONNX runtime.

inference nlp onnx onnxruntime transformers

Last synced: 01 Nov 2024

https://github.com/RevanthRameshkumar/CRD3

The repo containing the Critical Role Dungeons and Dragons Dataset.

acl2020 dataset dialogue-systems machine-learning nlp storytelling summarization

Last synced: 03 Nov 2024

https://github.com/hliyan/jarvis

J.A.R.V.I.S - Just Another Rudimentary Verbal Instruction Shell

chatbot cli nlp

Last synced: 31 Oct 2024

https://github.com/houbb/segment

The jieba-analysis tool for java.(基于结巴分词词库实现的更加灵活优雅易用,高性能的 java 分词实现。支持词性标注。)

benchmark chinese dfa hmm java jieba jieba-analysis jieba-chinese nlp segment segmentation trie trie-tree

Last synced: 07 Nov 2024

https://github.com/kavgan/phrase-at-scale

Detect common phrases in large amounts of text using a data-driven approach. Size of discovered phrases can be arbitrary. Can be used in languages other than English

collocation-extraction multiword-expressions multiword-extraction natural-language-processing nlp nlp-machine-learning phrase-discovery phrase-extraction pyspark spark

Last synced: 30 Oct 2024

https://github.com/explosion/spacy-dev-resources

💫 Scripts, tools and resources for developing spaCy

natural-language-processing nlp python spacy

Last synced: 25 Sep 2024

https://github.com/norskregnesentral/weak-supervision-for-ner

Framework to learn Named Entity Recognition models without labelled data using weak supervision.

domain-adaptation hidden-markov-models named-entity-recognition natural-language-processing nlp python spacy weak-supervision

Last synced: 25 Sep 2024

https://github.com/minhpqn/nlp_100_drill_exercises

100 bài luyện tập xử lý ngôn ngữ tự nhiên

dependency-parsing exercises nlp nlp-tool

Last synced: 07 Nov 2024

https://github.com/proycon/colibri-core

Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.

c-plus-plus computational-linguistics corpus library linguistics ngram ngrams nlp pattern-recognition python skipgram text-processing

Last synced: 12 Oct 2024

https://github.com/suminb/hanja

한글, 한자 라이브러리

hangul hanja nlp python

Last synced: 01 Nov 2024

https://github.com/rdspring1/pytorch_gbw_lm

PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset

deep-learning gpu language-model lstm machine-learning nlp pytorch torch torch-gbw

Last synced: 29 Oct 2024

https://github.com/tiesdekok/python_nlp_tutorial

This repository provides everything to get started with Python for Text Mining / Natural Language Processing (NLP)

computational-linguistics natural-language-processing nlp nltk python research spacy text-mining textblob textual-analysis

Last synced: 14 Oct 2024

https://github.com/cocoa-ai/sentimentcoremldemo

😃 iOS11 demo application for sentiment polarity analysis.

coreml coreml-models ios machine-learning nlp sentiment-analysis sentiment-polarity swift swift4

Last synced: 07 Nov 2024

https://github.com/cocoa-ai/SentimentCoreMLDemo

😃 iOS11 demo application for sentiment polarity analysis.

coreml coreml-models ios machine-learning nlp sentiment-analysis sentiment-polarity swift swift4

Last synced: 03 Aug 2024

https://github.com/nullnull/simstring

A Python implementation of the SimString, a simple and efficient algorithm for approximate string matching.

nlp nlp-library python

Last synced: 05 Aug 2024

https://github.com/shjwudp/c4-dataset-script

Inspired by google c4, here is a series of colossal clean data cleaning scripts focused on CommonCrawl data processing. Including Chinese data processing and cleaning methods in MassiveText.

commoncrawl dataset massivetext nlp python spark

Last synced: 28 Oct 2024

https://github.com/graykode/toeicbert

TOEIC(Test of English for International Communication) solving using pytorch-pretrained-BERT model.

ai bert deep-learning lm mask nlp pytorch pytorch-pretrained toeic

Last synced: 01 Nov 2024

https://github.com/lonepatient/bilstm-crf-ner-pytorch

This repo contains a PyTorch implementation of a BiLSTM-CRF model for named entity recognition task.

bilstm-crf crf lstm ner nlp pytorch

Last synced: 06 Nov 2024

https://github.com/taranis-ai/taranis-ai

Taranis AI is an advanced Open-Source Intelligence (OSINT) tool, leveraging Artificial Intelligence to revolutionize information gathering and situational analysis.

artificial-intelligence cybersecurity nlp osint secops

Last synced: 26 Sep 2024

https://github.com/dlab-berkeley/R-Deep-Learning

Workshop (6 hours): Deep learning in R using Keras. Building & training deep nets, image classification, transfer learning, text analysis, visualization

biomedical cloudml deep-learning keras nlp tensorflow

Last synced: 11 Nov 2024

https://github.com/rajat2502/StandNote

StandNote is a chrome extension that generates automated meeting minutes after every online meeting and helps save time for students and professionals.

canva chrome-extension collaborate django github gitlens icons8 inout learn meeting-summary microsoft nlp online-meetings productivity reactjs

Last synced: 06 Nov 2024

https://github.com/naver/gdc

Code accompanying our papers on the "Generative Distributional Control" framework

ai controlled-nlg exponential-family fairness-ml gpt-2 gpt3 information-geometry language-model machine-learning nlg nlp reinforcement-learning

Last synced: 08 Nov 2024

https://github.com/minibikini/paasaa

🔤 Natural language detection for Elixir

detect-language elixir language language-detection nlp

Last synced: 12 Nov 2024

https://github.com/nschneid/amr-tutorial

Abstract Meaning Representation (AMR) tutorial slides

abstract-meaning-representation computational-linguistics nlp reference semantics tutorial

Last synced: 08 Nov 2024

https://github.com/kennethenevoldsen/asent

Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.

interpretability natural-language-processing nlp python3 sentiment-analysis spacy spacy-extensions

Last synced: 14 Nov 2024