Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/hankcs/id-cnn-cws

Source codes and corpora of paper "Iterated Dilated Convolutions for Chinese Word Segmentation"

bilstm cnn crf cws nlp tensorflow

Last synced: 27 Oct 2024

https://github.com/redis-developer/redis-arxiv-search

Vector search demo with the arXiv paper dataset, RedisVL, HuggingFace, OpenAI, Cohere, FastAPI, React, and Redis.

arxiv arxiv-papers cohere document-retrieval document-search huggingface machine-learning nlp openai react redis vector-database vector-search

Last synced: 11 Oct 2024

https://github.com/boudinfl/ake-datasets

Large, curated set of benchmark datasets for evaluating automatic keyphrase extraction algorithms.

benchmarking datasets information-retrieval keyphrase-extraction keyphrase-generation keyword-extraction natural-language-processing nlp nlp-machine-learning

Last synced: 14 Oct 2024

https://github.com/hsinyuan-huang/FusionNet-NLI

An example for applying FusionNet to Natural Language Inference

deep-learning machine-comprehension nlp

Last synced: 07 Aug 2024

https://github.com/cohere-ai/sandbox-grounded-qa

A sandbox repo for grounded question answering with Cohere and Google Search

grounded-bot llm nlp question-answering search

Last synced: 07 Oct 2024

https://github.com/HKUST-KnowComp/MnemonicReader

A PyTorch implementation of Mnemonic Reader for the Machine Comprehension task

document-reader machine-comprehension mnemonic-reader nlp pytorch r-net squad

Last synced: 07 Aug 2024

https://github.com/eisenjulian/nlp_estimator_tutorial

Educational material on using the TensorFlow Estimator framework for text classification

estimator nlp tensorflow text-classification

Last synced: 03 Sep 2024

https://github.com/mallahyari/llm-hub

A curated collection of interesting applications, repos, and tutorials using large language models (LLM) like GPT-3

chatgpt deep-learning gpt-3 gpt-4 language-model llms nlp openai

Last synced: 28 Oct 2024

https://github.com/algolisted-org/algolisted

Algolisted is an AI-powered nonprofit analytics firm dedicated to assisting computer science students in preparing for placements and internships. Our services include tracking and analytics across various platforms and topics.

ai css firebase hacktoberfest-2023 javascript mern-stack ml nlp python3 react-js web-scraping

Last synced: 10 Oct 2024

https://github.com/yutkin/lenta.ru-news-dataset

Corpus of Russian news articles collected from Lenta.Ru

asynchronous asyncio corpus dataset lenta lenta-ru news nlp parser python russian

Last synced: 05 Nov 2024

https://github.com/A-baoYang/alpaca-7b-chinese

Finetune LLaMA-7B with Chinese instruction datasets

alpaca chatgpt deep-learning fine-tuning instruction-following llm lora nlp pytorch

Last synced: 25 Oct 2024

https://github.com/Living-with-machines/DeezyMatch

A Flexible Deep Learning Approach to Fuzzy String Matching

deep-learning hacktoberfest hut23 hut23-96 machine-learning natural-language-processing nlp

Last synced: 27 Oct 2024

https://github.com/renovamen/text-classification

PyTorch implementation of some text classification models (HAN, fastText, BiLSTM-Attention, TextCNN, Transformer) | 文本分类

bilstm-attention cnn document-classification fasttext han hierarchical-attention-networks lstm nlp text-classification textcnn transformer

Last synced: 10 Nov 2024

https://github.com/km1994/recommendation_advertisement_search

整理自然语言处理、推荐系统、搜索引擎等AI领域的入门笔记,论文学习笔记和面试资料(关于NLP那些你不知道的事、关于推荐系统那些你不知道的事、NLP百面百搭、推荐系统百面百搭、搜索引擎百面百搭)

advertisement nlp recommendation-system search-engine

Last synced: 09 Nov 2024

https://github.com/yagays/ja-timex

自然言語で書かれた時間情報表現を抽出/規格化するルールベースの解析器

datetime nlp python regular-expression temporal time-parsing

Last synced: 06 Nov 2024

https://github.com/eugeneyan/recsys-nlp-graph

🛒 Simple recommender with matrix factorization, graph, and NLP. Beating the regular collaborative filtering baseline.

graph matrix-factorization nlp pytorch recommender-system

Last synced: 08 Nov 2024

https://github.com/farach/huggingfaceR

Hugging Face state-of-the-art models in R

huggingface nlp r rstats

Last synced: 11 Nov 2024

https://github.com/comtravo/ctparse

Parse natural language time expressions in python

machine-learning nlp python python-library regular-expression time-parsing

Last synced: 02 Aug 2024

https://github.com/omarsar/pytorch_notebooks

A collection of PyTorch notebooks for learning and practicing deep learning

ai deeplearning machine-learning nlp notebook pytorch

Last synced: 27 Oct 2024

https://github.com/daoyuanli2816/kaggle-4th-place-solution-lmsys-chatbot-arena-human-preference-predictions

4th Place Solution for the Kaggle Competition: LMSYS - Chatbot Arena Human Preference Predictions

arena chatbot gemma2-9b gold-medal kaggle-competition kaggle-solution llm nlp

Last synced: 08 Nov 2024

https://github.com/jieyuz2/ecoassistant

EcoAssistant: using LLM assistant more affordably and accurately

chatbot gpt large-language-models llm-inference nlp

Last synced: 30 Oct 2024

https://github.com/proycon/clam

Quickly turn command-line applications into RESTful webservices with a web-application front-end. You provide a specification of your command line application, its input, output and parameters, and CLAM wraps around your application to form a fully fledged RESTful webservice.

nlp python rest webservice wrapper

Last synced: 30 Oct 2024

https://github.com/cyberzhg/keras-gpt-2

Load GPT-2 checkpoint and generate texts

gpt-2 keras language-model nlp

Last synced: 27 Sep 2024

https://github.com/dmotz/emdash

📚🧙‍♂️ Wisdom indexer — use AI to organize text snippets so you can actually remember & learn from what you read

ai books ebook ebooks elm embeddings epub kindle kindle-clippings kindle-highlights literature ml nlp notes reading semantic-search

Last synced: 30 Oct 2024

https://github.com/azure99/blossomlm

中英双语对话式大型语言模型

artificial-intelligence chatgpt large-language-models llm nlp

Last synced: 07 Nov 2024

https://github.com/noahgift/pragmaticai

[Book-2019] Pragmatic AI: An Introduction to Cloud-based Machine Learning

ai aws azure azure-cli book chalice gcp ipython jupyter-notebook machine-learning ml nlp plotly python r seaborn serverless step-functions

Last synced: 12 Oct 2024

https://github.com/microsoft/adamix

This is the implementation of the paper AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning (https://arxiv.org/abs/2205.12410).

adapter bert dart glue gpt-2 nlg nlp nlu parameter-efficient pytorch roberta webnlg

Last synced: 07 Oct 2024

https://github.com/RevanthRameshkumar/CRD3

The repo containing the Critical Role Dungeons and Dragons Dataset.

acl2020 dataset dialogue-systems machine-learning nlp storytelling summarization

Last synced: 03 Nov 2024

https://github.com/AlekseyKorshuk/optimum-transformers

Accelerated NLP pipelines for fast inference on CPU and GPU. Built with Transformers, Optimum and ONNX Runtime.

benchmark huggingface infinity natural-language-processing nlp onnx onnxruntime optimum pipeline transformers

Last synced: 07 Aug 2024

https://github.com/patil-suraj/onnx_transformers

Accelerated NLP pipelines for fast inference on CPU. Built with Transformers and ONNX runtime.

inference nlp onnx onnxruntime transformers

Last synced: 01 Nov 2024

https://github.com/cosmoquester/2021-dialogue-summary-competition

[2021 훈민정음 한국어 음성•자연어 인공지능 경진대회] 대화요약 부문 알라꿍달라꿍 팀의 대화요약 학습 및 추론 코드를 공유하기 위한 레포입니다.

dialogue huggingface-transformers nlp pytorch-lightning summarization

Last synced: 09 Nov 2024

https://github.com/kavgan/phrase-at-scale

Detect common phrases in large amounts of text using a data-driven approach. Size of discovered phrases can be arbitrary. Can be used in languages other than English

collocation-extraction multiword-expressions multiword-extraction natural-language-processing nlp nlp-machine-learning phrase-discovery phrase-extraction pyspark spark

Last synced: 30 Oct 2024

https://github.com/hliyan/jarvis

J.A.R.V.I.S - Just Another Rudimentary Verbal Instruction Shell

chatbot cli nlp

Last synced: 31 Oct 2024

https://github.com/houbb/segment

The jieba-analysis tool for java.(基于结巴分词词库实现的更加灵活优雅易用,高性能的 java 分词实现。支持词性标注。)

benchmark chinese dfa hmm java jieba jieba-analysis jieba-chinese nlp segment segmentation trie trie-tree

Last synced: 07 Nov 2024

https://github.com/explosion/spacy-dev-resources

💫 Scripts, tools and resources for developing spaCy

natural-language-processing nlp python spacy

Last synced: 25 Sep 2024

https://github.com/norskregnesentral/weak-supervision-for-ner

Framework to learn Named Entity Recognition models without labelled data using weak supervision.

domain-adaptation hidden-markov-models named-entity-recognition natural-language-processing nlp python spacy weak-supervision

Last synced: 25 Sep 2024

https://github.com/alisafaya/Arabic-BERT

Arabic edition of BERT pretrained language models

arabic arabic-nlp bert bert-language-models language-model nlp transformer

Last synced: 03 Aug 2024

https://github.com/minhpqn/nlp_100_drill_exercises

100 bài luyện tập xử lý ngôn ngữ tự nhiên

dependency-parsing exercises nlp nlp-tool

Last synced: 07 Nov 2024

https://github.com/proycon/colibri-core

Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.

c-plus-plus computational-linguistics corpus library linguistics ngram ngrams nlp pattern-recognition python skipgram text-processing

Last synced: 12 Oct 2024

https://github.com/grid-parity-exchange/Egret

Tools for building power systems optimization problems

energy-system milp minlp nlp optimization power powerflow python snl-applications snl-science-libs

Last synced: 03 Aug 2024

https://github.com/suminb/hanja

한글, 한자 라이브러리

hangul hanja nlp python

Last synced: 01 Nov 2024

https://github.com/rdspring1/pytorch_gbw_lm

PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset

deep-learning gpu language-model lstm machine-learning nlp pytorch torch torch-gbw

Last synced: 29 Oct 2024

https://github.com/cocoa-ai/sentimentcoremldemo

😃 iOS11 demo application for sentiment polarity analysis.

coreml coreml-models ios machine-learning nlp sentiment-analysis sentiment-polarity swift swift4

Last synced: 07 Nov 2024

https://github.com/nullnull/simstring

A Python implementation of the SimString, a simple and efficient algorithm for approximate string matching.

nlp nlp-library python

Last synced: 05 Aug 2024

https://github.com/tiesdekok/python_nlp_tutorial

This repository provides everything to get started with Python for Text Mining / Natural Language Processing (NLP)

computational-linguistics natural-language-processing nlp nltk python research spacy text-mining textblob textual-analysis

Last synced: 14 Oct 2024

https://github.com/cocoa-ai/SentimentCoreMLDemo

😃 iOS11 demo application for sentiment polarity analysis.

coreml coreml-models ios machine-learning nlp sentiment-analysis sentiment-polarity swift swift4

Last synced: 03 Aug 2024

https://github.com/shjwudp/c4-dataset-script

Inspired by google c4, here is a series of colossal clean data cleaning scripts focused on CommonCrawl data processing. Including Chinese data processing and cleaning methods in MassiveText.

commoncrawl dataset massivetext nlp python spark

Last synced: 28 Oct 2024

https://github.com/graykode/toeicbert

TOEIC(Test of English for International Communication) solving using pytorch-pretrained-BERT model.

ai bert deep-learning lm mask nlp pytorch pytorch-pretrained toeic

Last synced: 01 Nov 2024

https://github.com/lonepatient/bilstm-crf-ner-pytorch

This repo contains a PyTorch implementation of a BiLSTM-CRF model for named entity recognition task.

bilstm-crf crf lstm ner nlp pytorch

Last synced: 06 Nov 2024

https://github.com/rajat2502/StandNote

StandNote is a chrome extension that generates automated meeting minutes after every online meeting and helps save time for students and professionals.

canva chrome-extension collaborate django github gitlens icons8 inout learn meeting-summary microsoft nlp online-meetings productivity reactjs

Last synced: 06 Nov 2024

https://github.com/dlab-berkeley/R-Deep-Learning

Workshop (6 hours): Deep learning in R using Keras. Building & training deep nets, image classification, transfer learning, text analysis, visualization

biomedical cloudml deep-learning keras nlp tensorflow

Last synced: 11 Nov 2024

https://github.com/taranis-ai/taranis-ai

Taranis AI is an advanced Open-Source Intelligence (OSINT) tool, leveraging Artificial Intelligence to revolutionize information gathering and situational analysis.

artificial-intelligence cybersecurity nlp osint secops

Last synced: 26 Sep 2024

https://github.com/naver/gdc

Code accompanying our papers on the "Generative Distributional Control" framework

ai controlled-nlg exponential-family fairness-ml gpt-2 gpt3 information-geometry language-model machine-learning nlg nlp reinforcement-learning

Last synced: 08 Nov 2024

https://github.com/nschneid/amr-tutorial

Abstract Meaning Representation (AMR) tutorial slides

abstract-meaning-representation computational-linguistics nlp reference semantics tutorial

Last synced: 08 Nov 2024

https://github.com/princeton-nlp/llmbar

[ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following

evaluation llm nlp

Last synced: 11 Nov 2024

https://github.com/cohere-ai/sandbox-toy-semantic-search

A demonstration of how a toy (but usable!) semantic search engine can be quickly built using Cohere's platform.

llm nlp search semantic-se

Last synced: 07 Oct 2024

https://github.com/minibikini/paasaa

🔤 Natural language detection for Elixir

detect-language elixir language language-detection nlp

Last synced: 29 Oct 2024

https://github.com/kennethenevoldsen/asent

Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.

interpretability natural-language-processing nlp python3 sentiment-analysis spacy spacy-extensions

Last synced: 31 Oct 2024

https://github.com/lan-ce-lot/pythorch-text-classification

对豆瓣影评进行文本分类情感分析,利用爬虫豆瓣爬取评论,进行数据清洗,分词,采用BERT、CNN、LSTM等模型进行训练,采用tensorboardX可视化训练过程,自然语言处理项目\A project for text classification, based on torch 1.7.1

bert cnn douban lstm natural-language-processing nlp qt qt5 qt6 rnn scrapy sentiment-analysis tensorboard tensorboardx text-classification ui

Last synced: 12 Oct 2024

https://github.com/ZhixiuYe/Intra-Bag-and-Inter-Bag-Attentions

Code for NAACL 2019 paper: Distant Supervision Relation Extraction with Intra-Bag and Inter-Bag Attentions

deeplearning distant-supervision nlp pytorch relation-extraction

Last synced: 01 Nov 2024

https://github.com/logpai/bughub

A collection of free-text bug reports for duplicate issue identification

bug-reports datasets duplicate-detection nlp

Last synced: 07 Nov 2024

https://github.com/aphp/edsnlp

Modular, fast NLP framework, compatible with Pytorch and spaCy, offering tailored support for French clinical notes.

clinical-data-warehouse deep-learning fast french medical multi-task nlp pytorch rule-based spacy text-mining

Last synced: 14 Oct 2024

https://github.com/McGill-NLP/weblinx

WebLINX is a benchmark for building web navigation agents with conversational capabilities

agent agents computer-vision llm multimodal navigation nlp web

Last synced: 20 Oct 2024

https://github.com/bayeru/chat-to-your-database

Chat to your database with AI. An experimental app to test the abilities of LLMs to query SQL databases using natural language.

chatgpt chatgpt-app database langchain langchain-typescript llm llms mysql natural-language-processing nlp openai postgres sql sqlite

Last synced: 10 Aug 2024

https://github.com/winkjs/wink-nlp-utils

NLP Functions for amplifying negations, managing elisions, creating ngrams, stems, phonetic codes to tokens and more.

bag-of-words natural-language-processing ngrams nlp phonetize sentence-boundary-detection stem stop-words tokenize

Last synced: 09 Nov 2024

https://github.com/shibing624/nerpy

🌈 NERpy: Implementation of Named Entity Recognition using Python. 命名实体识别工具,支持BertSoftmax、BertSpan等模型,开箱即用。

bert bert-softmax bert-span named-entity-recognition ner nlp pytorch transformers

Last synced: 31 Oct 2024

https://github.com/deepset-ai/haystack-core-integrations

Additional packages (components, document stores and the likes) to extend the capabilities of Haystack version 2.0 and onwards

ai haystack llm mlops nlp

Last synced: 06 Nov 2024

https://github.com/johnbumgarner/wordhoard

This Python module can be used to obtain antonyms, synonyms, hypernyms, hyponyms, homophones and definitions.

antonyms bag-of-words definitions dictionary homophones hypernyms hyponyms lexicon nlp python python3 synonyms text-analysis textual-analysis wordlists wordnet wordnets wordsearch

Last synced: 04 Aug 2024

https://github.com/pooya-mohammadi/deep_utils

An open-source toolkit which is full of handy functions, including the most used models and utilities for deep-learning practitioners!

augmentation coco computer-vision cutmix deep-learning face-detection face-recognition machine-learning modelcheckpoint nlp object-detection python pytorch senet tensorflow utils vggface2 yolov5

Last synced: 09 Nov 2024

https://github.com/deep-diver/en-fr-mlt-tensorflow

English-French Machine Language Translation in Tensorflow

deep-learning english-to-french machine-translation nlp tensorflow

Last synced: 01 Nov 2024

https://github.com/DFKI-NLP/TRE

[AKBC 19] Improving Relation Extraction by Pre-trained Language Representations

information-extraction machine-learning multi-task-learning nlp relation-extraction transformer

Last synced: 01 Nov 2024

https://github.com/yohasebe/lemmatizer

Lemmatizer for text in English. Inspired by Python's nltk.corpus.reader.wordnet.morphy

lemmatizer nlp ruby rubynlp wordnet

Last synced: 08 Nov 2024

https://github.com/Nipun1212/Claude_api

Claude_api is a Python package that provides a convenient way to interact with Claude 2 from Anthropic.

anthropic anthropic-claude claude claude-ai claude-api nlp

Last synced: 02 Aug 2024

https://github.com/proycon/flat

FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.github.io/folia), a rich XML-based format for linguistic annotation. Flat allows users to view annotated FoLiA documents and enrich these documents with new annotations, a wide variety of linguistic annotation types is supported through the FoLiA paradigm.

annotation-tool clariah clarin computational-linguistics folia javascript linguistic-annotation-framework linguistics nlp python web-application

Last synced: 31 Oct 2024

https://github.com/ahmedbesbes/media-agent

Scrape data from social media and chat with it using Langchain

langchain large-language-models llms nlp nlproc python tweepy

Last synced: 06 Nov 2024

https://github.com/ahmedbesbes/twitter-agent

Scrape data from social media and chat with it using Langchain

langchain large-language-models llms nlp nlproc python tweepy

Last synced: 22 Aug 2024

https://github.com/prrao87/tweet-stance-prediction

Applying NLP transfer learning techniques to predict Tweet stance toward a topic

natural-language-processing nlp openai-gpt python text-classification transfer-learning transformers ulmfit

Last synced: 02 Nov 2024

https://github.com/clipperhouse/jargon

Tokenizers and lemmatizers for Go

data-science go lemmatizer nlp tokenizer

Last synced: 30 Oct 2024

https://github.com/textlint-rule/sentence-splitter

Split {Japanese, English} text into sentences.

english japanese javascript nlp segement sentence

Last synced: 04 Aug 2024

https://github.com/kororo/excelcy

Excel Integration with spaCy. Training NER using Excel/XLSX from PDF, DOCX, PPT, PNG or JPG.

entity excel nlp python python3 spacy spacy-extensions spacy-nlp spacy-pipeline training xlsx

Last synced: 14 Oct 2024

https://github.com/orthagonal/langchainex

Language Chain Library for Elixir

ai langchain nlp

Last synced: 01 Nov 2024