Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/akoumjian/datefinder

Find dates inside text using Python and get back datetime objects

datetime nlp parser

Last synced: 02 Aug 2024

https://github.com/Liquid-Legal-Institute/Legal-Text-Analytics

A list of selected resources, methods, and tools dedicated to Legal Text Analytics.

legal legal-text-analytics nlp

Last synced: 07 Nov 2024

https://github.com/michaelthwan/searchgpt

Grounded search engine (i.e. with source reference) based on LLM / ChatGPT / OpenAI API. It supports web search, file content search etc.

ai chatgpt grounded-api grounded-bot language-model llm machine-learning nlp nlp-machine-learning openai python retrieval retrieval-model

Last synced: 09 Nov 2024

https://github.com/smoothnlp/SmoothNLP

专注于可解释的NLP技术 An NLP Toolset With A Focus on Explainable Inference

depedency-parsing nlp nlp-pipeline postagging python tokenizer

Last synced: 03 Aug 2024

https://github.com/huggingface/dataset-viewer

Lightweight web API for visualizing and exploring any dataset - computer vision, speech, text, and tabular - stored on the Hugging Face Hub

api-rest data datasets huggingface machine-learning nlp

Last synced: 09 Aug 2024

https://github.com/blacksamorez/tensor_parallel

Automatically split your PyTorch models on multiple GPUs for training & inference

deep-learning machine-learning natural-language-processing nlp python pytorch pytorch-transformers

Last synced: 05 Nov 2024

https://github.com/ymcui/MacBERT

Revisiting Pre-trained Models for Chinese Natural Language Processing (MacBERT)

bert language-model macbert nlp pytorch tensorflow transformers

Last synced: 03 Aug 2024

https://github.com/zkywsg/daily-deeplearning

🔥机器学习/深度学习/Python/算法面试/自然语言处理教程/剑指offer/machine learning/deeplearning/Python/Algorithm interview/NLP Tutorial

deep-learning leetcode leetcode-python leetcode-solutions machine-learning nlp python pytorch pytorch-nlp pytorch-tutorial pytorch-tutorials tensorflow tensorflow-examples tensorflow-tutorials

Last synced: 10 Oct 2024

https://github.com/thunlp/openhownet

Core Data of HowNet and OpenHowNet Python API

hownet knowledge-base nlp openhownet semantics sememe

Last synced: 10 Nov 2024

https://github.com/gutfeeling/word_forms

Accurately generate all possible forms of an English word e.g "election" --> "elect", "electoral", "electorate" etc.

adjective adverb dictionary lemmatizer natural-language-processing nlp noun parts-of-speech stemmer verb-conjugations wordnet words

Last synced: 30 Oct 2024

https://github.com/princeton-nlp/densephrases

[ACL 2021] Learning Dense Representations of Phrases at Scale; EMNLP'2021: Phrase Retrieval Learns Passage Retrieval, Too https://arxiv.org/abs/2012.12624

information-retrieval knowledge-base nlp open-domain-qa passage-retrieval slot-filling

Last synced: 11 Nov 2024

https://github.com/BlackSamorez/tensor_parallel

Automatically split your PyTorch models on multiple GPUs for training & inference

deep-learning machine-learning natural-language-processing nlp python pytorch pytorch-transformers

Last synced: 09 Aug 2024

https://github.com/princeton-nlp/DensePhrases

[ACL 2021] Learning Dense Representations of Phrases at Scale; EMNLP'2021: Phrase Retrieval Learns Passage Retrieval, Too https://arxiv.org/abs/2012.12624

information-retrieval knowledge-base nlp open-domain-qa passage-retrieval slot-filling

Last synced: 03 Nov 2024

https://github.com/samtecspg/articulate

A platform for building conversational interfaces with intelligent agents (chatbots)

chatbot nlp nlu react

Last synced: 29 Oct 2024

https://github.com/mit-han-lab/lite-transformer

[ICLR 2020] Lite Transformer with Long-Short Range Attention

nlp pytorch transformer

Last synced: 03 Aug 2024

https://github.com/bminixhofer/nlprule

A fast, low-resource Natural Language Processing and Text Correction library written in Rust.

grammar grammatical-error-correction machine-learning natural-language-processing nlp proofreading rust spellcheck style-checker

Last synced: 28 Oct 2024

https://github.com/bakwc/JamSpell

Modern spell checking library - accurate, fast, multi-language

cpp csharp java ngrams nlp python ruby spellcheck spellchecker spelling-correction

Last synced: 13 Nov 2024

https://github.com/ChenghaoMou/text-dedup

All-in-one text de-duplication

data-processing de-duplication nlp text-processing

Last synced: 04 Nov 2024

https://github.com/timbmg/sentence-vae

PyTorch Re-Implementation of "Generating Sentences from a Continuous Space" by Bowman et al 2015 https://arxiv.org/abs/1511.06349

deep-learning generative-model neural-network nlp ptb pytorch vae

Last synced: 30 Oct 2024

https://github.com/titipata/pubmed_parser

:clipboard: A Python Parser for PubMed Open-Access XML Subset and MEDLINE XML Dataset

article doi medline-xml nlp parse parser pmid pubmed-central pubmed-parser python xml

Last synced: 13 Nov 2024

https://github.com/jerry1993-tech/Cornucopia-LLaMA-Fin-Chinese

聚宝盆(Cornucopia): 中文金融系列开源可商用大模型,并提供一套高效轻量化的垂直领域LLM训练框架(Pretraining、SFT、RLHF、Quantize等)

chinese finance large-language-models llama nlp qa rlhf sft text-generation transformers

Last synced: 02 Nov 2024

https://github.com/ymcui/Chinese-Mixtral

中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)

32k 64k large-language-models llm mixtral mixture-of-experts moe nlp

Last synced: 29 Oct 2024

https://github.com/HKUST-KnowComp/R-Net

Tensorflow Implementation of R-Net

machine-comprehension nlp r-net squad tensorflow

Last synced: 07 Aug 2024

https://github.com/rinnakk/japanese-pretrained-models

Code for producing Japanese pretrained models provided by rinna Co., Ltd.

gpt2 japanese nlp roberta

Last synced: 06 Nov 2024

https://github.com/graykode/xlnet-Pytorch

Simple XLNet implementation with Pytorch Wrapper

bert natural-language-processing nlp pytorch xlnet xlnet-pytorch

Last synced: 04 Nov 2024

https://github.com/graykode/xlnet-pytorch

Simple XLNet implementation with Pytorch Wrapper

bert natural-language-processing nlp pytorch xlnet xlnet-pytorch

Last synced: 30 Oct 2024

https://github.com/mozilla/firefox-translations

Firefox Translations is a webextension that enables client side translations for web browsers.

deep-neural-networks firefox javascript nlp nmt translation webextension

Last synced: 27 Oct 2024

https://github.com/ymcui/chinese-mixtral

中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)

32k 64k large-language-models llm mixtral mixture-of-experts moe nlp

Last synced: 28 Oct 2024

https://ucinlp.github.io/autoprompt/

AutoPrompt: Automatic Prompt Construction for Masked Language Models.

evaluation language-model nlp

Last synced: 04 Aug 2024

https://github.com/ucinlp/autoprompt

AutoPrompt: Automatic Prompt Construction for Masked Language Models.

evaluation language-model nlp

Last synced: 04 Aug 2024

https://github.com/zhang17173/Event-Extraction

基于法律裁判文书的事件抽取及其应用,包括数据的分词、词性标注、命名实体识别、事件要素抽取和判决结果预测等内容

cnn-classification deep-learning event-extraction judgment nlp word2vec

Last synced: 06 Aug 2024

https://github.com/dmitrizzle/chat-bubble

Simple chatbot UI for the Web with JSON scripting 👋🤖🤙

bot bot-framework chat-bots chatbot chatbot-ui javascript natural-language-classifiers nlp

Last synced: 04 Aug 2024

https://github.com/google-research/bigbird

Transformers for Longer Sequences

bert deep-learning longer-sequences nlp transformer

Last synced: 10 Nov 2024

https://github.com/yassouali/ML-paper-notes

:notebook: Notes and summaries of various ML, Computer Vision & NLP papers.

computer-vision deep-learning machine-learning natural-language-processing nlp summary

Last synced: 04 Nov 2024

https://github.com/tasdikrahman/vocabulary

[Not Maintained anymore] Python Module to get Meanings, Synonyms and what not for a given word

antonym api dictionary glosbe nlp pronunciation python synonyms wordnik

Last synced: 31 Oct 2024

https://github.com/jerryji1993/DNABERT

DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome

deep-learning dnabert-model genome gpu kmer kmer-format machine-learning natural-language-processing nlp sequence

Last synced: 08 Aug 2024

https://github.com/princeton-nlp/LLM-Shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

efficiency llama llama2 llm nlp pre-training pruning

Last synced: 08 Nov 2024

https://github.com/pysentimiento/pysentimiento

A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks

nlp sentiment-analysis transformers

Last synced: 27 Oct 2024

https://github.com/shuaihuaiyi/QA

使用深度学习算法实现的中文问答系统

lstm nlp

Last synced: 04 Aug 2024

https://github.com/princeton-nlp/llm-shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

efficiency llama llama2 llm nlp pre-training pruning

Last synced: 10 Oct 2024

https://github.com/Teamlinker/Teamlinker

Teamlinker is a team collaboration platform that integrates multi-functional modules. Users can process tasks in parallel, including six functional modules: project, wiki, calendar, meeting, chat and network disk, achieving seamless integration and improving team collaboration efficiency.

arco-design artificial-intelligence calendar chat confluence cooperation documentation javascript mediasoup meeting nlp nodejs project-management teamwork typescript video-conferencing vue webos wiki workflow

Last synced: 07 Nov 2024

https://github.com/web-arena-x/webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"

agent nlp

Last synced: 03 Aug 2024

https://github.com/voidful/textrl

Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)

chatgpt controlled-nlg gpt-2 gpt-3 language-model nlg nlp pytorch reinforcement-learning rlhf

Last synced: 09 Nov 2024

https://github.com/voidful/TextRL

Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)

chatgpt controlled-nlg gpt-2 gpt-3 language-model nlg nlp pytorch reinforcement-learning rlhf

Last synced: 31 Oct 2024

https://github.com/ukairia777/tensorflow-nlp-tutorial

tensorflow를 사용하여 텍스트 전처리부터, Topic Models, BERT, GPT, LLM과 같은 최신 모델의 다운스트림 태스크들을 정리한 Deep Learning NLP 저장소입니다.

bert bert-ner dpo huggingface keras-tutorial llama llm lora named-entity-recognition natural-language-processing nlp nlp-tutorial question-answering sft tensorflow trainer transformers

Last synced: 09 Nov 2024

https://github.com/neilgupta/Sherlock

Natural-language event parser for Javascript

datetime event-parser javascript natural-language-processing nlp regex

Last synced: 03 Aug 2024

https://github.com/TinyLLaVA/TinyLLaVA_Factory

A Framework of Small-scale Large Multimodal Models

large-multimodal-models llama llava nlp tinyllama transformers vision-language

Last synced: 02 Aug 2024

https://github.com/udibr/headlines

Automatically generate headlines to short articles

generation keras nlp rnn summarization

Last synced: 07 Aug 2024

https://github.com/neuml/codequestion

🔎 Semantic search for developers

machine-learning nlp python search txtai

Last synced: 28 Oct 2024

https://github.com/medspacy/medspacy

Library for clinical NLP with spaCy.

clinical-nlp medspacy nlp nlp-library pipeline spacy

Last synced: 14 Oct 2024

https://github.com/OpenLemur/Lemur

[ICLR 2024] Lemur: Open Foundation Models for Language Agents

code-generation language-model machine-learning natural-language-processing nlp text-reasoning

Last synced: 03 Aug 2024

https://github.com/Shark-NLP/OpenICL

OpenICL is an open-source framework to facilitate research, development, and prototyping of in-context learning.

in-context-learning language-model nlp

Last synced: 03 Aug 2024

https://github.com/dair-ai/pytorch_notebooks

🔥 A collection of PyTorch notebooks for learning and practicing deep learning

deep-learning machine-learning nlp pytorch

Last synced: 03 Sep 2024

https://github.com/airalcorn2/Deep-Semantic-Similarity-Model

My Keras implementation of the Deep Semantic Similarity Model (DSSM)/Convolutional Latent Semantic Model (CLSM) described here: http://research.microsoft.com/pubs/226585/cikm2014_cdssm_final.pdf.

deep-learning information-retrieval keras natural-language-processing nlp

Last synced: 07 Aug 2024

https://github.com/allenai/allennlp-models

Officially supported AllenNLP models

allennlp nlp pytorch

Last synced: 26 Sep 2024

https://github.com/stanfordnlp/python-stanford-corenlp

Python interface to CoreNLP using a bidirectional server-client interface.

corenlp corenlp-server nlp

Last synced: 08 Nov 2024

https://github.com/CornellNLP/ConvoKit

ConvoKit is a toolkit for extracting conversational features and analyzing social phenomena in conversations. It includes several large conversational datasets along with scripts exemplifying the use of the toolkit on these datasets.

computational-social-science conversational-ai conversational-analysis conversations dataset dialogs machine-learning nlp toolkit

Last synced: 26 Oct 2024

https://github.com/CornellNLP/Cornell-Conversational-Analysis-Toolkit

ConvoKit is a toolkit for extracting conversational features and analyzing social phenomena in conversations. It includes several large conversational datasets along with scripts exemplifying the use of the toolkit on these datasets.

computational-social-science conversational-ai conversational-analysis conversations dataset dialogs machine-learning nlp toolkit

Last synced: 05 Aug 2024

https://github.com/subho406/OmniNet

Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain

artificial-intelligence deep-learning image-captioning machine-learning multimodal-learning multitask-learning neural-network nlp transformer video-recognition

Last synced: 07 Aug 2024

https://github.com/allenai/tango

Organize your experiments into discrete steps that can be cached and reused throughout the lifetime of your research project.

ai machine-learning nlp python python3 pytorch

Last synced: 01 Oct 2024

https://github.com/fhamborg/Giveme5W1H

Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?

5w 5w1h answering event-detection event-extraction fivew fivewoneh news news-articles nlp nlp-library question question-answering text-analysis

Last synced: 28 Oct 2024

https://github.com/fhamborg/giveme5w1h

Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?

5w 5w1h answering event-detection event-extraction fivew fivewoneh news news-articles nlp nlp-library question question-answering text-analysis

Last synced: 11 Oct 2024

https://github.com/synyi/poplar

A web-based annotation tool for natural language processing (NLP)

annotation nlp svg

Last synced: 30 Oct 2024

https://github.com/intellabs/rag-fit

Framework for enhancing LLMs for RAG tasks using fine-tuning.

evaluation fine-tuning information-retrieval llm nlp question-answering rag semantic-search

Last synced: 13 Nov 2024

https://github.com/chunelfeng/caiss

一款简单好用的 跨平台/多语言的 相似向量/相似词/相似句 高性能检索引擎。欢迎star & fork。Build together! Power another !

ai ann chatbot deep-learning faiss hnsw mrpt nlp search-engine similarity-search

Last synced: 28 Oct 2024

https://github.com/Brokenwind/BertSimilarity

Computing similarity of two sentences with google's BERT algorithm。利用Bert计算句子相似度。语义相似度计算。文本相似度计算。

bert nlp python semantic similarity tensorflow

Last synced: 02 Nov 2024

https://github.com/phantominsights/subreddit-analyzer

A comprehensive Data and Text Mining workflow for submissions and comments from any given public subreddit.

matplotlib nlp pandas python3 seaborn spacy wordcloud

Last synced: 30 Oct 2024

https://github.com/salesforce/matchbox

Write PyTorch code at the level of individual examples, then run it efficiently on minibatches.

deep-learning minibatch nlp pytorch

Last synced: 03 Aug 2024