Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/abadojack/whatlangGo

Natural language detection library for Go

go language nlp text-processing

Last synced: 24 Oct 2024

https://github.com/HIT-SCIR/Chinese-Mixtral-8x7B

中文Mixtral-8x7B(Chinese-Mixtral-8x7B)

large-language-models llm mixtral-8x7b nlp

Last synced: 29 Oct 2024

https://github.com/ICLRandD/Blackstone

:black_circle: A spaCy pipeline and model for NLP on unstructured legal text.

caselaw law legaltech nlp spacy-models

Last synced: 09 Nov 2024

https://github.com/wyounas/homer

Homer, a text analyser in Python, can help make your text more clear, simple and useful for your readers.

nlp nlp-library python python-library python-script python3 text-analysis

Last synced: 06 Nov 2024

https://github.com/BlackSamorez/tensor_parallel

Automatically split your PyTorch models on multiple GPUs for training & inference

deep-learning machine-learning natural-language-processing nlp python pytorch pytorch-transformers

Last synced: 29 Nov 2024

https://github.com/zkywsg/daily-deeplearning

🔥机器学习/深度学习/Python/大模型/多模态/LLM/deeplearning/Python/Algorithm interview/NLP Tutorial

cv deep-learning leetcode leetcode-python leetcode-solutions llm machine-learning nlp python pytorch pytorch-nlp pytorch-tutorial pytorch-tutorials tensorflow tensorflow-examples tensorflow-tutorials

Last synced: 18 Dec 2024

https://github.com/akoumjian/datefinder

Find dates inside text using Python and get back datetime objects

datetime nlp parser

Last synced: 13 Nov 2024

https://github.com/Liquid-Legal-Institute/Legal-Text-Analytics

A list of selected resources, methods, and tools dedicated to Legal Text Analytics.

legal legal-text-analytics nlp

Last synced: 07 Nov 2024

https://github.com/michaelthwan/searchgpt

Grounded search engine (i.e. with source reference) based on LLM / ChatGPT / OpenAI API. It supports web search, file content search etc.

ai chatgpt grounded-api grounded-bot language-model llm machine-learning nlp nlp-machine-learning openai python retrieval retrieval-model

Last synced: 09 Nov 2024

https://github.com/smoothnlp/SmoothNLP

专注于可解释的NLP技术 An NLP Toolset With A Focus on Explainable Inference

depedency-parsing nlp nlp-pipeline postagging python tokenizer

Last synced: 18 Nov 2024

https://github.com/TinyLLaVA/TinyLLaVA_Factory

A Framework of Small-scale Large Multimodal Models

large-multimodal-models llama llava nlp tinyllama transformers vision-language

Last synced: 13 Nov 2024

https://github.com/thunlp/openhownet

Core Data of HowNet and OpenHowNet Python API

hownet knowledge-base nlp openhownet semantics sememe

Last synced: 23 Dec 2024

https://github.com/gutfeeling/word_forms

Accurately generate all possible forms of an English word e.g "election" --> "elect", "electoral", "electorate" etc.

adjective adverb dictionary lemmatizer natural-language-processing nlp noun parts-of-speech stemmer verb-conjugations wordnet words

Last synced: 30 Oct 2024

https://github.com/princeton-nlp/densephrases

[ACL 2021] Learning Dense Representations of Phrases at Scale; EMNLP'2021: Phrase Retrieval Learns Passage Retrieval, Too https://arxiv.org/abs/2012.12624

information-retrieval knowledge-base nlp open-domain-qa passage-retrieval slot-filling

Last synced: 21 Dec 2024

https://github.com/bminixhofer/nlprule

A fast, low-resource Natural Language Processing and Text Correction library written in Rust.

grammar grammatical-error-correction machine-learning natural-language-processing nlp proofreading rust spellcheck style-checker

Last synced: 23 Dec 2024

https://github.com/princeton-nlp/DensePhrases

[ACL 2021] Learning Dense Representations of Phrases at Scale; EMNLP'2021: Phrase Retrieval Learns Passage Retrieval, Too https://arxiv.org/abs/2012.12624

information-retrieval knowledge-base nlp open-domain-qa passage-retrieval slot-filling

Last synced: 03 Nov 2024

https://github.com/samtecspg/articulate

A platform for building conversational interfaces with intelligent agents (chatbots)

chatbot nlp nlu react

Last synced: 29 Oct 2024

https://github.com/mit-han-lab/lite-transformer

[ICLR 2020] Lite Transformer with Long-Short Range Attention

nlp pytorch transformer

Last synced: 16 Nov 2024

https://github.com/bakwc/JamSpell

Modern spell checking library - accurate, fast, multi-language

cpp csharp java ngrams nlp python ruby spellcheck spellchecker spelling-correction

Last synced: 13 Nov 2024

https://github.com/ChenghaoMou/text-dedup

All-in-one text de-duplication

data-processing de-duplication nlp text-processing

Last synced: 04 Nov 2024

https://github.com/titipata/pubmed_parser

:clipboard: A Python Parser for PubMed Open-Access XML Subset and MEDLINE XML Dataset

article doi medline-xml nlp parse parser pmid pubmed-central pubmed-parser python xml

Last synced: 19 Dec 2024

https://github.com/timbmg/sentence-vae

PyTorch Re-Implementation of "Generating Sentences from a Continuous Space" by Bowman et al 2015 https://arxiv.org/abs/1511.06349

deep-learning generative-model neural-network nlp ptb pytorch vae

Last synced: 20 Dec 2024

https://github.com/ymcui/chinese-mixtral

中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)

32k 64k large-language-models llm mixtral mixture-of-experts moe nlp

Last synced: 21 Dec 2024

https://github.com/jerry1993-tech/Cornucopia-LLaMA-Fin-Chinese

聚宝盆(Cornucopia): 中文金融系列开源可商用大模型,并提供一套高效轻量化的垂直领域LLM训练框架(Pretraining、SFT、RLHF、Quantize等)

chinese finance large-language-models llama nlp qa rlhf sft text-generation transformers

Last synced: 02 Nov 2024

https://github.com/ymcui/Chinese-Mixtral

中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)

32k 64k large-language-models llm mixtral mixture-of-experts moe nlp

Last synced: 29 Oct 2024

https://github.com/HKUST-KnowComp/R-Net

Tensorflow Implementation of R-Net

machine-comprehension nlp r-net squad tensorflow

Last synced: 27 Nov 2024

https://github.com/google-research/bigbird

Transformers for Longer Sequences

bert deep-learning longer-sequences nlp transformer

Last synced: 22 Dec 2024

https://github.com/graykode/xlnet-pytorch

Simple XLNet implementation with Pytorch Wrapper

bert natural-language-processing nlp pytorch xlnet xlnet-pytorch

Last synced: 22 Dec 2024

https://ucinlp.github.io/autoprompt/

AutoPrompt: Automatic Prompt Construction for Masked Language Models.

evaluation language-model nlp

Last synced: 19 Nov 2024

https://github.com/rinnakk/japanese-pretrained-models

Code for producing Japanese pretrained models provided by rinna Co., Ltd.

gpt2 japanese nlp roberta

Last synced: 06 Nov 2024

https://github.com/graykode/xlnet-Pytorch

Simple XLNet implementation with Pytorch Wrapper

bert natural-language-processing nlp pytorch xlnet xlnet-pytorch

Last synced: 04 Nov 2024

https://github.com/ucinlp/autoprompt

AutoPrompt: Automatic Prompt Construction for Masked Language Models.

evaluation language-model nlp

Last synced: 19 Nov 2024

https://github.com/mozilla/firefox-translations

Firefox Translations is a webextension that enables client side translations for web browsers.

deep-neural-networks firefox javascript nlp nmt translation webextension

Last synced: 27 Oct 2024

https://github.com/zhang17173/Event-Extraction

基于法律裁判文书的事件抽取及其应用,包括数据的分词、词性标注、命名实体识别、事件要素抽取和判决结果预测等内容

cnn-classification deep-learning event-extraction judgment nlp word2vec

Last synced: 25 Nov 2024

https://github.com/dmitrizzle/chat-bubble

Simple chatbot UI for the Web with JSON scripting 👋🤖🤙

bot bot-framework chat-bots chatbot chatbot-ui javascript natural-language-classifiers nlp

Last synced: 22 Dec 2024

https://github.com/jerryji1993/DNABERT

DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome

deep-learning dnabert-model genome gpu kmer kmer-format machine-learning natural-language-processing nlp sequence

Last synced: 28 Nov 2024

https://github.com/yassouali/ml-paper-notes

:notebook: Notes and summaries of various ML, Computer Vision & NLP papers.

computer-vision deep-learning machine-learning natural-language-processing nlp summary

Last synced: 24 Nov 2024

https://github.com/yassouali/ML-paper-notes

:notebook: Notes and summaries of various ML, Computer Vision & NLP papers.

computer-vision deep-learning machine-learning natural-language-processing nlp summary

Last synced: 04 Nov 2024

https://github.com/tasdikrahman/vocabulary

[Not Maintained anymore] Python Module to get Meanings, Synonyms and what not for a given word

antonym api dictionary glosbe nlp pronunciation python synonyms wordnik

Last synced: 31 Oct 2024

https://github.com/princeton-nlp/LLM-Shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

efficiency llama llama2 llm nlp pre-training pruning

Last synced: 08 Nov 2024

https://github.com/princeton-nlp/llm-shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

efficiency llama llama2 llm nlp pre-training pruning

Last synced: 20 Dec 2024

https://github.com/pysentimiento/pysentimiento

A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks

nlp sentiment-analysis transformers

Last synced: 27 Oct 2024

https://github.com/shuaihuaiyi/QA

使用深度学习算法实现的中文问答系统

lstm nlp

Last synced: 19 Nov 2024

https://github.com/teamlinker/teamlinker

Teamlinker is a team collaboration platform that integrates multi-functional modules. Users can process tasks in parallel, including six functional modules: project, wiki, calendar, meeting, chat and network disk, achieving seamless integration and improving team collaboration efficiency.

arco-design artificial-intelligence calendar chat confluence cooperation documentation javascript mediasoup meeting nlp nodejs project-management teamwork typescript video-conferencing vue webos wiki workflow

Last synced: 20 Dec 2024

https://github.com/Teamlinker/Teamlinker

Teamlinker is a team collaboration platform that integrates multi-functional modules. Users can process tasks in parallel, including six functional modules: project, wiki, calendar, meeting, chat and network disk, achieving seamless integration and improving team collaboration efficiency.

arco-design artificial-intelligence calendar chat confluence cooperation documentation javascript mediasoup meeting nlp nodejs project-management teamwork typescript video-conferencing vue webos wiki workflow

Last synced: 07 Nov 2024

https://github.com/voidful/textrl

Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)

chatgpt controlled-nlg gpt-2 gpt-3 language-model nlg nlp pytorch reinforcement-learning rlhf

Last synced: 09 Nov 2024

https://github.com/neilgupta/sherlock

Natural-language event parser for Javascript

datetime event-parser javascript natural-language-processing nlp regex

Last synced: 20 Dec 2024

https://github.com/voidful/TextRL

Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)

chatgpt controlled-nlg gpt-2 gpt-3 language-model nlg nlp pytorch reinforcement-learning rlhf

Last synced: 31 Oct 2024

https://github.com/medspacy/medspacy

Library for clinical NLP with spaCy.

clinical-nlp medspacy nlp nlp-library pipeline spacy

Last synced: 19 Dec 2024

https://github.com/neilgupta/Sherlock

Natural-language event parser for Javascript

datetime event-parser javascript natural-language-processing nlp regex

Last synced: 15 Nov 2024

https://github.com/OpenLemur/Lemur

[ICLR 2024] Lemur: Open Foundation Models for Language Agents

code-generation language-model machine-learning natural-language-processing nlp text-reasoning

Last synced: 14 Nov 2024

https://github.com/chunelfeng/caiss

一款简单好用的 跨平台/多语言的 相似向量/相似词/相似句 高性能检索引擎。欢迎star & fork。Build together! Power another !

ai ann chatbot deep-learning faiss hnsw mrpt nlp search-engine similarity-search

Last synced: 22 Dec 2024

https://github.com/ukairia777/tensorflow-nlp-tutorial

tensorflow를 사용하여 텍스트 전처리부터, Topic Models, BERT, GPT, LLM과 같은 최신 모델의 다운스트림 태스크들을 정리한 Deep Learning NLP 저장소입니다.

bert bert-ner dpo huggingface keras-tutorial llama llm lora named-entity-recognition natural-language-processing nlp nlp-tutorial question-answering sft tensorflow trainer transformers

Last synced: 09 Nov 2024

https://github.com/udibr/headlines

Automatically generate headlines to short articles

generation keras nlp rnn summarization

Last synced: 27 Nov 2024

https://github.com/neuml/codequestion

🔎 Semantic search for developers

machine-learning nlp python search txtai

Last synced: 28 Oct 2024

https://github.com/magpie-align/magpie

Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality synthetic data generation pipeline!

alignment dataset gemma llama2 llama3 llm nlp paper phi3 qwen2 supervised-finetuning synthetic-data synthetic-dataset-generation

Last synced: 21 Dec 2024

https://github.com/Shark-NLP/OpenICL

OpenICL is an open-source framework to facilitate research, development, and prototyping of in-context learning.

in-context-learning language-model nlp

Last synced: 18 Nov 2024

https://github.com/dair-ai/pytorch_notebooks

🔥 A collection of PyTorch notebooks for learning and practicing deep learning

deep-learning machine-learning nlp pytorch

Last synced: 03 Sep 2024

https://github.com/allenai/allennlp-models

Officially supported AllenNLP models

allennlp nlp pytorch

Last synced: 26 Sep 2024

https://github.com/airalcorn2/Deep-Semantic-Similarity-Model

My Keras implementation of the Deep Semantic Similarity Model (DSSM)/Convolutional Latent Semantic Model (CLSM) described here: http://research.microsoft.com/pubs/226585/cikm2014_cdssm_final.pdf.

deep-learning information-retrieval keras natural-language-processing nlp

Last synced: 27 Nov 2024

https://github.com/stanfordnlp/python-stanford-corenlp

Python interface to CoreNLP using a bidirectional server-client interface.

corenlp corenlp-server nlp

Last synced: 22 Dec 2024

https://github.com/intellabs/rag-fit

Framework for enhancing LLMs for RAG tasks using fine-tuning.

evaluation fine-tuning information-retrieval llm nlp question-answering rag semantic-search

Last synced: 21 Dec 2024

https://github.com/CornellNLP/ConvoKit

ConvoKit is a toolkit for extracting conversational features and analyzing social phenomena in conversations. It includes several large conversational datasets along with scripts exemplifying the use of the toolkit on these datasets.

computational-social-science conversational-ai conversational-analysis conversations dataset dialogs machine-learning nlp toolkit

Last synced: 26 Oct 2024

https://github.com/subho406/OmniNet

Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain

artificial-intelligence deep-learning image-captioning machine-learning multimodal-learning multitask-learning neural-network nlp transformer video-recognition

Last synced: 27 Nov 2024

https://github.com/farukalamai/advanced-machine-learning-engineer-roadmap-2024

A Full Stack ML (Machine Learning) Roadmap involves learning the necessary skills and technologies to become proficient in all aspects of machine learning, including data collection and preprocessing, model development, deployment, and maintenance.

aws computer-vision data-analysis data-science data-visualization deep-learning git-github machine-learning machine-learning-roadmap mlops natural-language-processing neural-network nlp opencv pandas python pytorch statistics tensorflow yolo

Last synced: 22 Dec 2024

https://github.com/fhamborg/Giveme5W1H

Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?

5w 5w1h answering event-detection event-extraction fivew fivewoneh news news-articles nlp nlp-library question question-answering text-analysis

Last synced: 28 Oct 2024

https://github.com/fhamborg/giveme5w1h

Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?

5w 5w1h answering event-detection event-extraction fivew fivewoneh news news-articles nlp nlp-library question question-answering text-analysis

Last synced: 15 Nov 2024

https://github.com/allenai/tango

Organize your experiments into discrete steps that can be cached and reused throughout the lifetime of your research project.

ai machine-learning nlp python python3 pytorch

Last synced: 01 Oct 2024