Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/jsksxs360/AHANLP

啊哈自然语言处理包,提供包括分词、依存句法分析、语义角色标注、自动摘要、语义相似度计算、LDA 主题预测、词云等服务。

chinese nlp

Last synced: 03 Jul 2024

https://github.com/chatopera/chatopera.feishu

通过 Feishu 开放平台和 Chatopera 机器人平台上线智能对话机器人服务, 聊天机器人,飞书,lark

ai bot chatbot chatopera dialog feishu lark machine-learning nlp nlu python python3

Last synced: 03 Jul 2024

https://github.com/arian-askari/ChatGPT-RetrievalQA

A dataset for training/evaluating Question Answering Retrieval models on ChatGPT responses with the possibility to training/evaluating on real human responses.

ai chatgpt chatgpt-information-retrieval chatgpt-ir data-augmentation dataset deep-learning gpt-3 gpt2 gpt3 information-retrieval information-retrieval-chatgpt ir ir-chatgpt machine-learning nlp openai python sequence-to-sequence text-retrieval

Last synced: 03 Jul 2024

https://github.com/microsoft/tutel

Tutel MoE: An Optimized Mixture-of-Experts Implementation

mixture-of-experts moe nlp pytorch transformer

Last synced: 03 Jul 2024

https://github.com/bigscience-workshop/bigscience

Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.

machine-learning models nlp training

Last synced: 03 Jul 2024

https://github.com/explosion/spacy-models

💫 Models for the spaCy Natural Language Processing (NLP) library

machine-learning machine-learning-models models natural-language-processing nlp spacy spacy-models statistical-models

Last synced: 02 Jul 2024

https://github.com/tim5go/zhopenie

Chinese Open Information Extraction (Tree-based Triple Relation Extraction Module)

chinese chinese-nlp nlp relation-extraction semantic-web

Last synced: 02 Jul 2024

https://github.com/philipperemy/Stanford-OpenIE-Python

Stanford Open Information Extraction made simple!

extraction nlp python-wrapper stanford stanford-openie

Last synced: 02 Jul 2024

https://github.com/gkiril/MinSCIE

MinScIE is an Open Information Extraction system which provides structured knowledge enriched with semantic information about citations.

information-extraction natural-language-processing natural-language-toolkit natural-language-understanding nlp nlp-apis nlp-resources open-information-extraction

Last synced: 02 Jul 2024

https://github.com/crownpku/awesome-chinese-nlp

A curated list of resources for Chinese NLP 中文自然语言处理相关资料

chinese-nlp nlp

Last synced: 02 Jul 2024

https://github.com/allenai/allennlp

An open-source NLP research library, built on PyTorch.

data-science deep-learning natural-language-processing nlp python pytorch

Last synced: 02 Jul 2024

https://github.com/vzhong/embeddings

Fast, DB Backed pretrained word embeddings for natural language processing.

deep-learning neural-network nlp

Last synced: 02 Jul 2024

https://github.com/Beomi/KcBERT

🤗 Pretrained BERT model & WordPiece tokenizer trained on Korean Comments 한국어 댓글로 프리트레이닝한 BERT 모델과 데이터셋

bert bert-model korean-nlp nlp transformers

Last synced: 02 Jul 2024

https://github.com/SKTBrain/KoBERT

Korean BERT pre-trained cased (KoBERT)

bert korean-nlp language-model nlp pytorch transformers

Last synced: 02 Jul 2024

https://github.com/cosmoquester/2021-dialogue-summary-competition

[2021 훈민정음 한국어 음성•자연어 인공지능 경진대회] 대화요약 부문 알라꿍달라꿍 팀의 대화요약 학습 및 추론 코드를 공유하기 위한 레포입니다.

dialogue huggingface-transformers nlp pytorch-lightning summarization

Last synced: 02 Jul 2024

https://github.com/km1994/recommendation_advertisement_search

整理自然语言处理、推荐系统、搜索引擎等AI领域的入门笔记,论文学习笔记和面试资料(关于NLP那些你不知道的事、关于推荐系统那些你不知道的事、NLP百面百搭、推荐系统百面百搭、搜索引擎百面百搭)

advertisement nlp recommendation-system search-engine

Last synced: 02 Jul 2024

https://github.com/JackHCC/Arxiv-NLP-Reporter

每日自动获取Arxiv上NLP相关最新论文【Arxiv Natural Language Processing Paper Automatic Crawl Daily】

arxiv automation nlp

Last synced: 02 Jul 2024

https://github.com/techcentaur/PyLex

Perform lexical analysis on words, one word at a time.

cli lexical-analysis nlp poets python3 scraping words

Last synced: 01 Jul 2024

https://github.com/ARBML/tkseem

Arabic Tokenization Library. It provides many tokenization algorithms.

arabic-nlp nlp tkseem tokenization

Last synced: 01 Jul 2024

https://github.com/ThuCCSLab/Awesome-LM-SSP

A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).

adversarial-attacks awesome-list diffusion-models jailbreak language-model llm nlp privacy safety security vlm

Last synced: 01 Jul 2024

https://github.com/datasciencecampus/pyGrams

Extracts key terminology (n-grams) from any large collection of documents (>1000) and forecasts emergence

dsc-projects emergence-calculations natural-language-processing nlp nltk patents python scikit-learn tf-idf

Last synced: 01 Jul 2024

https://github.com/rguthrie3/DeepLearningForNLPInPytorch

An IPython Notebook tutorial on deep learning for natural language processing, including structure prediction.

deep-learning lstm neural-network nlp pytorch tutorial

Last synced: 01 Jul 2024

https://github.com/LanguageMachines/PICCL

A set of workflows for corpus building through OCR, post-correction and normalisation

computational-linguistics corpus-linguistics corpus-tools folia nlp ocr workflow

Last synced: 30 Jun 2024

https://github.com/indix/whatthelang

Lightning Fast Language Prediction 🚀

fasttext language-detection languages nlp python

Last synced: 30 Jun 2024

https://github.com/jjangsangy/ExplainToMe

Automatic Web Article Summarizer

docker heroku nlp python textrank

Last synced: 30 Jun 2024

https://github.com/davikawasaki/utfpr-ce-undergrad-final-project

UTFPR Computer Engineering Undergrad Final Project - Computing Exam Questions Classification Using Natural-Language Processing

adaptive-teaching computing-classification machine-learning natural-language-processing nlp nltk python sklearn

Last synced: 30 Jun 2024

https://github.com/makcedward/nlp

:memo: This repository recorded my NLP journey.

ai data-science deep-learning machine-learning nlp

Last synced: 30 Jun 2024

https://github.com/love-irish/spellchecker

A ruby spellchecker library that works well with Irish

irish nlp spellchecker

Last synced: 30 Jun 2024

https://github.com/thunlp/PromptPapers

Must-read papers on prompt-based tuning for pre-trained language models.

ai bert machine-learning nlp pre-trained-language-models prompt prompt-based prompt-learning prompt-toolkit

Last synced: 30 Jun 2024

https://github.com/vi3k6i5/flashtext

Extract Keywords from sentence or Replace keywords in sentences.

data-extraction keyword-extraction nlp search-in-text word2vec

Last synced: 29 Jun 2024

https://github.com/anupamchugh/iowncode

A curated collection of iOS, ML, AR resources sprinkled with some UI additions

alamofire arkit computer-vision coreml coremltools ios keras ml-kit natural-language-processing nlp realitykit swift swiftui vision vision-framework

Last synced: 29 Jun 2024

https://github.com/web-arena-x/webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"

agent nlp

Last synced: 29 Jun 2024

https://github.com/CogComp/cogcomp-nlpy

CogComp's light-weight Python NLP annotators

data-mining natural-language-processing nlp text-mining text-processing

Last synced: 29 Jun 2024

https://github.com/rynst/awesome-llm-engineering

💻 An awesome & curated list of resources for large language model engineering (application layer: prompt engineering, fine tuning, etc.) [ Work In Progress, feel free to contribute! ]

gpt-3 machine-learning nlp prompt-engineering

Last synced: 29 Jun 2024

https://github.com/the-finai/pixiu

This repository introduces PIXIU, an open-source resource featuring the first financial large language models (LLMs), instruction tuning data, and evaluation benchmarks to holistically assess financial LLMs. Our goal is to continually push forward the open-source development of financial artificial intelligence (AI).

aifinance chatgpt fintech gpt-4 large-language-models llama machine-learning named-entity-recognition natural-language-processing nlp pixiu question-answering sentiment-analysis stock-price-prediction text-classification

Last synced: 29 Jun 2024

https://github.com/ukairia777/tensorflow-nlp-tutorial

tensorflow를 사용하여 텍스트 전처리부터, Topic Models, BERT, GPT, LLM과 같은 최신 모델의 다운스트림 태스크들을 정리한 Deep Learning NLP 저장소입니다.

bert bert-ner dpo huggingface keras-tutorial llama llm lora named-entity-recognition natural-language-processing nlp nlp-tutorial question-answering sft tensorflow trainer transformers

Last synced: 29 Jun 2024

https://github.com/dsgiitr/d2l-pytorch

This project reproduces the book Dive Into Deep Learning (https://d2l.ai/), adapting the code from MXNet into PyTorch.

book computer-vision d2l data-science deep-learning dive-into-deep-learning mxnet nlp pytorch pytorch-implmention

Last synced: 29 Jun 2024

https://github.com/Curated-Awesome-Lists/awesome-llms-fine-tuning

Explore a comprehensive collection of resources, tutorials, papers, tools, and best practices for fine-tuning Large Language Models (LLMs). Perfect for ML practitioners and researchers!

ai awesome-list deep-learning fine-tuning gpt large-language-models llms machine-learning nlp transformers

Last synced: 29 Jun 2024

https://github.com/thunlp/THUCTC

An Efficient Chinese Text Classifier

chinese-nlp nlp

Last synced: 28 Jun 2024

https://github.com/ART-Group-it/GASP

GASP! Dataset - Generating Abstracts of Scientific Papers from Abstracts of Cited Papers

corpus dataset machine-learning natural-language-processing nlp

Last synced: 28 Jun 2024

https://github.com/jerryji1993/DNABERT

DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome

deep-learning dnabert-model genome gpu kmer kmer-format machine-learning natural-language-processing nlp sequence

Last synced: 28 Jun 2024

https://github.com/CogStack/OpenGPT

A framework for creating grounded instruction based datasets and training conversational domain expert Large Language Models (LLMs).

chatgpt gpt-4 health healthcare huggingface llm medicine nlp opengpt

Last synced: 28 Jun 2024

https://github.com/CornellNLP/ConvoKit

ConvoKit is a toolkit for extracting conversational features and analyzing social phenomena in conversations. It includes several large conversational datasets along with scripts exemplifying the use of the toolkit on these datasets.

computational-social-science conversational-ai conversational-analysis conversations dataset dialogs machine-learning nlp toolkit

Last synced: 28 Jun 2024

https://github.com/Guitaricet/relora

Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates

deep-learning distributed-training llama nlp peft transformer

Last synced: 28 Jun 2024

https://github.com/salesforce/factualNLG

Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"

factual-consistency factuality large-language-models llm nlp summarization

Last synced: 28 Jun 2024

https://github.com/SamEdwardes/spacytextblob

A TextBlob sentiment analysis pipeline component for spaCy.

natural-language-processing nlp python spacy

Last synced: 27 Jun 2024

https://github.com/mhezarei/ai-bot

2020 AI bot challenge (ai-bot.ir) repository. This program answers a given question with a specific format and subject.

bert nlp persian-nlp

Last synced: 27 Jun 2024

https://github.com/srstevenson/keyword-extractor

Extract keywords from plain text documents

nlp spacy tf-idf

Last synced: 27 Jun 2024

https://github.com/pooya-mohammadi/persian-spell-checker-kenlm

A complete instruction for training a Persian spell checker and a language model based on SymSpell and KenLM, respectively using Wikipedia dataset.

bash kenlm language-model nlp persian python spellcheck spellchecker symspell

Last synced: 27 Jun 2024

https://github.com/mohadese-yousefi/spell-correction

Simple autocorrect misspelled word base on distance.

nlp spelling-correction

Last synced: 27 Jun 2024

https://github.com/minasmz/Persian-Summarization

Statistical and Semantical Text Summarizer in Persian Language

doc2vec-model gensim nlp persian-language persian-nlp text-summarization textrank-algorithm

Last synced: 27 Jun 2024

https://github.com/johnbumgarner/wordhoard

This Python module can be used to obtain antonyms, synonyms, hypernyms, hyponyms, homophones and definitions.

antonyms bag-of-words definitions dictionary homophones hypernyms hyponyms lexicon nlp python python3 synonyms text-analysis textual-analysis wordlists wordnet wordnets wordsearch

Last synced: 27 Jun 2024

https://github.com/neilgupta/Sherlock

Natural-language event parser for Javascript

datetime event-parser javascript natural-language-processing nlp regex

Last synced: 27 Jun 2024

https://github.com/bilibili/Index-1.9B

A lightweight multilingual SOTA LLM

llm nlp

Last synced: 26 Jun 2024

https://github.com/hrwhisper/SpamMessage

中文垃圾短信识别(手写分类器)

machine-learning nlp python

Last synced: 26 Jun 2024

https://github.com/PaddlePaddle/models

Officially maintained, supported by PaddlePaddle, including CV, NLP, Speech, Rec, TS, big models and so on.

computer-vision cv deep-learning models natural-language-processing neural-network nlp paddlepaddle recommendation speech

Last synced: 26 Jun 2024

https://github.com/ymcui/Chinese-Mixtral

中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)

32k 64k large-language-models llm mixtral mixture-of-experts moe nlp

Last synced: 26 Jun 2024

https://github.com/koayon/awesome-adaptive-computation

A curated reading list of research in Adaptive Computation, Dynamic Compute & Mixture of Experts (MoE).

adaptive-computation computer-vision machine-learning mixture-of-experts nlp pytorch tensorflow transformers

Last synced: 25 Jun 2024

https://github.com/maastrichtlawtech/awesome-legal-nlp

📖 A curated list of LegalNLP resources from all around the web.

artificial-intelligence law legal-ai nlp

Last synced: 25 Jun 2024

https://github.com/banglakit/awesome-bangla

A collection of tools, datasets and resources on Bangla computing

bangla bangla-computing bengali nlp

Last synced: 25 Jun 2024

https://github.com/Shark-NLP/OpenICL

OpenICL is an open-source framework to facilitate research, development, and prototyping of in-context learning.

in-context-learning language-model nlp

Last synced: 25 Jun 2024

https://github.com/AliAbdelaal/ATKSpy

this repository is a python package that supports SOAP interface to communicate with the Microsoft ATKS

arabic arabic-nlp atks microsoft natural-language-processing nlp parser pos-tagger pos-tagging python3 soap-web-services

Last synced: 24 Jun 2024

https://github.com/forzagreen/n2words

Convert numerical numbers to written numbers, in 25+ languages.

convert-numbers language natural-language nlp

Last synced: 24 Jun 2024

https://github.com/ha-lins/MetaLearning4NLP-Papers

A list of recent papers about Meta / few-shot learning methods applied in NLP areas.

dialogue-systems few-shot-learning low-resource meta-learning nlp papers-collection semantic-parsing

Last synced: 24 Jun 2024

https://github.com/airaria/Visual-Chinese-LLaMA-Alpaca

多模态中文LLaMA&Alpaca大语言模型(VisualCLA)

alpaca chinese llama llm lora multimodal nlp vision-language

Last synced: 23 Jun 2024

https://github.com/thunlp/OpenBackdoor

An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)

backdoor-attacks nlp

Last synced: 23 Jun 2024

https://github.com/grammarly/gector

Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)

bert grammatical-error-correction natural-language-processing nlp roberta sequence-labeling text-simplification transformers xlnet

Last synced: 23 Jun 2024

https://github.com/Qznan/QizNLP

Quick run NLP in many task 快速运行分类、序列标注、匹配、生成等NLP任务的Tensorflow框架 (中文 NLP 支持分布式)

beam-search chinese classification horovod match nlp sequence-labeling sequence-to-sequence tensorflow

Last synced: 23 Jun 2024

https://github.com/ChenghaoMou/pytorch-pQRNN

Implementation of pQRNN in PyTorch

nlp pqrnn pytorch text-classification

Last synced: 23 Jun 2024