Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/ModelTC/lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

deep-learning gpt llama llm model-serving nlp openai-triton

Last synced: 31 Jul 2024

https://github.com/songyouwei/ABSA-PyTorch

Aspect Based Sentiment Analysis, PyTorch Implementations. 基于方面的情感分析,使用PyTorch实现。

aspect-based-sentiment-analysis attention bert natural-language-processing nlp sentiment-analysis sentiment-classification

Last synced: 01 Aug 2024

https://github.com/alibaba/AliceMind

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

bert deep-learning natural-language-processing nlp

Last synced: 31 Jul 2024

https://github.com/delip/PyTorchNLPBook

Code and data accompanying Natural Language Processing with PyTorch published by O'Reilly Media https://amzn.to/3JUgR2L

deep-learning deep-neural-networks natural-language-processing neural-machine-translation neural-networks nlp pytorch pytorch-nlp pytorch-tutorial

Last synced: 03 Sep 2024

https://github.com/huggingface/setfit

Efficient few-shot learning with Sentence Transformers

few-shot-learning nlp sentence-transformers

Last synced: 03 Aug 2024

https://github.com/jalammar/ecco

Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, BERT, RoBERTA, T5, and T0).

explorables language-models natural-language-processing nlp pytorch visualization

Last synced: 01 Aug 2024

https://github.com/rguthrie3/DeepLearningForNLPInPytorch

An IPython Notebook tutorial on deep learning for natural language processing, including structure prediction.

deep-learning lstm neural-network nlp pytorch tutorial

Last synced: 01 Aug 2024

https://github.com/GauravBh1010tt/DeepLearn

Implementation of research papers on Deep Learning+ NLP+ CV in Python using Keras, Tensorflow and Scikit Learn.

audio-processing computer-vision deep-learning nlp

Last synced: 30 Jul 2024

https://github.com/ChineseGLUE/ChineseGLUE

Language Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard

albert bert chinese-corpus datasets glue language-understanding nlp pre-trained-model

Last synced: 01 Aug 2024

https://github.com/huggingface/transfer-learning-conv-ai

🦄 State-of-the-Art Conversational AI with Transfer Learning

chatbots deep-learning dialog gpt gpt-2 neural-networks nlp pytorch transfer-learning

Last synced: 31 Jul 2024

https://github.com/deepset-ai/FARM

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

bert deep-learning germanbert language-models ner nlp nlp-framework nlp-library pretrained-models pytorch question-answering roberta transfer-learning xlnet-pytorch

Last synced: 01 Aug 2024

https://github.com/imcaspar/gpt2-ml

GPT2 for Multiple Languages, including pretrained models. GPT2 多语言支持, 15亿参数中文预训练模型

bert chinese colab gpt-2 nlp pretrained-models tensorflow text-generation tpu

Last synced: 02 Aug 2024

https://github.com/PaddlePaddle/Research

novel deep learning research works with PaddlePaddle

computer-vision data-mining deep-learning knowledge-graph nlp spatial-temporal

Last synced: 01 Aug 2024

https://github.com/Franck-Dernoncourt/NeuroNER

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

deep-learning machine-learning named-entity-recognition neural-networks nlp tensorflow

Last synced: 01 Aug 2024

https://github.com/Roshanson/TextInfoExp

自然语言处理实验(sougou数据集),TF-IDF,文本分类、聚类、词向量、情感识别、关系抽取等

nlp python

Last synced: 31 Jul 2024

https://github.com/425776024/nlpcda

一键中文数据增强包 ; NLP数据增强、bert数据增强、EDA:pip install nlpcda

chinese-data-augmentation chinese-eda data-augmentation nlp nlpcda

Last synced: 03 Aug 2024

https://github.com/graph4ai/graph4nlp

Graph4nlp is the library for the easy use of Graph Neural Networks for NLP. Welcome to visit our DLG4NLP website (https://dlg4nlp.github.io/index.html) for various learning resources!

deep-learning graph-neural-networks machine-learning natural-language-processing nlp pytorch

Last synced: 01 Aug 2024

https://github.com/milvus-io/bootcamp

Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc.

audio-search benchmark-testing deep-learning hacktoberfest image-classification image-recognition image-search milvus nlp python question-answering unstructured-data

Last synced: 01 Aug 2024

https://github.com/BDBC-KG-NLP/QA-Survey-CN

北京航空航天大学大数据高精尖中心自然语言处理研究团队开展了智能问答的研究与应用总结。包括基于知识图谱的问答(KBQA),基于文本的问答系统(TextQA),基于表格的问答系统(TableQA)、基于视觉的问答系统(VisualQA)和机器阅读理解(MRC)等,每类任务分别对学术界和工业界进行了相关总结。

cqa kbqa nlp qa qa-survey question-answering survey tqa vqa

Last synced: 02 Aug 2024

https://github.com/ymcui/Chinese-XLNet

Pre-Trained Chinese XLNet(中文XLNet预训练模型)

natural-language-processing nlp pytorch tensorflow xlnet

Last synced: 31 Jul 2024

https://github.com/timoschick/pet

This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"

machine-learning nlp python

Last synced: 01 Aug 2024

https://github.com/lihanghang/NLP-Knowledge-Graph

自然语言处理、知识图谱、对话系统三大技术研究与应用。

bert deep-learning ernie event-driven kbqa knowledge-graph machine-learning ner nlp transformers

Last synced: 31 Jul 2024

https://github.com/allenai/scispacy

A full spaCy pipeline and models for scientific/biomedical documents.

bioinformatics biomedical custom-pipes nlp scientific-documents spacy

Last synced: 01 Aug 2024

https://allenai.github.io/scispacy/

A full spaCy pipeline and models for scientific/biomedical documents.

bioinformatics biomedical custom-pipes nlp scientific-documents spacy

Last synced: 04 Aug 2024

https://github.com/airaria/TextBrewer

A PyTorch-based knowledge distillation toolkit for natural language processing

bert distillation knowledge nlp pytorch

Last synced: 01 Aug 2024

https://github.com/explosion/spacy-models

💫 Models for the spaCy Natural Language Processing (NLP) library

machine-learning machine-learning-models models natural-language-processing nlp spacy spacy-models statistical-models

Last synced: 31 Jul 2024

https://github.com/allenai/bi-att-flow

Bi-directional Attention Flow (BiDAF) network is a multi-stage hierarchical process that represents context at different levels of granularity and uses a bi-directional attention flow mechanism to achieve a query-aware context representation without early summarization.

bidaf nlp question-answering squad tensorflow

Last synced: 01 Aug 2024

https://github.com/datamade/usaddress

:us: a python library for parsing unstructured United States address strings into address components

address address-parser conditional-random-fields crf machine-learning natural-language-processing nlp parserator python python-library

Last synced: 01 Aug 2024

https://github.com/bfelbo/DeepMoji

State-of-the-art deep learning model for analyzing sentiment, emotion, sarcasm etc.

ai deep-learning keras machine-learning natural-language-processing neural-networks nlp python sentiment-analysis tensorflow text-classification

Last synced: 02 Aug 2024

https://github.com/dair-ai/Transformers-Recipe

🧠 A study guide to learn about Transformers

ai deep-learning machine-learning natural-language-processing nlp

Last synced: 01 Aug 2024

https://github.com/NLPchina/nlp-lang

这个项目是一个基本包.封装了大多数nlp项目中常用工具

java nlp nlp-lang tire

Last synced: 31 Jul 2024

https://github.com/thunlp/TAADpapers

Must-read Papers on Textual Adversarial Attack and Defense

adversarial-attacks adversarial-defense adversarial-learning natural-language-processing nlp paper-list

Last synced: 31 Jul 2024

https://github.com/dair-ai/nlp_paper_summaries

✍️ A carefully curated list of NLP paper summaries

deep-learning machine-learning nlp

Last synced: 30 Jul 2024

https://github.com/HIT-SCIR/ELMoForManyLangs

Pre-trained ELMo Representations for Many Languages

elmo multilingual nlp

Last synced: 02 Aug 2024

https://github.com/Tencent/TurboTransformers

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

albert bert decoder gpt2 gpu huggingface-transformers inference machine-translation nlp pytorch roberta transformer

Last synced: 31 Jul 2024

https://github.com/eikek/docspell

Assist in organizing your piles of documents, resulting from scanners, e-mails and other sources with miminal effort.

dms docspell document document-management document-management-system edms elm nlp ocr pdf personal-document-system scala self-hosted spa stanford-corenlp webapp

Last synced: 31 Jul 2024

https://github.com/chrismattmann/tika-python

Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.

buffer covid-19 detection extraction memex mime nlp nlp-library nlp-machine-learning parse parser-interface python recognition text-extraction text-recognition tika-python tika-server tika-server-jar translation-interface usc

Last synced: 01 Aug 2024

https://github.com/allenai/scibert

A BERT model for scientific text.

bert nlp scientific-papers

Last synced: 01 Aug 2024

https://github.com/demidovakatya/vvedenie-mashinnoe-obuchenie

:memo: Подборка ресурсов по машинному обучению

collections data-mining data-science deep-learning machine-learning mooc neural-networks nlp russian university

Last synced: 07 Aug 2024

https://github.com/ymcui/Chinese-ELECTRA

Pre-trained Chinese ELECTRA(中文ELECTRA预训练模型)

bert chinese chinese-electra electra language-model nlp pre-trained-model pytorch tensorflow

Last synced: 01 Aug 2024

https://github.com/fergiemcdowall/search-index

A persistent, network resilient, full text search library for the browser and Node.js

nlp offline-first search

Last synced: 30 Jul 2024

https://github.com/konlpy/konlpy

Python package for Korean natural language processing.

hacktoberfest korean korean-nlp morphology nlp python text-mining

Last synced: 03 Aug 2024

https://github.com/UKPLab/beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.

ance benchmark bert colbert dataset deep-learning dpr elasticsearch information-retrieval nlp passage-retrieval pytorch question-generation retrieval retrieval-models sbert sentence-transformers use-qa zero-shot-retrieval

Last synced: 04 Aug 2024

https://github.com/beir-cellar/beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.

ance benchmark bert colbert dataset deep-learning dpr elasticsearch information-retrieval nlp passage-retrieval pytorch question-generation retrieval retrieval-models sbert sentence-transformers use-qa zero-shot-retrieval

Last synced: 01 Aug 2024

https://github.com/Canner/WrenAI

Wren AI makes your database RAG-ready. Implement Text-to-SQL more accurately and securely.

agent ai bigquery duckdb fastapi gpt llm nextjs nlp openai postgresql python rag sql text-to-sql typescript

Last synced: 06 Aug 2024

https://github.com/yao8839836/text_gcn

Graph Convolutional Networks for Text Classification. AAAI 2019

deep-learning graph-convolutional-networks nlp text-classification

Last synced: 01 Aug 2024

https://github.com/yoshitomo-matsubara/torchdistill

A coding-free framework built on PyTorch for reproducible deep learning studies. 🏆25 knowledge distillation methods presented at CVPR, ICLR, ECCV, NeurIPS, ICCV, etc are implemented so far. 🎁 Trained models, training logs and configurations are available for ensuring the reproducibiliy and benchmark.

amazon-sagemaker-lab cifar10 cifar100 coco colab-notebook glue google-colab image-classification imagenet knowledge-distillation natural-language-processing nlp object-detection pascal-voc pytorch semantic-segmentation transformer

Last synced: 01 Aug 2024

https://github.com/modelscope/data-juicer

A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!

chinese data-analysis data-science data-visualization dataset gpt gpt-4 instruction-tuning large-language-models llama llava llm llms multi-modal nlp opendata pre-training pytorch sora streamlit

Last synced: 01 Aug 2024

https://github.com/facebookarchive/duckling_old

Deprecated in favor of https://github.com/facebook/duckling

nlp nlu parser

Last synced: 31 Jul 2024

https://github.com/omarsar/nlp_overview

Overview of Modern Deep Learning Techniques Applied to Natural Language Processing

cnn deep-learning nlp reinforcement-learning rnn word-embeddings

Last synced: 31 Jul 2024

https://github.com/fukuball/jieba-php

"結巴"中文分詞:做最好的 PHP 中文分詞、中文斷詞組件。 / "Jieba" (Chinese for "to stutter") Chinese text segmentation: built to be the best PHP Chinese word segmentation module.

chinese-text-segmentation machine-learning natural-language-processing nlp

Last synced: 01 Aug 2024

https://github.com/hyperonym/basaran

Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.

generative gpt huggingface language-model llama llm model natural-language-processing nlp openai-api python text-generation transformers

Last synced: 30 Jul 2024

https://github.com/explosion/projects

🪐 End-to-end NLP workflows from prototype to production

annotations datasets natural-language-processing nlp prodigy spacy

Last synced: 07 Aug 2024

https://github.com/SKTBrain/KoBERT

Korean BERT pre-trained cased (KoBERT)

bert korean-nlp language-model nlp pytorch transformers

Last synced: 02 Aug 2024

https://github.com/neuml/paperai

📄 🤖 Semantic search and workflows for medical/scientific papers

document-search machine-learning medical nlp python scientific-papers search txtai

Last synced: 01 Aug 2024

https://github.com/cdpierse/transformers-interpret

Model explainability that works seamlessly with 🤗 transformers. Explain your transformers model in just 2 lines of code.

captum computer-vision deep-learning explainable-ai interpretability machine-learning model-explainability natural-language-processing neural-network nlp transformers transformers-model

Last synced: 31 Jul 2024

https://github.com/SeanLee97/xmnlp

xmnlp:提供中文分词, 词性标注, 命名体识别,情感分析,文本纠错,文本转拼音,文本摘要,偏旁部首,句子表征及文本相似度计算等功能

lexical-analysis ner nlp pinyin postagging radical segmentation sentence-embeddings sentence-similarity sentiment-analysis spell-checker

Last synced: 31 Jul 2024

https://github.com/aurelio-labs/semantic-router

Superfast AI decision making and intelligent processing of multi-modal data.

ai artificial-intelligence chatbot computer-vision generative-ai machine-learning nlp

Last synced: 01 Aug 2024

https://github.com/amaiya/ktrain

ktrain is a Python library that makes deep learning and AI more accessible and easier to apply

computer-vision deep-learning graph-neural-networks keras machine-learning nlp python tabular-data tensorflow

Last synced: 01 Aug 2024

https://github.com/Hyperparticle/one-pixel-attack-keras

Keras implementation of "One pixel attack for fooling deep neural networks" using differential evolution on Cifar10 and ImageNet

cifar10 cnn deep-learning image-processing imagenet keras machine-learning neural-network nlp tensorflow

Last synced: 01 Aug 2024

https://github.com/huggingface/hmtl

🌊HMTL: Hierarchical Multi-Task Learning - A State-of-the-Art neural network model for several NLP tasks based on PyTorch and AllenNLP

multi-task-learning natural-language-processing nlp pytorch

Last synced: 01 Aug 2024

https://github.com/natasha/natasha

Solves basic Russian NLP tasks, API for lower level Natasha projects

embeddings morphology ner nlp python russian sentence-segmentation syntax tokenizer visualization

Last synced: 30 Jul 2024

https://github.com/bheinzerling/bpemb

Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)

embeddings multilingual natural-language-processing nlp subword-embeddings

Last synced: 03 Sep 2024

https://github.com/MilaNLProc/contextualized-topic-models

A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021.

bert embeddings multilingual-models multilingual-topic-models neural-topic-models nlp nlp-library nlp-machine-learning text-as-data topic-coherence topic-modeling transformer

Last synced: 01 Aug 2024

https://github.com/DengBoCong/nlp-paper

自然语言处理领域下的相关论文(附阅读笔记),复现模型以及数据处理等(代码含TensorFlow和PyTorch两版本)

bert dialogue nlp nlp-machine-learning paper pytorch speech tensorflow2

Last synced: 03 Aug 2024