Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/GauravBh1010tt/DeepLearn

Implementation of research papers on Deep Learning+ NLP+ CV in Python using Keras, Tensorflow and Scikit Learn.

audio-processing computer-vision deep-learning nlp

Last synced: 25 Oct 2024

https://github.com/gauravbh1010tt/deeplearn

Implementation of research papers on Deep Learning+ NLP+ CV in Python using Keras, Tensorflow and Scikit Learn.

audio-processing computer-vision deep-learning nlp

Last synced: 21 Dec 2024

https://github.com/yongzhuo/keras-textclassification

中文长文本分类、短句子分类、多标签分类、两句子相似度(Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short),字词句向量嵌入层(embeddings)和网络层(graph)构建基类,FastText,TextCNN,CharCNN,TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 胶囊网络-CapsuleNet, Transformer-encode, Seq2seq, SWEM, LEAM, TextGCN

albert bert capsule charcnn crnn dcnn dpcnn embeddings fasttext han keras keras-textclassification leam nlp rcnn text-classification textcnn transformer vdcnn xlnet

Last synced: 18 Dec 2024

https://github.com/chineseglue/chineseglue

Language Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard

albert bert chinese-corpus datasets glue language-understanding nlp pre-trained-model

Last synced: 21 Dec 2024

https://github.com/ChineseGLUE/ChineseGLUE

Language Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard

albert bert chinese-corpus datasets glue language-understanding nlp pre-trained-model

Last synced: 06 Nov 2024

https://github.com/425776024/nlpcda

一键中文数据增强包 ; NLP数据增强、bert数据增强、EDA:pip install nlpcda

chinese-data-augmentation chinese-eda data-augmentation nlp nlpcda

Last synced: 19 Dec 2024

https://github.com/ymcui/chinese-llama-alpaca-3

中文羊驼大模型三期项目 (Chinese Llama-3 LLMs) developed from Meta Llama 3

alpaca large-language-models llama llama-2 llama-3 llama3 llm nlp

Last synced: 22 Dec 2024

https://github.com/huggingface/transfer-learning-conv-ai

🦄 State-of-the-Art Conversational AI with Transfer Learning

chatbots deep-learning dialog gpt gpt-2 neural-networks nlp pytorch transfer-learning

Last synced: 21 Dec 2024

https://github.com/deepset-ai/farm

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

bert deep-learning germanbert language-models ner nlp nlp-framework nlp-library pretrained-models pytorch question-answering roberta transfer-learning xlnet-pytorch

Last synced: 19 Dec 2024

https://github.com/bdbc-kg-nlp/qa-survey-cn

北京航空航天大学大数据高精尖中心自然语言处理研究团队开展了智能问答的研究与应用总结。包括基于知识图谱的问答(KBQA),基于文本的问答系统(TextQA),基于表格的问答系统(TableQA)、基于视觉的问答系统(VisualQA)和机器阅读理解(MRC)等,每类任务分别对学术界和工业界进行了相关总结。

cqa kbqa nlp qa qa-survey question-answering survey tqa vqa

Last synced: 02 Dec 2024

https://github.com/deepset-ai/FARM

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

bert deep-learning germanbert language-models ner nlp nlp-framework nlp-library pretrained-models pytorch question-answering roberta transfer-learning xlnet-pytorch

Last synced: 04 Nov 2024

https://github.com/paddlepaddle/research

novel deep learning research works with PaddlePaddle

computer-vision data-mining deep-learning knowledge-graph nlp spatial-temporal

Last synced: 18 Dec 2024

https://github.com/PaddlePaddle/Research

novel deep learning research works with PaddlePaddle

computer-vision data-mining deep-learning knowledge-graph nlp spatial-temporal

Last synced: 01 Nov 2024

https://github.com/imcaspar/gpt2-ml

GPT2 for Multiple Languages, including pretrained models. GPT2 多语言支持, 15亿参数中文预训练模型

bert chinese colab gpt-2 nlp pretrained-models tensorflow text-generation tpu

Last synced: 21 Dec 2024

https://github.com/BDBC-KG-NLP/QA-Survey-CN

北京航空航天大学大数据高精尖中心自然语言处理研究团队开展了智能问答的研究与应用总结。包括基于知识图谱的问答(KBQA),基于文本的问答系统(TextQA),基于表格的问答系统(TableQA)、基于视觉的问答系统(VisualQA)和机器阅读理解(MRC)等,每类任务分别对学术界和工业界进行了相关总结。

cqa kbqa nlp qa qa-survey question-answering survey tqa vqa

Last synced: 11 Nov 2024

https://github.com/allenai/scispacy

A full spaCy pipeline and models for scientific/biomedical documents.

bioinformatics biomedical custom-pipes nlp scientific-documents spacy

Last synced: 17 Dec 2024

https://allenai.github.io/scispacy/

A full spaCy pipeline and models for scientific/biomedical documents.

bioinformatics biomedical custom-pipes nlp scientific-documents spacy

Last synced: 21 Nov 2024

https://github.com/franck-dernoncourt/neuroner

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

deep-learning machine-learning named-entity-recognition neural-networks nlp tensorflow

Last synced: 20 Dec 2024

https://github.com/Franck-Dernoncourt/NeuroNER

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

deep-learning machine-learning named-entity-recognition neural-networks nlp tensorflow

Last synced: 07 Nov 2024

https://github.com/Roshanson/TextInfoExp

自然语言处理实验(sougou数据集),TF-IDF,文本分类、聚类、词向量、情感识别、关系抽取等

nlp python

Last synced: 31 Oct 2024

https://github.com/microsoft/recognizers-text

Microsoft.Recognizers.Text provides recognition and resolution of numbers, units, date/time, etc. in multiple languages (ZH, EN, FR, ES, PT, DE, IT, TR, HI, NL. Partial support for JA, KO, AR, SV). Packages available at: https://www.nuget.org/profiles/Recognizers.Text, https://www.npmjs.com/~recognizers.text

date datetime datetime-normalization-and-resolution entity-extraction hacktoberfest ner nlp number-expression numbers numex parser parser-library time time-expression time-expression-recognition timex

Last synced: 16 Dec 2024

https://github.com/graph4ai/graph4nlp

Graph4nlp is the library for the easy use of Graph Neural Networks for NLP. Welcome to visit our DLG4NLP website (https://dlg4nlp.github.io/index.html) for various learning resources!

deep-learning graph-neural-networks machine-learning natural-language-processing nlp pytorch

Last synced: 19 Dec 2024

https://github.com/eikek/docspell

Assist in organizing your piles of documents, resulting from scanners, e-mails and other sources with miminal effort.

dms docspell document document-management document-management-system edms elm nlp ocr pdf personal-document-system scala self-hosted spa stanford-corenlp webapp

Last synced: 19 Dec 2024

https://github.com/microsoft/Recognizers-Text

Microsoft.Recognizers.Text provides recognition and resolution of numbers, units, date/time, etc. in multiple languages (ZH, EN, FR, ES, PT, DE, IT, TR, HI, NL. Partial support for JA, KO, AR, SV). Packages available at: https://www.nuget.org/profiles/Recognizers.Text, https://www.npmjs.com/~recognizers.text

date datetime datetime-normalization-and-resolution entity-extraction hacktoberfest ner nlp number-expression numbers numex parser parser-library time time-expression time-expression-recognition timex

Last synced: 09 Nov 2024

https://github.com/explosion/spacy-models

💫 Models for the spaCy Natural Language Processing (NLP) library

machine-learning machine-learning-models models natural-language-processing nlp spacy spacy-models statistical-models

Last synced: 17 Dec 2024

https://github.com/ymcui/chinese-xlnet

Pre-Trained Chinese XLNet(中文XLNet预训练模型)

natural-language-processing nlp pytorch tensorflow xlnet

Last synced: 21 Dec 2024

https://github.com/ymcui/Chinese-XLNet

Pre-Trained Chinese XLNet(中文XLNet预训练模型)

natural-language-processing nlp pytorch tensorflow xlnet

Last synced: 31 Oct 2024

https://github.com/ymcui/Chinese-LLaMA-Alpaca-3

中文羊驼大模型三期项目 (Chinese Llama-3 LLMs) developed from Meta Llama 3

alpaca large-language-models llama llama-2 llama-3 llama3 llm nlp

Last synced: 14 Nov 2024

https://github.com/lihanghang/nlp-knowledge-graph

自然语言处理、知识图谱、对话系统三大技术研究与应用。

bert deep-learning ernie event-driven kbqa knowledge-graph machine-learning ner nlp transformers

Last synced: 03 Dec 2024

https://github.com/timoschick/pet

This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"

machine-learning nlp python

Last synced: 21 Dec 2024

https://github.com/lihanghang/NLP-Knowledge-Graph

自然语言处理、知识图谱、对话系统三大技术研究与应用。

bert deep-learning ernie event-driven kbqa knowledge-graph machine-learning ner nlp transformers

Last synced: 30 Oct 2024

https://github.com/airaria/textbrewer

A PyTorch-based knowledge distillation toolkit for natural language processing

bert distillation knowledge nlp pytorch

Last synced: 21 Dec 2024

https://github.com/beir-cellar/beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.

ance benchmark bert colbert dataset deep-learning dpr elasticsearch information-retrieval nlp passage-retrieval pytorch question-generation retrieval retrieval-models sbert sentence-transformers use-qa zero-shot-retrieval

Last synced: 17 Dec 2024

https://github.com/airaria/TextBrewer

A PyTorch-based knowledge distillation toolkit for natural language processing

bert distillation knowledge nlp pytorch

Last synced: 03 Nov 2024

https://github.com/tatsu-lab/alpaca_eval

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

deep-learning evaluation foundation-models instruction-following large-language-models leaderboard nlp rlhf

Last synced: 17 Dec 2024

https://github.com/datamade/usaddress

:us: a python library for parsing unstructured United States address strings into address components

address address-parser conditional-random-fields crf machine-learning natural-language-processing nlp parserator python python-library

Last synced: 17 Dec 2024

https://github.com/allenai/bi-att-flow

Bi-directional Attention Flow (BiDAF) network is a multi-stage hierarchical process that represents context at different levels of granularity and uses a bi-directional attention flow mechanism to achieve a query-aware context representation without early summarization.

bidaf nlp question-answering squad tensorflow

Last synced: 21 Dec 2024

https://github.com/yongzhuo/nlp_xiaojiang

自然语言处理(nlp),小姜机器人(闲聊检索式chatbot),BERT句向量-相似度(Sentence Similarity),XLNET句向量-相似度(text xlnet embedding),文本分类(Text classification), 实体提取(ner,bert+bilstm+crf),数据增强(text augment, data enhance),同义句同义词生成,句子主干提取(mainpart),中文汉语短文本相似度,文本特征工程,keras-http-service调用

bert chatbot chinese data-augmentation distance enhance feature nlp text-augment text-classification xlnet

Last synced: 21 Dec 2024

https://github.com/bfelbo/deepmoji

State-of-the-art deep learning model for analyzing sentiment, emotion, sarcasm etc.

ai deep-learning keras machine-learning natural-language-processing neural-networks nlp python sentiment-analysis tensorflow text-classification

Last synced: 19 Dec 2024

https://github.com/bfelbo/DeepMoji

State-of-the-art deep learning model for analyzing sentiment, emotion, sarcasm etc.

ai deep-learning keras machine-learning natural-language-processing neural-networks nlp python sentiment-analysis tensorflow text-classification

Last synced: 11 Nov 2024

https://github.com/thunlp/taadpapers

Must-read Papers on Textual Adversarial Attack and Defense

adversarial-attacks adversarial-defense adversarial-learning natural-language-processing nlp paper-list

Last synced: 19 Dec 2024

https://github.com/juand-r/entity-recognition-datasets

A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.

annotations corpora datasets entity-extraction entity-recognition named-entity-recognition natural-language-processing ner nlp nlp-resources

Last synced: 19 Dec 2024

https://github.com/thunlp/TAADpapers

Must-read Papers on Textual Adversarial Attack and Defense

adversarial-attacks adversarial-defense adversarial-learning natural-language-processing nlp paper-list

Last synced: 30 Oct 2024

https://github.com/chrismattmann/tika-python

Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.

buffer covid-19 detection extraction memex mime nlp nlp-library nlp-machine-learning parse parser-interface python recognition text-extraction text-recognition tika-python tika-server tika-server-jar translation-interface usc

Last synced: 17 Dec 2024

https://github.com/NLPchina/nlp-lang

这个项目是一个基本包.封装了大多数nlp项目中常用工具

java nlp nlp-lang tire

Last synced: 30 Oct 2024

https://github.com/nlpchina/nlp-lang

这个项目是一个基本包.封装了大多数nlp项目中常用工具

java nlp nlp-lang tire

Last synced: 19 Dec 2024

https://github.com/tencent/turbotransformers

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

albert bert decoder gpt2 gpu huggingface-transformers inference machine-translation nlp pytorch roberta transformer

Last synced: 22 Dec 2024

https://github.com/dair-ai/transformers-recipe

🧠 A study guide to learn about Transformers

ai deep-learning machine-learning natural-language-processing nlp

Last synced: 10 Nov 2024

https://github.com/dair-ai/Transformers-Recipe

🧠 A study guide to learn about Transformers

ai deep-learning machine-learning natural-language-processing nlp

Last synced: 05 Nov 2024

https://github.com/Tencent/TurboTransformers

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

albert bert decoder gpt2 gpu huggingface-transformers inference machine-translation nlp pytorch roberta transformer

Last synced: 27 Oct 2024

https://github.com/dair-ai/nlp_paper_summaries

✍️ A carefully curated list of NLP paper summaries

deep-learning machine-learning nlp

Last synced: 29 Nov 2024

https://github.com/HIT-SCIR/ELMoForManyLangs

Pre-trained ELMo Representations for Many Languages

elmo multilingual nlp

Last synced: 13 Nov 2024

https://github.com/shibing624/similarity

similarity: Text similarity calculation Toolkit for Java. 文本相似度计算工具包,java编写,可用于文本相似度计算、情感分析等任务,开箱即用。

java nlp semantic sentiment sim-scores similarity

Last synced: 21 Dec 2024

https://github.com/nx-ai/xlstm

Official repository of the xLSTM.

deep-learning deep-learning-architecture llm machine-learning nlp rnn

Last synced: 18 Dec 2024

https://github.com/allenai/scibert

A BERT model for scientific text.

bert nlp scientific-papers

Last synced: 02 Nov 2024

https://github.com/demidovakatya/vvedenie-mashinnoe-obuchenie

:memo: Подборка ресурсов по машинному обучению

collections data-mining data-science deep-learning machine-learning mooc neural-networks nlp russian university

Last synced: 30 Nov 2024

https://github.com/konlpy/konlpy

Python package for Korean natural language processing.

hacktoberfest korean korean-nlp morphology nlp python text-mining

Last synced: 21 Dec 2024

https://github.com/yoshitomo-matsubara/torchdistill

A coding-free framework built on PyTorch for reproducible deep learning studies. 🏆25 knowledge distillation methods presented at CVPR, ICLR, ECCV, NeurIPS, ICCV, etc are implemented so far. 🎁 Trained models, training logs and configurations are available for ensuring the reproducibiliy and benchmark.

amazon-sagemaker-lab cifar10 cifar100 coco colab-notebook glue google-colab image-classification imagenet knowledge-distillation natural-language-processing nlp object-detection pascal-voc pytorch semantic-segmentation transformer

Last synced: 18 Dec 2024

https://github.com/ymcui/chinese-electra

Pre-trained Chinese ELECTRA(中文ELECTRA预训练模型)

bert chinese chinese-electra electra language-model nlp pre-trained-model pytorch tensorflow

Last synced: 22 Dec 2024

https://github.com/ymcui/Chinese-ELECTRA

Pre-trained Chinese ELECTRA(中文ELECTRA预训练模型)

bert chinese chinese-electra electra language-model nlp pre-trained-model pytorch tensorflow

Last synced: 07 Nov 2024

https://github.com/fergiemcdowall/search-index

A persistent, network resilient, full text search library for the browser and Node.js

nlp offline-first search

Last synced: 17 Dec 2024

https://github.com/yao8839836/text_gcn

Graph Convolutional Networks for Text Classification. AAAI 2019

deep-learning graph-convolutional-networks nlp text-classification

Last synced: 22 Dec 2024

https://github.com/explosion/projects

🪐 End-to-end NLP workflows from prototype to production

annotations datasets natural-language-processing nlp prodigy spacy

Last synced: 19 Dec 2024

https://github.com/omarsar/nlp_overview

Overview of Modern Deep Learning Techniques Applied to Natural Language Processing

cnn deep-learning nlp reinforcement-learning rnn word-embeddings

Last synced: 22 Dec 2024

https://github.com/fukuball/jieba-php

"結巴"中文分詞:做最好的 PHP 中文分詞、中文斷詞組件。 / "Jieba" (Chinese for "to stutter") Chinese text segmentation: built to be the best PHP Chinese word segmentation module.

chinese-text-segmentation machine-learning natural-language-processing nlp

Last synced: 19 Dec 2024

https://github.com/facebookarchive/duckling_old

Deprecated in favor of https://github.com/facebook/duckling

nlp nlu parser

Last synced: 26 Sep 2024

https://github.com/neuml/paperai

📄 🤖 Semantic search and workflows for medical/scientific papers

ai artificial-intelligence document-search machine-learning medical nlp python scientific-papers search txtai

Last synced: 21 Dec 2024

https://github.com/cdpierse/transformers-interpret

Model explainability that works seamlessly with 🤗 transformers. Explain your transformers model in just 2 lines of code.

captum computer-vision deep-learning explainable-ai interpretability machine-learning model-explainability natural-language-processing neural-network nlp transformers transformers-model

Last synced: 20 Dec 2024

https://github.com/hyperonym/basaran

Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.

generative gpt huggingface language-model llama llm model natural-language-processing nlp openai-api python text-generation transformers

Last synced: 27 Sep 2024

https://github.com/oracle/tribuo

Tribuo - A Java machine learning library

classification clustering deep-learning java machine-learning ml nlp regression

Last synced: 16 Dec 2024

https://github.com/SKTBrain/KoBERT

Korean BERT pre-trained cased (KoBERT)

bert korean-nlp language-model nlp pytorch transformers

Last synced: 09 Nov 2024

https://github.com/linkedin/detext

DeText: A Deep Neural Text Understanding Framework for Ranking and Classification Tasks

classification deep-neural-networks detext-framework nlp ranking text-embeddings

Last synced: 21 Dec 2024