Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/PaddlePaddle/PALM

a Fast, Flexible, Extensible and Easy-to-use NLP Large-scale Pretraining and Multi-task Learning Framework.

baidu multi-task-learning nlp paddlepaddle pretrain-model transformers

Last synced: 07 Aug 2024

https://github.com/sorgerlab/indra

INDRA (Integrated Network and Dynamical Reasoning Assembler) is an automated model assembly system interfacing with NLP systems and databases to collect knowledge, and through a process of assembly, produce causal graphs and dynamical models.

bioinformatics biology computational-biology indra modeling nlp pysb sbml systems-biology

Last synced: 14 Oct 2024

https://github.com/simongray/clojure-dsl-resources

A curated list of Clojure resources for dealing with domain-specific languages.

data-transformation domain-specific-language dsl nlp parsing

Last synced: 22 Oct 2024

https://github.com/coastalcph/lex-glue

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

benchmark lawtech legal legaltech nlp

Last synced: 02 Nov 2024

https://github.com/xatkit-bot-platform/xatkit

The simplest way to build all types of smart chatbots and digital assistants

bot chatbot-framework chatbots conversational-ai digital-assistant dsl low-code nlp no-code

Last synced: 07 Nov 2024

https://github.com/hscspring/all4nlp

All For NLP, especially Chinese.

ai deeplearning machinelearning nlp

Last synced: 27 Oct 2024

https://github.com/princeton-nlp/cofipruning

[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408

bert model-compression nlp pruning

Last synced: 11 Nov 2024

https://github.com/Uzay-G/espial

Espial is an engine for automated organization and discovery of personal knowledge

knowledge knowledge-graph nlp python

Last synced: 01 Nov 2024

https://github.com/uzay-g/espial

Espial is an engine for automated organization and discovery of personal knowledge

knowledge knowledge-graph nlp python

Last synced: 27 Oct 2024

https://github.com/dengbocong/text-similarity

文本相似度(匹配)计算,提供Baseline、训练、推理、指标分析...代码包含TensorFlow/Pytorch双版本

bert deep-learning mechine-learing model nlp pytorch similarity text-classification transformer

Last synced: 08 Nov 2024

https://github.com/daspartho/prompt-extend

extending stable diffusion prompts with suitable style cues using text generation

deep-learning gpt-2 huggingface-spaces huggingface-transformers machine-learning nlp prompt stable-diffusion text-generation

Last synced: 03 Aug 2024

https://github.com/cyberzhg/keras-xlnet

Implementation of XLNet that can load pretrained checkpoints

glue keras language-model nlp xlnet

Last synced: 27 Sep 2024

https://github.com/HKUSTDial/NL2SQL_Handbook

This is a continuously updated handbook for readers to easily track the latest NL2SQL techniques in the literature and provide practical guidance for researchers and practitioners.

awesome finetuning llms nl-to-code nl-to-sql nl2sql nlp nlp-resources survey text-to-sql text2sql tutorial

Last synced: 02 Nov 2024

https://github.com/shjwudp/shu

中文书籍收录整理, Collection of Chinese Books

books dataset nlp

Last synced: 27 Oct 2024

https://github.com/yohasebe/wp2txt

A command-line toolkit to extract text content and category data from Wikipedia dump files

corpus machine-learning nlp ruby wikipedia wikipedia-dump

Last synced: 08 Nov 2024

https://github.com/CyberZHG/keras-xlnet

Implementation of XLNet that can load pretrained checkpoints

glue keras language-model nlp xlnet

Last synced: 03 Aug 2024

https://github.com/avidale/compress-fasttext

Tools for shrinking fastText models (in gensim format)

fasttext fasttext-embeddings nlp python word-embeddings

Last synced: 13 Nov 2024

https://github.com/j2kao/fcc_nn_research

(somewhat) cleaned-up notebooks used in researching public comments for FCC Proceeding 17-108 (Net Neutrality Repeal)

fcc net-neutrality nlp

Last synced: 09 Aug 2024

https://github.com/dccuchile/wefe

WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Word Embeddings models. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!

bias-detection bias-reduction fairness-ai fairness-ml library nlp nlp-library python3 word-embedding-evaluation word-embedding-fairness word-embeddings

Last synced: 05 Aug 2024

https://github.com/ownthink/semantic

语义理解/口语理解,项目包含有词法分析:中文分词、词性标注、命名实体识别;口语理解:领域分类、槽填充、意图识别。

nlp nlu slu

Last synced: 07 Nov 2024

https://github.com/fiddler-labs/fiddler-auditor

Fiddler Auditor is a tool to evaluate language models.

ai-observability evaluation generative-ai langchain llms nlp robustness

Last synced: 13 Nov 2024

https://github.com/IlyaGusev/summarus

Models for automatic abstractive summarization

deep-learning machine-learning nlp pytorch summarization

Last synced: 04 Nov 2024

https://github.com/ymcui/lert

LERT: A Linguistically-motivated Pre-trained Language Model(语言学信息增强的预训练模型LERT)

bert lert nlp plm pre-train pytorch tensorflow transformer

Last synced: 28 Oct 2024

https://github.com/rylans/getlang

Natural language detection package in pure Go

language-model natural-language nlp

Last synced: 26 Oct 2024

https://github.com/doc-analysis/XFUND

XFUND: A Multilingual Form Understanding Benchmark

dataset natural-language-processing nlp

Last synced: 06 Nov 2024

https://github.com/princeton-nlp/optiprompt

[NAACL 2021] Factual Probing Is [MASK]: Learning vs. Learning to Recall https://arxiv.org/abs/2104.05240

nlp probing prompt

Last synced: 11 Nov 2024

https://github.com/prrao87/fine-grained-sentiment

A comparison and discussion of different NLP methods for 5-class sentiment classification on the SST-5 dataset.

fasttext flair nlp python pytorch sentiment-analysis text-classification transformers

Last synced: 13 Nov 2024

https://github.com/princeton-nlp/OptiPrompt

[NAACL 2021] Factual Probing Is [MASK]: Learning vs. Learning to Recall https://arxiv.org/abs/2104.05240

nlp probing prompt

Last synced: 04 Aug 2024

https://github.com/natasha/navec

Compact high quality word embeddings for Russian language

embeddings glove nlp python quantization russian word2vec

Last synced: 10 Nov 2024

https://github.com/microsoft/presidio-research

This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.

deep-learning flair machine-learning named-entity-recognition natural-language-processing ner nlp pii privacy spacy transformers

Last synced: 07 Oct 2024

https://github.com/indix/whatthelang

Lightning Fast Language Prediction 🚀

fasttext language-detection languages nlp python

Last synced: 07 Nov 2024

https://github.com/crazyofapple/Reading_groups

A paper & resource list of large language models, including course, paper, demo, figures

chatgpt gpt-3 gpt-4 large-language-models llm llms natural-language-processing nlp

Last synced: 10 Nov 2024

https://github.com/akshaynagpal/w2n

Convert number words (eg. twenty one) to numeric digits (21)

nlp numeric-digits python word-to-number

Last synced: 05 Aug 2024

https://github.com/NPCai/Open-IE-Papers

Open Information Extraction (OpenIE) and Open Relation Extraction (ORE) papers and data.

information-extraction literature-review nlp openie papers relation-extraction tuples

Last synced: 10 Nov 2024

https://github.com/geekjr/quickai

QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

ai artificial-intelligence bert deep-learning dl easy-to-use fast gpt gpt-neo huggingface-transformers ml neural-network nlp object-detection python pytorch quickai research tensorflow2 yolo

Last synced: 09 Nov 2024

https://github.com/platisd/duplicate-code-detection-tool

A simple Python3 tool to detect similarities between files within a repository

code-duplication gensim nlp

Last synced: 01 Nov 2024

https://github.com/irudnyts/openai

An R package-wrapper around OpenAI API

api ml nlp openai package r

Last synced: 13 Aug 2024

https://github.com/apple/ml-mkqa

We introduce MKQA, an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically diverse languages (260k question-answer pairs in total). The goal of this dataset is to provide a challenging benchmark for question answering quality across a wide set of languages. Please refer to our paper for details, MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering

dataset multilingual-evaluation nlp

Last synced: 07 Oct 2024

https://github.com/kuutsav/information-retrieval

Neural information retrieval / semantic-search / Bi-Encoders

information-retrieval machine-learning nlp semantic-search

Last synced: 03 Aug 2024

https://github.com/lyeoni/prenlp

Preprocessing Library for Natural Language Processing

natural-language-processing nlp preprocessing-library text-preprocessing text-processing

Last synced: 06 Nov 2024

https://github.com/lancern/asm2vec

An unofficial implementation of asm2vec as a standalone python package

asm2vec binary-analysis machine-learning nlp numpy python python3 unofficial word2vec

Last synced: 01 Nov 2024

https://github.com/the-javapocalypse/Twitter-Sentiment-Analysis

This script can tell you the sentiments of people regarding to any events happening in the world by analyzing tweets related to that event

nlp python python3 sentiment sentiment-analysis textblob tweepy tweets twitter twitter-sentiment-analysis

Last synced: 25 Oct 2024

https://github.com/umarbutler/semchunk

A fast and lightweight pure Python library for splitting text into semantically meaningful chunks.

chunking nlp python semantic-chunking splitting text text-chunking text-splitting

Last synced: 07 Nov 2024

https://github.com/Lancern/asm2vec

An unofficial implementation of asm2vec as a standalone python package

asm2vec binary-analysis machine-learning nlp numpy python python3 unofficial word2vec

Last synced: 03 Aug 2024

https://github.com/husseinmozannar/SOQAL

Arabic Open Domain Question Answering System using Neural Reading Comprehension

arabic arabic-language arabic-nlp deep-learning nlp question-answering reading-comprehension tf-idf

Last synced: 03 Aug 2024

https://github.com/Yachay-AI/byt5-geotagging

Confidence and Byt5 - based geotagging model predicting coordinates from text alone.

coordinates deep-learning geo-location geotagging machine-learning neural-network nlp nlp-machine-learning python pytorch transformers

Last synced: 05 Nov 2024

https://github.com/microsoft/ASTRA

Self-training with Weak Supervision (NAACL 2021)

machine-learning nlp weak-supervision weakly-supervised-learning

Last synced: 05 Nov 2024

https://github.com/lxuechen/private-transformers

A codebase that makes differentially private training of transformers easy.

deep-learning differential-privacy huggingface-transformers nlp pytorch transformers

Last synced: 27 Oct 2024

https://github.com/danielegrattarola/twitter-sentiment-cnn

An implementation in TensorFlow of a convolutional neural network (CNN) to perform sentiment classification on tweets.

deep-learning nlp python sentiment-classification tensorflow twitter

Last synced: 01 Nov 2024

https://github.com/zhpmatrix/BERTem

论文实现(ACL2019):《Matching the Blanks: Distributional Similarity for Relation Learning》

acl2019 bert-pytorch fewrel matching-the-blanks nlp relation-extraction

Last synced: 02 Nov 2024

https://github.com/anasaito/skillner

A (smart) rule based NLP module to extract job skills from text

ner nlp python rule-based skillner skills spacy

Last synced: 30 Oct 2024

https://github.com/brolin59/trnlp

TÜRKÇE İÇİN DOĞAL DİL İŞLEME ARAÇLARI

dogal-dil-isleme morfoloji morfolojik-analiz nlp turkish-nlp turkish-sentence-tokenizer

Last synced: 12 Nov 2024

https://github.com/smyja/blackmaria

Python package for webscraping in Natural language

gpt-3 nlp openai python webscraping

Last synced: 09 Aug 2024

https://github.com/lucaterre/spacyfishing

A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata

entity-disambiguation entity-linking natural-language-processing nlp python3 spacy spacy-extension spacy-extensions wikidata

Last synced: 31 Oct 2024

https://github.com/dair-ai/ml-nlp-paper-discussions

📄 A repo containing notes and discussions for our weekly NLP/ML paper discussions.

machine-learning ml nlp

Last synced: 10 Nov 2024

https://github.com/ClipsAI/clipsai

Clips AI is an open-source Python library that automatically converts long videos into clips.

computer-vision nlp video-processing

Last synced: 06 Nov 2024

https://github.com/chewxy/lingo

package lingo provides the data structures and algorithms required for natural language processing

conll-u go golang inflection language-model natural-language-processing nlp nlp-dependency-parsing nlp-library nlp-machine-learning nlp-parsing part-of-speech part-of-speech-tagger

Last synced: 31 Oct 2024

https://github.com/vatshayan/live-chatbot-for-final-year-project

Chatbot system for Final Year Project. Chatbot made in Python using Natural Language Toolkit especially Machine Learning. Easy to Understand and Implement.

btech-project capstone-project chat chat-application chatbot chatbots college-project computer-science cse-project final final-project final-year-project final-year-projects machine-learning nlp nltk project-ideas projects python python-project

Last synced: 28 Oct 2024

https://github.com/kevincobain2000/jProcessing

Japanese Natural Langauge Processing Libraries

japanese nlp word-sense-disambiguation wsd

Last synced: 30 Oct 2024

https://github.com/microsoft/browsecloud

A web app to create and browse text visualizations for automated customer listening.

bayesian-networks counting-grids nlp text-classification text-processing visualization

Last synced: 05 Aug 2024

https://github.com/calpt/awesome-adapter-resources

Collection of Tools and Papers related to Adapters / Parameter-Efficient Transfer Learning/ Fine-Tuning

adapters awesome deep-learning natural-language-processing nlp parameter-efficient-learning parameter-efficient-tuning peft transformers

Last synced: 15 Oct 2024

https://github.com/rth/vtext

Simple NLP in Rust with Python bindings

bag-of-words information-retrieval nlp tf-idf tokenization

Last synced: 30 Oct 2024

https://github.com/emilhvitfeldt/r-text-data

List of textual data sources to be used for text mining in R

data-science nlp rstats text-analysis text-analytics-in-r text-mining tidytext

Last synced: 30 Oct 2024

https://github.com/thunlp/openbackdoor

An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)

backdoor-attacks nlp

Last synced: 10 Nov 2024

https://github.com/xalanq/chinese-sentiment-classification

简单的中文文本情感分类 (MLP, CNN, RNN in PyTorch) - 2019 THU 人工智能导论作业

nlp pytorch

Last synced: 07 Nov 2024

https://github.com/emres/turkish-deasciifier

Turkish deasciifier in Python based on Deniz Yüret's turkish-mode for Emacs

deasciifier diacritics diacritics-reconstruction diacritics-restoration nlp nlp-library python turkish turkish-nlp

Last synced: 12 Nov 2024

https://github.com/yuewang-cuhk/takg

The official implementation of ACL 2019 paper "Topic-Aware Neural Keyphrase Generation for Social Media Language"

keyphrase-generation nlp social-media topic-modeling

Last synced: 09 Nov 2024

https://github.com/cluebenchmark/dataclue

DataCLUE: 数据为中心的NLP基准和工具包

ai chinese classification-algorithm data-centric human-in-the-loop nlp

Last synced: 09 Nov 2024

https://github.com/erfanzar/EasyDeL

EasyDeL is an OpenSource Library to make your training faster and more Optimized With cool Options for training and serving Both in Python And Mojo🔥

easydel flax gpt jax machine-learning mojo nlp optax pytorch transformers

Last synced: 03 Aug 2024

https://github.com/EmilHvitfeldt/R-text-data

List of textual data sources to be used for text mining in R

data-science nlp rstats text-analysis text-analytics-in-r text-mining tidytext

Last synced: 05 Aug 2024

https://github.com/CLUEbenchmark/DataCLUE

DataCLUE: 数据为中心的NLP基准和工具包

ai chinese classification-algorithm data-centric human-in-the-loop nlp

Last synced: 03 Aug 2024

https://github.com/rocketchat/hubot-natural

Natural Language Processing Chatbot for RocketChat

chatbot coffeescript hubot hubot-natural nlp nodejs rocketchat rocketchat-hubot

Last synced: 29 Oct 2024

https://github.com/RocketChat/hubot-natural

Natural Language Processing Chatbot for RocketChat

chatbot coffeescript hubot hubot-natural nlp nodejs rocketchat rocketchat-hubot

Last synced: 26 Oct 2024

https://github.com/alisonmitchell/stock-prediction

Technical and sentiment analysis to predict the stock market with machine learning models based on historical time series data and news article sentiment collected using APIs and web scraping.

beautifulsoup bert gensim huggingface keras-tensorflow machine-learning matplotlib mplfinance nlp nltk numpy pandas plotly python scikit-learn scipy seaborn spacy textblob yfinance

Last synced: 07 Nov 2024

https://github.com/Planeshifter/text-miner

text mining utilities for Node.js

nlp text-mining

Last synced: 10 Nov 2024

https://github.com/planeshifter/text-miner

text mining utilities for Node.js

nlp text-mining

Last synced: 26 Oct 2024

https://github.com/ofa-sys/ofasys

OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models

audio computer-vision deep-learning motion multimodal-learning multitask-learning nlp pretrained-models pytorch transformers vision-and-language

Last synced: 10 Oct 2024

https://github.com/thunlp/OpenBackdoor

An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)

backdoor-attacks nlp

Last synced: 03 Aug 2024