Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/anasaito/skillner

A (smart) rule based NLP module to extract job skills from text

ner nlp python rule-based skillner skills spacy

Last synced: 19 Jan 2025

https://github.com/kuutsav/information-retrieval

Neural information retrieval / Semantic search / Bi-encoders

information-retrieval machine-learning nlp semantic-search

Last synced: 15 Nov 2024

https://github.com/crazyofapple/Reading_groups

A paper & resource list of large language models, including course, paper, demo, figures

chatgpt gpt-3 gpt-4 large-language-models llm llms natural-language-processing nlp

Last synced: 10 Nov 2024

https://github.com/platisd/duplicate-code-detection-tool

A simple Python3 tool to detect similarities between files within a repository

code-duplication gensim nlp

Last synced: 21 Jan 2025

https://github.com/geekjr/quickai

QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

ai artificial-intelligence bert deep-learning dl easy-to-use fast gpt gpt-neo huggingface-transformers ml neural-network nlp object-detection python pytorch quickai research tensorflow2 yolo

Last synced: 09 Nov 2024

https://github.com/Texera/texera

Collaborative Machine-Learning-Centric Data Analytics Using Workflows

data-analytics declarative-ui machine-learning nlp texera workflow

Last synced: 06 Jan 2025

https://github.com/NPCai/Open-IE-Papers

Open Information Extraction (OpenIE) and Open Relation Extraction (ORE) papers and data.

information-extraction literature-review nlp openie papers relation-extraction tuples

Last synced: 10 Nov 2024

https://github.com/apple/ml-mkqa

We introduce MKQA, an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically diverse languages (260k question-answer pairs in total). The goal of this dataset is to provide a challenging benchmark for question answering quality across a wide set of languages. Please refer to our paper for details, MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering

dataset multilingual-evaluation nlp

Last synced: 07 Oct 2024

https://github.com/lyeoni/prenlp

Preprocessing Library for Natural Language Processing

natural-language-processing nlp preprocessing-library text-preprocessing text-processing

Last synced: 23 Jan 2025

https://github.com/husseinmozannar/SOQAL

Arabic Open Domain Question Answering System using Neural Reading Comprehension

arabic arabic-language arabic-nlp deep-learning nlp question-answering reading-comprehension tf-idf

Last synced: 14 Nov 2024

https://github.com/lancern/asm2vec

An unofficial implementation of asm2vec as a standalone python package

asm2vec binary-analysis machine-learning nlp numpy python python3 unofficial word2vec

Last synced: 20 Dec 2024

https://github.com/Lancern/asm2vec

An unofficial implementation of asm2vec as a standalone python package

asm2vec binary-analysis machine-learning nlp numpy python python3 unofficial word2vec

Last synced: 17 Nov 2024

https://github.com/the-javapocalypse/Twitter-Sentiment-Analysis

This script can tell you the sentiments of people regarding to any events happening in the world by analyzing tweets related to that event

nlp python python3 sentiment sentiment-analysis textblob tweepy tweets twitter twitter-sentiment-analysis

Last synced: 25 Oct 2024

https://github.com/microsoft/astra

Self-training with Weak Supervision (NAACL 2021)

machine-learning nlp weak-supervision weakly-supervised-learning

Last synced: 19 Dec 2024

https://github.com/Yachay-AI/byt5-geotagging

Confidence and Byt5 - based geotagging model predicting coordinates from text alone.

coordinates deep-learning geo-location geotagging machine-learning neural-network nlp nlp-machine-learning python pytorch transformers

Last synced: 05 Nov 2024

https://github.com/alisonmitchell/stock-prediction

Technical and sentiment analysis to predict the stock market with machine learning models based on historical time series data and news article sentiment collected using APIs and web scraping.

beautifulsoup bert gensim huggingface keras-tensorflow machine-learning matplotlib mplfinance nlp nltk numpy pandas plotly python scikit-learn scipy seaborn spacy textblob yfinance

Last synced: 19 Dec 2024

https://github.com/microsoft/ASTRA

Self-training with Weak Supervision (NAACL 2021)

machine-learning nlp weak-supervision weakly-supervised-learning

Last synced: 05 Nov 2024

https://github.com/lucaterre/spacyfishing

A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata

entity-disambiguation entity-linking natural-language-processing nlp python3 spacy spacy-extension spacy-extensions wikidata

Last synced: 22 Jan 2025

https://github.com/danielegrattarola/twitter-sentiment-cnn

An implementation in TensorFlow of a convolutional neural network (CNN) to perform sentiment classification on tweets.

deep-learning nlp python sentiment-classification tensorflow twitter

Last synced: 01 Nov 2024

https://github.com/lxuechen/private-transformers

A codebase that makes differentially private training of transformers easy.

deep-learning differential-privacy huggingface-transformers nlp pytorch transformers

Last synced: 27 Oct 2024

https://github.com/rokid/elmo-chinese

Deep contextualized word representations for Chinese

nlp tensorflow word-embedding wordvectors

Last synced: 05 Dec 2024

https://github.com/liucongg/unilmchatchitrobot

Unilm for Chinese Chitchat Robot.基于Unilm模型的夸夸式闲聊机器人项目。

chatbot chinese generation nlp unilm

Last synced: 20 Nov 2024

https://github.com/zhpmatrix/bertem

论文实现(ACL2019):《Matching the Blanks: Distributional Similarity for Relation Learning》

acl2019 bert-pytorch fewrel matching-the-blanks nlp relation-extraction

Last synced: 15 Nov 2024

https://github.com/brolin59/trnlp

TÜRKÇE İÇİN DOĞAL DİL İŞLEME ARAÇLARI

dogal-dil-isleme morfoloji morfolojik-analiz nlp turkish-nlp turkish-sentence-tokenizer

Last synced: 12 Nov 2024

https://github.com/zhpmatrix/BERTem

论文实现(ACL2019):《Matching the Blanks: Distributional Similarity for Relation Learning》

acl2019 bert-pytorch fewrel matching-the-blanks nlp relation-extraction

Last synced: 02 Nov 2024

https://github.com/chewxy/lingo

package lingo provides the data structures and algorithms required for natural language processing

conll-u go golang inflection language-model natural-language-processing nlp nlp-dependency-parsing nlp-library nlp-machine-learning nlp-parsing part-of-speech part-of-speech-tagger

Last synced: 03 Jan 2025

https://github.com/squeezeailab/llm2llm

[ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

data-augmentation llama llama2 llm llms natural-language-processing nlp synthetic-dataset-generation transformer

Last synced: 05 Dec 2024

https://github.com/smyja/blackmaria

Python package for webscraping in Natural language

gpt-3 nlp openai python webscraping

Last synced: 29 Nov 2024

https://github.com/dair-ai/ml-nlp-paper-discussions

📄 A repo containing notes and discussions for our weekly NLP/ML paper discussions.

machine-learning ml nlp

Last synced: 08 Jan 2025

https://github.com/ClipsAI/clipsai

Clips AI is an open-source Python library that automatically converts long videos into clips.

computer-vision nlp video-processing

Last synced: 06 Nov 2024

https://github.com/rth/vtext

Simple NLP in Rust with Python bindings

bag-of-words information-retrieval nlp tf-idf tokenization

Last synced: 20 Jan 2025

https://github.com/thunlp/OpenBackdoor

An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)

backdoor-attacks nlp

Last synced: 16 Nov 2024

https://github.com/kevincobain2000/jProcessing

Japanese Natural Langauge Processing Libraries

japanese nlp word-sense-disambiguation wsd

Last synced: 30 Oct 2024

https://github.com/calpt/awesome-adapter-resources

Collection of Tools and Papers related to Adapters / Parameter-Efficient Transfer Learning/ Fine-Tuning

adapters awesome deep-learning natural-language-processing nlp parameter-efficient-learning parameter-efficient-tuning peft transformers

Last synced: 15 Oct 2024

https://github.com/microsoft/browsecloud

A web app to create and browse text visualizations for automated customer listening.

bayesian-networks counting-grids nlp text-classification text-processing visualization

Last synced: 22 Nov 2024

https://github.com/vatshayan/live-chatbot-for-final-year-project

Chatbot system for Final Year Project. Chatbot made in Python using Natural Language Toolkit especially Machine Learning. Easy to Understand and Implement.

btech-project capstone-project chat chat-application chatbot chatbots college-project computer-science cse-project final final-project final-year-project final-year-projects machine-learning nlp nltk project-ideas projects python python-project

Last synced: 28 Oct 2024

https://github.com/kevincobain2000/jprocessing

Japanese Natural Langauge Processing Libraries

japanese nlp word-sense-disambiguation wsd

Last synced: 30 Nov 2024

https://github.com/emilhvitfeldt/r-text-data

List of textual data sources to be used for text mining in R

data-science nlp rstats text-analysis text-analytics-in-r text-mining tidytext

Last synced: 18 Dec 2024

https://github.com/EmilHvitfeldt/R-text-data

List of textual data sources to be used for text mining in R

data-science nlp rstats text-analysis text-analytics-in-r text-mining tidytext

Last synced: 22 Nov 2024

https://github.com/dbklim/rnnoise_wrapper

A simple Python wrapper for audio noise reduction RNNoise. Simplifies work with it, adds new trained models and detailed instructions for training.

audio audio-processing denoise denoiser denoising dsp ml nlp noise noise-algorithms noise-reduction noise-suppression python-wrapper rnn rnnoise rnnoise-training rnnoise-wrapper rtc wav

Last synced: 27 Dec 2024

https://github.com/xalanq/chinese-sentiment-classification

简单的中文文本情感分类 (MLP, CNN, RNN in PyTorch) - 2019 THU 人工智能导论作业

nlp pytorch

Last synced: 07 Nov 2024

https://github.com/thunlp/openbackdoor

An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)

backdoor-attacks nlp

Last synced: 10 Nov 2024

https://github.com/houbb/segment

The jieba-analysis tool for java.(基于结巴分词词库实现的更加灵活优雅易用,高性能的 java 分词实现。支持词性标注。)

benchmark chinese dfa hmm java jieba jieba-analysis jieba-chinese nlp segment segmentation trie trie-tree

Last synced: 21 Jan 2025

https://github.com/emres/turkish-deasciifier

Turkish deasciifier in Python based on Deniz Yüret's turkish-mode for Emacs

deasciifier diacritics diacritics-reconstruction diacritics-restoration nlp nlp-library python turkish turkish-nlp

Last synced: 12 Nov 2024

https://github.com/yuewang-cuhk/takg

The official implementation of ACL 2019 paper "Topic-Aware Neural Keyphrase Generation for Social Media Language"

keyphrase-generation nlp social-media topic-modeling

Last synced: 09 Nov 2024

https://github.com/rocketchat/hubot-natural

Natural Language Processing Chatbot for RocketChat

chatbot coffeescript hubot hubot-natural nlp nodejs rocketchat rocketchat-hubot

Last synced: 20 Jan 2025

https://github.com/RocketChat/hubot-natural

Natural Language Processing Chatbot for RocketChat

chatbot coffeescript hubot hubot-natural nlp nodejs rocketchat rocketchat-hubot

Last synced: 26 Oct 2024

https://github.com/CLUEbenchmark/DataCLUE

DataCLUE: 数据为中心的NLP基准和工具包

ai chinese classification-algorithm data-centric human-in-the-loop nlp

Last synced: 16 Nov 2024

https://github.com/cluebenchmark/dataclue

DataCLUE: 数据为中心的NLP基准和工具包

ai chinese classification-algorithm data-centric human-in-the-loop nlp

Last synced: 16 Nov 2024

https://github.com/amaiya/causalnlp

CausalNLP is a practical toolkit for causal inference with text as treatment, outcome, or "controlled-for" variable.

causal-inference nlp

Last synced: 22 Jan 2025

https://github.com/flight-school/guide-to-swift-strings-sample-code

Xcode Playground Sample Code for the Flight School Guide to Swift Strings

antlr4 binary-to-text nlp parser regex strings swift unicode

Last synced: 26 Nov 2024

https://github.com/azure99/blossomlm

中英双语对话式大型语言模型

artificial-intelligence chatgpt large-language-models llm nlp

Last synced: 19 Jan 2025

https://github.com/Planeshifter/text-miner

text mining utilities for Node.js

nlp text-mining

Last synced: 10 Nov 2024

https://github.com/KudoAI/duckduckgpt

🐤 DuckDuckGo add-on that brings the magic of ChatGPT to search results (powered by GPT-4!)

ai artificial-intelligence bot chatbot chatgpt chatgpt3 ddg duckduckgo gpt gpt-3 gpt-4 greasemonkey javascript machine-learning nlp openai search userscripts web

Last synced: 30 Oct 2024

https://github.com/ofa-sys/ofasys

OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models

audio computer-vision deep-learning motion multimodal-learning multitask-learning nlp pretrained-models pytorch transformers vision-and-language

Last synced: 10 Oct 2024

https://github.com/planeshifter/text-miner

text mining utilities for Node.js

nlp text-mining

Last synced: 24 Jan 2025

https://github.com/stanfordnlp/stanza-old

Stanford NLP group's shared Python tools.

natural-language-processing nlp python text-analysis text-processing

Last synced: 08 Nov 2024

https://github.com/redis-developer/redis-arxiv-search

Vector search demo with the arXiv paper dataset, RedisVL, HuggingFace, OpenAI, Cohere, FastAPI, React, and Redis.

arxiv arxiv-papers cohere document-retrieval document-search huggingface machine-learning nlp openai react redis vector-database vector-search

Last synced: 24 Jan 2025

https://github.com/ianycxu/GCN-with-BERT

Graph Convolutional Networks (GCN) with BERT for Coreference Resolution Task [Pytorch][DGL]

bert bert-model coreference-resolution gcn gnn graph-convolutional-networks graph-neural-networks nlp pytorch

Last synced: 02 Nov 2024

https://github.com/MxDkl/pls

CLI to convert natural language to terminal commands

chatgpt cli llm nlp openai terminal

Last synced: 06 Nov 2024

https://github.com/km1994/recommendation_advertisement_search

整理自然语言处理、推荐系统、搜索引擎等AI领域的入门笔记,论文学习笔记和面试资料(关于NLP那些你不知道的事、关于推荐系统那些你不知道的事、NLP百面百搭、推荐系统百面百搭、搜索引擎百面百搭)

advertisement nlp recommendation-system search-engine

Last synced: 14 Jan 2025

https://github.com/living-with-machines/deezymatch

A Flexible Deep Learning Approach to Fuzzy String Matching

deep-learning hacktoberfest hut23 hut23-96 machine-learning natural-language-processing nlp

Last synced: 20 Jan 2025

https://github.com/ai-forever/ru-clip

CLIP implementation for Russian language

clip computer-vision nlp

Last synced: 20 Dec 2024

https://github.com/eugeneyan/recsys-nlp-graph

🛒 Simple recommender with matrix factorization, graph, and NLP. Beating the regular collaborative filtering baseline.

graph matrix-factorization nlp pytorch recommender-system

Last synced: 05 Jan 2025

https://github.com/datquocnguyen/RDRPOSTagger

A fast and accurate POS and morphological tagging toolkit (EACL 2014)

java nlp part-of-speech-tagger pos-tagger pos-tagging python3

Last synced: 30 Oct 2024

https://github.com/arian-askari/ChatGPT-RetrievalQA-CIKM2023

A dataset for training/evaluating Question Answering Retrieval models on ChatGPT responses with the possibility to training/evaluating on real human responses.

ai chatgpt chatgpt-information-retrieval chatgpt-ir data-augmentation dataset deep-learning gpt-3 gpt2 gpt3 information-retrieval information-retrieval-chatgpt ir ir-chatgpt machine-learning nlp openai python sequence-to-sequence text-retrieval

Last synced: 30 Oct 2024

https://github.com/aerdem4/kaggle-quora-dup

Solution to Kaggle's Quora Duplicate Question Detection Competition

neural-network nlp regex siamese-lstm siamese-network

Last synced: 29 Dec 2024

https://github.com/boudinfl/ake-datasets

Large, curated set of benchmark datasets for evaluating automatic keyphrase extraction algorithms.

benchmarking datasets information-retrieval keyphrase-extraction keyphrase-generation keyword-extraction natural-language-processing nlp nlp-machine-learning

Last synced: 14 Oct 2024

https://github.com/HKUST-KnowComp/MnemonicReader

A PyTorch implementation of Mnemonic Reader for the Machine Comprehension task

document-reader machine-comprehension mnemonic-reader nlp pytorch r-net squad

Last synced: 27 Nov 2024

https://github.com/hankcs/id-cnn-cws

Source codes and corpora of paper "Iterated Dilated Convolutions for Chinese Word Segmentation"

bilstm cnn crf cws nlp tensorflow

Last synced: 27 Oct 2024