Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/fractalego/pynsett

A programmable relation extraction tool

extract-relationships nlp relation-extraction spacy wikidata-knowledge

Last synced: 12 Oct 2024

https://github.com/ucrel/pymusas

Python Multilingual Ucrel Semantic Analysis System

natural-language-processing nlp python spacy spacy-pipeline

Last synced: 12 Oct 2024

https://github.com/dbklim/stressrnn

Modified version of RusStress (https://github.com/MashaPo/russtress) — python package for placing stress in Russian text using RNN (BiLSTM) and the "Grammatical Dictionary" by A. A. Zaliznyak (from http://odict.ru/).

accent bilstm emphasis linguistic linguistics lstm nlp rnn russian russian-accent russian-stress russtress rustress stress

Last synced: 11 Nov 2024

https://github.com/ayaka14732/bart-base-jax

JAX implementation of the bart-base model

bart jax natural-language-processing nlp nlp-model

Last synced: 28 Oct 2024

https://github.com/proycon/python-ucto

This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is regular-expression based, extensible, and advanced tokeniser written in C++ (http://ilk.uvt.nl/ucto).

computational-linguistics folia nlp nlp-library python text-processing tokenizer

Last synced: 14 Nov 2024

https://github.com/97k/spam-ham-web-app

A web app that classifies text as a spam or ham. I am using my own ML algorithm in the backend, Code to that can be found under machine_learning_section. For Live Demo: Checkout this link

bag-of-words data-visualization django heroku-deployment jupyter-notebook machine-learning machine-learning-projects multinomial-naive-bayes nlp nltk spam-classification text-classification tfidf

Last synced: 11 Nov 2024

https://github.com/ademakdogan/gpterm

Creating Intelligent Terminal Apps with ChatGPT and LLM Models

chatgpt chatgpt-api iterm2 langchain langchain-python natural-language-processing nlp python query-generator terminal

Last synced: 07 Nov 2024

https://github.com/benjaminvdb/DBRD

110k Dutch Book Reviews Dataset for Sentiment Analysis

dataset dataset-creation dutch nlp nlp-machine-learning python python3 scraped-data scraper

Last synced: 03 Aug 2024

https://github.com/sedthh/lara-hungarian-nlp

NLP class for rapid ChatBot development in Hungarian language

chatbot hungarian hungarian-language lemmatizer nlp python3 stemmer

Last synced: 03 Aug 2024

https://github.com/dsdanielpark/gpt2-bert-medical-qa-chat

Medical domain-focused GPT-2 fine-tuning, optimization, and lightweighting research repository (compared to GPT-4).

bert chatgpt gpt2 gpt4 medical-chatbot natural-language-processing nlp nlp-keywords-extraction

Last synced: 14 Nov 2024

https://github.com/akosbalasko/obsidian-autotagger-plugin

This plugin offers smart tags for notes by performing Named Entity Recognition (NER) on the content

natural-language-processing nlp obsidian-md obsidian-plugin

Last synced: 22 Oct 2024

https://github.com/Qznan/QizNLP

Quick run NLP in many task 快速运行分类、序列标注、匹配、生成等NLP任务的Tensorflow框架 (中文 NLP 支持分布式)

beam-search chinese classification horovod match nlp sequence-labeling sequence-to-sequence tensorflow

Last synced: 03 Aug 2024

https://github.com/maxent-ai/lda2vec

Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019

chainer deep-learning embeddings lda nlp python3 sklearn text text-mining topic-modeling word-embeddings word2vec

Last synced: 30 Sep 2024

https://github.com/shashwath94/hierarchical-seq2seq

A PyTorch implementation of the hierarchical encoder-decoder architecture (HRED) introduced in Sordoni et al (2015). It is a hierarchical encoder-decoder architecture for modeling conversation triples in the MovieTriples dataset. This version of the model is built for the MovieTriples dataset.

deep-learning hred nlp pytorch seq2seq-pytorch

Last synced: 27 Oct 2024

https://github.com/bloomberg/entsum

Open Source / ENTSUM: A Data Set for Entity-Centric Extractive Summarization

nlp

Last synced: 09 Nov 2024

https://github.com/adamspannbauer/app_rasa_chat_bot

a stateless chat bot to perform natural language queries against the App Store top charts

chatbot dash nlp nlu plotly rasa

Last synced: 11 Oct 2024

https://github.com/griptape-ai/griptape-tools

Tools for the Griptape Framework.

ai cohere gpt huggingface llm nlp openai python

Last synced: 27 Sep 2024

https://github.com/vaibhavs10/10_days_of_deep_learning

10 days 10 different practical applications of Deep Learning (primarily NLP) using Tensorflow and Keras

classification gensim keras nlp python tensorflow tfidf-matrix

Last synced: 02 Nov 2024

https://github.com/saidziani/feedny

The Internet plays an increasingly important part in our daily lives as a source of written content for news and leisure. Yet it is tedious and difficult to sort through this staggering flow of information and stay updated with changes in our world, even using automated tools. Reading magazines and newspapers is too time-consuming, and there is a huge amount of online content that is updated or generated each minute. Our solution considers each user’s interests and leverages Artificial Intelligence, Machine Learning and Natural Language Processing in order to suggest to relevant articles from the internet.

automatic-summarization javascript machine-learning machine-translation natural-language-processing nlp profiling react-native recommendation-system text-classification

Last synced: 28 Oct 2024

https://github.com/adapter-hub/efficient-task-transfer

Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021

adapters bert nlp roberta transfer-learning transformers

Last synced: 06 Nov 2024

https://github.com/voidful/nlprep

🍳 NLPrep - dataset tool for many natural language processing task

dataset nlp prepare pytorch tfkit

Last synced: 01 Oct 2024

https://github.com/tianduowang/diffaug

EMNLP 2022: Differentiable Data Augmentation for Contrastive Sentence Representation Learning. https://arxiv.org/abs/2210.16536

data-augmentation nlp sentence-embeddings

Last synced: 14 Oct 2024

https://github.com/ramtinms/tokenquery

TokenQuery (regular expressions over tokens)

machine-learning natural-language-processing nlp regex regular-expressions

Last synced: 11 Nov 2024

https://github.com/yuyuzha0/word2vec

a word2vec impl of Chinese language, based on deeplearning4j and ansj

chinese java nlp word2vec word2vec-zh

Last synced: 12 Nov 2024

https://github.com/zimmerrol/attention-is-all-you-need-keras

Implementation of the Transformer architecture described by Vaswani et al. in "Attention Is All You Need"

attention-is-all-you-need keras neural-network nlp seq2seq transformer

Last synced: 22 Oct 2024

https://github.com/andreaferretti/charade

A server for multilanguage, composable NLP API in Python

nlp nlp-apis python

Last synced: 14 Oct 2024

https://github.com/praful932/llmsearch

Find better generation parameters for your LLM

llm llm-evaluation llm-inference nlp

Last synced: 27 Oct 2024

https://github.com/veler/notepad-based-calculator

A smart calculator using natural language processing

calculator csharp dotnet mef natural-language-processing nlp

Last synced: 29 Oct 2024

https://github.com/swanhtet1992/ReSegment

Burmese (Myanmar) syllable level segmentation with regex.

burmese-nlp myanmar-nlp myanmar-text nlp segmentation

Last synced: 25 Oct 2024

https://github.com/Praful932/llmsearch

Find better generation parameters for your LLM

llm llm-evaluation llm-inference nlp

Last synced: 08 Nov 2024

https://github.com/loomchild/maligna

Bilingual sengence aligner

nlp text-alignment translation

Last synced: 08 Nov 2024

https://github.com/siphulangeni/tortus

A PyPI package for easy text annotation in a Jupyter Notebook.

annotation-tool ipywidgets jupyter-notebook labeling-tool nlp

Last synced: 08 Nov 2024

https://github.com/trashhalo/logseq-summarizer

Logseq plugin to summarize text

logseq nlp pin

Last synced: 02 Nov 2024

https://github.com/houbb/word-cloud

The word cloud tool for java.(java 好用的词云工具-云图)

cloud image nlp word word-cloud wordcloud

Last synced: 07 Nov 2024

https://github.com/aqibsaeed/research-paper-categorization

Research paper classification using machine learning and NLP

machine-learning nlp text-classification

Last synced: 09 Nov 2024

https://github.com/fredriko/bert-tensorflow-pytorch-spacy-conversion

Instructions for how to convert a BERT Tensorflow model to work with HuggingFace's pytorch-transformers, and spaCy. This walk-through uses DeepPavlov's RuBERT as example.

bert bert-model how-to keras nlp pytorch-transformers spacy spacy-models spacy-nlp spacy-package spacy-pytorch-transformers tensorflow

Last synced: 07 Aug 2024

https://github.com/sap-samples/acl2020-commonsense

Source code for paper on commonsense reasoning for 2020 Annual Conference of the Association for Computational Linguistics (ACL) 2020.

commonsense-reasoning contrastive deep-learning machine-learning nlp sample sample-code self-supervised

Last synced: 15 Nov 2024

https://github.com/suicao/vn-accent-restorer

This project applies multiple deep learning models to the problem of restoring diacritical marks to sentences in Vietnamese.

deep-learning nlp tensorflow tensorflow-experiments

Last synced: 10 Oct 2024

https://github.com/luoyuanlab/text_gcn_tutorial

A tutorial & minimal example (8min on CPU) for Graph Convolutional Networks for Text Classification. AAAI 2019

deep-learning graph-convolutional-networks nlp text-classification

Last synced: 02 Nov 2024

https://github.com/gatenlp/gateplugin-learningframework

A plugin for the GATE language technology framework for training and using machine learning models. Currently supports Mallet (MaxEnt, NaiveBayes, CRF and others), LibSVM, Scikit-Learn, Weka, and DNNs through Pytorch and Keras.

classification crf machine-learning nlp sequence-tagging

Last synced: 13 Nov 2024

https://github.com/yasinkuyu/turkish.cs

Turkish Suffix Library for C# & .NET- Türkçe Çekim ve Yapım Ekleri

c-sharp nlp stem vowel

Last synced: 06 Nov 2024

https://github.com/generall/entitycategoryprediction

Model for predicting categories of entities by its mentions

allennlp classification mentions nlp

Last synced: 14 Oct 2024

https://github.com/agatan/yoin

A Japanese Morphological Analyzer written in pure Rust

japanese nlp rust

Last synced: 05 Nov 2024

https://github.com/amazon-science/bold

Dataset associated with "BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation" paper

bert bert-model bias fairness-ml gpt-2 language-model nlg nlg-dataset nlp text-generation

Last synced: 12 Nov 2024

https://github.com/philipmay/stsb-multi-mt

Machine translated multilingual STS benchmark dataset.

dataset multilingual nlp

Last synced: 28 Oct 2024

https://github.com/jayyip/cws-tensorflow

基于Tensorflow的中文分词模型

nlp tensorflow word-segmentation

Last synced: 11 Nov 2024

https://github.com/shibing624/pinyin-tokenizer

pinyintokenizer, 拼音分词器,将连续的拼音切分为单字拼音列表。

nlp pinyin pinyin-analysis pinyin4j tokenizer trie-tree

Last synced: 22 Oct 2024

https://github.com/yasinkuyu/Turkish.cs

Turkish Suffix Library for C# & .NET- Türkçe Çekim ve Yapım Ekleri

c-sharp nlp stem vowel

Last synced: 12 Nov 2024

https://github.com/warpy-ai/tgs

Terminal Generative Shell

ai bash nlp shell t5-small terminal

Last synced: 13 Aug 2024

https://github.com/gaussalgo/adaptor

ACL 2022: Adaptor: a library to easily adapt a language model to your own task, domain, or custom objective(s).

domain-adaptation multi-objective-optimization ner nlp pytorch robustness text-classification text-generation transformers

Last synced: 08 Nov 2024

https://github.com/vzhong/corenlp-docker

Docker image for Stanford CoreNLP

corenlp docker nlp

Last synced: 14 Oct 2024

https://github.com/loomchild/segment

Program used to split text into segments

nlp segmentation srx

Last synced: 14 Nov 2024

https://github.com/tlack/hairytext

A data labeling and NLP tool for Elixir (uses Spacy)

elixir entity-recognition nlp nlp-machine-learning phoenix-live-view spacy text-classification

Last synced: 28 Oct 2024

https://github.com/karma9874/seq2seq-chatbot

Chatbot based Seq2Seq model with bidirectional rnn and attention mechanism with tensorflow, trained on Cornell Movie-Dialogs Corpus and deployed on a Flask Server

attention-mechanism bidirectional-lstm chatbot deep-learning flask nlp question-answering seq2seq tensorflow

Last synced: 06 Nov 2024

https://github.com/omarsar/nlp_pytorch_tensorflow_notebooks

Deep Learning for NLP Python Notebooks in PyTorch and TensorFlow

deeplearning emotion nlp pytorch rnn sentiment-analysis tensorflow

Last synced: 13 Oct 2024

https://github.com/yuewang-cuhk/hashtaggeneration

The official implementation of the NAACL-HLT 2019 paper "Microblog Hashtag Generation via Encoding Conversation Contexts"

hashtag-generator nlp social-media

Last synced: 09 Nov 2024

https://github.com/princeton-nlp/rationale-robustness

NAACL 2022: Can Rationalization Improve Robustness? https://arxiv.org/abs/2204.11790

interpretability nlp robustness

Last synced: 11 Nov 2024

https://github.com/shibing624/title-generator

Automatic Text Summarization and Title Generation.

deep-learning nlp text-summarization title-generation

Last synced: 22 Oct 2024

https://github.com/chengchingwen/bytepairencoding.jl

Julia implementation of Byte Pair Encoding for NLP

nlp nlp-library nlp-machine-learning word-segmentation

Last synced: 15 Oct 2024

https://github.com/adhaamehab/arabicnlp

Python package for Arabic natural language processing

arabic arabic-nlp keras ml nlp part-of-speech-tagger postagging sequence-modeling

Last synced: 11 Oct 2024

https://github.com/chengchingwen/BytePairEncoding.jl

Julia implementation of Byte Pair Encoding for NLP

nlp nlp-library nlp-machine-learning word-segmentation

Last synced: 28 Oct 2024

https://github.com/erikgartner/sentimental

Sentiment analysis made easy; built on top off solid libraries.

natural-language-processing nlp sentiment-analysis

Last synced: 02 Nov 2024

https://github.com/decalogue/ai

AI ——人工智能工具集,包含机器学习,深度学习,自然语言处理

ai deep-learning dl machine-learning ml natural-language-processing nlp python

Last synced: 15 Nov 2024

https://github.com/janekb04/py2gpt

Convert Python code into JSON consumable by OpenAI's function API.

ai api chatgpt converter function gpt gpt-4 json nlp openai openai-api python schema transcoding

Last synced: 05 Nov 2024

https://github.com/thunlp/hiddenkiller

Code and data of the ACL-IJCNLP 2021 paper "Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger"

backdoor-attacks nlp nlproc

Last synced: 10 Nov 2024

https://github.com/dair-ai/odsc_2020_nlp

Repository for ODSC talk related to Deep Learning NLP

elasticsearch nlp search transformer

Last synced: 10 Nov 2024

https://github.com/mlabouardy/dialogflow-watchnow-messenger

WatchNow FB Messenger bot with DialogFlow & Golang 💬

api-ai bot dialogflow golang messenger nlp

Last synced: 15 Nov 2024

https://gair-nlp.github.io/BeHonest/

BeHonest: Benchmarking Honesty in Large Language Models

alignment benchmark evaluation honesty llm nlp

Last synced: 11 Oct 2024

https://github.com/jasonwbw/recordpapers4nlp

Record papers for some NLP related area

deep-learning dialogue-generation nlp reading-comprehension

Last synced: 28 Oct 2024

https://github.com/fractalego/subjectivity_classifier

Detects if a sentence is in a subjective or objective form

nlp rnn-tensorflow subjectivity

Last synced: 28 Oct 2024

https://github.com/shrebox/Personified-Chatbot

A personified chatbot responding to a query based on the answering pattern of Dr. APJ Abdul Kalam using Information Retrieval, Natural Language Processing, and Deep Learning techniques.

apj-abdul-kalam chatbot deep-learning information-retrieval lstm natural-language-processing nlp ranking-algorithm seq2seq-chatbot seq2seq-model summarization word2vec

Last synced: 11 Nov 2024

https://github.com/liebeck/spacy-iwnlp

German lemmatization with IWNLP as extension for spaCy

nlp spacy spacy-extension spacy-pipeline

Last synced: 14 Oct 2024

https://github.com/kampersanda/tongrams-rs

Rust library providing fast language model queries in compressed space

compression elias-fano language-model ngrams nlp trie

Last synced: 11 Nov 2024