Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/Smat26/Roman-Urdu-Dataset

Compilation of Manually Tagged Roman Urdu Dataset (Urdu written in Latin/Roman Script), along with other helpful Roman Urdu NLP resources

data-science dataset hindi hindi-language natural-language-processing nlp urdu urdu-language urdu-nlp

Last synced: 04 Aug 2024

https://github.com/Furyton/awesome-language-model-analysis

This paper list focuses on the theoretical and empirical analysis of language models, especially large language models (LLMs). The papers in this list investigate the learning behavior, generalization ability, and other properties of language models through theoretical analysis, empirical analysis, or a combination of both.

ai analysis analytics awesome chatgpt deep-learning generative-ai large-language-models llm nlp theory transformers

Last synced: 19 Sep 2024

https://github.com/mananshah99/sentR

Simple sentiment analysis framework for R

nlp r sentiment-analysis

Last synced: 05 Aug 2024

https://github.com/X-LANCE/Mobile-Env

A Universal Platform for Training and Evaluation of Mobile Interaction

decision-making information-ui infoui interaction-platform nlp rl-environments rl-platform

Last synced: 09 Nov 2024

https://github.com/hienduyph/oxford-deepnlp-2017

:rocket: :tada: :sparkles: Oxford Deep NLP 2017 Course Materials and Practicals, Solutions

deepnlp nlp tensorflow

Last synced: 09 Nov 2024

https://github.com/brianspiering/nlp-course

An introduction to Natural Language Processing (NLP) course

machine-learning natural-language-processing nlp python

Last synced: 07 Nov 2024

https://github.com/ownthink/chatbot

基于语义理解、知识图谱的聊天机器人

chatbot knowledgegraph nlp nlu qa

Last synced: 07 Nov 2024

https://github.com/qznan/qiznlp

Quick run NLP in many task 快速运行分类、序列标注、匹配、生成等NLP任务的Tensorflow框架 (中文 NLP 支持分布式)

beam-search chinese classification horovod match nlp sequence-labeling sequence-to-sequence tensorflow

Last synced: 13 Oct 2024

https://github.com/dalmia/quora-question-pairs

The code for our submission in Kaggle's competition Quora Question Pairs which ranked in the top 25%.

deep-learning machine-learning nlp quora-question-pairs tensorflow

Last synced: 30 Oct 2024

https://github.com/Qznan/QizNLP

Quick run NLP in many task 快速运行分类、序列标注、匹配、生成等NLP任务的Tensorflow框架 (中文 NLP 支持分布式)

beam-search chinese classification horovod match nlp sequence-labeling sequence-to-sequence tensorflow

Last synced: 16 Nov 2024

https://github.com/pooya-mohammadi/persian-spell-checker-kenlm

A complete instruction for training a Persian spell checker and a language model based on SymSpell and KenLM, respectively using Wikipedia dataset.

bash kenlm language-model nlp persian python spellcheck spellchecker symspell

Last synced: 04 Aug 2024

https://github.com/sarthakjshetty/pyresearchinsights

End-to-end NLP tool to analyze research publications. Published in Ecology & Evolution 2021.

gensim natural-language-processing nlp python scientific-analysis spacy text-mining

Last synced: 12 Oct 2024

https://github.com/thunlp/cokebert

CokeBERT: Contextual Knowledge Selection and Embedding towards Enhanced Pre-Trained Language Models

bert knowledge-graph nlp pretrained-language-model pytorch

Last synced: 10 Nov 2024

https://github.com/yuanzhoulvpi2017/questionanswersystem

基于sentence-transformers实现文本转向量的机器人

database encoding fastapi nlp numpy pandas python question-answering robot sbert

Last synced: 08 Nov 2024

https://github.com/benjaminvdb/DBRD

110k Dutch Book Reviews Dataset for Sentiment Analysis

dataset dataset-creation dutch nlp nlp-machine-learning python python3 scraped-data scraper

Last synced: 17 Nov 2024

https://github.com/songyouwei/fiction_generator

Fiction generator with Tensorflow. 模仿王小波的风格的小说生成器

deep-learning keras lstm nlp seq2seq tensorflow text-generation

Last synced: 11 Nov 2024

https://github.com/PhilipMay/stsb-multi-mt

Machine translated multilingual STS benchmark dataset.

dataset multilingual nlp

Last synced: 16 Nov 2024

https://github.com/arjunpatel7/perfect-prompt

An approach to creating the perfect prompt for any image generation task.

cohere nlp prompt stable-diffusion streamlit text-generation

Last synced: 11 Oct 2024

https://github.com/proycon/python-ucto

This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is regular-expression based, extensible, and advanced tokeniser written in C++ (http://ilk.uvt.nl/ucto).

computational-linguistics folia nlp nlp-library python text-processing tokenizer

Last synced: 14 Nov 2024

https://github.com/ademakdogan/gpterm

Creating Intelligent Terminal Apps with ChatGPT and LLM Models

chatgpt chatgpt-api iterm2 langchain langchain-python natural-language-processing nlp python query-generator terminal

Last synced: 07 Nov 2024

https://github.com/97k/spam-ham-web-app

A web app that classifies text as a spam or ham. I am using my own ML algorithm in the backend, Code to that can be found under machine_learning_section. For Live Demo: Checkout this link

bag-of-words data-visualization django heroku-deployment jupyter-notebook machine-learning machine-learning-projects multinomial-naive-bayes nlp nltk spam-classification text-classification tfidf

Last synced: 11 Nov 2024

https://github.com/fractalego/pynsett

A programmable relation extraction tool

extract-relationships nlp relation-extraction spacy wikidata-knowledge

Last synced: 12 Oct 2024

https://github.com/ucrel/pymusas

Python Multilingual Ucrel Semantic Analysis System

natural-language-processing nlp python spacy spacy-pipeline

Last synced: 12 Oct 2024

https://github.com/dbklim/stressrnn

Modified version of RusStress (https://github.com/MashaPo/russtress) — python package for placing stress in Russian text using RNN (BiLSTM) and the "Grammatical Dictionary" by A. A. Zaliznyak (from http://odict.ru/).

accent bilstm emphasis linguistic linguistics lstm nlp rnn russian russian-accent russian-stress russtress rustress stress

Last synced: 11 Nov 2024

https://github.com/ayaka14732/bart-base-jax

JAX implementation of the bart-base model

bart jax natural-language-processing nlp nlp-model

Last synced: 28 Oct 2024

https://github.com/stevenay/myan-word-breaker

Myanmar Word Segmentation Tool

burmese nlp word-segmentation

Last synced: 25 Oct 2024

https://github.com/eimg/burmese-text-classifier

A neural network based text classification system for Burmese

deep-learning javascript nlp

Last synced: 25 Oct 2024

https://github.com/sedthh/lara-hungarian-nlp

NLP class for rapid ChatBot development in Hungarian language

chatbot hungarian hungarian-language lemmatizer nlp python3 stemmer

Last synced: 17 Nov 2024

https://github.com/dsdanielpark/gpt2-bert-medical-qa-chat

Medical domain-focused GPT-2 fine-tuning, optimization, and lightweighting research repository (compared to GPT-4).

bert chatgpt gpt2 gpt4 medical-chatbot natural-language-processing nlp nlp-keywords-extraction

Last synced: 14 Nov 2024

https://github.com/akosbalasko/obsidian-autotagger-plugin

This plugin offers smart tags for notes by performing Named Entity Recognition (NER) on the content

natural-language-processing nlp obsidian-md obsidian-plugin

Last synced: 22 Oct 2024

https://github.com/maxent-ai/lda2vec

Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019

chainer deep-learning embeddings lda nlp python3 sklearn text text-mining topic-modeling word-embeddings word2vec

Last synced: 30 Sep 2024

https://github.com/andreaferretti/charade

A server for multilanguage, composable NLP API in Python

nlp nlp-apis python

Last synced: 14 Oct 2024

https://github.com/vaibhavs10/10_days_of_deep_learning

10 days 10 different practical applications of Deep Learning (primarily NLP) using Tensorflow and Keras

classification gensim keras nlp python tensorflow tfidf-matrix

Last synced: 02 Nov 2024

https://github.com/bloomberg/entsum

Open Source / ENTSUM: A Data Set for Entity-Centric Extractive Summarization

nlp

Last synced: 09 Nov 2024

https://github.com/yuyuzha0/word2vec

a word2vec impl of Chinese language, based on deeplearning4j and ansj

chinese java nlp word2vec word2vec-zh

Last synced: 12 Nov 2024

https://github.com/tianduowang/diffaug

EMNLP 2022: Differentiable Data Augmentation for Contrastive Sentence Representation Learning. https://arxiv.org/abs/2210.16536

data-augmentation nlp sentence-embeddings

Last synced: 14 Oct 2024

https://github.com/shashwath94/hierarchical-seq2seq

A PyTorch implementation of the hierarchical encoder-decoder architecture (HRED) introduced in Sordoni et al (2015). It is a hierarchical encoder-decoder architecture for modeling conversation triples in the MovieTriples dataset. This version of the model is built for the MovieTriples dataset.

deep-learning hred nlp pytorch seq2seq-pytorch

Last synced: 27 Oct 2024

https://github.com/adamspannbauer/app_rasa_chat_bot

a stateless chat bot to perform natural language queries against the App Store top charts

chatbot dash nlp nlu plotly rasa

Last synced: 11 Oct 2024

https://github.com/griptape-ai/griptape-tools

Tools for the Griptape Framework.

ai cohere gpt huggingface llm nlp openai python

Last synced: 27 Sep 2024

https://github.com/ramtinms/tokenquery

TokenQuery (regular expressions over tokens)

machine-learning natural-language-processing nlp regex regular-expressions

Last synced: 11 Nov 2024

https://github.com/adapter-hub/efficient-task-transfer

Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021

adapters bert nlp roberta transfer-learning transformers

Last synced: 06 Nov 2024

https://github.com/saidziani/feedny

The Internet plays an increasingly important part in our daily lives as a source of written content for news and leisure. Yet it is tedious and difficult to sort through this staggering flow of information and stay updated with changes in our world, even using automated tools. Reading magazines and newspapers is too time-consuming, and there is a huge amount of online content that is updated or generated each minute. Our solution considers each user’s interests and leverages Artificial Intelligence, Machine Learning and Natural Language Processing in order to suggest to relevant articles from the internet.

automatic-summarization javascript machine-learning machine-translation natural-language-processing nlp profiling react-native recommendation-system text-classification

Last synced: 28 Oct 2024

https://github.com/voidful/nlprep

🍳 NLPrep - dataset tool for many natural language processing task

dataset nlp prepare pytorch tfkit

Last synced: 01 Oct 2024

https://github.com/zimmerrol/attention-is-all-you-need-keras

Implementation of the Transformer architecture described by Vaswani et al. in "Attention Is All You Need"

attention-is-all-you-need keras neural-network nlp seq2seq transformer

Last synced: 22 Oct 2024

https://github.com/trashhalo/logseq-summarizer

Logseq plugin to summarize text

logseq nlp pin

Last synced: 02 Nov 2024

https://github.com/Praful932/llmsearch

Find better generation parameters for your LLM

llm llm-evaluation llm-inference nlp

Last synced: 08 Nov 2024

https://github.com/sap-samples/acl2020-commonsense

Source code for paper on commonsense reasoning for 2020 Annual Conference of the Association for Computational Linguistics (ACL) 2020.

commonsense-reasoning contrastive deep-learning machine-learning nlp sample sample-code self-supervised

Last synced: 15 Nov 2024

https://github.com/houbb/word-cloud

The word cloud tool for java.(java 好用的词云工具-云图)

cloud image nlp word word-cloud wordcloud

Last synced: 07 Nov 2024

https://github.com/siphulangeni/tortus

A PyPI package for easy text annotation in a Jupyter Notebook.

annotation-tool ipywidgets jupyter-notebook labeling-tool nlp

Last synced: 08 Nov 2024

https://github.com/swanhtet1992/ReSegment

Burmese (Myanmar) syllable level segmentation with regex.

burmese-nlp myanmar-nlp myanmar-text nlp segmentation

Last synced: 25 Oct 2024

https://github.com/loomchild/maligna

Bilingual sengence aligner

nlp text-alignment translation

Last synced: 08 Nov 2024

https://github.com/fredriko/bert-tensorflow-pytorch-spacy-conversion

Instructions for how to convert a BERT Tensorflow model to work with HuggingFace's pytorch-transformers, and spaCy. This walk-through uses DeepPavlov's RuBERT as example.

bert bert-model how-to keras nlp pytorch-transformers spacy spacy-models spacy-nlp spacy-package spacy-pytorch-transformers tensorflow

Last synced: 07 Aug 2024

https://github.com/praful932/llmsearch

Find better generation parameters for your LLM

llm llm-evaluation llm-inference nlp

Last synced: 27 Oct 2024

https://github.com/veler/notepad-based-calculator

A smart calculator using natural language processing

calculator csharp dotnet mef natural-language-processing nlp

Last synced: 29 Oct 2024

https://github.com/aqibsaeed/research-paper-categorization

Research paper classification using machine learning and NLP

machine-learning nlp text-classification

Last synced: 09 Nov 2024

https://github.com/suicao/vn-accent-restorer

This project applies multiple deep learning models to the problem of restoring diacritical marks to sentences in Vietnamese.

deep-learning nlp tensorflow tensorflow-experiments

Last synced: 10 Oct 2024

https://github.com/yasinkuyu/Turkish.cs

Turkish Suffix Library for C# & .NET- Türkçe Çekim ve Yapım Ekleri

c-sharp nlp stem vowel

Last synced: 12 Nov 2024

https://github.com/luoyuanlab/text_gcn_tutorial

A tutorial & minimal example (8min on CPU) for Graph Convolutional Networks for Text Classification. AAAI 2019

deep-learning graph-convolutional-networks nlp text-classification

Last synced: 02 Nov 2024

https://github.com/gatenlp/gateplugin-learningframework

A plugin for the GATE language technology framework for training and using machine learning models. Currently supports Mallet (MaxEnt, NaiveBayes, CRF and others), LibSVM, Scikit-Learn, Weka, and DNNs through Pytorch and Keras.

classification crf machine-learning nlp sequence-tagging

Last synced: 13 Nov 2024

https://github.com/amazon-science/bold

Dataset associated with "BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation" paper

bert bert-model bias fairness-ml gpt-2 language-model nlg nlg-dataset nlp text-generation

Last synced: 12 Nov 2024

https://github.com/shibing624/pinyin-tokenizer

pinyintokenizer, 拼音分词器,将连续的拼音切分为单字拼音列表。

nlp pinyin pinyin-analysis pinyin4j tokenizer trie-tree

Last synced: 22 Oct 2024

https://github.com/yasinkuyu/turkish.cs

Turkish Suffix Library for C# & .NET- Türkçe Çekim ve Yapım Ekleri

c-sharp nlp stem vowel

Last synced: 06 Nov 2024

https://github.com/agatan/yoin

A Japanese Morphological Analyzer written in pure Rust

japanese nlp rust

Last synced: 05 Nov 2024

https://github.com/generall/entitycategoryprediction

Model for predicting categories of entities by its mentions

allennlp classification mentions nlp

Last synced: 14 Oct 2024

https://github.com/philipmay/stsb-multi-mt

Machine translated multilingual STS benchmark dataset.

dataset multilingual nlp

Last synced: 28 Oct 2024

https://github.com/jayyip/cws-tensorflow

基于Tensorflow的中文分词模型

nlp tensorflow word-segmentation

Last synced: 11 Nov 2024

https://github.com/warpy-ai/tgs

Terminal Generative Shell

ai bash nlp shell t5-small terminal

Last synced: 13 Aug 2024

https://github.com/karma9874/seq2seq-chatbot

Chatbot based Seq2Seq model with bidirectional rnn and attention mechanism with tensorflow, trained on Cornell Movie-Dialogs Corpus and deployed on a Flask Server

attention-mechanism bidirectional-lstm chatbot deep-learning flask nlp question-answering seq2seq tensorflow

Last synced: 06 Nov 2024

https://github.com/omarsar/nlp_pytorch_tensorflow_notebooks

Deep Learning for NLP Python Notebooks in PyTorch and TensorFlow

deeplearning emotion nlp pytorch rnn sentiment-analysis tensorflow

Last synced: 13 Oct 2024

https://github.com/gaussalgo/adaptor

ACL 2022: Adaptor: a library to easily adapt a language model to your own task, domain, or custom objective(s).

domain-adaptation multi-objective-optimization ner nlp pytorch robustness text-classification text-generation transformers

Last synced: 08 Nov 2024

https://github.com/loomchild/segment

Program used to split text into segments

nlp segmentation srx

Last synced: 14 Nov 2024

https://github.com/vzhong/corenlp-docker

Docker image for Stanford CoreNLP

corenlp docker nlp

Last synced: 14 Oct 2024

https://github.com/tlack/hairytext

A data labeling and NLP tool for Elixir (uses Spacy)

elixir entity-recognition nlp nlp-machine-learning phoenix-live-view spacy text-classification

Last synced: 28 Oct 2024

https://github.com/princeton-nlp/rationale-robustness

NAACL 2022: Can Rationalization Improve Robustness? https://arxiv.org/abs/2204.11790

interpretability nlp robustness

Last synced: 11 Nov 2024