Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/pyurbans/urbans

A tool for translating text from source grammar to target grammar (context-free) with corresponding dictionary.

artificial-intelligence data-science machine-translation nlp python

Last synced: 02 Aug 2024

https://github.com/banyh/PyStanfordNLP

A Python Wrapper of Stanford Chinese Segmenter

nlp postagging python-wrapper stanford stanford-chinese-segmenter

Last synced: 02 Aug 2024

https://github.com/fursovia/geometric_embedding

"Zero-Training Sentence Embedding via Orthogonal Basis" paper implementation

embeddings nlp

Last synced: 03 Aug 2024

https://github.com/proycon/deepfrog

An NLP-suite powered by deep learning

deep-learning deep-neural-networks dutch folia frog nlp transformers

Last synced: 03 Aug 2024

https://github.com/cmccomb/rust-stop-words

Common stop words in a variety of languages

languages natural-language-procressing nlp nltk rust-crate stopwords

Last synced: 04 Aug 2024

https://github.com/systats/textlearnR

A simple collection of well working NLP models (Keras, H2O, StarSpace) tuned and benchmarked on a variety of datasets.

classification hyperparameter-optimization keras nlp r text-mining

Last synced: 05 Aug 2024

https://github.com/MilaNLProc/bertlang

A web interface to understand language-specific BERT-models

artificial-intelligence bert-model machine-learning nlp nlp-machine-learning

Last synced: 28 Aug 2024

https://github.com/sno2/bertml

Use common pre-trained ML models in Deno!

bert deno machine-learning nlp rust

Last synced: 17 Aug 2024

https://github.com/jaron/sciencegraph

A comprehensive knowledge graph of scientific concepts

knowledge-graph neo4j nlp question-answering

Last synced: 01 Aug 2024

https://github.com/yuanxiaosc/Deep_dynamic_contextualized_word_representation

TensorFlow code and pre-trained models for A Dynamic Word Representation Model Based on Deep Context. It combines the idea of BERT model and ELMo's deep context word representation.

bert elmo nlp transformer

Last synced: 01 Aug 2024

https://github.com/hsgodhia/squad_rasor_nn

Pytorch implementation of the RaSoR paper "Learning Recurrent Span Representations for Extractive Question Answering" (Lee et al. 2016) and experiments with various neural components

deep-learning machine-comprehension nlp pytorch

Last synced: 07 Aug 2024

https://github.com/yuanxiaosc/Deep_dynamic_word_representation

TensorFlow code and pre-trained models for A Dynamic Word Representation Model Based on Deep Context. It combines the idea of BERT model and ELMo's deep context word representation.

bert elmo nlp transformer

Last synced: 22 Aug 2024

https://github.com/go-air/dupi

A tool to find all duplicates in large sets of text documents.

analysis analytics golang index nlp search

Last synced: 01 Aug 2024

https://github.com/SapienzaNLP/xl-amr

XL-AMR is a sequence-to-graph cross-lingual AMR parser that exploits transfer learning (EMNLP2020).

abstract-meaning-representation amr amr-graphs amr-parsing natural-language-processing nlp semantic-parsing translations

Last synced: 31 Jul 2024

https://github.com/snipsco/snips-nlu-parsers

Rust crate for entity parsing

entity-recognition entity-resolution nlp nlu rust

Last synced: 01 Aug 2024

https://github.com/reinfer/blingfire-rs

Rust wrapper for the BlingFire tokenization library

machine-learning nlp rust rust-wrapper tokenizer

Last synced: 04 Aug 2024

https://github.com/Etwas-Builders/Twitter-Source-Bot

Ever wanted to know the source of a tweet? Just @whosaidthis_bot and I'll tell you where it came from

bot mozilla-builders nlp source-verify twitter-bot twitter-source-bot web-scraping

Last synced: 01 Aug 2024

https://github.com/ppke-nlpg/purepos

PurePos is an open source hybrid morphological tagger.

hungarian morphological-analysis nlp parser pos-tagger tagger

Last synced: 03 Aug 2024

https://github.com/brunoarine/findlike

Command-line tool that finds lexically similar documents in relation to a reference text file or ad-hoc query

bm25 nlp similarity-search tfidf

Last synced: 07 Aug 2024

https://github.com/JackHCC/Arxiv-NLP-Reporter

每日自动获取Arxiv上NLP相关最新论文【Arxiv Natural Language Processing Paper Automatic Crawl Daily】

arxiv automation nlp

Last synced: 02 Aug 2024

https://github.com/bluelovers/node-segment

Chinese word segmentation 簡繁中文分词模块 以網路小說為樣本 基于 Node.js 的中文分词模块

chinese javascript nlp nodejs segment typescript

Last synced: 02 Aug 2024

https://github.com/dair-ai/nlp_highlights

✨ A report of the most important NLP highlights (A Yearly Report - 2018, 2019)

deep-learning machine-learning nlp

Last synced: 01 Aug 2024

https://github.com/JnRMnT/ZemberekDotNet

ZemberekDotNet is the .NET Port of Zemberek-NLP (Natural Language Processing tools for Turkish).

csharp language machine-learning morphology natural-language-processing nlp nuget turkish zemberek zemberek-nlp

Last synced: 02 Aug 2024

https://github.com/ndabAP/assocentity

Package assocentity returns the mean distance from tokens to an entity and its synonyms

go golang natural-language-processing nlp social-sciences tokenizer

Last synced: 30 Jul 2024

https://github.com/adriacabeza/DeepCatalan

🤖 Deep Catalan: Bring closer the Catalan Language to Deep Learning using ULMFit.

catalan catalan-language classificador fastai fine-tuning nlp pytorch ulmfit

Last synced: 30 Jul 2024

https://github.com/varunon9/chat-reply-suggestions

Auto reply suggestions to chat messages/emails (like gmail and linkedin) built using rasa_nlu framework.

chat-reply chatbot nlp rasa rasa-nlu

Last synced: 01 Aug 2024

https://github.com/helboukkouri/embedding-visualization

This is a project for visualizing word embeddings based on the work of Andrei Kashcha (@anvaka).

fasttext glove graphs nlp visualization word-embeddings word2vec

Last synced: 03 Sep 2024

https://github.com/alexandrevl/supersummarizeai

Unleash the power of AI with SuperSummarizeAI! Effortlessly extract, condense, and clip content from webpages and YouTube videos using ChatGPT. Turning endless streams of content into digestible summaries.

beautifulsoup chatgpt content-analysis multilingual nlp openai papperclip text text-processing text-summarization web-scraping youtube

Last synced: 02 Aug 2024

https://github.com/tech4germany/bam-inclusify

INCLUSIFY is a tool to support the practical use of diversity-sensitive language in German.

diversity equality german govtech language nlp react t4g tech4germany

Last synced: 03 Aug 2024

https://github.com/Rajan-sust/WikiTextCorpusDownloader

A Language Independent Wikipedia Text Corpus Downloader

gensim nlp python3 tensorflow wikipedia

Last synced: 31 Jul 2024

https://github.com/trinker/sentimentpy

A Python port of the #rstats sentimentr package

emotion nlp polarity sentiment text-mining

Last synced: 05 Aug 2024

https://github.com/Cartus/AMR-Parser

Better Transition-Based AMR Parsing with a Refined Search Space (authors' DyNet implementation for the EMNLP18 paper)

aligner amr-parser nlp semantic-parsing transition-based-parser

Last synced: 02 Aug 2024

https://github.com/moj-analytical-services/pq-tool

Tool to analyse past parliamentary questions with visualisation in RShiny

clustering latent-semantic-analysis nlp shiny

Last synced: 13 Aug 2024

https://github.com/MaartenGr/Reviewer

Tool for extracting and analyzing IMDB reviews

bert disney imdb ner nlp sentiment-analysis

Last synced: 04 Aug 2024

https://github.com/Pushkar1853/Cover-Generator

Application of OpenAI tools such as Whisper, DALL-E, and ChatGPT to generate album covers from audio

chat-gpt computer-vision dall-e music nlp openai stable-diffusion whisper-ai

Last synced: 31 Jul 2024

https://github.com/Captmoonshot/data-bore

A Django REST API with an embedded ML model for sentiment analysis of movie reviews.

dash dashboard django django-rest-framework machine-learning nlp pandas plotly powerbi python

Last synced: 08 Aug 2024

https://github.com/contefranz/OpTop

Optimal topic identification from a pool of Latent Dirichlet Allocation models

latent-dirichlet-allocation lda model-selection natural-language-processing nlp text-mining topic-modeling

Last synced: 05 Aug 2024

https://github.com/simonepri/varname-seq2seq

📄Source code variable naming using a seq2seq architecture

nlp pytorch rnn seq2seq

Last synced: 30 Jul 2024

https://github.com/doubledaibo/compcaption_neurips2018

A Neural Compositional Paradigm for Image Captioning

compositionality image-captioning neurips-2018 nlp

Last synced: 31 Jul 2024

https://github.com/minhd-vu/toxicity-filter

Natural language processing API to detect toxic chat.

flask nlp python

Last synced: 01 Aug 2024

https://github.com/UCSB-NLP-Chang/ULD

Implementation of paper 'Reversing the Forget-Retain Objectives: An Efficient LLM Unlearning Framework from Logit Difference'

large-language-models nlp pytorch transformers unlearning

Last synced: 31 Jul 2024

https://github.com/ariya/tebakmasa

Infer the date and time from the general description in Bahasa Indonesia

bahasa bahasa-indonesia date indonesian nlp time timestamp

Last synced: 01 Aug 2024

https://github.com/mhezarei/ai-bot

2020 AI bot challenge (ai-bot.ir) repository. This program answers a given question with a specific format and subject.

bert nlp persian-nlp

Last synced: 04 Aug 2024

https://github.com/praekelt/feersum-nlu-api-wrappers

Swagger spec and generated Python language wrappers for the FeersumNLU HTTP Rest API for building intelligent chatbots.

chatbot-framework nlp nlp-machine-learning

Last synced: 13 Aug 2024

https://github.com/AlbertSuarez/casescan

🔍 Clinical cases search by similarity specialized in Covid-19

nlp python react similarity-search

Last synced: 30 Jul 2024

https://github.com/derintelligence/az-summarization

Abstractive summarization for Azerbaijani language

azerbaijan dataset language linguistics nlp summarization

Last synced: 02 Aug 2024

https://github.com/alvations/myth

Myanmar and Thai Language Resources

machine-translation myanmar nlp thai

Last synced: 30 Jul 2024

https://github.com/uds-lsv/Multi-tasking_Learning_With_Unreliable_Labels

Extending the NLNN algorithm proposed by Bekker & Goldbergers in a Multi-tasking Learning set-up to handle noisy labels. In order to extend low-resource data we often used artificial annotators. In this following setup we aim to generate clean training labeled data from artificial annotators.

machine-learning nlp noise-reduction

Last synced: 01 Aug 2024

https://github.com/strayMat/tag_serve

Deployable Neural Tagger implementation for Named Entity Recognition

bilstm-crf deep-learning docker-image flask flask-application machine-learning ner neural-network nlp pytorch tagger

Last synced: 03 Sep 2024

https://github.com/BrianWeinstein/googlenlp

An Interface to Google's Cloud Natural Language API

api cran google-cloud-platform nlp r

Last synced: 13 Aug 2024

https://github.com/clnnn/chat-summarizer

💬 Real-time chat application prototype that can summarise the entire chat log

angular flask huggingface-transformer java ngrx nlp python websocket

Last synced: 31 Jul 2024

https://github.com/patrick-miller/textbook-concept-map

Build a concept map from textbooks using DBpedia Spotlight

concept-map dbpedia-spotlight educational-technology nlp

Last synced: 01 Aug 2024

https://github.com/denismosolov/alice-entities-library

Набор именованных сущностей для платформы Яндекс.Диалоги. Используйте при создании навыков Алисы.

alice-sdk alice-skills nlp yandex-dialogs

Last synced: 03 Aug 2024

https://github.com/pharo-ai/Polyglot

A library for Natural Language Processing

natural-language-processing nlp pharo

Last synced: 03 Aug 2024

https://github.com/oneapi-src/disease-prediction

AI Starter Kit for the implementation of AI-based NLP Disease Prediction system using Intel® Extension for PyTorch* and Intel® Neural Compressor

deep-learning nlp pytorch

Last synced: 01 Aug 2024

https://github.com/mochi-co/ngrams

A Go n-gram indexer for natural language processing with modular tokenizers and data stores

bigrams go golang language-model natural-language-processing ngram ngrams nlp tokenization trigrams

Last synced: 02 Aug 2024

https://github.com/webpolis/musai

Machine learning-powered music generation. Full-featured tokenizer, customization options, and high-quality output files. Integration with music production tools.

deep-learning generative-art large-language-models llm machine-learning midi music music-generation nlp recurrent-neural-networks rnn text-generation tokenizer vae variational-autoencoder

Last synced: 05 Aug 2024

https://github.com/codeastra2/ChatGPTDevFriendly

A wrapper over the Chatgpt Python APIs, for better developer experience.

api chatgpt chatgpt-api chatgpt-api-wrapper chatgpt-bot chatgpt-python chatgpt-sdk gpt3 gpt4-api nlp openai-api python

Last synced: 01 Aug 2024

https://github.com/thomas-chauvet/names_transliteration

Neural Machine Translation (NMT) applied to transliterate names in arabic characters to latin characters (romanization).

arabic characters cli data dataset deep-learning latin neural-network nlp nmt romanization seq2seq translation transliteration typer-cli

Last synced: 03 Aug 2024

https://github.com/seanghay/khmernormalizer

A missing toolkit for Khmer Natural Language Processing.

khmer nlp normalization normalizer verbalization

Last synced: 01 Aug 2024

https://github.com/minnesotanlp/Quantifying-Annotation-Disagreement

Official implementation of Wan et al's paper "Everyone's Voice Matters: Quantifying Annotation Disagreement Using Demographic Information" (AAAI 2023)

aaai ai annotation natural-language-processing nlp roberta

Last synced: 01 Aug 2024

https://github.com/Davisy/Texthero-Python-Toolkit

Texthero is a simple python toolkit to work with a text-based dataset. It provides quick and effortlessly functionalities to preprocess, represent, map it into vectors and visualize text data in just a couple of lines of codes.

machine-learning natural-language-processing nlp preprocessing python

Last synced: 01 Aug 2024

https://github.com/Joppewouts/belabBERT

🤧belabBERT: Repository for a new Dutch language model based on the RoBERTa architecture

bert language-model nlp roberta

Last synced: 03 Aug 2024

https://github.com/aman5319/Classification-Report

This repo helps to track model Weights, Biases and Gradients during training with loss tracking and gives detailed insight for Classification-Model Evaluation

classification image-classification loss-plotting metrics-visualization model-visualization nlp pytorch sklearn tensorboard tensorboard-pytorch tensorboard-visualization text-classification

Last synced: 03 Aug 2024

https://github.com/sinaahmadi/KurdishTokenization

Tokenization resources for Kurdish (Sorani & Kurmanji dialects)

kurdish kurdish-language-processing kurmanji natural-language-processing nlp sorani tokenization

Last synced: 03 Aug 2024

https://github.com/DidierRLopes/similarstocks

This repository will hold similar stocks based on their description through NLP models

finance nlp similarity-search stocks

Last synced: 01 Aug 2024

https://github.com/sinaahmadi/KurdishMT

Towards Machine Translation for the Kurdish Language

kurdish kurdish-language-processing less-resource-languages machine-translation nlp

Last synced: 03 Aug 2024

https://github.com/maxoodf/tgnews

Telegram Data Clustering Contest (Bossy Gnu's submission )

cpp document-clustering document-embedding document-similarity nlp nlp-machine-learning telegram word2vec

Last synced: 01 Aug 2024