Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/leoneversberg/llm-chatbot-rag

A local LLM chatbot with RAG for PDF input files

chatbot llm nlp rag

Last synced: 08 Aug 2024

https://github.com/ahmedbesbes/audiolizr

A bentoML-powered API to transcribe audio and make sense of it

bentoml bentoml-service docker nlp openai openai-whisper pytube speech-recognition t5 torch transformers

Last synced: 07 Aug 2024

https://github.com/Flight-School/sentences

A command-line utility that splits natural language text into sentences.

cli macos nlp sentence-tokenizer swift

Last synced: 05 Aug 2024

https://github.com/gentaiscool/indonesian-nlp

A curated list of research papers and resources on Indonesian languages

deep-learning indonesian javanese local local-languages machine-learning nlp papers research speech sundanese survey

Last synced: 08 Nov 2024

https://github.com/stanfordnlp/stanza-train

Model training tutorials for the Stanza Python NLP Library

natural-language-processing nlp stanza

Last synced: 08 Nov 2024

https://github.com/amazon-science/recode

Releasing code for "ReCode: Robustness Evaluation of Code Generation Models"

code-generation large-language-models nlp robustness

Last synced: 12 Nov 2024

https://github.com/dair-ai/notebooks

🔬 Sharing your data science notebooks with the community has never been this easy.

artificial-intelligence deep-learning machine-learning nlp

Last synced: 10 Nov 2024

https://github.com/ropensci-archive/geoparser

:no_entry: ARCHIVED :no_entry:

geocoding geoparser nlp peer-reviewed r r-package rstats

Last synced: 05 Aug 2024

https://github.com/cocoa-ai/namescoremldemo

🏷 iOS11 demo application for predicting gender from first names.

classification coreml coreml-models gender-classification ios machine-learning nlp swift swift4

Last synced: 07 Nov 2024

https://github.com/seanlee97/clfzoo

A deep text classifiers library.

nlp tensorflow text-classification

Last synced: 27 Oct 2024

https://github.com/chrismattmann/lucene-geo-gazetteer

Uses Apache Lucene, OpenNLP and geonames and extracts locations from text and geocodes them.

allcountries apache gazetteer geoindex geonames irds lucene nlp nlp-machine-learning opennlp

Last synced: 30 Oct 2024

https://github.com/aws-solutions/content-localization-on-aws

Automatically generate multi-language subtitles using AWS AI/ML services. Machine generated subtitles can be edited to improve accuracy and downstream tracks will automatically be regenerated based on the edits. Built on Media Insights Engine (https://github.com/awslabs/aws-media-insights-engine)

amazon-comprehend amazon-polly amazon-transcribe amazon-translate audio aws-media-insights-engine captions content-analysis localisation localization media mie nlp nlp-machine-learning speech-to-text subtitles video video-on-demand vod

Last synced: 08 Nov 2024

https://github.com/macournoyer/utterance_parser

Extract intent and entities from natural language utterances

extracts-intent nlp slot-filling

Last synced: 09 Nov 2024

https://github.com/mchmarny/tsignal

Analyzing social media sentiment and its impact on stock market

analytics golang nasdaq nlp sentiment-analysis twitter

Last synced: 08 Nov 2024

https://github.com/snnclsr/ner

Turkish Named Entity Recognition

ner nlp

Last synced: 10 Oct 2024

https://github.com/cocoa-ai/NamesCoreMLDemo

🏷 iOS11 demo application for predicting gender from first names.

classification coreml coreml-models gender-classification ios machine-learning nlp swift swift4

Last synced: 09 Aug 2024

https://github.com/psolbach/metadoc

Aviation grade news article metadata extraction

extraction metadata news nlp perceptron

Last synced: 08 Nov 2024

https://github.com/hiyouga/pban-pytorch

A Position-aware Bidirectional Attention Network for Aspect-level Sentiment Analysis, PyTorch implementation.

aspect-based-sentiment-analysis attention-model deep-learning natural-language-processing nlp pytorch sentiment-analysis

Last synced: 27 Oct 2024

https://github.com/liebeck/spacy-sentiws

German sentiment scores with SentiWS as extension for spaCy

nlp spacy spacy-extension spacy-pipeline

Last synced: 14 Oct 2024

https://github.com/bastienbot/nlp-js-tools-french

POS Tagger, lemmatizer and stemmer for french language in javascript

lemmatization lemmatizer nlp postagging postgresql stemmer stemming tokenization tokenizer

Last synced: 28 Aug 2024

https://github.com/xxjwxc/gohanlp

Golang RESTful Client for HanLP.中文分词 词性标注 命名实体识别 依存句法分析 语义依存分析 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理

ai dependency-parser hanlp named-entity-recognition natural-language-processing nlp pos-tagging semantic-parsing text-classification

Last synced: 28 Oct 2024

https://github.com/bangla-rag/porag

Fully Configurable RAG Pipeline for Bengali Language RAG Applications. Supports both Local and Huggingface Models, Built with Langchain.

ai bengali bengali-nlp chromadb langchain llama3 llm nlp rag transformers

Last synced: 10 Oct 2024

https://github.com/syzer/sentiment-analyser

ML that can extract german and english sentiment

english german nlp nlp-library node-js nodejs sentiment-analyser sentiment-analysis

Last synced: 28 Oct 2024

https://github.com/thisisiron/nmt-attention-tf2

👫 Effective Approaches to Attention-based Neural Machine Translation implemented as Tensorflow 2.0

attention lstm natural-language-processing neural-machine-translation nlp nmt tensorflow tensorflow2 tf2 translation

Last synced: 08 Nov 2024

https://github.com/MiuLab/FlowDelta

FlowDelta: Modeling Flow Information Gain in Reasoning for Conversational Machine Comprehension

machine-comprehension nlp pytorch question-answering

Last synced: 07 Aug 2024

https://github.com/anjum48/commonlitreadabilityprize

4th Place solution for the Kaggle CommonLit Readability Prize

huggingface kaggle nlp pytorch transformers

Last synced: 14 Oct 2024

https://github.com/adirthaborgohain/ner-re

A Named Entity Recognition + Entity Linker + Relation Extraction Pipeline built using spacy v3. Given a text, the pipeline will extract entities from the text as trained and will disambiguate the entities to its normalized form through an Entity Linker connected to a Knowledge Base and will assign a relation between the entities, if any.

named-entity-recognition nlp relation-extraction spacy transformers

Last synced: 09 Nov 2024

https://github.com/GermanT5/wikipedia2corpus

Wikipedia text corpus for self-supervised NLP model training

corpus german-nlp machine-learning nlp somajo wikipedia wikipedia-corpus

Last synced: 31 Oct 2024

https://github.com/kbogas/medknow

Medical Relations and Entities Extraction

biomedical metamap neo4j nlp relation-extraction semrep umls

Last synced: 28 Oct 2024

https://github.com/nikhilbarhate99/char-rnn-pytorch

Minimal implementation of Multi-layer Recurrent Neural Networks (LSTM) for character-level language modelling in PyTorch

char-rnn deep-learning lstm natural-language-generation natural-language-processing nlp pytorch pytorch-implementation pytorch-nlp pytorch-tutorial rnn

Last synced: 13 Nov 2024

https://github.com/rainmaker712/nlp_ryan

Study for Natural Language Processing & Deep Learning Framework

chatbot deep-learning machine-comprehension machine-learning nlp python pytorch scala spark tensorflow

Last synced: 13 Nov 2024

https://github.com/neomatrix369/chatbot-conversations

Chatbot conversations: a demo application how two (or more) chatbots can talk to each other, the logic used to build Eliza (along with an NLP model) has been used to power the chatbots.

ai chat-application chatbot eliza eliza-chatbot graalvm helidon helidon-example java ml nlp python quarkus text

Last synced: 14 Oct 2024

https://github.com/datawhalechina/whale-paper

Datawhale论文分享,阅读前沿论文,分享技术创新

cv nlp papers recommendation-system

Last synced: 09 Nov 2024

https://github.com/johncmunson/react-taggy

A simple zero-dependency React component for tagging user-defined entities within a block of text.

component entities named-entity-recognition natural-language ner nlp react react-component

Last synced: 28 Aug 2024

https://github.com/mirusu400/clova-x

Unofficial API for CLOVA X

api clova clovaai hacktoberfest llm naver naver-api nlp

Last synced: 06 Nov 2024

https://github.com/bnosac/rdrpostagger

R package for Ripple Down Rules-based Part-Of-Speech Tagging (RDRPOS). On more than 45 languages.

java multi-language natural-language-processing nlp pos pos-tagging r r-package tagging

Last synced: 11 Nov 2024

https://github.com/alexeyev/keras-generating-sentences-from-a-continuous-space

Text Variational Autoencoder inspired by the paper 'Generating Sentences from a Continuous Space' Bowman et al. https://arxiv.org/abs/1511.06349

deep-learning deeplearning keras keras-implementations nlp text-generation vae variational-autoencoder

Last synced: 11 Nov 2024

https://github.com/xfgryujk/taobaoanalysis

练习NLP,分析淘宝评论的项目

crawler nlp taobao

Last synced: 08 Nov 2024

https://github.com/michaelaquilina/hashedindex

Python package providing an Inverted Index implementation using dictionaries

indexing nlp nlp-machine-learning numpy pandas python2 python3 text-processing

Last synced: 28 Oct 2024

https://github.com/news-r/gensimr

📝 Topic Modeling for Humans

nlp r rstats topic-modeling

Last synced: 05 Aug 2024

https://github.com/hyperparticle/lemmatag

A neural network that jointly part-of-speech tags and lemmatizes sentences, boosting accuracy for morphologically-rich languages (Czech, Arabic, etc.)

deep-learning lemmatization machine-learning natural-language-processing neural-network nlp pos-tagging tensorflow

Last synced: 14 Nov 2024

https://github.com/selimfirat/bilkent-turkish-writings-dataset

Turkish writings dataset that promotes creativity, content, composition, grammar, spelling and punctuation.

bilkent-university creative-writing dataset nlp nlp-datasets pdf-conversion turkish turkish-language

Last synced: 10 Oct 2024

https://github.com/koichiyasuoka/unidic2ud

Tokenizer POS-tagger Lemmatizer and Dependency-parser for modern and contemporary Japanese

dependency-parser japanese-language nlp

Last synced: 16 Nov 2024

https://github.com/sea-snell/calm-dialogue

Official code for the paper "Context-Aware Language Modeling for Goal-Oriented Dialogue Systems"

deep-learning language-model nlp python pytorch reinforcement-learning

Last synced: 27 Oct 2024

https://github.com/cyclecycle/spacy-pattern-builder

Reverse engineer patterns for use with SpaCy's DependencyMatcher

nlp python spacy

Last synced: 10 Oct 2024

https://github.com/nlppln/nlppln

NLP pipeline software using common workflow language

cwl nlp pipeline text-mining workflow

Last synced: 23 Oct 2024

https://github.com/livingbio/syntaxnet_wrapper

A Python Wrapper for Google SyntaxNet

google-syntaxnet nlp python python-wrapper syntaxnet

Last synced: 09 Nov 2024

https://github.com/omarsar/clinical_nlp_elastic

Clinical NLP Analysis with Elasticsearch and Kibana

elastic elasticsearch kibana linguistics machine-learning mental-health nlp

Last synced: 28 Oct 2024

https://github.com/x-lance/mobile-env

A Universal Platform for Training and Evaluation of Mobile Interaction

decision-making information-ui infoui interaction-platform nlp rl-environments rl-platform

Last synced: 12 Nov 2024

https://github.com/ivan-bilan/nlp-and-data-science-spotlights

Regular spotlights of underrated NLP and Data Science GitHub repositories

data-science deep-learning natural-language-processing nlp spotlight

Last synced: 08 Nov 2024

https://github.com/paulbricman/semantica

Extending conceptual thinking with semantic embeddings.

creativity embeddings nlp tools-for-thought wordembeddings

Last synced: 17 Nov 2024

https://github.com/wit-ai/android-voice-demo

Example on how to build a voice-enabled Android app with Wit.ai

android machine-learning nlp nlu voice wit witai

Last synced: 15 Nov 2024

https://github.com/writer/replacy

spaCy match and replace, maintaining conjugation

nlp spacy

Last synced: 01 Nov 2024

https://github.com/omarsar/nlp_pycon

Material for PyCon 2019 NLP Tutorial

deep machine-learning nlp pytorch

Last synced: 28 Oct 2024

https://github.com/pyunits/pyunit-ner

NER实体识别模型,快速高效简单一键部署docker部署调用模型。能识别:地址、人名、机构名实体。

docker nlp python3

Last synced: 12 Nov 2024

https://github.com/georgezouq/awosome-ai-in-social-media

💻 Collect those AI & Bot use in social media wechat/facebook/twitter/instagram/weibo/TikTok etc.

facebook ins nlp social-media social-network social-network-analysis twitter wechat

Last synced: 10 Nov 2024

https://github.com/wri-dssg-omdena/policy-data-analyzer

Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.

active-learning bert data-science document-classification environmental huggingface incentives landscape-restoration lda machine-learning nlp policy sbert scraping scrapy sentence-transformers spyder text-classification topic transformers

Last synced: 30 Oct 2024

https://github.com/Ermlab/PoLitBert

Polish RoBERTA model trained on Polish literature, Wikipedia, and Oscar. The major assumption is that quality text will give a good model.

nlp polish roberta text-corpus

Last synced: 11 Nov 2024

https://github.com/miroozyx/BERT_with_keras

A Keras version of Google's BERT model

bert deep-learning nlp tensorflow

Last synced: 02 Nov 2024

https://github.com/nathankleyn/ruby-nlp

Various NLP tools for Ruby

nlp ruby

Last synced: 06 Nov 2024

https://github.com/sayakpaul/bert-for-mobile

Compares the DistilBERT and MobileBERT architectures for mobile deployments.

bert distilbert mobile mobile-bert nlp tensorflow-lite

Last synced: 23 Oct 2024

https://github.com/nitotm/efficient-language-detector-js

Fast and accurate natural language detection. Detector written in Javascript. Nito-ELD, ELD.

javascript language language-detection language-detector language-identification natural-language natural-language-processing nlp nodejs

Last synced: 12 Oct 2024

https://github.com/pszemraj/ai-msgbot

Training & Implementation of chatbots leveraging GPT-like architecture with the aitextgen package to enable dynamic conversations.

ai aitextgen chat-application chatbot deep-learning deepspeed deployment gpt-2 gpt-j gpt-j-6b gradio huggingface huggingface-transformers natural-language-processing nlp nlp-parsing telegram telegram-bot text-generation transformers

Last synced: 03 Oct 2024

https://github.com/alan-turing-institute/robots-in-disguise

Information and materials for the Turing's "robots-in-disguise" reading group on fundamental AI research.

deep-learning diffusion-models foundation-model hut23 language-models large-language-models machine-learning nlp transformers

Last synced: 13 Nov 2024

https://github.com/dzieciou/pystempel

Python port of Stempel, an algorithmic stemmer for Polish language.

nlp

Last synced: 26 Oct 2024

https://github.com/nlpir-team/nlpir-python

NLPIR-python A python wrapper and toolkit for NLPIR

cws nlp nlpir

Last synced: 14 Nov 2024

https://github.com/linuxscout/arabicstopwords

Arabic Stop Word List

arabic-nlp language nlp

Last synced: 02 Nov 2024

https://github.com/google-research-datasets/swim-ir

SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 languages, generated using PaLM 2 and summarize-then-ask prompting.

cross-lingual datasets deep-learning information-retrieval machine-learning multilingual natural-language-processing neural-information-retrieval nlp training-data

Last synced: 08 Nov 2024

https://github.com/zlsh80826/msmarco

Machine Comprehension Train on MSMARCO with S-NET Extraction Modification

cntk extraction-net machine-comprehension msmarco nlp question-answering s-net

Last synced: 28 Oct 2024

https://github.com/aliosm/simplerepresentations

Easy-to-use text representations extraction library based on the Transformers library.

embeddings nlp transformers

Last synced: 27 Oct 2024

https://github.com/JackHCC/Chinese-Tokenization

利用传统方法(N-gram,HMM等)、神经网络方法(CNN,LSTM等)和预训练方法(Bert等)的中文分词任务实现【The word segmentation task is realized by using traditional methods (n-gram, HMM, etc.), neural network methods (CNN, LSTM, etc.) and pre training methods (Bert, etc.)】

bert-crf bilstm-crf hmm-viterbi-algorithm ngram nlp tokenization

Last synced: 18 Nov 2024

https://github.com/hellohaptik/HINT3

This repository contains datasets and code for the paper "HINT3: Raising the bar for Intent Detection in the Wild" accepted at EMNLP-2020's Insights Workshop https://insights-workshop.github.io/ Preprint for the paper is available here https://arxiv.org/abs/2009.13833

conversational-ai datasets dialogue-systems nlp

Last synced: 16 Nov 2024

https://github.com/uminosachi/open-llm-webui

This repository contains a web application designed to execute relatively compact, locally-operated Large Language Models (LLMs).

chatbot ggml gradio huggingface language-model llama llama2 llama3 llava llava-llama3 llm nlp transformers

Last synced: 10 Oct 2024

https://github.com/staticdev/human-readable

Lib to make data intended for machines, readable to humans.

formatting humanizable humanization natural-language-processing nlp readable

Last synced: 16 Nov 2024

https://github.com/vasilescur/parse_context

Use GPT-3 to process human conversations and extract context, identify information that would be useful, and suggest data sources to get that information. Intended for a voice assistant.

ai assistants gpt-3 natural-language nlp semantic-analysis

Last synced: 16 Nov 2024

https://github.com/thisiscetin/textoken

Simple and customizable text tokenization gem.

nlp ruby rubynlp tokenization

Last synced: 07 Nov 2024

https://github.com/hyunwoongko/bert2bert-summarization

Abstractive summarization using Bert2Bert framework.

bert nlp summarization

Last synced: 28 Oct 2024

https://github.com/princeton-vl/attach-juxtapose-parser

Code for the paper "Strongly Incremental Constituency Parsing with Graph Neural Networks"

machine-learning neurips-2020 nlp parsing

Last synced: 09 Nov 2024

https://github.com/eimg/myanmar-text-breaker

Syllable and word, breaker/boundary-segmentation for Myanmar text in JavaScript

javascript nlp

Last synced: 25 Oct 2024

https://github.com/peaceiris/actions-suggest-related-links

A GitHub Action to suggest related or similar issues, documents, and links. Based on the power of NLP and fastText.

actions fasttext github-actions issue-management nlp

Last synced: 31 Oct 2024

https://github.com/proycon/analiticcl

an approximate string matching or fuzzy-matching system for spelling correction, normalisation or post-OCR correction

approximate-string-matching fuzzy-matching nlp normalization spelling-correction

Last synced: 14 Nov 2024

https://github.com/Smat26/Roman-Urdu-Dataset

Compilation of Manually Tagged Roman Urdu Dataset (Urdu written in Latin/Roman Script), along with other helpful Roman Urdu NLP resources

data-science dataset hindi hindi-language natural-language-processing nlp urdu urdu-language urdu-nlp

Last synced: 04 Aug 2024

https://github.com/codewithzichao/deepclassifier

DeepClassifier is aimed at building general text classification model library.It's easy and user-friendly to build any text classification task.

deep-learning deepclassifier nlp pytorch text-classification torch

Last synced: 07 Nov 2024