Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/kavgan/word_cloud

Python word cloud library for use within Jupyter notebook and Python apps.

cloud-library jupyter-notebook nlp python visualization word-cloud wordcloud

Last synced: 30 Oct 2024

https://github.com/yoshoku/suika

Suika 🍉 is a Japanese morphological analyzer written in pure Ruby

morphological-analysis nlp postagger ruby tokenizer

Last synced: 10 Nov 2024

https://github.com/chatopera/chatopera.feishu

通过 Feishu 开放平台和 Chatopera 机器人平台上线智能对话机器人服务, 聊天机器人,飞书,lark

ai bot chatbot chatopera dialog feishu lark machine-learning nlp nlu python python3

Last synced: 30 Oct 2024

https://github.com/stas00/porting

Helper scripts and notes that were used while porting various nlp models

nlp porting python

Last synced: 22 Oct 2024

https://github.com/edwardcooper/piidetect

A package to build an end-to-end pipeline for detecting personally identifiable information from text.

nlp pii pii-detection word2vec

Last synced: 11 Nov 2024

https://github.com/sciss/ws4j

WordNet Similarity for Java provides an API for several Semantic Relatedness/Similarity algorithms. Mirror of https://codeberg.org/sciss/ws4j

nlp wordnet

Last synced: 09 Nov 2024

https://github.com/explosion/assets

💥 Explosion Assets

machine-learning nlp spacy spacy-nlp

Last synced: 07 Oct 2024

https://github.com/xavidop/dialogflow-cx-cli

The missing Dialogflow CX CLI to interact with your projects

cli cxcli dialogflow dialogflow-cx dialogflowcx golang nlp nlu test-automation testing-tools

Last synced: 31 Oct 2024

https://github.com/gunale0926/sorsa

SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models

deep-learning fine-tuning llama lora machine-learning nlp peft python pytorch rwkv sorsa svd transformer

Last synced: 26 Oct 2024

https://github.com/Lipairui/textgo

Text preprocessing, representation, similarity calculation, text search and classification. Let's go and play with text!

bert nlp text-classification text-preprocessing text-representation text-search text-similarity

Last synced: 07 Aug 2024

https://github.com/aphp/eds-pseudo

EDS-Pseudo is a hybrid model for detecting personally identifying entities in clinical reports

edsnlp nlp pseudonymisation

Last synced: 03 Sep 2024

https://github.com/coosto/dutch-word-embeddings

Dutch word embeddings, trained on a large collection of Dutch social media messages and news/blog/forum posts.

coosto dutch nlp word2vec word2vec-model wordembeddings

Last synced: 03 Aug 2024

https://github.com/jaykef/avachat

AvaChat - is a realtime AI chat demo with animated talking heads - it uses Large Language Models (GPT, API2D GPT4, Cluade) as text inputs to D-ID's image-to-video talking head model (via D-ID stream api)

avatar llms nlp

Last synced: 29 Oct 2024

https://github.com/kinosal/cowriter

Write 10x faster using OpenAI's GPT-3 based Davinci model to autocomplete your text

ai gpt machine-learning nlp

Last synced: 27 Oct 2024

https://github.com/dongjunlee/dmn-tensorflow

TensorFlow implementation of 'Ask Me Anything: Dynamic Memory Networks for Natural Language Processing (2015)'

babi-tasks dynamic-memory-network hb-experiment natural-language-processing nlp question-answering tensorflow

Last synced: 08 Nov 2024

https://github.com/OpenSextant/Xponents

Geographic Place, Date/time, and Pattern entity extraction toolkit along with text extraction from unstructured data and GIS outputters.

document-conversion geocoding geonames geoparsing geotagging information-extraction nlp solr tika

Last synced: 05 Nov 2024

https://github.com/explosion/spacy-huggingface-hub

🤗 Push your spaCy pipelines to the Hugging Face Hub

huggingface machine-learning ml-models models natural-language-processing nlp spacy

Last synced: 07 Oct 2024

https://github.com/argosopentech/metaltranslate

Customizable machine translation in C++

machine-learning nlp nlp-machine-learning translation

Last synced: 08 Nov 2024

https://github.com/salesforce/query-focused-sum

Official code repository for "Exploring Neural Models for Query-Focused Summarization".

deep-learning machine-learning neural-network nlp question-answering summarization

Last synced: 08 Nov 2024

https://github.com/kenlimmj/rouge

A Javascript implementation of the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) evaluation metric for summaries.

bootstrapping-statistics evaluation-metric jackknifing nlp rouge summarization

Last synced: 10 Nov 2024

https://github.com/osu-nlp-group/amplegcg

AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM

adversarial-attacks gcg nlp safety

Last synced: 11 Nov 2024

https://github.com/skroutz/turkish_stemmer

A simple Turkish stemming library

nlp stemmer

Last synced: 11 Nov 2024

https://github.com/tokestermw/spacy_grammar

:black_nib: Language Tool style grammar handling with spaCy 2.0

grammar nlp spacy spacy-nlp

Last synced: 07 Nov 2024

https://github.com/yuvalpinter/m3gm

Max-Margin Markov Graph Models for WordNet (EMNLP 2018)

markov-model nlp relation-extraction semantics wordnet

Last synced: 27 Oct 2024

https://github.com/KehaoWu/Jinyong-Corpus

金庸15部小说字典

corpus-data nlp

Last synced: 07 Nov 2024

https://github.com/zamgi/lingvo--ner-ru

Named entity recognition (NER) in Russian texts / Определение именованных сущностей (NER) в тексте на русском языке

linguistics lingvo named-entity-recognition natural-language-processing ner nlp nlp-machine-learning

Last synced: 05 Nov 2024

https://github.com/yasinkuyu/turkish.js

Turkish Suffix Library for Javascript - Türkçe Çekim ve Yapım Ekleri

javascript nlp stem vowel

Last synced: 06 Nov 2024

https://github.com/lunarwhite/covid-social-analysis

Apply ML on weibo sentiment. 疫情背景下微博文本情感分析与可视化

crawling data-analysis machine-learning nlp python vizualization

Last synced: 06 Nov 2024

https://github.com/kehaowu/jinyong-corpus

金庸15部小说字典

corpus-data nlp

Last synced: 13 Oct 2024

https://github.com/saareliad/FTPipe

FTPipe and related pipeline model parallelism research.

deep-neural-networks distributed-training fine-tuning nlp pipeline-parallelism t5

Last synced: 07 Nov 2024

https://github.com/ecohealthalliance/epitator

EpiTator annotates epidemiological information in text documents. It is the natural language processing framework that powers GRITS and EIDR Connect.

disease-surveillance epidemiology geonames nlp spacy toponym-resolution

Last synced: 14 Oct 2024

https://github.com/dpressel/textrank-js

TextRank algorithm implementation in Javascript

nlp textrank

Last synced: 28 Oct 2024

https://github.com/danieldeutsch/repro

Repro is a library for easily running code from published papers via Docker.

docker machine-learning nlp reproducibility reproducible-research

Last synced: 06 Nov 2024

https://github.com/rangilyu/llama.mmengine

Training LLaMA language model with MMEngine! It supports LoRA fine-tuning!

alpaca fine-tuning language-model llama lora nlp

Last synced: 22 Oct 2024

https://github.com/greenelab/pubtator

Retrieve and process PubTator annotations

data nlp pubmed pubtator snorkel text-mining tool

Last synced: 13 Nov 2024

https://github.com/perone/feste

Feste is a free and open-source framework allowing scalable composition of NLP tasks using a graph execution model that is optimized and executed by specialized schedulers.

deep-learning language-model machine-learning nlp

Last synced: 28 Oct 2024

https://github.com/sanghviharshit/pocket-tagger

📖👓🏷Tag your getpocket.com articles automatically using natural language processing

articles getpocket google-cloud natural-language-processing nlp pocket scraper tag

Last synced: 30 Oct 2024

https://github.com/applenob/simple_crf

simple Conditional Random Field implementation in Python

crf nlp python3

Last synced: 07 Nov 2024

https://github.com/IndexFziQ/KMRC-Papers

A list of recent papers regarding knowledge-based machine reading comprehension.

knowledge knowledge-base machine-reading-comprehension nlp paper reading-comprehension

Last synced: 13 Nov 2024

https://github.com/bentoml/transformers-nlp-service

Online Inference API for NLP Transformer models - summarization, text classification, sentiment analysis and more

llm llmops mlops model-deployment model-inference-service model-serving nlp nlp-machine-learning online-inference transformer

Last synced: 13 Nov 2024

https://github.com/nschneid/arabic-tagger

AQMAR Arabic Tagger: Sequence tagger with cost-augmented structured perceptron training

arabic arabic-language arabic-nlp arabic-wikipedia java named-entities nlp nlp-machine-learning sequence-tagger tagger

Last synced: 08 Nov 2024

https://github.com/megagonlabs/t5-japanese

Codes to pre-train Japanese T5 models

natural-language-processing nlp t5 transformer

Last synced: 10 Nov 2024

https://github.com/machinelearningzh/simply-simplify-language

Use machine learning to make your institutional communication more understandable and inclusive.

anthropic einfachesprache leichtesprache llm llms mistral mistralai natural-language-processing nlp openai plainlanguage python spacy streamlit

Last synced: 14 Oct 2024

https://github.com/shibing624/text-feature

文本特征提取,适用于小说,论文,议论文等文本,提取词语、句子、依存关系等特征。python开发。

article feature nlp textrank

Last synced: 22 Oct 2024

https://github.com/tmalsburg/txl.el

Emacs extension providing direct access to DeepL's machine translation API.

emacs language language-technology machine-translation nlp

Last synced: 27 Oct 2024

https://github.com/palewire/storysniffer

Inspect a URL and estimate if it contains a news story

data-journalism journalism jupyter-notebook machine-learning news nlp python scikit-learn

Last synced: 11 Oct 2024

https://github.com/microsoft/vistalk

A JavaScript toolkit for Natural Language-based Visualization Authoring

nlp nx reactjs tensorflowjs transformer vega vega-lite visualization

Last synced: 07 Oct 2024

https://github.com/sfischer13/python-arpa

:snake: Python library for n-gram models in ARPA format

arpa computational-linguistics language-model library lm nlp python python-3

Last synced: 01 Nov 2024

https://github.com/ysy1216/firewallm

By calling FirewaLLM, users can ensure the accuracy of the large model while greatly reducing the risk of privacy leakage when interacting with it. We believe that FirewallLLM is a privacy protected chatgpt interaction platform.

chatbot chatgpt firewall flask llm nlp privacy python web

Last synced: 09 Nov 2024

https://github.com/ardauzunoglu/rte-speech-generator

Natural Language Processing to generate new speeches for the President of Turkey.

natural-language-processing nlp politics python speech-processing tensorflow turkce turkish turkish-nlp

Last synced: 12 Nov 2024

https://github.com/leoneversberg/llm-chatbot-rag

A local LLM chatbot with RAG for PDF input files

chatbot llm nlp rag

Last synced: 08 Aug 2024

https://github.com/ahmedbesbes/audiolizr

A bentoML-powered API to transcribe audio and make sense of it

bentoml bentoml-service docker nlp openai openai-whisper pytube speech-recognition t5 torch transformers

Last synced: 07 Aug 2024

https://github.com/Flight-School/sentences

A command-line utility that splits natural language text into sentences.

cli macos nlp sentence-tokenizer swift

Last synced: 05 Aug 2024

https://github.com/gentaiscool/indonesian-nlp

A curated list of research papers and resources on Indonesian languages

deep-learning indonesian javanese local local-languages machine-learning nlp papers research speech sundanese survey

Last synced: 08 Nov 2024

https://github.com/x-tabdeveloping/neofuzz

Blazing fast fuzzy text search for Python.

fuzzy llm nlp python search semantic

Last synced: 14 Nov 2024

https://github.com/kommunicate-io/kommunicate-web-sdk

Kommunicate Web Gen AI Chatbot and Live Chat Plugin

ai chat chatbot chatbots kommunicate live-chat nlp openai support support-chat webplugin

Last synced: 15 Nov 2024

https://github.com/amazon-science/recode

Releasing code for "ReCode: Robustness Evaluation of Code Generation Models"

code-generation large-language-models nlp robustness

Last synced: 12 Nov 2024

https://github.com/stanfordnlp/stanza-train

Model training tutorials for the Stanza Python NLP Library

natural-language-processing nlp stanza

Last synced: 08 Nov 2024

https://github.com/ropensci-archive/geoparser

:no_entry: ARCHIVED :no_entry:

geocoding geoparser nlp peer-reviewed r r-package rstats

Last synced: 05 Aug 2024

https://github.com/dair-ai/notebooks

🔬 Sharing your data science notebooks with the community has never been this easy.

artificial-intelligence deep-learning machine-learning nlp

Last synced: 10 Nov 2024

https://github.com/mchmarny/tsignal

Analyzing social media sentiment and its impact on stock market

analytics golang nasdaq nlp sentiment-analysis twitter

Last synced: 08 Nov 2024

https://github.com/aws-solutions/content-localization-on-aws

Automatically generate multi-language subtitles using AWS AI/ML services. Machine generated subtitles can be edited to improve accuracy and downstream tracks will automatically be regenerated based on the edits. Built on Media Insights Engine (https://github.com/awslabs/aws-media-insights-engine)

amazon-comprehend amazon-polly amazon-transcribe amazon-translate audio aws-media-insights-engine captions content-analysis localisation localization media mie nlp nlp-machine-learning speech-to-text subtitles video video-on-demand vod

Last synced: 08 Nov 2024

https://github.com/adirthaborgohain/ner-re

A Named Entity Recognition + Entity Linker + Relation Extraction Pipeline built using spacy v3. Given a text, the pipeline will extract entities from the text as trained and will disambiguate the entities to its normalized form through an Entity Linker connected to a Knowledge Base and will assign a relation between the entities, if any.

named-entity-recognition nlp relation-extraction spacy transformers

Last synced: 09 Nov 2024

https://github.com/snnclsr/ner

Turkish Named Entity Recognition

ner nlp

Last synced: 10 Oct 2024

https://github.com/hiyouga/pban-pytorch

A Position-aware Bidirectional Attention Network for Aspect-level Sentiment Analysis, PyTorch implementation.

aspect-based-sentiment-analysis attention-model deep-learning natural-language-processing nlp pytorch sentiment-analysis

Last synced: 27 Oct 2024

https://github.com/thisisiron/nmt-attention-tf2

👫 Effective Approaches to Attention-based Neural Machine Translation implemented as Tensorflow 2.0

attention lstm natural-language-processing neural-machine-translation nlp nmt tensorflow tensorflow2 tf2 translation

Last synced: 08 Nov 2024

https://github.com/cocoa-ai/namescoremldemo

🏷 iOS11 demo application for predicting gender from first names.

classification coreml coreml-models gender-classification ios machine-learning nlp swift swift4

Last synced: 07 Nov 2024

https://github.com/psolbach/metadoc

Aviation grade news article metadata extraction

extraction metadata news nlp perceptron

Last synced: 08 Nov 2024

https://github.com/cocoa-ai/NamesCoreMLDemo

🏷 iOS11 demo application for predicting gender from first names.

classification coreml coreml-models gender-classification ios machine-learning nlp swift swift4

Last synced: 09 Aug 2024

https://github.com/xxjwxc/gohanlp

Golang RESTful Client for HanLP.中文分词 词性标注 命名实体识别 依存句法分析 语义依存分析 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理

ai dependency-parser hanlp named-entity-recognition natural-language-processing nlp pos-tagging semantic-parsing text-classification

Last synced: 28 Oct 2024

https://github.com/bastienbot/nlp-js-tools-french

POS Tagger, lemmatizer and stemmer for french language in javascript

lemmatization lemmatizer nlp postagging postgresql stemmer stemming tokenization tokenizer

Last synced: 28 Aug 2024

https://github.com/bangla-rag/porag

Fully Configurable RAG Pipeline for Bengali Language RAG Applications. Supports both Local and Huggingface Models, Built with Langchain.

ai bengali bengali-nlp chromadb langchain llama3 llm nlp rag transformers

Last synced: 10 Oct 2024

https://github.com/anjum48/commonlitreadabilityprize

4th Place solution for the Kaggle CommonLit Readability Prize

huggingface kaggle nlp pytorch transformers

Last synced: 14 Oct 2024

https://github.com/syzer/sentiment-analyser

ML that can extract german and english sentiment

english german nlp nlp-library node-js nodejs sentiment-analyser sentiment-analysis

Last synced: 28 Oct 2024