Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/welding-torch/excel-anonymizer

A Python script that anonymizes an Excel file and synthesizes new data in its place.

data-science microsoft nlp pandas presidio privacy

Last synced: 07 Nov 2024

https://github.com/kaleidophon/nlp-uncertainty-zoo

Model zoo for different kinds of uncertainty quantification methods used in Natural Language Processing, implemented in PyTorch.

deep-learning lstm nlp nlp-machine-learning package python pytorch rnn transformers uncertainty-estimation uncertainty-neural-networks uncertainty-quantification

Last synced: 10 Oct 2024

https://github.com/ChenghaoMou/pytorch-pQRNN

Implementation of pQRNN in PyTorch

nlp pqrnn pytorch text-classification

Last synced: 16 Nov 2024

https://github.com/kootenpv/spacy_api

Server/Client around Spacy to load spacy only once

api machine-learning nlp spacy

Last synced: 14 Oct 2024

https://github.com/kennethenevoldsen/spacy-wrap

spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to include existing fine-tuned models within your SpaCy workflow.

deep-learning huggingface huggingface-transformers language-model machine-learning natural-language-processing nlp pytorch spacy spacy-extension spacy-extensions spacy-models spacy-nlp spacy-pipeline spacy-transformers text-classification transformers

Last synced: 12 Oct 2024

https://github.com/hsm207/bert_attn_viz

Visualize BERT's self-attention layers on text classification tasks

attention bert explainable-ai nlp tensorflow

Last synced: 28 Oct 2024

https://github.com/onesuper/HuggingFace-Datasets-Text-Quality-Analysis

Retrieves parquet files from Hugging Face, identifies and quantifies junky data, duplication, contamination, and biased content in dataset using pandas

dataset huggingface-datasets llm machine-learning nlp streamlit text-processing

Last synced: 09 Aug 2024

https://github.com/khakhulin/compressed-transformer

Compression of NMT transformer model with tensor methods

compression deep-learning mnist nlp nmt pytorch tensor-train transformer translation tucker

Last synced: 04 Aug 2024

https://github.com/ailln/nlp-roadmap

🗺️ 一个自然语言处理的学习路线图

natural-language-processing nlp roadmap sequence-labeling word-embedding word-segmentation

Last synced: 18 Nov 2024

https://github.com/deshwalmahesh/phudge

Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute, relative and much more. It contains a list of all the available tool, methods, repo, code etc to detect hallucination, LLM evaluation, grading and much more.

ai custom-dataset evaluation feedback-collection finetuning hallucination hallucination-detection judge llm llm-evaluation ml nlp phi-3 pytorch sota

Last synced: 18 Nov 2024

https://github.com/nlpcloud/nlpcloud-js

NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, code generation, and much more...

ad-generator chatbot code-generation conversational-ai embeddings intent-classification keywords-extraction language-detection machine-translation ner nlp paraphrasing question-answering semantic-similarity sentiment-analysis text-classification text-generation text-summarization tokenization

Last synced: 07 Nov 2024

https://github.com/christabor/namebot

A company/project name generator for Python. Uses NLTK and diverse techniques derived from existing corporate etymologies and naming agencies for sophisticated word generation and ideation.

language name-generation naming naming-agencies nlp nltk

Last synced: 07 Nov 2024

https://github.com/s-nlp/rudetoxifier

Code and data of "Methods for Detoxification of Texts for the Russian Language" paper

nlp russian-language style-transfer

Last synced: 07 Aug 2024

https://github.com/obss/trapper

State-of-the-art NLP through transformer models in a modular design and consistent APIs.

allennlp deep-learning natural-language-processing nlp python pytorch pytorch-transformers transformer transformers

Last synced: 27 Oct 2024

https://github.com/teticio/llama-squad

Train Llama 2 & 3 on the SQuAD v2 task as an example of how to specialize a generalized (foundation) model.

decoder fine-tuning llama2 llama3 nlp question-answering squad

Last synced: 10 Oct 2024

https://github.com/tuanacelik/should-i-follow

🦄 An NLP application just for the lols: built with Haystack to get an overview of what a user is posting about on Twitter

haystack llm nlp twitter

Last synced: 22 Oct 2024

https://github.com/chatopera/chatopera.feishu

通过 Feishu 开放平台和 Chatopera 机器人平台上线智能对话机器人服务, 聊天机器人,飞书,lark

ai bot chatbot chatopera dialog feishu lark machine-learning nlp nlu python python3

Last synced: 30 Oct 2024

https://github.com/natasha/naeval

Comparing quality and performance of NLP systems for Russian language

evaluation nlp performance-analysis python russian

Last synced: 10 Nov 2024

https://github.com/kavgan/word_cloud

Python word cloud library for use within Jupyter notebook and Python apps.

cloud-library jupyter-notebook nlp python visualization word-cloud wordcloud

Last synced: 30 Oct 2024

https://github.com/undertheseanlp/ner

Vietnamese Named Entity Recognition

named-entity-recognition nlp

Last synced: 11 Nov 2024

https://github.com/yoshoku/suika

Suika 🍉 is a Japanese morphological analyzer written in pure Ruby

morphological-analysis nlp postagger ruby tokenizer

Last synced: 10 Nov 2024

https://github.com/stas00/porting

Helper scripts and notes that were used while porting various nlp models

nlp porting python

Last synced: 22 Oct 2024

https://github.com/jaykef/avachat

AvaChat - is a realtime AI chat demo with animated talking heads - it uses Large Language Models (GPT, API2D GPT4, Cluade) as text inputs to D-ID's image-to-video talking head model (via D-ID stream api)

avatar llms nlp

Last synced: 29 Oct 2024

https://github.com/edwardcooper/piidetect

A package to build an end-to-end pipeline for detecting personally identifiable information from text.

nlp pii pii-detection word2vec

Last synced: 11 Nov 2024

https://github.com/coosto/dutch-word-embeddings

Dutch word embeddings, trained on a large collection of Dutch social media messages and news/blog/forum posts.

coosto dutch nlp word2vec word2vec-model wordembeddings

Last synced: 17 Nov 2024

https://github.com/explosion/assets

💥 Explosion Assets

machine-learning nlp spacy spacy-nlp

Last synced: 07 Oct 2024

https://github.com/xavidop/dialogflow-cx-cli

The missing Dialogflow CX CLI to interact with your projects

cli cxcli dialogflow dialogflow-cx dialogflowcx golang nlp nlu test-automation testing-tools

Last synced: 15 Nov 2024

https://github.com/aphp/eds-pseudo

EDS-Pseudo is a hybrid model for detecting personally identifying entities in clinical reports

edsnlp nlp pseudonymisation

Last synced: 03 Sep 2024

https://github.com/Lipairui/textgo

Text preprocessing, representation, similarity calculation, text search and classification. Let's go and play with text!

bert nlp text-classification text-preprocessing text-representation text-search text-similarity

Last synced: 07 Aug 2024

https://github.com/sciss/ws4j

WordNet Similarity for Java provides an API for several Semantic Relatedness/Similarity algorithms. Mirror of https://codeberg.org/sciss/ws4j

nlp wordnet

Last synced: 09 Nov 2024

https://github.com/osu-nlp-group/amplegcg

AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM

adversarial-attacks gcg nlp safety

Last synced: 11 Nov 2024

https://github.com/skroutz/turkish_stemmer

A simple Turkish stemming library

nlp stemmer

Last synced: 11 Nov 2024

https://github.com/argosopentech/metaltranslate

Customizable machine translation in C++

machine-learning nlp nlp-machine-learning translation

Last synced: 08 Nov 2024

https://github.com/kinosal/cowriter

Write 10x faster using OpenAI's GPT-3 based Davinci model to autocomplete your text

ai gpt machine-learning nlp

Last synced: 27 Oct 2024

https://github.com/OpenSextant/Xponents

Geographic Place, Date/time, and Pattern entity extraction toolkit along with text extraction from unstructured data and GIS outputters.

document-conversion geocoding geonames geoparsing geotagging information-extraction nlp solr tika

Last synced: 05 Nov 2024

https://github.com/explosion/spacy-huggingface-hub

🤗 Push your spaCy pipelines to the Hugging Face Hub

huggingface machine-learning ml-models models natural-language-processing nlp spacy

Last synced: 07 Oct 2024

https://github.com/kenlimmj/rouge

A Javascript implementation of the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) evaluation metric for summaries.

bootstrapping-statistics evaluation-metric jackknifing nlp rouge summarization

Last synced: 10 Nov 2024

https://github.com/salesforce/query-focused-sum

Official code repository for "Exploring Neural Models for Query-Focused Summarization".

deep-learning machine-learning neural-network nlp question-answering summarization

Last synced: 08 Nov 2024

https://github.com/prihoda/golem

Open-source chatbot framework for python developers. Batteries included 🔋🔋

bot chatbot dialog-management messenger nlp python telegram witai

Last synced: 16 Nov 2024

https://github.com/dongjunlee/dmn-tensorflow

TensorFlow implementation of 'Ask Me Anything: Dynamic Memory Networks for Natural Language Processing (2015)'

babi-tasks dynamic-memory-network hb-experiment natural-language-processing nlp question-answering tensorflow

Last synced: 08 Nov 2024

https://github.com/tokestermw/spacy_grammar

:black_nib: Language Tool style grammar handling with spaCy 2.0

grammar nlp spacy spacy-nlp

Last synced: 07 Nov 2024

https://github.com/lunarwhite/covid-social-analysis

Apply ML on weibo sentiment. 疫情背景下微博文本情感分析与可视化

crawling data-analysis machine-learning nlp python vizualization

Last synced: 06 Nov 2024

https://github.com/KehaoWu/Jinyong-Corpus

金庸15部小说字典

corpus-data nlp

Last synced: 07 Nov 2024

https://github.com/zamgi/lingvo--ner-ru

Named entity recognition (NER) in Russian texts / Определение именованных сущностей (NER) в тексте на русском языке

linguistics lingvo named-entity-recognition natural-language-processing ner nlp nlp-machine-learning

Last synced: 05 Nov 2024

https://github.com/ecohealthalliance/epitator

EpiTator annotates epidemiological information in text documents. It is the natural language processing framework that powers GRITS and EIDR Connect.

disease-surveillance epidemiology geonames nlp spacy toponym-resolution

Last synced: 14 Oct 2024

https://github.com/yuvalpinter/m3gm

Max-Margin Markov Graph Models for WordNet (EMNLP 2018)

markov-model nlp relation-extraction semantics wordnet

Last synced: 27 Oct 2024

https://github.com/yasinkuyu/turkish.js

Turkish Suffix Library for Javascript - Türkçe Çekim ve Yapım Ekleri

javascript nlp stem vowel

Last synced: 06 Nov 2024

https://github.com/kehaowu/jinyong-corpus

金庸15部小说字典

corpus-data nlp

Last synced: 13 Oct 2024

https://github.com/saareliad/FTPipe

FTPipe and related pipeline model parallelism research.

deep-neural-networks distributed-training fine-tuning nlp pipeline-parallelism t5

Last synced: 07 Nov 2024

https://github.com/IndexFziQ/KMRC-Papers

A list of recent papers regarding knowledge-based machine reading comprehension.

knowledge knowledge-base machine-reading-comprehension nlp paper reading-comprehension

Last synced: 13 Nov 2024

https://github.com/jina-ai/example-multimodal-fashion-search

Input text or image, get back matching image fashion results, using Jina, DocArray, and CLIP

computer-vision deep-learning neural-search nlp python

Last synced: 16 Nov 2024

https://github.com/danieldeutsch/repro

Repro is a library for easily running code from published papers via Docker.

docker machine-learning nlp reproducibility reproducible-research

Last synced: 06 Nov 2024

https://github.com/dpressel/textrank-js

TextRank algorithm implementation in Javascript

nlp textrank

Last synced: 28 Oct 2024

https://github.com/rangilyu/llama.mmengine

Training LLaMA language model with MMEngine! It supports LoRA fine-tuning!

alpaca fine-tuning language-model llama lora nlp

Last synced: 22 Oct 2024

https://github.com/greenelab/pubtator

Retrieve and process PubTator annotations

data nlp pubmed pubtator snorkel text-mining tool

Last synced: 13 Nov 2024

https://github.com/applenob/simple_crf

simple Conditional Random Field implementation in Python

crf nlp python3

Last synced: 07 Nov 2024

https://github.com/sanghviharshit/pocket-tagger

📖👓🏷Tag your getpocket.com articles automatically using natural language processing

articles getpocket google-cloud natural-language-processing nlp pocket scraper tag

Last synced: 30 Oct 2024

https://github.com/perone/feste

Feste is a free and open-source framework allowing scalable composition of NLP tasks using a graph execution model that is optimized and executed by specialized schedulers.

deep-learning language-model machine-learning nlp

Last synced: 28 Oct 2024

https://github.com/machinelearningzh/simply-simplify-language

Use machine learning to make your institutional communication more understandable and inclusive.

anthropic einfachesprache leichtesprache llm llms mistral mistralai natural-language-processing nlp openai plainlanguage python spacy streamlit

Last synced: 14 Oct 2024

https://github.com/bentoml/transformers-nlp-service

Online Inference API for NLP Transformer models - summarization, text classification, sentiment analysis and more

llm llmops mlops model-deployment model-inference-service model-serving nlp nlp-machine-learning online-inference transformer

Last synced: 13 Nov 2024

https://github.com/megagonlabs/t5-japanese

Codes to pre-train Japanese T5 models

natural-language-processing nlp t5 transformer

Last synced: 10 Nov 2024

https://github.com/tmalsburg/txl.el

Emacs extension providing direct access to DeepL's machine translation API.

emacs language language-technology machine-translation nlp

Last synced: 27 Oct 2024

https://github.com/nschneid/arabic-tagger

AQMAR Arabic Tagger: Sequence tagger with cost-augmented structured perceptron training

arabic arabic-language arabic-nlp arabic-wikipedia java named-entities nlp nlp-machine-learning sequence-tagger tagger

Last synced: 08 Nov 2024

https://github.com/shibing624/text-feature

文本特征提取,适用于小说,论文,议论文等文本,提取词语、句子、依存关系等特征。python开发。

article feature nlp textrank

Last synced: 22 Oct 2024

https://github.com/palewire/storysniffer

Inspect a URL and estimate if it contains a news story

data-journalism journalism jupyter-notebook machine-learning news nlp python scikit-learn

Last synced: 11 Oct 2024

https://github.com/ysy1216/firewallm

By calling FirewaLLM, users can ensure the accuracy of the large model while greatly reducing the risk of privacy leakage when interacting with it. We believe that FirewallLLM is a privacy protected chatgpt interaction platform.

chatbot chatgpt firewall flask llm nlp privacy python web

Last synced: 09 Nov 2024

https://github.com/sfischer13/python-arpa

:snake: Python library for n-gram models in ARPA format

arpa computational-linguistics language-model library lm nlp python python-3

Last synced: 01 Nov 2024

https://github.com/kommunicate-io/kommunicate-web-sdk

Kommunicate Web Gen AI Chatbot and Live Chat Plugin

ai chat chatbot chatbots kommunicate live-chat nlp openai support support-chat webplugin

Last synced: 15 Nov 2024

https://github.com/microsoft/vistalk

A JavaScript toolkit for Natural Language-based Visualization Authoring

nlp nx reactjs tensorflowjs transformer vega vega-lite visualization

Last synced: 07 Oct 2024

https://github.com/x-tabdeveloping/neofuzz

Blazing fast fuzzy text search for Python.

fuzzy llm nlp python search semantic

Last synced: 14 Nov 2024