Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/liaad/tweet2story

Repository for the Tweet2Story framework for the extraction of narratives from tweets.

dataset narrative-extraction nlp python

Last synced: 10 Nov 2024

https://github.com/dayyass/neural-machine-translation

Pipeline for training Stanford Seq2Seq Neural Machine Translation using PyTorch.

deep-learning natural-language-processing neural-machine-translation nlp pytorch seq2seq seq2seq-attention-model

Last synced: 14 Oct 2024

https://github.com/winkjs/wink-jaro-distance

An Implementation of Jaro Distance Algorithm by Matthew A. Jaro

jaro jaro-distance jaro-similarity natural-language-processing nlp string-matching

Last synced: 09 Nov 2024

https://github.com/juliasilge/ibm-ai-day

Presentation for IBM Community Day AI

machine-learning nlp nlp-machine-learning r tidytext

Last synced: 13 Oct 2024

https://github.com/cloudera/cml_amp_few-shot_text_classification

Perform topic classification on news articles in several limited-labeled data regimes.

bert few-shot-learning nlp text-embedding zero-shot-classification

Last synced: 07 Nov 2024

https://github.com/sunitroy2703/google-summer-of-code-2021-tensorflow

📌Final submission for Google Summer of Code at @Tensorflow ❤️

android bert google-summer-of-code gsoc ios java nlp swift tensorflow tensorflow-lite tflite

Last synced: 23 Oct 2024

https://github.com/qanastek/drbert

DrBERT: A Robust Pre-trained Model in French for Biomedical and Clinical domains

bert biomedical french learning machine machine-learning medical ml nlp nlp-machine-learning taln text

Last synced: 12 Oct 2024

https://github.com/ct83/bunyip

Bunyip is a Chrome Extension, which allows us to detect AI generated text, it helps users detect fake news articles which might be generated automatically and not by a real human!

artificial-intelligence chrome-extension chrome-extensions deep-learning gpt gpt-2 gpt-detector machine-learning natural-language-processing nlp nlp-machine-learning openai python python3 serverless-applications

Last synced: 11 Oct 2024

https://github.com/nikimanoledaki/finbot-api

🤖 API for Ubb, a chatbot that trains a Natural Language Processing model using NLTK & TensorFlow to answer questions about personal finance, built with Django

api chatbot django nlp nltk python tensorflow tensorflow-models

Last synced: 10 Oct 2024

https://github.com/hankcs/gohanlp

Golang RESTful Client for HanLP

natural-language-processing nlp

Last synced: 13 Oct 2024

https://github.com/stefan-it/gc4lm

GC4LM: A Colossal (Biased) language model for German

gc4lm german language-model nlp

Last synced: 23 Oct 2024

https://github.com/oarriaga/luvina

High-level Natural Language Processing (NLP) for Python.

natural-language-processing nlp nltk python spacy

Last synced: 14 Oct 2024

https://github.com/paradite/techspeak

:page_with_curl: Generate random sentences with tech terms

context-free-grammar generator javascript nlp sentence-generator

Last synced: 23 Oct 2024

https://github.com/jaayperez/facebook-ai-bot

Facebook A.I. chatbot in Node Js that connects to a Facebook page for a new way to interact with a conversational user interface, powered by Google’s robust AI technology.

ai artificial-intelligence bot chatbot dialogflow dialogflow-v2 express facebook facebook-messenger facebook-messenger-bot javascript messenger natural-language-processing nlp node nodejs

Last synced: 28 Oct 2024

https://github.com/cathalgarvey/whatlang-py

Simple bindings to the whatlang Rust package

language language-detection language-detector nlp

Last synced: 11 Oct 2024

https://github.com/trykatchup/tesi-triennale

Progetto di strumenti basati su Deep Neural Network per la rilevazione di similarità tra password (Tesi Triennale, Ingegneria Informatica T - Alma Mater Studiorum, Università di Bologna)

artificial-intelligence deep-neural-networks nlp password password-similarity security-privacy-machine-learning

Last synced: 13 Oct 2024

https://github.com/kotartemiy/topic-labeled-news-dataset

100k+ topic labeled news articles published from thousands of news websites

media news nlp topic topic-modeling topics

Last synced: 17 Nov 2024

https://github.com/sap-samples/acl2019-commonsense

Source code for the paper "Attention Is (not) All You Need for Commonsense Reasoning" published at ACL 2019.

commonsense-reasoning machine-learning nlp sample

Last synced: 15 Nov 2024

https://github.com/koichiyasuoka/guwencombo

Tokenizer POS-tagger and Dependency-parser for Classical Chinese

ancient-chinese classical-chinese literary-chinese nlp

Last synced: 16 Nov 2024

https://github.com/mapmeld/aoc_reply_dataset

Building a dataset of Twitter replies for unsupervised learning / bot-blocking

abuse-detection nlp nlp-machine-learning scraping-tool twitter

Last synced: 15 Nov 2024

https://github.com/shuxiaobo/text-representation

Text representation works, such as : paper, code, review, datasets, blogs, thesis and so on.

benchmark competition embeddings nlp representation-learning scholars sentence-embeddings text-classification thesis transfer-learning

Last synced: 17 Nov 2024

https://github.com/koichiyasuoka/supar-kanbun

Tokenizer POS-tagger and Dependency-parser for Classical Chinese

ancient-chinese classical-chinese literary-chinese nlp

Last synced: 16 Nov 2024

https://github.com/kingabzpro/fastapi-ml-project

Learning and buiding API using Fast API

api fastapi nlp spacy testing

Last synced: 17 Nov 2024

https://github.com/aditeyabaral/calbert

CalBERT - Code-mixed Adaptive Language representations using BERT, published at AAAI-MAKE 2022

bert code-mixed deep-learning machine-learning natural-language-processing nlp transformer

Last synced: 16 Nov 2024

https://github.com/medspacy/sectionizer

A rule-based Python module for spitting documents into sections.

clinical-nlp medspacy nlp nlp-library pipeline spacy

Last synced: 11 Nov 2024

https://github.com/adriacabeza/DeepCatalan

🤖 Deep Catalan: Bring closer the Catalan Language to Deep Learning using ULMFit.

catalan catalan-language classificador fastai fine-tuning nlp pytorch ulmfit

Last synced: 26 Oct 2024

https://github.com/worldbank/wb-nlp-apps

This repository contains the NLP modeling components and web application implementations of a project for knowledge and data discovery funded by the Knowledge for Change Program (KCP) and the Joint Data Center on Forced Displacement (JDC).

data-discovery lda machine-learning nlp python topic-modeling word2vec

Last synced: 10 Nov 2024

https://github.com/hourout/tensordata

CV, NLP, DM datasets Toolkit for Machine Learning.

cv data-mining datasets machine-learning nlp

Last synced: 05 Nov 2024

https://github.com/analyticalmonk/pyspark_nlp_workshop

Instructions and code for the workshop "From Big Data to NLP Insights: Unlocking the Power of PySpark and Spark NLP"

databricks databricks-notebooks distributed-computing nlp pyspark spark spark-nlp workshop

Last synced: 08 Nov 2024

https://github.com/erfanzar/agentx

AgentX is an Open-source library that help people use LLMs on their own computers or help them to serve LLMs as easy as possible that support multi-backends like PyTorch, llama.cpp, Ollama and EasyDeL

easydel jax llama-cpp llama-cpp-python machine-learning nlp ollama

Last synced: 22 Oct 2024

https://github.com/nishiwen1214/at_papers

Must-read papers on Adversarial training for neural networks!

adversarial-training generalization nlp robustness

Last synced: 19 Nov 2024

https://github.com/hit-scir/abacus

珠算代码大模型(Abacus Code LLM)

code-generation large-language-model nlp

Last synced: 10 Nov 2024

https://github.com/nishiwen1214/superglue-bert4keras

基于bert4keras的SuperGLUE基准代码

baseline bert bert4keras keras nlp nlu superglue

Last synced: 19 Nov 2024

https://github.com/adriacabeza/deepcatalan

🤖 Deep Catalan: Bring closer the Catalan Language to Deep Learning using ULMFit.

catalan catalan-language classificador fastai fine-tuning nlp pytorch ulmfit

Last synced: 12 Nov 2024

https://github.com/direct-phonology/dphon

uncover old chinese textual parallels based on sound

chinese-traditional nlp phonology python text-analysis

Last synced: 06 Nov 2024

https://github.com/jackfsuia/shampoosalesagent

A minimal LLM sales agent framework for sales agent fast deployment and benchmark. Support OpenAI models, Claude, HuggingFace models, Gemini, Ernie(文心一言4.0), Baichuan(百川), Qwen(通义千问), Moonshot(月之暗面), GLM(智谱), Deepseek. AI销售智能体微型框架.

agent ai framework gpt llm machine-learning nlp recommendation-system retail salesperson selling-platform shampoo shopping

Last synced: 14 Nov 2024

https://github.com/arne-cl/ppi_graphkernel

all-paths graph kernel for protein-protein interaction extraction

graph-kernel natural-language-processing nlp ppi protein-protein-interaction python

Last synced: 10 Nov 2024

https://github.com/deepraj1729/tchatbot-api

A Flask REST API to serve trained ChatBots using Tensorflow Serving and Docker Containers

api-rest chatbot deep-learning flask flask-restful framwork keras nlp preprocessing requests tensorflow tf-serving

Last synced: 12 Nov 2024

https://github.com/alexandrevl/supersummarizeai

Unleash the power of AI with SuperSummarizeAI! Effortlessly extract, condense, and clip content from webpages and YouTube videos using ChatGPT. Turning endless streams of content into digestible summaries.

beautifulsoup chatgpt content-analysis multilingual nlp openai papperclip text text-processing text-summarization web-scraping youtube

Last synced: 09 Nov 2024

https://github.com/wittline/recommendation-system

Build a Content-Based Movie Recommender System (TF-IDF, BM25, BERT)

bert bm25 nlp python recommender-system recsys text-analysis tf-idf word2vec

Last synced: 14 Oct 2024

https://github.com/alvarobartt/ea-associate-ds

Electronic Arts (EA) NLP Assignment for: Associate Data Scientist

data-science electronic-arts nlp recruitment-task

Last synced: 14 Oct 2024

https://github.com/luozhouyang/tplinker

TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair Linking

entity-extraction nlp pytorch-implementation relation-extraction

Last synced: 10 Nov 2024

https://github.com/helboukkouri/embedding-visualization

This is a project for visualizing word embeddings based on the work of Andrei Kashcha (@anvaka).

fasttext glove graphs nlp visualization word-embeddings word2vec

Last synced: 03 Sep 2024

https://github.com/devmount/neural-network-pos-tagger

Train and evaluate neural network language models for POS tagging, tag input sentences according to a trained model.

embeddings feedforward-neural-network neural-network neural-networks nlp part-of-speech-tagger pos-tagger pos-tagging recurrent-neural-networks word-embeddings

Last synced: 27 Oct 2024

https://github.com/aquadzn/deploy-transformers

Easily deploy a state-of-the-art language model from HuggingFace's Transformers

deployment gpt-2 language-model nlp pytorch pytorch-transformers transformers web-app

Last synced: 14 Oct 2024

https://github.com/harisbinzia/urdu-word-segmentation

Urdu Word Segmentation using Conditional Random Fields (CRFs)

crf nlp segmentation urdu

Last synced: 07 Nov 2024

https://github.com/tech4germany/bam-inclusify

INCLUSIFY is a tool to support the practical use of diversity-sensitive language in German.

diversity equality german govtech language nlp react t4g tech4germany

Last synced: 16 Nov 2024

https://github.com/omarsar/nlp_with_tensorflow

NLP tutorials I have written using TensorFlow

cnn deep-learning nlp rnn tensorflow

Last synced: 13 Oct 2024

https://github.com/quadrismegistus/cadence

Rhythm analysis toolkit in Python

nlp python rhythm

Last synced: 23 Oct 2024

https://github.com/johannkm/goex-search

(Winner | Capital One) A Yelp search app that summarizes reviews using Watson and Aylien Text API

aylien golang google-places-api nlp vue watson-api yelp-fusion-api

Last synced: 13 Oct 2024

https://github.com/dcavar/spacy-json-nlp

spaCy wrapper for JSON-NLP.

json natural-language-processing nlp spacy

Last synced: 18 Oct 2024

https://github.com/nyandwi/deep_learning_with_tensorflow

Deep Learning with TensorFlow for basic neural networks tasks, computer vision and natural language processing.

computer-vision deep-learning machine-learning nlp tensorflow

Last synced: 11 Nov 2024

https://github.com/orgoro/white-2-black

The official code to reproduce results from the NACCL2019 paper: White-to-Black: Efficient Distillation of Black-Box Adversarial Attacks

adversarial-attacks adversarial-networks nlp toxic-comment-classification toxicity

Last synced: 23 Oct 2024

https://github.com/fargolo/textgraphs.jl

Graph representations of text

graphs julia nlp

Last synced: 10 Nov 2024

https://github.com/kenlimmj/fightin-words

A scikit-learn compliant implementation of Monroe et al.'s Fightin' Words analysis method.

bayesian-methods evaluation-metrics nlp scikit-learn

Last synced: 12 Oct 2024

https://github.com/ztjhz/t5-jax

JAX implementation of the T5 model: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

jax natural-language-processing nlp nlp-model t5

Last synced: 28 Oct 2024

https://github.com/rahul-jha98/justjoking.ai

Using a Transformer for learning the Language Model and Generate Short Jokes

gpt-2 joke jokegenerator language-model nlg nlp tensorflow2 transformer-models

Last synced: 19 Nov 2024

https://github.com/maxim5/cs224n-2020-winter

All lecture notes, slides and assignments from CS224n: Natural Language Processing with Deep Learning class by Stanford

cs224n deep-learning machine-learning nlp stanford-nlp

Last synced: 05 Nov 2024

https://github.com/kemingy/plane

A text processing tool including tag(HTML, URL, Email) extraction and removing, punctuation normalization, simple segmentation, and so on.

chinese-nlp data-cleaning nlp preprocess regex tokenization tokenizer

Last synced: 27 Oct 2024

https://github.com/jianzhnie/multimodaltookit

Incorporate Image, Text and Tabular Data with HuggingFace Transformers

image machine-learning multimodal nlp tabular-data text transformer transformers

Last synced: 27 Oct 2024

https://github.com/gatoreducator/gatorminer

A visualized text mining and analysis tool for student markdown reflection documents based on Natural language processing in the Dept of CS at Allegheny College.

nlp spacy streamlit textmining

Last synced: 12 Oct 2024

https://github.com/shreyaskarnik/pagepilot

Summarize URLs using the Kagi Universal Summarizer and Read Out Loud

ai kagi nlp summarization

Last synced: 27 Oct 2024

https://github.com/liamca/medical-ner-search

Leveraging Apache CTakes and Azure Search to Build and Medical Search App

azure azure-search ctakes medical natural-language-processing ner nlp search-engine text-analytics

Last synced: 18 Nov 2024

https://github.com/salesforce/bite

Code for "Mind Your Inflections! Improving NLP for Non-Standard Englishes with Base-Inflection Encoding" (EMNLP 2020).

nlp python tokenizer

Last synced: 08 Nov 2024

https://github.com/allan-nava/go-bard

Go package that returns response of Google Bard through API.

bard bard-api chatbot go-library golang google google-bard-api google-bard-go googlebard llm nlp

Last synced: 27 Oct 2024

https://github.com/Rajan-sust/WikiTextCorpusDownloader

A Language Independent Wikipedia Text Corpus Downloader

gensim nlp python3 tensorflow wikipedia

Last synced: 29 Oct 2024

https://github.com/anakin87/who-killed-laura-palmer

Simple Question Answering system, based on data crawled from Twin Peaks Wiki. It is built using 🔍 Haystack, an awesome open-source framework for building search systems that work intelligently over large document collections.

haystack information-retrieval natural-language-processing neural-search nlp python question-answering semantic-search space streamlit streamlit-webapp transformers

Last synced: 23 Oct 2024

https://github.com/thinkwee/eda_zh_bert

Chinese version code for the paper "EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks"

augmentation bert chinese-nlp eda nlp nlp-toolkit

Last synced: 27 Oct 2024

https://github.com/potamides/unsupervised-metrics

Library for experimenting with state-of-the-art evaluation metrics like UScore

evaluation machine-translation metrics nlp sentsim uscore xmoverscore

Last synced: 27 Oct 2024

https://github.com/tianxiaomo/cail2019_rc

中国法研杯 CAIL 2019

cail2019 nlp squad

Last synced: 14 Nov 2024

https://github.com/ekote/ai-on-microsoft-azure

Microsoft buduje i tworzy Polską Dolinę Cyfrową. W ramach tej inicjatywy podjęliśmy się wyzwania zbudowania chmurowych kompetencji wśród 150tys osób w Polsce. Jednym z elementów tej inicjatywy jest dedykowany kurs na studiach inzynierskich i magisterskich na Politechnice Warszawskiej poświęcony chmurze obliczeniowej oraz sztucznej inteligencji.

artificial-intelligence azure azure-cognitive-services azure-functions azure-machine azure-machine-learning cloud cloudcomputing cognitive-services computer-vision machine-learning nlp

Last synced: 12 Oct 2024

https://github.com/tuanacelik/anthropic-hackathon

🧠 Workshop Notebook and assets for the Anthropic Hackathon

anthropic claude llm nlp prompt-engineering

Last synced: 23 Oct 2024

https://github.com/zamgi/lingvo--classify

Автоклассификация текста на русском языке

classification linguistics lingvo natural-language-processing nlp nlp-machine-learning text-classification

Last synced: 05 Nov 2024

https://github.com/tchoutri/botfuel-elixir-sdk

An Elixir SDK for the Botfuel NLP chatbot platform.

botfuel chatbot elixir nlp sdk

Last synced: 05 Nov 2024

https://github.com/salesforce/adversarial-polyglots

Code for the paper "Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots" (NAACL-HLT 2021)

adversarial-attacks adversarial-examples adversarial-training code-mixing multilingual nlp robustness

Last synced: 08 Nov 2024

https://github.com/eellak/gsoc2019-text-extraction

GSoC 2019: Development of a Tool for Extracting Quantitative Text Profiles

computational-linguistics electron-react gsoc-2019 nlp

Last synced: 08 Nov 2024

https://github.com/bryanlimy/rnn-lie-detector

TensorFlow RNN-based Lie Detector on the CSC Deceptive Speech Dataset

gru lstm neural-network nlp rnn tensorflow

Last synced: 23 Oct 2024

https://github.com/clam004/minichatgpt

annotated tutorial of the huggingface TRL repo for reinforcement learning from human feedback connecting equations from PPO and GAE to the lines of code in the pytorch implementation

deep-learning deep-reinforcement-learning fine-tuning language-model large-language-models nlp pytorch reinforcement-learning reinforcement-learning-from-human-feedback transformers

Last synced: 15 Nov 2024