Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Natural language processing
Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
- GitHub: https://github.com/topics/nlp
- Wikipedia: https://en.wikipedia.org/wiki/Natural_language_processing
- Created by: Alan Turing
- Aliases: natural-language-processing, nlp-machine-learning, nlp-resources,
- Last updated: 2024-11-20 00:15:35 UTC
- JSON Representation
https://github.com/jackaduma/ai-waf
AI driven Web Application Firewall
ai classification-algorithm cyber-security cybersecurity deep-learning machine-learning natural-language-processing neural-network nlp nlp-deep-learning nlp-machine-learning text-classification textcnn waf web-application-firewall webapplicationfirewall
Last synced: 11 Nov 2024
https://github.com/liaad/tweet2story
Repository for the Tweet2Story framework for the extraction of narratives from tweets.
dataset narrative-extraction nlp python
Last synced: 10 Nov 2024
https://github.com/dayyass/neural-machine-translation
Pipeline for training Stanford Seq2Seq Neural Machine Translation using PyTorch.
deep-learning natural-language-processing neural-machine-translation nlp pytorch seq2seq seq2seq-attention-model
Last synced: 14 Oct 2024
https://github.com/thisisiron/transformer-tf2
🤖 Transformer implemented as Tensorflow 2.0
attention attention-is-all-you-need natural-language-processing nlp self-attention tensorflow tensorflow2 tf2 transformer translation
Last synced: 08 Nov 2024
https://github.com/linuxscout/mishtar
Mishtar: Named and temporal entities chunker
arabic-language arabic-nlp chunking named-entity-recognition nlp temporal-entities-chunker
Last synced: 25 Oct 2024
https://github.com/winkjs/wink-jaro-distance
An Implementation of Jaro Distance Algorithm by Matthew A. Jaro
jaro jaro-distance jaro-similarity natural-language-processing nlp string-matching
Last synced: 09 Nov 2024
https://github.com/wannaphong/isannlp
Isan NLP
natural-language-processing nlp thai-language thai-nlp
Last synced: 08 Nov 2024
https://github.com/inphyt/imdb_sentiment_analysis_bert
BERT Sentiment Classification on the IMDb Large Movie Review Dataset.
bert bert-model data-mining data-mining-algorithms data-mining-python data-science machine-learning machine-learning-algorithms natural-language-processing nlp nlp-machine-learning scikit-learn sentiment-analysis sentiment-classification spacy spacy-models spacy-nlp
Last synced: 12 Nov 2024
https://github.com/riccorl/ipa
NLP Preprocessing Pipeline Wrappers
lemmatization model natural-language-processing nlp part-of-speech-tagger pipeline preprocessing spacy stanza tagging token tokenizer wrapper
Last synced: 14 Oct 2024
https://github.com/juliasilge/ibm-ai-day
Presentation for IBM Community Day AI
machine-learning nlp nlp-machine-learning r tidytext
Last synced: 13 Oct 2024
https://github.com/aahouzi/stock-price-forecasting
An overview of various quantitative techniques and trading strategies for predicting stock prices, based on historical data from YahooFinance.
arima-model bollinger-bands data-analytics finance financial-analysis fintech kdj macd-divegence-strategy macd-indicator momentum-trading-strategy moving-average nlp quantitative-finance stochastic-oscillator stock stock-market time-series trading vader-sentiment-analysis
Last synced: 27 Oct 2024
https://github.com/cloudera/cml_amp_few-shot_text_classification
Perform topic classification on news articles in several limited-labeled data regimes.
bert few-shot-learning nlp text-embedding zero-shot-classification
Last synced: 07 Nov 2024
https://github.com/sunitroy2703/google-summer-of-code-2021-tensorflow
📌Final submission for Google Summer of Code at @Tensorflow ❤️
android bert google-summer-of-code gsoc ios java nlp swift tensorflow tensorflow-lite tflite
Last synced: 23 Oct 2024
https://github.com/qanastek/drbert
DrBERT: A Robust Pre-trained Model in French for Biomedical and Clinical domains
bert biomedical french learning machine machine-learning medical ml nlp nlp-machine-learning taln text
Last synced: 12 Oct 2024
https://github.com/ct83/bunyip
Bunyip is a Chrome Extension, which allows us to detect AI generated text, it helps users detect fake news articles which might be generated automatically and not by a real human!
artificial-intelligence chrome-extension chrome-extensions deep-learning gpt gpt-2 gpt-detector machine-learning natural-language-processing nlp nlp-machine-learning openai python python3 serverless-applications
Last synced: 11 Oct 2024
https://github.com/nikimanoledaki/finbot-api
🤖 API for Ubb, a chatbot that trains a Natural Language Processing model using NLTK & TensorFlow to answer questions about personal finance, built with Django
api chatbot django nlp nltk python tensorflow tensorflow-models
Last synced: 10 Oct 2024
https://github.com/hankcs/gohanlp
Golang RESTful Client for HanLP
natural-language-processing nlp
Last synced: 13 Oct 2024
https://github.com/stefan-it/gc4lm
GC4LM: A Colossal (Biased) language model for German
gc4lm german language-model nlp
Last synced: 23 Oct 2024
https://github.com/oarriaga/luvina
High-level Natural Language Processing (NLP) for Python.
natural-language-processing nlp nltk python spacy
Last synced: 14 Oct 2024
https://github.com/paradite/techspeak
:page_with_curl: Generate random sentences with tech terms
context-free-grammar generator javascript nlp sentence-generator
Last synced: 23 Oct 2024
https://github.com/jaayperez/facebook-ai-bot
Facebook A.I. chatbot in Node Js that connects to a Facebook page for a new way to interact with a conversational user interface, powered by Google’s robust AI technology.
ai artificial-intelligence bot chatbot dialogflow dialogflow-v2 express facebook facebook-messenger facebook-messenger-bot javascript messenger natural-language-processing nlp node nodejs
Last synced: 28 Oct 2024
https://github.com/cathalgarvey/whatlang-py
Simple bindings to the whatlang Rust package
language language-detection language-detector nlp
Last synced: 11 Oct 2024
https://github.com/delph-in/pydmrs
A library for manipulating DMRS structures
computational-linguistics delph-in dependency-graph dmrs formal-semantics hpsg linguistics minimal-recursion-semantics mrs natural-language natural-language-processing nlp python semantics
Last synced: 07 Aug 2024
https://github.com/amir9ume/urdu_ghazals_rekhta
Dataset for Urdu Ghazals
data dataset language-model machine-learning nlp parser rekhta urdu
Last synced: 18 Nov 2024
https://github.com/trykatchup/tesi-triennale
Progetto di strumenti basati su Deep Neural Network per la rilevazione di similarità tra password (Tesi Triennale, Ingegneria Informatica T - Alma Mater Studiorum, Università di Bologna)
artificial-intelligence deep-neural-networks nlp password password-similarity security-privacy-machine-learning
Last synced: 13 Oct 2024
https://github.com/adilzouitine/pyfeel
Python package for emotion analysis in French
data-analysis data-mining data-science emotion emotion-analysis nlp nlp-library opinion-mining python
Last synced: 13 Oct 2024
https://github.com/kotartemiy/topic-labeled-news-dataset
100k+ topic labeled news articles published from thousands of news websites
media news nlp topic topic-modeling topics
Last synced: 17 Nov 2024
https://github.com/sap-samples/acl2019-commonsense
Source code for the paper "Attention Is (not) All You Need for Commonsense Reasoning" published at ACL 2019.
commonsense-reasoning machine-learning nlp sample
Last synced: 15 Nov 2024
https://github.com/koichiyasuoka/guwencombo
Tokenizer POS-tagger and Dependency-parser for Classical Chinese
ancient-chinese classical-chinese literary-chinese nlp
Last synced: 16 Nov 2024
https://github.com/katanaml/table-query-model
Table Query with ML
huggingface-transformers machine-learning nlp
Last synced: 13 Nov 2024
https://github.com/mapmeld/aoc_reply_dataset
Building a dataset of Twitter replies for unsupervised learning / bot-blocking
abuse-detection nlp nlp-machine-learning scraping-tool twitter
Last synced: 15 Nov 2024
https://github.com/shuxiaobo/text-representation
Text representation works, such as : paper, code, review, datasets, blogs, thesis and so on.
benchmark competition embeddings nlp representation-learning scholars sentence-embeddings text-classification thesis transfer-learning
Last synced: 17 Nov 2024
https://github.com/koichiyasuoka/supar-kanbun
Tokenizer POS-tagger and Dependency-parser for Classical Chinese
ancient-chinese classical-chinese literary-chinese nlp
Last synced: 16 Nov 2024
https://github.com/aditeyabaral/calbert
CalBERT - Code-mixed Adaptive Language representations using BERT, published at AAAI-MAKE 2022
bert code-mixed deep-learning machine-learning natural-language-processing nlp transformer
Last synced: 16 Nov 2024
https://github.com/medspacy/sectionizer
A rule-based Python module for spitting documents into sections.
clinical-nlp medspacy nlp nlp-library pipeline spacy
Last synced: 11 Nov 2024
https://github.com/adriacabeza/DeepCatalan
🤖 Deep Catalan: Bring closer the Catalan Language to Deep Learning using ULMFit.
catalan catalan-language classificador fastai fine-tuning nlp pytorch ulmfit
Last synced: 26 Oct 2024
https://github.com/worldbank/wb-nlp-apps
This repository contains the NLP modeling components and web application implementations of a project for knowledge and data discovery funded by the Knowledge for Change Program (KCP) and the Joint Data Center on Forced Displacement (JDC).
data-discovery lda machine-learning nlp python topic-modeling word2vec
Last synced: 10 Nov 2024
https://github.com/hourout/tensordata
CV, NLP, DM datasets Toolkit for Machine Learning.
cv data-mining datasets machine-learning nlp
Last synced: 05 Nov 2024
https://github.com/analyticalmonk/pyspark_nlp_workshop
Instructions and code for the workshop "From Big Data to NLP Insights: Unlocking the Power of PySpark and Spark NLP"
databricks databricks-notebooks distributed-computing nlp pyspark spark spark-nlp workshop
Last synced: 08 Nov 2024
https://github.com/erfanzar/agentx
AgentX is an Open-source library that help people use LLMs on their own computers or help them to serve LLMs as easy as possible that support multi-backends like PyTorch, llama.cpp, Ollama and EasyDeL
easydel jax llama-cpp llama-cpp-python machine-learning nlp ollama
Last synced: 22 Oct 2024
https://github.com/nishiwen1214/at_papers
Must-read papers on Adversarial training for neural networks!
adversarial-training generalization nlp robustness
Last synced: 19 Nov 2024
https://github.com/hit-scir/abacus
珠算代码大模型(Abacus Code LLM)
code-generation large-language-model nlp
Last synced: 10 Nov 2024
https://github.com/dayyass/language-modeling
Pipeline for training Language Models using PyTorch.
decoding deep-learning gpt-2 language-modeling lstm natural-language-processing ngrams nlp python pytorch rnn sampling text-generation
Last synced: 14 Oct 2024
https://github.com/nishiwen1214/superglue-bert4keras
基于bert4keras的SuperGLUE基准代码
baseline bert bert4keras keras nlp nlu superglue
Last synced: 19 Nov 2024
https://github.com/adriacabeza/deepcatalan
🤖 Deep Catalan: Bring closer the Catalan Language to Deep Learning using ULMFit.
catalan catalan-language classificador fastai fine-tuning nlp pytorch ulmfit
Last synced: 12 Nov 2024
https://github.com/direct-phonology/dphon
uncover old chinese textual parallels based on sound
chinese-traditional nlp phonology python text-analysis
Last synced: 06 Nov 2024
https://github.com/jackfsuia/shampoosalesagent
A minimal LLM sales agent framework for sales agent fast deployment and benchmark. Support OpenAI models, Claude, HuggingFace models, Gemini, Ernie(文心一言4.0), Baichuan(百川), Qwen(通义千问), Moonshot(月之暗面), GLM(智谱), Deepseek. AI销售智能体微型框架.
agent ai framework gpt llm machine-learning nlp recommendation-system retail salesperson selling-platform shampoo shopping
Last synced: 14 Nov 2024
https://github.com/arne-cl/ppi_graphkernel
all-paths graph kernel for protein-protein interaction extraction
graph-kernel natural-language-processing nlp ppi protein-protein-interaction python
Last synced: 10 Nov 2024
https://github.com/deepraj1729/tchatbot-api
A Flask REST API to serve trained ChatBots using Tensorflow Serving and Docker Containers
api-rest chatbot deep-learning flask flask-restful framwork keras nlp preprocessing requests tensorflow tf-serving
Last synced: 12 Nov 2024
https://github.com/alexandrevl/supersummarizeai
Unleash the power of AI with SuperSummarizeAI! Effortlessly extract, condense, and clip content from webpages and YouTube videos using ChatGPT. Turning endless streams of content into digestible summaries.
beautifulsoup chatgpt content-analysis multilingual nlp openai papperclip text text-processing text-summarization web-scraping youtube
Last synced: 09 Nov 2024
https://github.com/riccorl/chinese-word-segmentation-pytorch
Chinese Word Segmentation task based on BERT and implemented in Pytorch
bert bert-embeddings chinese chinese-word-segmentation classification cws deep-learning embeddings neural-network nlp pytorch segmentation token transformer
Last synced: 08 Nov 2024
https://github.com/wittline/recommendation-system
Build a Content-Based Movie Recommender System (TF-IDF, BM25, BERT)
bert bm25 nlp python recommender-system recsys text-analysis tf-idf word2vec
Last synced: 14 Oct 2024
https://github.com/alvarobartt/ea-associate-ds
Electronic Arts (EA) NLP Assignment for: Associate Data Scientist
data-science electronic-arts nlp recruitment-task
Last synced: 14 Oct 2024
https://github.com/luozhouyang/tplinker
TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair Linking
entity-extraction nlp pytorch-implementation relation-extraction
Last synced: 10 Nov 2024
https://github.com/helboukkouri/embedding-visualization
This is a project for visualizing word embeddings based on the work of Andrei Kashcha (@anvaka).
fasttext glove graphs nlp visualization word-embeddings word2vec
Last synced: 03 Sep 2024
https://github.com/devmount/neural-network-pos-tagger
Train and evaluate neural network language models for POS tagging, tag input sentences according to a trained model.
embeddings feedforward-neural-network neural-network neural-networks nlp part-of-speech-tagger pos-tagger pos-tagging recurrent-neural-networks word-embeddings
Last synced: 27 Oct 2024
https://github.com/aquadzn/deploy-transformers
Easily deploy a state-of-the-art language model from HuggingFace's Transformers
deployment gpt-2 language-model nlp pytorch pytorch-transformers transformers web-app
Last synced: 14 Oct 2024
https://github.com/explosion/spacy-loggers
📟 Logging utilities for spaCy
logging machine-learning natural-language-processing nlp python spacy
Last synced: 07 Oct 2024
https://github.com/harisbinzia/urdu-word-segmentation
Urdu Word Segmentation using Conditional Random Fields (CRFs)
Last synced: 07 Nov 2024
https://github.com/tech4germany/bam-inclusify
INCLUSIFY is a tool to support the practical use of diversity-sensitive language in German.
diversity equality german govtech language nlp react t4g tech4germany
Last synced: 16 Nov 2024
https://github.com/omarsar/nlp_with_tensorflow
NLP tutorials I have written using TensorFlow
cnn deep-learning nlp rnn tensorflow
Last synced: 13 Oct 2024
https://github.com/quadrismegistus/cadence
Rhythm analysis toolkit in Python
Last synced: 23 Oct 2024
https://github.com/johannkm/goex-search
(Winner | Capital One) A Yelp search app that summarizes reviews using Watson and Aylien Text API
aylien golang google-places-api nlp vue watson-api yelp-fusion-api
Last synced: 13 Oct 2024
https://github.com/dcavar/spacy-json-nlp
spaCy wrapper for JSON-NLP.
json natural-language-processing nlp spacy
Last synced: 18 Oct 2024
https://github.com/nyandwi/deep_learning_with_tensorflow
Deep Learning with TensorFlow for basic neural networks tasks, computer vision and natural language processing.
computer-vision deep-learning machine-learning nlp tensorflow
Last synced: 11 Nov 2024
https://github.com/orgoro/white-2-black
The official code to reproduce results from the NACCL2019 paper: White-to-Black: Efficient Distillation of Black-Box Adversarial Attacks
adversarial-attacks adversarial-networks nlp toxic-comment-classification toxicity
Last synced: 23 Oct 2024
https://github.com/kenlimmj/fightin-words
A scikit-learn compliant implementation of Monroe et al.'s Fightin' Words analysis method.
bayesian-methods evaluation-metrics nlp scikit-learn
Last synced: 12 Oct 2024
https://github.com/ztjhz/t5-jax
JAX implementation of the T5 model: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
jax natural-language-processing nlp nlp-model t5
Last synced: 28 Oct 2024
https://github.com/rahul-jha98/justjoking.ai
Using a Transformer for learning the Language Model and Generate Short Jokes
gpt-2 joke jokegenerator language-model nlg nlp tensorflow2 transformer-models
Last synced: 19 Nov 2024
https://github.com/julesbelveze/text-summarizer
Text Summarizer implemented in PyTorch
attention-mechanism attention-seq2seq nlp pointer-generator pytorch seq2seq text-summarization
Last synced: 20 Oct 2024
https://github.com/maxim5/cs224n-2020-winter
All lecture notes, slides and assignments from CS224n: Natural Language Processing with Deep Learning class by Stanford
cs224n deep-learning machine-learning nlp stanford-nlp
Last synced: 05 Nov 2024
https://github.com/kemingy/plane
A text processing tool including tag(HTML, URL, Email) extraction and removing, punctuation normalization, simple segmentation, and so on.
chinese-nlp data-cleaning nlp preprocess regex tokenization tokenizer
Last synced: 27 Oct 2024
https://github.com/cohere-ai/public-demos
Public demos using the Cohere platform!
embeddings nlp sentiment-analysis text-classification text-generation text-summarization
Last synced: 07 Oct 2024
https://github.com/jianzhnie/multimodaltookit
Incorporate Image, Text and Tabular Data with HuggingFace Transformers
image machine-learning multimodal nlp tabular-data text transformer transformers
Last synced: 27 Oct 2024
https://github.com/gatoreducator/gatorminer
A visualized text mining and analysis tool for student markdown reflection documents based on Natural language processing in the Dept of CS at Allegheny College.
nlp spacy streamlit textmining
Last synced: 12 Oct 2024
https://github.com/yuanjie-ai/inlp
https://pypi.org/project/iNLP/
nlp nlp-apis nlp-keywords-extraction nlp-library nlp-machine-learning nlp-parsing nlp-resources
Last synced: 16 Oct 2024
https://github.com/shreyaskarnik/pagepilot
Summarize URLs using the Kagi Universal Summarizer and Read Out Loud
Last synced: 27 Oct 2024
https://github.com/liamca/medical-ner-search
Leveraging Apache CTakes and Azure Search to Build and Medical Search App
azure azure-search ctakes medical natural-language-processing ner nlp search-engine text-analytics
Last synced: 18 Nov 2024
https://github.com/salesforce/bite
Code for "Mind Your Inflections! Improving NLP for Non-Standard Englishes with Base-Inflection Encoding" (EMNLP 2020).
Last synced: 08 Nov 2024
https://github.com/allan-nava/go-bard
Go package that returns response of Google Bard through API.
bard bard-api chatbot go-library golang google google-bard-api google-bard-go googlebard llm nlp
Last synced: 27 Oct 2024
https://github.com/Rajan-sust/WikiTextCorpusDownloader
A Language Independent Wikipedia Text Corpus Downloader
gensim nlp python3 tensorflow wikipedia
Last synced: 29 Oct 2024
https://github.com/anakin87/who-killed-laura-palmer
Simple Question Answering system, based on data crawled from Twin Peaks Wiki. It is built using 🔍 Haystack, an awesome open-source framework for building search systems that work intelligently over large document collections.
haystack information-retrieval natural-language-processing neural-search nlp python question-answering semantic-search space streamlit streamlit-webapp transformers
Last synced: 23 Oct 2024
https://github.com/thinkwee/eda_zh_bert
Chinese version code for the paper "EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks"
augmentation bert chinese-nlp eda nlp nlp-toolkit
Last synced: 27 Oct 2024
https://github.com/potamides/unsupervised-metrics
Library for experimenting with state-of-the-art evaluation metrics like UScore
evaluation machine-translation metrics nlp sentsim uscore xmoverscore
Last synced: 27 Oct 2024
https://github.com/KompleteAI/xllm
🦖 X—LLM: Simple & Cutting Edge LLM Finetuning
alpaca cerebras chatgpt deep-learning deep-neural-networks deeplearning falcon gpt language-model large-language-models llama2 llama2-7b llm mistal mistral mistralai natural-language-processing nlp openai vicuna
Last synced: 05 Aug 2024
https://github.com/ekote/ai-on-microsoft-azure
Microsoft buduje i tworzy Polską Dolinę Cyfrową. W ramach tej inicjatywy podjęliśmy się wyzwania zbudowania chmurowych kompetencji wśród 150tys osób w Polsce. Jednym z elementów tej inicjatywy jest dedykowany kurs na studiach inzynierskich i magisterskich na Politechnice Warszawskiej poświęcony chmurze obliczeniowej oraz sztucznej inteligencji.
artificial-intelligence azure azure-cognitive-services azure-functions azure-machine azure-machine-learning cloud cloudcomputing cognitive-services computer-vision machine-learning nlp
Last synced: 12 Oct 2024
https://github.com/tuanacelik/anthropic-hackathon
🧠 Workshop Notebook and assets for the Anthropic Hackathon
anthropic claude llm nlp prompt-engineering
Last synced: 23 Oct 2024
https://github.com/zamgi/lingvo--classify
Автоклассификация текста на русском языке
classification linguistics lingvo natural-language-processing nlp nlp-machine-learning text-classification
Last synced: 05 Nov 2024
https://github.com/dellison/dependencytrees.jl
Dependency parsing in Julia
computational-linguistics dependency-parsing natural-language-processing nlp nlp-dependency-parsing
Last synced: 16 Nov 2024
https://github.com/umitkaanusta/mint-youtube
Comment analytics tool for YouTube videos
channel comment nlp self-hosted text-analysis text-classification video youtube youtube-api youtube-videos
Last synced: 23 Oct 2024
https://github.com/salesforce/adversarial-polyglots
Code for the paper "Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots" (NAACL-HLT 2021)
adversarial-attacks adversarial-examples adversarial-training code-mixing multilingual nlp robustness
Last synced: 08 Nov 2024
https://github.com/eellak/gsoc2019-text-extraction
GSoC 2019: Development of a Tool for Extracting Quantitative Text Profiles
computational-linguistics electron-react gsoc-2019 nlp
Last synced: 08 Nov 2024
https://github.com/bryanlimy/rnn-lie-detector
TensorFlow RNN-based Lie Detector on the CSC Deceptive Speech Dataset
gru lstm neural-network nlp rnn tensorflow
Last synced: 23 Oct 2024
https://github.com/clam004/minichatgpt
annotated tutorial of the huggingface TRL repo for reinforcement learning from human feedback connecting equations from PPO and GAE to the lines of code in the pytorch implementation
deep-learning deep-reinforcement-learning fine-tuning language-model large-language-models nlp pytorch reinforcement-learning reinforcement-learning-from-human-feedback transformers
Last synced: 15 Nov 2024