Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Natural language processing
Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
- GitHub: https://github.com/topics/nlp
- Wikipedia: https://en.wikipedia.org/wiki/Natural_language_processing
- Created by: Alan Turing
- Aliases: natural-language-processing, nlp-machine-learning, nlp-resources,
- Last updated: 2025-01-09 00:24:16 UTC
- JSON Representation
https://github.com/amazon-science/refined
ReFinED is an efficient and accurate entity linking (EL) system.
entity-extraction entity-linking entity-resolution nlp pytorch
Last synced: 08 Jan 2025
https://github.com/explosion/displacy-ent
:boom: displaCy-ent.js: An open-source named entity visualiser for the modern web
css javascript named-entities natural-language-processing nlp spacy visualization
Last synced: 25 Sep 2024
https://github.com/milaan9/Python_Natural_Language_Processing
This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.
bag-of-words inversedocumentfrequency ipython-notebook lemmatization named-entity-recognition nlp partofspeech-tagger python4datascience python4everybody sentence-segmentation stemming stopwords termfrequency tf-idf tokenization tutor-milaan9 vocabulary-matching
Last synced: 25 Dec 2024
https://github.com/dkpro/dkpro-core
Collection of software components for natural language processing (NLP) based on the Apache UIMA framework.
dkpro java natural-language-processing nlp uima uima-components
Last synced: 03 Jan 2025
https://github.com/milaan9/python_natural_language_processing
This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.
bag-of-words inversedocumentfrequency ipython-notebook lemmatization named-entity-recognition nlp partofspeech-tagger python4datascience python4everybody sentence-segmentation stemming stopwords termfrequency tf-idf tokenization tutor-milaan9 vocabulary-matching
Last synced: 07 Jan 2025
https://github.com/MaartenGr/Concept
Concept Modeling: Topic Modeling on Images and Text
computer-vision image-processing nlp topic-modeling
Last synced: 05 Nov 2024
https://github.com/fanhuaandluomu/parselawdocuments
对收集的法律文档进行一系列分析,包括根据规范自动切分、案件相似度计算、案件聚类、法律条文推荐等(试验目前基于婚姻类案件,可扩展至其它领域)。
Last synced: 12 Nov 2024
https://github.com/iPieter/RobBERT
A Dutch RoBERTa-based language model
bert bert-model language-model nlp nlp-resources roberta transformers
Last synced: 17 Nov 2024
https://github.com/pszemraj/vid2cleantxt
Python API & command-line tool to easily transcribe speech-based video files into clean text
audio audio-processing keyword keyword-extraction nlp python sentence sentence-boundary-detection speech speech-recognition speech-to-text spelling-correction transcription transformer video video-processing video-summarisation video-summarization wav2vec2 whisper
Last synced: 05 Jan 2025
https://github.com/stanford-oval/genie-toolkit
The Genie open source kit for voice assistant (formerly known as Almond)
hacktoberfest natural-language nlp semantic-parsers voice-assistant
Last synced: 06 Jan 2025
https://github.com/textvec/textvec
Text vectorization tool to outperform TFIDF for classification tasks
machine-learning natural-language-processing nlp python text-analysis text-classification text-processing tf-idf
Last synced: 05 Jan 2025
https://github.com/dusty-nv/jetson-voice
ASR/NLP/TTS deep learning inference library for NVIDIA Jetson using PyTorch and TensorRT
deep-learning jetson jetson-nano nlp pytorch speech-recognition tensorrt text-to-speech
Last synced: 09 Jan 2025
https://github.com/WZBSocialScienceCenter/tmtoolkit
Text Mining and Topic Modeling Toolkit for Python with parallel processing power
evaluation nlp parallel-processing python socialscience text-processing topic-modeling
Last synced: 13 Nov 2024
https://github.com/yanndubs/hash-embeddings
PyTorch implementation of Hash Embeddings (NIPS 2017). Submission to the NIPS Implementation Challenge.
embeddings hashing nips nips-challenge nlp pytorch reproducible-research word-embeddings
Last synced: 27 Oct 2024
https://github.com/sajjjadayobi/PersianQA
Persian (Farsi) Question Answering Dataset (+ Models)
dataset farsi natural-language-processing nlp persian-language persian-nlp question-answering reading-comprehension squad
Last synced: 20 Nov 2024
https://github.com/franck-dernoncourt/pubmed-rct
PubMed 200k RCT dataset: a large dataset for sequential sentence classification.
corpus machine-learning medical nlp randomized-controlled-trials sentence-classification
Last synced: 01 Dec 2024
https://github.com/adamlui/chatgpt-auto-refresh
↻ Keeps ChatGPT sessions fresh to avoid network errors + Cloudflare checks
ai artificial-intelligence chat chatbot chatgpt chatgpt3 cloudflare gpt gpt-3 gpt-4 greasemonkey javascript machine-learning ml nlp openai reloader userscript userscripts
Last synced: 03 Jan 2025
https://github.com/explosion/jupyterlab-prodigy
🧬 A JupyterLab extension for annotating data with Prodigy
active-learning annotation annotation-tool artificial-intelligence computer-vision data-annotation data-science jupyter jupyterlab labeling-tool machine-learning machine-teaching natural-language-processing nlp prodigy spacy
Last synced: 06 Jan 2025
https://github.com/dair-ai/dair-ai.github.io
Home of DAIR.AI
ai data-science education machine-learning nlp
Last synced: 08 Jan 2025
https://github.com/soumyadip007/microsoft-student-partner-workshop-learning-materials-ai-nlp
This repository contains all codes and materials of the current session. It contains the required code on Natural Language Processing, Artificial intelligence.
ai cloud distributed-networking microsoft nlp peer-to-peer workshop
Last synced: 10 Jan 2025
https://github.com/intelligo-mn/neuro
🔮 Neuro.js is machine learning library for building AI assistants and chat-bots.
ai ai-assistants bot chat-bot chat-bots chatbot machine-learning natural-language-processing nlp nodejs
Last synced: 04 Jan 2025
https://github.com/guotong1988/NL2SQL-RULE
Content Enhanced BERT-based Text-to-SQL Generation https://arxiv.org/abs/1910.07179
bert deep-learning knowledge knowledge-representation nl2sql nlp pytorch rule-inject-to-model semantic-parsing text2sql
Last synced: 11 Nov 2024
https://github.com/d5555/tageditor
🏖TagEditor - Annotation tool for spaCy
annotation annotation-tool coreference-resolution data-science labeling-tool machine-learning named-entities named-entity-recognition natural-language-processing neural-networks neuralcoref nlp spacy spacy-visualizer tagging-tool text-annotation text-tagging training-data
Last synced: 19 Dec 2024
https://github.com/thammegowda/nllb-serve
Meta's "No Language Left Behind" models served as web app and REST API
machine-translation multilingual nlp transformers translation
Last synced: 05 Jan 2025
https://github.com/shreyansh26/annotated-ml-papers
Annotations of the interesting ML papers I read
annotated-paper bert deep-learning gpt gpt-2 machine-learning megatron-lm nlp papers-annotations research-paper transformers xlnet
Last synced: 14 Nov 2024
https://github.com/obss/jury
Comprehensive NLP Evaluation System
datasets evaluate evaluation huggingface machine-learning metrics natural-language-processing nlp nlp-evaluation python pytorch transformers
Last synced: 04 Jan 2025
https://github.com/dair-ai/emotion_dataset
:smile: Dataset for Emotion Recognition Research
dataset machine-learning nlp pytorch
Last synced: 27 Dec 2024
https://github.com/ines/spacy-js
🎀 JavaScript API for spaCy with Python REST API
javascript natural-language-processing nlp python rest-api spacy
Last synced: 06 Jan 2025
https://github.com/ropensci/tokenizers
Fast, Consistent Tokenization of Natural Language Text
nlp peer-reviewed r r-package rstats text-mining tokenizer
Last synced: 22 Nov 2024
https://github.com/victorqribeiro/hntitlenator
Test your HN title against a neural network
javascript natural-language-processing neural-network nlp nlp-machine-learning
Last synced: 17 Nov 2024
https://github.com/microsoft/presidio-research
This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.
deep-learning flair machine-learning named-entity-recognition natural-language-processing ner nlp pii privacy spacy transformers
Last synced: 04 Jan 2025
https://github.com/ShawnyXiao/2017-CCF-BDCI-AIJudge
2017-CCF-BDCI-让AI当法官(初赛):7th/415 (Top 1.68%)
2017 bdci ccf data-mining multiclass-classification nlp
Last synced: 01 Nov 2024
https://github.com/tlatkowski/multihead-siamese-nets
Implementation of Siamese Neural Networks built upon multihead attention mechanism for text semantic similarity task.
attention deep-architectures deep-learning deep-neural-networks multihead-attention multihead-attention-networks natural-language-processing nlp paraphrase paraphrase-identification python3 quora-question-pairs semantic-similarity sentence-similarity siamese-cnn siamese-lstm siamese-neural-network snli tensorflow text-similarity
Last synced: 09 Jan 2025
https://github.com/Attempto/APE
Parser for Attempto Controlled English (ACE)
ace attempto cnl nlp swi-prolog
Last synced: 14 Nov 2024
https://github.com/sorgerlab/indra
INDRA (Integrated Network and Dynamical Reasoning Assembler) is an automated model assembly system interfacing with NLP systems and databases to collect knowledge, and through a process of assembly, produce causal graphs and dynamical models.
bioinformatics biology computational-biology indra modeling nlp pysb sbml systems-biology
Last synced: 04 Jan 2025
https://github.com/explosion/spacymoji
💙 Emoji handling and meta data for spaCy with custom extension attributes
emoji emoji-unicode emojis natural-language-processing nlp spacy spacy-extension spacy-pipeline
Last synced: 05 Jan 2025
https://github.com/beader/ruijin_round2
瑞金医院MMC人工智能辅助构建知识图谱大赛复赛
nlp relation-extraction tianchi
Last synced: 12 Nov 2024
https://github.com/princeton-nlp/trime
[EMNLP 2022] Training Language Models with Memory Augmentation https://arxiv.org/abs/2205.12674
Last synced: 11 Nov 2024
https://github.com/d5555/TagEditor
🏖TagEditor - Annotation tool for spaCy
annotation annotation-tool coreference-resolution data-science labeling-tool machine-learning named-entities named-entity-recognition natural-language-processing neural-networks neuralcoref nlp spacy spacy-visualizer tagging-tool text-annotation text-tagging training-data
Last synced: 18 Nov 2024
https://github.com/mannefedov/compling_nlp_hse_course
Материалы курса по компьютерной лингвистике Школы Лингвистики НИУ ВШЭ
computational-linguistics course hse machine-learning natural-language-processing nlp python
Last synced: 13 Nov 2024
https://github.com/s3nh/text-detector
Tool which allow you to detect and translate text.
craft crnn deep-learning nlp ocr-recognition pytorch recognition scene-text-detection scene-text-detectors text text-processing text-recognition
Last synced: 03 Nov 2024
https://github.com/martinomensio/spacy-universal-sentence-encoder
Google USE (Universal Sentence Encoder) for spaCy
models nlp spacy tensorflow-hub use
Last synced: 06 Jan 2025
https://github.com/maxent-ai/converse
Conversational text Analysis using various NLP techniques
callcenter-analysis conversational-ai emotion-recognition huggingface machine-learning nlp nlu pytorch scikit-learn sentiment-analysis spacy speech-to-text text text-mining topic-modeling transformers
Last synced: 03 Jan 2025
https://github.com/humansignal/label-studio-transformers
Label data using HuggingFace's transformers and automatically get a prediction service
bert data-labeling label-studio natural-language-processing natural-language-understanding nlp pytorch-transformers text-labeling transformers
Last synced: 19 Dec 2024
https://github.com/HumanSignal/label-studio-transformers
Label data using HuggingFace's transformers and automatically get a prediction service
bert data-labeling label-studio natural-language-processing natural-language-understanding nlp pytorch-transformers text-labeling transformers
Last synced: 27 Nov 2024
https://github.com/daspartho/prompt-extend
extending stable diffusion prompts with suitable style cues using text generation
deep-learning gpt-2 huggingface-spaces huggingface-transformers machine-learning nlp prompt stable-diffusion text-generation
Last synced: 15 Dec 2024
https://github.com/simongray/clojure-dsl-resources
A curated list of Clojure resources for dealing with domain-specific languages.
data-transformation domain-specific-language dsl nlp parsing
Last synced: 10 Dec 2024
https://github.com/opensemanticsearch/open-semantic-entity-search-api
Open Source REST API for named entity extraction, named entity linking, named entity disambiguation, recommendation & reconciliation of entities like persons, organizations and places for (semi)automatic semantic tagging & analysis of documents by linked data knowledge graph like SKOS thesaurus, RDF ontology, database(s) or list(s) of names
api disambiguation entity-extraction knowledge-graph knowledgebase linked-data linked-data-api linkeddata named-entities named-entity-recognition natural-language-processing nlp python reconciliation reconciliation-service rest-api semantic semantic-analysis semantic-annotation thesaurus
Last synced: 27 Oct 2024
https://github.com/ooguz/turkce-kufur-karaliste
Türkçe için bir kara liste (blacklist)
blacklist blacklisting bots nlp nlp-keywords-extraction turkce turkce-kufur-karaliste turkish turkish-language turkish-translation
Last synced: 21 Dec 2024
https://github.com/tugstugi/mongolian-nlp
Useful resources for Mongolian NLP
deep-learning language-model mongolian natural-language-processing nlp pytorch speech-recognition text-to-speech
Last synced: 09 Jan 2025
https://github.com/ownthink/semantic
语义理解/口语理解,项目包含有词法分析:中文分词、词性标注、命名实体识别;口语理解:领域分类、槽填充、意图识别。
Last synced: 20 Dec 2024
https://github.com/blopa/magento-chatbot
Magento Chatbot Integration with Telegram, Messenger, Whatsapp, WeChat, Skype and wit.ai.
bot chatbot facebook facebook-messenger magento magento-chatbot magento-module magento2 magento2-chatbot magento2-module messenger messenger-api nlp skype telegram telegram-api wechat whatsapp witai
Last synced: 10 Oct 2024
https://github.com/PaddlePaddle/PALM
a Fast, Flexible, Extensible and Easy-to-use NLP Large-scale Pretraining and Multi-task Learning Framework.
baidu multi-task-learning nlp paddlepaddle pretrain-model transformers
Last synced: 27 Nov 2024
https://github.com/xatkit-bot-platform/xatkit
The simplest way to build all types of smart chatbots and digital assistants
bot chatbot-framework chatbots conversational-ai digital-assistant dsl low-code nlp no-code
Last synced: 07 Nov 2024
https://github.com/anujvyas/natural-language-processing-projects
This repository consists of all my NLP Projects
lemmatization natural-language-processing nlp nltk python sentiment-analysis stemming text-classification wordcloud
Last synced: 19 Dec 2024
https://github.com/cohere-ai/sandbox-conversant-lib
Conversational AI tooling & personas built on Cohere's LLMs
chatbot chatbot-framework chatbots cohere conversational-agent conversational-ai conversational-bots dialogue-generation dialogue-systems large-language-models llm nlp
Last synced: 06 Jan 2025
https://github.com/dengbocong/text-similarity
文本相似度(匹配)计算,提供Baseline、训练、推理、指标分析...代码包含TensorFlow/Pytorch双版本
bert deep-learning mechine-learing model nlp pytorch similarity text-classification transformer
Last synced: 09 Jan 2025
https://github.com/dccuchile/wefe
WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Word Embeddings models. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!
bias-detection bias-reduction fairness-ai fairness-ml library nlp nlp-library python3 word-embedding-evaluation word-embedding-fairness word-embeddings
Last synced: 22 Nov 2024
https://github.com/fiddler-labs/fiddler-auditor
Fiddler Auditor is a tool to evaluate language models.
ai-observability evaluation generative-ai langchain llms nlp robustness
Last synced: 05 Jan 2025
https://github.com/yohasebe/wp2txt
A command-line toolkit to extract text content and category data from Wikipedia dump files
corpus machine-learning nlp ruby wikipedia wikipedia-dump
Last synced: 04 Jan 2025
https://github.com/princeton-nlp/cofipruning
[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408
bert model-compression nlp pruning
Last synced: 11 Nov 2024
https://github.com/uzay-g/espial
Espial is an engine for automated organization and discovery of personal knowledge
knowledge knowledge-graph nlp python
Last synced: 27 Oct 2024
https://github.com/salvatorera/tutorial
Tutorials on machine learning, artificial intelligence, data science with math explanation and reusable code (in python and R)
artificial-intelligence bioinformatics biology computer-vision convolutional-neural-networks data-science deep-learning graph image machine-learning natural-language-processing nlp python r streamlit streamlit-webapp tutorial tutorials vision-transformer
Last synced: 04 Jan 2025
https://github.com/Uzay-G/espial
Espial is an engine for automated organization and discovery of personal knowledge
knowledge knowledge-graph nlp python
Last synced: 01 Nov 2024
https://github.com/j2kao/fcc_nn_research
(somewhat) cleaned-up notebooks used in researching public comments for FCC Proceeding 17-108 (Net Neutrality Repeal)
Last synced: 29 Nov 2024
https://github.com/cyberzhg/keras-xlnet
Implementation of XLNet that can load pretrained checkpoints
glue keras language-model nlp xlnet
Last synced: 27 Sep 2024
https://github.com/CyberZHG/keras-xlnet
Implementation of XLNet that can load pretrained checkpoints
glue keras language-model nlp xlnet
Last synced: 16 Nov 2024
https://github.com/RelevanceAI/relevanceai
Home of the AI workforce - Multi-agent system, AI agents & tools
clustering computer-vision embeddings natural-language-processing nlp python search search-engine unstructured-data vector-database vector-search
Last synced: 23 Dec 2024
https://github.com/HKUSTDial/NL2SQL_Handbook
This is a continuously updated handbook for readers to easily track the latest NL2SQL techniques in the literature and provide practical guidance for researchers and practitioners.
awesome finetuning llms nl-to-code nl-to-sql nl2sql nlp nlp-resources survey text-to-sql text2sql tutorial
Last synced: 02 Nov 2024
https://github.com/hscspring/all4nlp
All For NLP, especially Chinese.
ai deeplearning machinelearning nlp
Last synced: 03 Jan 2025
https://github.com/iterative/example-get-started
Get started DVC project (NLP, random forest)
dvc example machine-learning nlp python random-forest reproducibility reproducible reproducible-research
Last synced: 14 Nov 2024
https://github.com/avidale/compress-fasttext
Tools for shrinking fastText models (in gensim format)
fasttext fasttext-embeddings nlp python word-embeddings
Last synced: 13 Nov 2024
https://github.com/balavenkatesh3322/NLP-pretrained-model
A collection of Natural language processing pre-trained models.
deep-learning deep-neural-networks keras machine-learning model mxnet natural-language-processing neural-networks nlp nlp-machine-learning python python3 pytorch tensorflow text text-classification text-to-number text-to-speech
Last synced: 27 Nov 2024
https://github.com/ethancaballero/improved-dynamic-memory-networks-dmn-plus
Theano Implementation of DMN+ (Improved Dynamic Memory Networks) from the paper by Xiong, Merity, & Socher at MetaMind, http://arxiv.org/abs/1603.01417 (Dynamic Memory Networks for Visual and Textual Question Answering)
babi-tasks deep-learning deep-neural-networks neural-network nlp question-answering
Last synced: 15 Dec 2024
https://github.com/akshaynagpal/w2n
Convert number words (eg. twenty one) to numeric digits (21)
nlp numeric-digits python word-to-number
Last synced: 22 Nov 2024
https://github.com/IlyaGusev/summarus
Models for automatic abstractive summarization
deep-learning machine-learning nlp pytorch summarization
Last synced: 04 Nov 2024
https://github.com/balavenkatesh3322/nlp-pretrained-model
A collection of Natural language processing pre-trained models.
deep-learning deep-neural-networks keras machine-learning model mxnet natural-language-processing neural-networks nlp nlp-machine-learning python python3 pytorch tensorflow text text-classification text-to-number text-to-speech
Last synced: 10 Nov 2024
https://github.com/rylans/getlang
Natural language detection package in pure Go
language-model natural-language nlp
Last synced: 26 Oct 2024
https://github.com/prrao87/fine-grained-sentiment
A comparison and discussion of different NLP methods for 5-class sentiment classification on the SST-5 dataset.
fasttext flair nlp python pytorch sentiment-analysis text-classification transformers
Last synced: 08 Jan 2025
https://github.com/algolisted-org/algolisted
Algolisted is an AI-powered platform dedicated to assisting computer science students in preparing for placements and internships. Our services include tracking and analytics across various platforms and topics.
ai css firebase hacktoberfest-2023 javascript mern-stack ml nlp python3 react-js web-scraping
Last synced: 05 Jan 2025
https://github.com/anthonymrios/pymetamap
Python wraper for MetaMap
biomedical-informatics named-entity-recognition natural-language-processing nlp python
Last synced: 07 Jan 2025
https://github.com/doc-analysis/xfund
XFUND: A Multilingual Form Understanding Benchmark
dataset natural-language-processing nlp
Last synced: 01 Dec 2024
https://github.com/princeton-nlp/optiprompt
[NAACL 2021] Factual Probing Is [MASK]: Learning vs. Learning to Recall https://arxiv.org/abs/2104.05240
Last synced: 11 Nov 2024
https://github.com/google-research/turkish-morphology
A two-level morphological analyzer for Turkish.
google morphological-analyser morphology natural-language-processing natural-language-understanding nlp turkish
Last synced: 07 Jan 2025
https://github.com/princeton-nlp/OptiPrompt
[NAACL 2021] Factual Probing Is [MASK]: Learning vs. Learning to Recall https://arxiv.org/abs/2104.05240
Last synced: 19 Nov 2024
https://github.com/doc-analysis/XFUND
XFUND: A Multilingual Form Understanding Benchmark
dataset natural-language-processing nlp
Last synced: 06 Nov 2024
https://github.com/natasha/navec
Compact high quality word embeddings for Russian language
embeddings glove nlp python quantization russian word2vec
Last synced: 05 Jan 2025
https://github.com/indix/whatthelang
Lightning Fast Language Prediction 🚀
fasttext language-detection languages nlp python
Last synced: 09 Jan 2025
https://github.com/kuutsav/information-retrieval
Neural information retrieval / Semantic search / Bi-encoders
information-retrieval machine-learning nlp semantic-search
Last synced: 15 Nov 2024
https://github.com/ahkarami/great-deep-learning-tutorials
A Great Collection of Deep Learning Tutorials and Repositories
computer-vision deep-learning deep-learning-tutorial deep-neural-networks gan machine-learning nlp
Last synced: 13 Nov 2024
https://github.com/crazyofapple/Reading_groups
A paper & resource list of large language models, including course, paper, demo, figures
chatgpt gpt-3 gpt-4 large-language-models llm llms natural-language-processing nlp
Last synced: 10 Nov 2024
https://github.com/geekjr/quickai
QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.
ai artificial-intelligence bert deep-learning dl easy-to-use fast gpt gpt-neo huggingface-transformers ml neural-network nlp object-detection python pytorch quickai research tensorflow2 yolo
Last synced: 09 Nov 2024
https://github.com/platisd/duplicate-code-detection-tool
A simple Python3 tool to detect similarities between files within a repository
Last synced: 07 Jan 2025
https://github.com/NPCai/Open-IE-Papers
Open Information Extraction (OpenIE) and Open Relation Extraction (ORE) papers and data.
information-extraction literature-review nlp openie papers relation-extraction tuples
Last synced: 10 Nov 2024