Projects in Awesome Lists tagged with stemming
A curated list of projects in awesome lists tagged with stemming .
https://github.com/hunspell/hunspell
The most popular spellchecking library.
natural-language-processing spell-check spell-checker spell-checking-engine spellcheck spellchecker stemming
Last synced: 29 Apr 2025
https://github.com/vngrs-ai/vnlp
State-of-the-art, lightweight NLP tools for Turkish language. Developed by VNGRS.
deasciifier deep-learning dependency-parsing fasttext morphological-analysis morphological-disambiguation named-entity-recognition nlp normalization number-to-words part-of-speech-tagging sentence-splitting sentence-tokenizer sentiment-analysis spelling-correction stemming stopword-removal turkish-nlp word-embeddings word2vec
Last synced: 14 Jan 2026
https://github.com/milaan9/python_natural_language_processing
This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.
bag-of-words inversedocumentfrequency ipython-notebook lemmatization named-entity-recognition nlp partofspeech-tagger python4datascience python4everybody sentence-segmentation stemming stopwords termfrequency tf-idf tokenization tutor-milaan9 vocabulary-matching
Last synced: 09 Apr 2025
https://github.com/milaan9/Python_Natural_Language_Processing
This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.
bag-of-words inversedocumentfrequency ipython-notebook lemmatization named-entity-recognition nlp partofspeech-tagger python4datascience python4everybody sentence-segmentation stemming stopwords termfrequency tf-idf tokenization tutor-milaan9 vocabulary-matching
Last synced: 28 Aug 2025
https://github.com/anujvyas/natural-language-processing-projects
This repository consists of all my NLP Projects
lemmatization natural-language-processing nlp nltk python sentiment-analysis stemming text-classification wordcloud
Last synced: 20 Aug 2025
https://github.com/anujvyas/Natural-Language-Processing-Projects
This repository consists of all my NLP Projects
lemmatization natural-language-processing nlp nltk python sentiment-analysis stemming text-classification wordcloud
Last synced: 29 Mar 2025
https://github.com/words/stemmer
Fast Porter stemmer implementation
natural-language porter stemmer stemming
Last synced: 12 Dec 2025
https://github.com/Qutuf/Qutuf
Qutuf (قُطُوْف): An Arabic Morphological analyzer and Part-Of-Speech tagger as an Expert System.
arabic arabic-language arabic-morphology arabic-nlp arabic-tagger expert-system heavy-stemming lemmatization light-stemming morphological-analysis overdue-tagging part-of-speech-tagger pattern-matching pos-tagging premature-tagging role-based root-extraction rooting stemmer stemming
Last synced: 07 May 2025
https://github.com/biolab/orange3-text
🍊 :page_facing_up: Text Mining add-on for Orange3
bag-of-words lemmatization newspapers nltk orange sentiment-analysis stemming stopwords text text-analysis text-mining twitter
Last synced: 14 Aug 2025
https://github.com/tokenmill/beagle
Beagle helps you identify keywords, phrases, regexes, and complex search queries of interest in streams of text documents.
clojure java lucene luwak nlp real-time-search stemming stored-query-engine stream-search
Last synced: 23 Jul 2025
https://github.com/abhinav-26/ai-chatbot
This is my Artificial Intelligence Project in which we build AI Contextual Chatbot
ai ai-assignments ai-chatbot ai-project chatbot contextual-chatbot deep-learning hacktoberfest hacktoberfest2020 machine-learning natural-language-processing neural-networks nlp stemming tensorflow tflearn
Last synced: 22 Mar 2025
https://github.com/trinker/textstem
Tools for fast text stemming & lemmatization
lemmatization r stemming text-mining
Last synced: 16 Mar 2025
https://github.com/bastienbot/nlp-js-tools-french
POS Tagger, lemmatizer and stemmer for french language in javascript
lemmatization lemmatizer nlp postagging postgresql stemmer stemming tokenization tokenizer
Last synced: 01 Aug 2025
https://github.com/words/lancaster-stemmer
Lancaster stemming algorithm
lancaster natural-language stemmer stemming
Last synced: 05 Apr 2026
https://github.com/master/spark-stemming
Spark MLlib wrapper for the Snowball framework
Last synced: 04 Jul 2025
https://github.com/fzn0x/idnaive
🧠 A Simple Node.js Naive Bayes Library.
hacktoberfest javascript multi-language naive-bayes stemming
Last synced: 23 Mar 2025
https://github.com/kangfend/bahasa
Natural language toolkit for Indonesian Language (Bahasa)
bahasa indonesia natural-language-processing nlp nlp-python python sastrawi stemmer stemming
Last synced: 21 Jan 2026
https://github.com/dbklim/uk_stemmer
A small modification of the stemmer for the Ukrainian language (https://github.com/Amice13/ukr_stemmer)
natural-language-processing nlp stemmer stemmers stemming stemming-algorithm uk ukr ukrainian ukrainian-morphology
Last synced: 29 Apr 2025
https://github.com/scaraux/swift-porter-stemmer-2
:snowman: A Swift wrapper over the Porter Stemmer 2 / libstemmer
porter-stemmer-algorithm porter-stemmer-v2 snowball stemming swift
Last synced: 11 Jun 2025
https://github.com/dariasmyr/fts-engine
A modular full-text search engine in Go with instant indexing, pluggable indexers, and configurable pre-search filters.
fulltext-search fuzzy-search ngram-analysis ngrams stemming trie
Last synced: 01 Apr 2026
https://github.com/ksdkamesh99/phony-news-classifier
Phony News Classifier is a repository which contains analysis of a natural language processing application i.e fake news classifier with the help of various text preprocessing strategies like bag of words,tfidf vectorizer,lemmatization,Stemming with Naive bayes and other deep learning RNN (LSTM) and maintaining the detailed accuracy below
bag-of-words deep-learning fake-news lemmatization lstm-neural-networks multinomial-naive-bayes naive-bayes-classifier natural-language-processing python3 stemming tfidfvectorizer
Last synced: 12 May 2025
https://github.com/mtumilowicz/elasticsearch7-ngrams-fuzzy-shingles-stemming-workshop
Gentle introduction to basic elasticsearch constructs boosting search: ngrams, shingles, stemmers, suggesters and fuzzy queries.
edge-ngram elasticsearch fuzzy-query fuzzy-search kibana ngram search-as-you-type shingles stemmer stemming suggester workshop workshop-materials
Last synced: 11 Apr 2025
https://github.com/labrijisaad/twitter-sentiment-analysis-with-python
I aim in this project to analyze the sentiment of tweets provided from the Sentiment140 dataset by developing a machine learning sentiment analysis model involving the use of classifiers. The performance of these classifiers is then evaluated using accuracy and F1 scores.
accuracy-score bernoulli-naive-bayes confusion-matrix f1-score lemmatization logistic-regression machine-learning nlp roc-auc-curve sentiment-analysis sentiment140-dataset stemming support-vector-machine tokenization twitter-sentiment-analysis
Last synced: 08 Apr 2025
https://github.com/singhpratyush/index-search-query
Inverted Index, Query Formulation and Ranking from Scratch in Python
indexing multithreading pipenv python query query-building ranking searching stemming
Last synced: 12 Apr 2025
https://github.com/assem-ch/snowball-sublime-syntax
Snowball framework syntax definition for Sublime Text 3
snowball stemming sublime-text syntax-highlighting
Last synced: 19 Feb 2026
https://github.com/iamkankan/natural-language-processing-nlp-tutorial
NLP tutorials and guidelines to learn efficiently
bigrams bow cbow glove lemmatization one-hot-encoding stemming stopwords tf-idf-vectorizer tokenization unigram word-embeddings word2vec
Last synced: 08 Jan 2026
https://github.com/prakharjadaun/feature-extraction-for-spam-email-detection
Implemented Preprocessing steps, Feature Extraction techniques and Naive Bayes Classifier in C++. Moreover, we have also implemented all the steps using python for comparative analysis.
bag-of-words-cpp email-spam-classifier naive-bayes-classifier-cpp nlp-machine-learning stemming text-classification
Last synced: 07 May 2025
https://github.com/shaadclt/twitter-hashtag-analysis
This project provides a website that allows users to analyze real-time tweets from Twitter based on a specific hashtag. The website includes a tweet sentiment analyzer to determine the sentiment (positive, negative, or neutral) of the collected tweets.
lemmization logistic-regression nltk stemming textblob wordcloud
Last synced: 10 Apr 2025
https://github.com/mrrefactoring/multilingual-stemmer
A NodeJS webasembly implementation of some popular snowball stemming algorithms
javascript nodejs stemmer stemmers stemming stemming-algorithm webassembly
Last synced: 16 Dec 2025
https://github.com/maxpatiiuk/porter-stemming
TypeScript implementation of the Porter Stemmer algorithm
Last synced: 22 Mar 2025
https://github.com/antonbaumann/german-go-stemmer
An efficient implementation of the German porter-stemming algorithm in Golang.
language-processing nlp porter-stemmer snowball stemming stemming-algorithm
Last synced: 05 Mar 2026
https://github.com/openderocknlp/extract-lemmatized-nonstop-words
Extracts a pure list of stemmed words of a text filtered by stop words
javascript lemma nlp npm stemming stopwords tokenizer
Last synced: 06 Oct 2025
https://github.com/mhmdio/arabic-stemming-toolkit
Improving Arabic Light Stemming in Information Retrieval Systems
arabic arabic-nlp gui information-retrieval java master master-thesis stemming stemming-algorithm stop-words terrier
Last synced: 21 Aug 2025
https://github.com/anishlearnstocode/nlp-playground
Small code snippets written in Python covering fundamental concepts in NLP used in all major NLP projects.
lemmatization natural-language-processing nlp porter-stemmer stemming
Last synced: 10 Apr 2025
https://github.com/fardinhash/chatbot-deep-learning
This Chatbot completed with combination of Deep Learning, Natural Language Toolkit(NLTK), PyTorch mode. And highest accuracy achieved here.
ai-chatbot chatbot deep-learning lemmatization machine-learning ml natural-language-processing natural-language-toolkit nlp nltk python pytorch pytorch-model stemming tokenization
Last synced: 22 Sep 2025
https://github.com/burhanharoon/urdu-stemmer
A simple python based Urdu stemmer which tries to find a stem word from a list of affixes.
python python3 stemming stemming-algorithm urdu urdu-language urdu-nlp urdu-text-processsing
Last synced: 19 Apr 2026
https://github.com/mmahmoodictbd/solr-analysis-bn
Solr / Lucene Bangla Analyzer, Stem Filter, Stemmer.
bangla bengali solr solr-plugin solr-search stemmer stemming
Last synced: 26 Mar 2025
https://github.com/krisharul26/text-classification-dbpedia-ontology-classes-using-lstm
Text classification is the task of assigning a set of predefined categories to free text. Text classifiers can be used to organize, structure, and categorize pretty much anything. For example, new articles can be organized by topics, support tickets can be organized by urgency, chat conversations can be organized by language, brand mentions can be organized by sentiment, and so on.
attention-mechanism bagofwords flask-application gensim-doc2vec gensim-word2vec glove-embeddings lemmatization lstm-neural-networks nlp-machine-learning nltk-python restapi-framework rnn-tensorflow stemming tensorflow2 word2vec-embeddinngs word2vec-model
Last synced: 22 Jan 2026
https://github.com/francescopaolol/learningnlp
This repository contains what I'm learning about NLP
cbow constituency-grammar dependency-grammar document-clustering feature-enginering gensim glove lda lda-model lsi-model nltk python semantic-analysis sentiment-analysis skip-gram stemming text-corpora-using text-wrangling topics-modeling word2vec
Last synced: 11 Apr 2025
https://github.com/abdullahashfaqvirk/NLP-Workshops
Embark on your NLP journey by learning essential techniques through a series of notebooks designed to kickstart your career in this field.
lemmatization named-entity-recognition nlp nltk notebooks pos-tagging python stemming stopwords tokenization workshops
Last synced: 27 Sep 2025
https://github.com/rajspeaks/machine-learning-approach-to-bengali-corpus-tokenization-stemming-pos-tagging-using-bnltk
Machine Learning approach to Bengali Corpus POS Tagging using BNLTK. This is an experimenting project under the mentorship of Prof. Sandipan Ganguly, HIT-K.
bengali bengali-dataset bengali-language-processing bengali-natural-language-processing bengali-nlp english machine-learning natural-language-processing natural-language-understanding nlp nlp-library nlp-machine-learning postagger postagging rajdeep-das rajspeaks stemmer stemming tokenizer-parser
Last synced: 04 Apr 2025
https://github.com/alisoltanirad/reason
Python NLP ToolBox
classification clustering lemmatization machine-learning natural-language-processing nlp stemming tagging tokenization
Last synced: 14 Jan 2026
https://github.com/mrseanryan/data-type-predictor
Given the name of a property or attribute like 'BrandName' or 'AmountReceived', try to predict a data type like String, Boolean, Integer...
ai data-classification data-types nlp stemming
Last synced: 08 Nov 2025
https://github.com/anuprulez/similar_galaxy_tools
Finding similarity in Galaxy tools
deep-learning deepnlp galaxy-project gradient-descent information-retrieval machine-learning nlp-machine-learning nltk stemming
Last synced: 03 Jul 2025
https://github.com/shutterstock/stemming-exceptions
A collection of stemming exceptions for different languages.
Last synced: 28 Jan 2026
https://github.com/arssite/naturalinguisticprogramming
Repo Related to Natural Language Processing and Social Media Analytics.
deep-learning lemmatization named-entity-recognition natural-language-processing social-network-analysis socialmediaanalytics stemming stopwords tokenization
Last synced: 27 Feb 2026
https://github.com/cosmoduende/r-twitter
Explore your Twitter activity with R: Sentiment Analysis and Data Visualization. How to analyze your Twitter account (or any account), discover your habits and sentiments with the "rtweet" package and NLP.
data-analysis data-visualization lemmatization nlp nlp-library nlp-resources nltk nltk-library r-package r-programming r-studio rtweet stemming twitter twitter-api twitter-data twitter-data-analysis twitter-data-extraction twitter-sentiment-analysis udpipe
Last synced: 10 Oct 2025
https://github.com/hernanmd/libstemmer
Pharo uFFI wrapper for the Porter Stemmer algorithm
pharo pharo-smalltalk porter-stemmer smalltalk stemming stemming-algorithm
Last synced: 27 Feb 2026
https://github.com/tomsquest/lucene-stemmers
Stem words like Lucene (port of Lucene' stemmers to JavaScript)
Last synced: 24 Jun 2026
https://github.com/moindalvs/text_mining_nlp
Natural Language Processing
bag-of-words classifier data-science fake-news lemmatization nlp pipeline sentiment-analysis sentiment-classification spacy spacy-pipeline stemming text-classification text-mining tfidf tokenization vectorizer
Last synced: 16 Apr 2026
https://github.com/aquilax/go-stemmer
Bulgarian language stemmer library in go for the BULSTEM rules
Last synced: 15 Mar 2025
https://github.com/aarryasutar/hate_speech_detection
This project aims to detect hate speech on Twitter using advanced NLP and machine learning techniques, exploring feature extraction methods like TF-IDF and sentiment analysis, and evaluating models such as Logistic Regression and SVM.
confusion-matrix doc2vec gensim logistic-regression matplotlib naive-bayes nltk numpy pandas python random-forest scikit-learn seaborn stemming stopwords-removal svm tf-idf-vectorizer tokenization vader word-cloud
Last synced: 09 Apr 2026
https://github.com/skourtsidisgiorgos/neural_networks_ntua
Projects for the ECE class "Neural Networks and Intelligent Systems", 2020-2021
bag-of-words classification clustering cnn-keras deep-learning ece intelligent-systems knn lemmatization machine-learning neural-networks reinforcement-learning stemming svm
Last synced: 06 May 2026
https://github.com/iiiioreo/data-cleaning-w-gui
AIO Data Cleaning: Python application using Tkinter for text file manipulation, featuring functions such as case conversion, lemmatization, stemming, and more.
ai datacleaning lemmatization python stemming text-editor text-to-pdf tkinter
Last synced: 30 Mar 2025
https://github.com/hangsbreaker/stemming-ind
Javascript, PHP, Python Stemming Bahasa Indonesia
javascript nodejs php stem stemmer stemming stemming-algorithm
Last synced: 07 May 2026
https://github.com/thomasbrockmeier/kpss_py3
Kraaij-Pohlmann Snowball Stemmer
dutch language language-processing linguistics nlp stemmer stemming stemming-algorithm
Last synced: 29 May 2026
https://github.com/gopireddy99/daily_ad_nlp_assignments
AD Training classes in NLP - Daily Assignments
cleaning-text regularexpression stemming stopwords textprocessing tokenization
Last synced: 12 Aug 2025
https://github.com/ewdlop/nlpnote
NLP(Natural Language Processing) Note. https://en.wikipedia.org/wiki/Natural_language_processing
attention-mechanism bag-of-words bert entity-recognition gpt holonymy-meronymy hypernymy-hyponymy inverted-index language-model large-language-models lemmatization n-gram natural-language-processing part-of-speech-tagging sentiment-analysis sequence-to-sequence stemming stopwords tf-idf word-embeddings
Last synced: 09 Aug 2025
https://github.com/sayande01/natural_language_processing
This repository contains Jupyter notebooks and Python scripts that cover foundational concepts and practical implementations of NLP preprocessing techniques. Each topic is accompanied by clear explanations and code examples using the Natural Language Toolkit (NLTK) library.
bag-of-words natural-language-processing nltk stemming word2vec
Last synced: 06 Apr 2025
https://github.com/04bhavyaa/sms-spam-classification-system
A Machine Learning project that identifies whether a given message is spam or not. It uses Natural Language Processing (NLP) techniques (Stemming and TF-IDF Vectorization) for text transformation and a trained Multinomial Naive Bayes Classifier for predictions.
bernoulli-naive-bayes nlp-machine-learning nltk-library spam-classification stemming streamlit tfidf-vectorizer
Last synced: 24 Apr 2026
https://github.com/chandkund/sms-spam-detection
The goal is to develop a classification model that can accurately differentiate between spam and non-spam messages. This is crucial for applications like email filtering, SMS spam detection, and improving overall user experience by reducing the influx of unwanted or malicious content.
matplotlib nlp-machine-learning numpy pandas seaborn stemming tfidf-vectorizer tokenization
Last synced: 19 Jan 2026
https://github.com/jigyasag18/fake-news-prediction-project
The Fake News Prediction App Repository offers a machine learning project that focuses on identifying the authenticity of news articles as fake or real. It uses a dataset of 20,000 articles and employs methods such as TF-IDF vectorization and the Porter stemming algorithm, achieving around 97% classification accuracy with logistic regression model.
data datapreprocessing logistic-regression machine-learning machine-learning-algorithms numpy pandas prediction stemming vectorization
Last synced: 08 Jun 2026
https://github.com/igoraugust0/information-retrieval
ℹ️ Information Retrieval models implemented in Python
boolean-model information-retrieval inverted-index matplotlib nltk pickle precision-recall prettytable python stemming tf-idf tokenization vector-space-model
Last synced: 21 Jul 2025
https://github.com/atharvapathak/customer_sentiment_analysis
Customer sentiment analysis is the process of using natural language processing (NLP) and machine learning techniques to analyze and understand the feelings, opinions, and attitudes expressed by customers in textual data, such as reviews, feedback, and social media posts.
cnn naive-bayes nlp nltk spacy stemming text-mining tokenization
Last synced: 21 Feb 2026
https://github.com/youssef155/sentiment_analysis
Sentiment Analysis For Restaurant Reviews
flask jupyter-notebook nlp pkl-model python stemming stopwords text-cleaning
Last synced: 12 May 2026
https://github.com/fusi3/natural_language_coursework
Assessing the impact of different pre-processing techniques for classifying the sentiment of movie reviews
bag-of-words latent-semantic-analysis lemmatization multilayer-perceptron nlp sentiment-analysis stemming support-vector-machines tfidf
Last synced: 18 Mar 2025
https://github.com/yeremi/stopwords
A lightweight and efficient PHP library tailored for developers working on Natural Language Processing (NLP) tasks in Brazilian Portuguese.
elasticsearch extract-information fulltext-search indexing-querying natural-language-processing php portuguese search-engine snowball stemming stop stop-words stopwords
Last synced: 09 Feb 2026
https://github.com/cyberfantics/naturallanguageprocessing
A comprehensive repository for the Natural Language Processing course, featuring lecture notes, slides, and practical implementations of key NLP concepts using Python and popular libraries.
chatbots hacktoberfest lemmatization nltk nltk-python spacy-nlp stemming tokenization transformer
Last synced: 12 Jun 2026
https://github.com/mayankmittal29/duplifinder-quora-clone-catcher
An advanced system for detecting semantically duplicate question pairs using cutting-edge NLP techniques. Combines traditional ML models (XGBoost, SVM, Random Forest) with deep learning architectures (BiLSTM, Siamese Networks, Transformers) and contextual embeddings (BERT, RoBERTa). Features engineered using token similarity, fuzzy matching, and em
bert bilstm cross-validation eda fastext fuzzy-matching glove numpy pandas python3 quora-question-pairs random-forest roberta seaborn stemming svm tf-idf transformers word2vec xgboost
Last synced: 15 Apr 2026
https://github.com/tggo/steblo
Zero-dependency rule-based Ukrainian stemmer in pure Go
bleve golang nlp stemmer stemming text-processing ukrainian ukrainian-language
Last synced: 03 Jun 2026
https://github.com/abinashsahoo007/project-resume-classification
The document classification solution should significantly reduce the manual human effort in the HRM. It should achieve a higher level of accuracy and automation with minimal human intervention.
corpus count-vectorizer label-encoding lemmitization machine-learning nltk part-of-speech-tagging resume-classification spacy stemming text-mining text-preprocessing textract tfidf-vectorizer tokenization wordcloud
Last synced: 02 Feb 2026
https://github.com/natnaelhhaile/Text-Similarity-Analysis
bag-of-words cosine-similarity data-analysis machine-learning natural-language-processing nltk-python one-hot-encoding python stemming stop-word-removal stop-words text-mining text-processing text-similarity-analysis tf tf-idf tokenization
Last synced: 11 Apr 2025
https://github.com/natnaelhhaile/text-similarity-analysis
bag-of-words cosine-similarity data-analysis machine-learning natural-language-processing nltk-python one-hot-encoding python stemming stop-word-removal stop-words text-mining text-processing text-similarity-analysis tf tf-idf tokenization
Last synced: 20 Apr 2026
https://github.com/somjit101/nlp-stackeroverflow-tag-prediction
A multi-class classification problem where the objective is to read a question posted on the popular reference website, StackOverflow and predict the primary topics it deals with, i.e. tags which the question will be associated with.
bag-of-words countvectorizer logistic-regression multi-class-classification multiclass-logistic-regression natural-language-processing nlp one-vs-rest onevsrestclassifier stackoverflow-tags stemming text-mining tf-idf tfidf-vectorizer word-cloud
Last synced: 05 Jun 2026
https://github.com/mina-faridi/document-ranking-with-galago
Galago related homeworks of Information Retrieval Course
bm25 document-ranking galago information-retrieval map ndcg pivoted-length-normalisation recall stemming tokenizing university-of-tehran
Last synced: 25 Apr 2026
https://github.com/atheeralzhrani/nlp_projects
NLP projects, which I worked on utilising different natural language processing libraries's.
nlp-datasets nltk-library rnn-lstm rnn-pytorch rnn-tensorflow spacy-nlp stemming stopwords tokenization
Last synced: 19 Aug 2025
https://github.com/atheeralzhrani/arabic_nlp
This repository contains projects focused on Arabic Natural Language Processing (NLP)
arabic-dataset arabic-language arabic-language-dataset arabic-nlp arabic-text-classification arabic-text-detection arabic-text-recognition huggingface spacy-nlp stemming text-preprocessing tokenization
Last synced: 16 Oct 2025
https://github.com/sdpdas/sm_sentiment_analysis
Using Natural Language Processing (NLP) and pandas, numpy, scikit-learn for classification and applying logistic regression as it is a supervised model, lastly NLTK. Pickle library used for saving and running the model anywhere.
logistic-regression machine-learning nlp scikit-learn sentiment-analysis stemming vectorizer
Last synced: 03 Jan 2026
https://github.com/gfyoung/stemming
Web Application for Counting Words in a Document
flask-application python stemming
Last synced: 15 Mar 2025
https://github.com/arya-io/nlp-explorer
NLP Explorer is an interactive Streamlit app that lets users explore various NLP techniques like Tokenization, POS Tagging, Stemming, Lemmatization, and NER. It provides real-time analysis of text, making it a great tool for learning and experimenting with NLP concepts.
datascience lemmatization machinelearning naturallanguageprocessing ner nlp nltk postagging python stemming streamlit textanalysis textprocessing tokenization
Last synced: 01 May 2026
https://github.com/kajuberdut/porter2
A python wrapper around surgebase's porter2 implementation.
nlp snowball stemming stemming-porters
Last synced: 20 May 2026
https://github.com/fnando/stemmers
Stemming and language detection bindings for Ruby
gem language-detection ruby stemming
Last synced: 30 Jun 2025
https://github.com/jigyasag18/fake-news-prediction-app
The Fake News Prediction App Repository offers a machine learning project that focuses on identifying the authenticity of news articles as fake or real. It uses a dataset of 20,000 articles and employs methods such as TF-IDF vectorization and the Lemmatization algorithm, achieving ~95% classification accuracy with random forest classifier model
data datapreprocessing logistic-regression machine-learning machine-learning-algorithms numpy pandas prediction stemming streamlit streamlit-webapp vectorization
Last synced: 11 Apr 2026
https://github.com/madhurimarawat/natural-language-processing-in-python
This repository contains Natural Language Processing programs in the Python programming language.
basic-programs bigram-model chunker description-of-code hmm-model lemmata lemmatization n-grams natural-language-processing parts-of-speech-tagging penn-treebank pos-penn-treebank python regular-expression stem-words stemming tokenization trigram-model unigram-model word-frequency-count
Last synced: 10 Apr 2026
https://github.com/nurfawaiq/ir-stemming-nazief
Information Retrieval - Stemming Nazief
information-retrieval stemming
Last synced: 18 Mar 2025
https://github.com/rachakondaganesh/using-nlp-online-and-retail-order-review-project
analyzed customer reviews from online and retail orders. Performed sentiment analysis, keyword extraction, and topic modeling to identify trends, satisfaction drivers, and pain points. Used Python (NLTK, spaCy) and visualization tools to present actionable insights for improving customer experience and product strategy.
bivariate-analysis lambda matplotlib-pyplot nlp-machine-learning pandas sent-tokenize stemming stopwords trivariate-analysis unicode-data univariate-analysis word-tokenization
Last synced: 12 Apr 2026