Projects in Awesome Lists tagged with tfidf-vectorizer
A curated list of projects in awesome lists tagged with tfidf-vectorizer .
https://github.com/zayedrais/documentsearchengine
Document Search Engine project with TF-IDF abd Google universal sentence encoder model
data-science deep-learning document-search document-similarity juypter machine-learning python python-text-analysis semantic-search semantic-search-engine tensorflow tensorflow-models tensorflow-tutorials text-analysis text-search text-semantic-similarity tfidf tfidf-text-analysis tfidf-vectorizer universal-sentence-encoder
Last synced: 02 May 2025
https://github.com/soumyajit4419/ai_for_social_good
Using natural language processing to analyze the sentiments of people and detect suicidal ideation on online social content.
lstm natural-language-processing random-forest tfidf-vectorizer web-scraping
Last synced: 30 Apr 2025
https://github.com/ksdkamesh99/spam-classifier
A Natural Language Processing with SMS Data to predict whether the SMS is Spam/Ham with various ML Algorithms like multinomial-naive-bayes,logistic regression,svm,decision trees to compare accuracy and using various data cleaning and processing techniques like PorterStemmer,CountVectorizer,TFIDF Vetorizer,WordnetLemmatizer. It is implemented using LSTM and Word Embeddings to gain accuracy of 97.84%.
bag-of-words count-vectorizer decision-tree-classifier embeddings logistic-regression lstm-neural-networks multinomial-naive-bayes naive-bayes-classifier porter-stemmer sms-spam-detection support-vector-machines tfidf-vectorizer wordnetlemmatizer
Last synced: 12 May 2025
https://github.com/rjarman/bus-mama
The Bus-Mama is a bus tracking mobile application for the transportation of the students of BSMRSTU. It helps the students of our university by showing the available route, bus, and their exact location. This app includes real-time bus tracking which is going to solve a problem that university students have been facing for many years. Students are often seen missing their buses. Often they can't maintain the bus time. Since there are many buses in our university, students can easily catch a bus if they know where and when it will pass by. My goal is to track the buses and make hardware, mobile application, and machine learning solution to solve the issue. This way the students can get relief from missing the bus and use the buses efficiently. The main idea is to track the buses. GPS trackers will be attached to every bus that will give the current position of them and automatically sync on the server. The Bus-Mama mobile application will show every real-time position of those buses. This application will be installed on students' mobile phones and in this way the students can easily maintain their transportation. In this application, the current location of the bus can be seen through Google map. Every bus will have a specific marker on Google map and all the details about a specific bus will be shown by clicking on the marker. There will be seen about how far the bus is, from which direction it will come, how much time to reach the bus, how much time it will take if there is any traffic on road, etc. There is also a search option to know about any specific bus details. There is also a list of all buses with sufficient details that will help students to know about all the details. Every student will have an account through which they can access bus data. Another main objective is the Bus-Mama Chatbot in the Bengali language so that the students can communicate to know about the bus easily. For now, they can make conversation only about bus-related information. The Chatbot is not yet able to make conversation except bus-related questions. If anyone asks anything except bus-related questions, it cannot reply to the question rather it will give a tag to the question as a reply. As the Chatbot is created in the Bengali language, it has used the "trie" data structure in lemmatization. A library has been designed to lemmatize the Bengali words. Almost 63,205 Bengali words have been lemmatized by using the library to train the SVM machine learning model.
angular bangla chatbot distancematrixservice googlemap gps iot javascript lemmatization machine-learning mongodb nodejs nosql python scss socket svm tfidf-vectorizer trie typescript
Last synced: 09 Mar 2026
https://github.com/venkat-0706/twalyze
Twitter sentiment analysis project using machine learning to classify tweets and understand audience mood, opinions, and behavior trends in real-time.
logistic-regression machine-learning model-evaluation naive-bayes-classifier pandas python scikitlearn-machine-learning tfidf-vectorizer tokenization
Last synced: 07 May 2026
https://github.com/vubacktracking/deep-neural-network-vietnamese-student-feedback-sentiment-analysis
Vietnamese Student Feedback Sentiment Analysis
deep-learning keras-tensorflow lstm-model lstm-neural-networks lstm-sentiment-analysis nlp sentiment-analysis streamlit streamlit-webapp tensforflow tfidf-vectorizer vietnamese-nlp word2vec word2vec-model
Last synced: 12 Apr 2025
https://github.com/abhishtagatya/text2meme
🖼️ Text2Meme is a Meme Classification Experiment based on Caption Text (Implemented as a Discord Bot)
discord-bot kaggle linear-svc meme-generator tfidf-vectorizer
Last synced: 06 May 2025
https://github.com/faizann24/authorship-attribution
Authorship Attribution with Machine Learning
authorship authorship-attribution cybersecurity machine-learning random-forest scikit-learn tfidf-vectorizer
Last synced: 28 Apr 2025
https://github.com/chiraag-kakar/fund
An NLP model to detect fake news and accurately classify a piece of news as REAL or FAKE trained on dataset provided by Kaggle.
confusion-matrix fake-news machine-learning-algorithms news-article passive-aggressive-classifier project sklearn tf-idf tfidf-text-analysis tfidf-vectorizer tfidfvectorizer
Last synced: 07 May 2025
https://github.com/zenwor/equilibrium
🗞️ Article Management System
cosine-similarity crud data-science machine-learning python pytorch tfidf tfidf-vectorizer
Last synced: 06 Feb 2026
https://github.com/shaadclt/password-strength-checker-randomforestclassifier
This project is a password strength checker that utilizes a Random Forest Classifier to determine the strength of a given password. The Random Forest Classifier is trained on a dataset of passwords labeled with their corresponding strength levels.
random-forest-classifier tfidf-vectorizer
Last synced: 10 Oct 2025
https://github.com/faseeh41/ai_for_social_good
Using natural language processing to analyze the sentiments of people and detect suicidal ideation on online social content.
lstm natural-language-processing random-forest tfidf-vectorizer web-scraping
Last synced: 25 Feb 2026
https://github.com/psychomita/intellicv
IntelliCV is an AI-driven platform for efficient and intelligent resume screening.
jupyter-notebook numpy pandas python scikit-learn seaborn streamlit svc-model tfidf-vectorizer
Last synced: 19 Apr 2025
https://github.com/sarrabenyahia/datamuse
Docker webapp on django - Parisian Culture - Datamuse
django django-application django-compose docker linux nlp nlp-machine-learning recommendation-engine tfidf-vectorizer webapp webapplication
Last synced: 02 Feb 2026
https://github.com/pmadruga/ds-jobindex
Machine learning techniques (NLP) applied to the jobindex.dk dataset
bert deep-learning machine-learning natural-language-processing nlp python pytorch tfidf-vectorizer transformers
Last synced: 19 Feb 2026
https://github.com/moindalvs/sentiment_analysis_on_-elon_musk_tweets
Perform sentimental analysis on the Elon-musk tweets (Elon-musk.csv)
bag-of-words cleaning-data elon-musk feature-engineering nlp nltk polarity sentiment-analysis sentiment-intensity sentiment-polarity spacy subjectivity text-mining text-processing textblob-sentiment-analysis tfidf tfidf-vectorizer tokenizer tweet-analysis twitter-sentiment-analysis
Last synced: 21 Apr 2026
https://github.com/rkbeatss/prepr-ml-challenge
Prepr's Machine Learning Challenge
jupyter-notebook machine-learning plotly python3 randomforestregressor tfidf-vectorizer
Last synced: 02 May 2026
https://github.com/saheedniyi02/krecommend
A python package for creating content-based text recommender systems on pandas dataframes and SQLAlchemy tables
cosine-similarity flask-sqlalchemy nlp numpy pandas python recommendation-algorithms recommendation-engine recommendation-system recommender-system scikit-learn sql sqlalchemy sqlite3 tfidf-vectorizer
Last synced: 10 Mar 2026
https://github.com/ayusharma03/codsoft_internship
CodSoft Internship Projects containing, SMS Spam prediction Model, Customer Churn Prediction and Movie Classification System Based On the Movie's Summary
bag-of-words codsoft codsoft-internship codsoft-machine-learning codsoft-virtual-internship codsoftinternship machine-learning nltk tfidf-vectorizer
Last synced: 29 Jan 2026
https://github.com/tushard48/sms-spam-detection
This repository contains code and models for identifying spam SMS messages. It utilizes machine learning techniques to classify messages as spam or ham (non-spam).
machine-learning spam-detection streamlit tfidf-vectorizer
Last synced: 19 May 2026
https://github.com/antonio-f/multilabel-classification
Predict tags on StackOverflow with linear models - Week 1 assignment of Coursera's Natural Language Processing course from the Advanced Machine Learning Specialization.
bag-of-words logistic-regression multilabel-classification nltk-library one-vs-rest sklearn-library tfidf tfidf-vectorizer
Last synced: 30 Mar 2025
https://github.com/chengetanaim/sentimentanalysisforfinancialnewsnotebook
Building the model of a financial news sentiment classifier. Financial news headlines will be classified as positive, negative or neutral (from an investor point of view)
logistic-regression machine-learning natural-language-processing scikit-learn tfidf-vectorizer
Last synced: 04 May 2026
https://github.com/shubhamgoyal575/spam_detective
This project uses machine learning to classify messages as spam or ham based on text analysis. It includes data preprocessing, feature extraction (TF-IDF), and classification models like Logistic Regression and Naive Bayes for accurate spam detection. Built with Python and Scikit-Learn. 🚀
count-vectorizer data-analysis data-analytics data-cleaning data-preprocessing data-science data-visualization data-wrangling exploratory-data-analysis logistic-regression machine-learning machine-learning-algorithms naive-bayes natural-language-processing spam-detection tfidf-vectorizer
Last synced: 02 Jul 2025
https://github.com/singhkunwardeep/twitter_sentiment_analysis
A machine learning project to classify Twitter sentiment into positive, negative, categories using Logistic Regression and TF-IDF Vectorization. This project involves data preprocessing, feature extraction, model training, and evaluation of the sentiment of tweets. Built with Python, NLTK, and Scikit-learn.
logistic-regression nltk-python pandas-dataframe python3 scikit-learn tfidf-vectorizer
Last synced: 05 May 2026
https://github.com/priyam-hub/inside-medium
Inside-Medium is an AI-powered content recommendation engine designed to help readers find the most relevant and high-quality Medium articles based on their interests or selected articles.
natural-language-processing non-negative-matrix-factorization tfidf-vectorizer
Last synced: 25 Jul 2025
https://github.com/akarshankapoor7/automated-complaint-triage-system-using-nlp-and-machine-learning
Automated Severity Classification of Forum Complaints for Resolution Teams - Emphasizes automation and the end goal for resolution teams.
data-science datamining kmeans-clustering naive-bayes-classifier nlp tfidf-vectorizer
Last synced: 27 Mar 2025
https://github.com/parag000/content-based-movie-recommender
This project builds a content-based movie recommendation system using the TMDB dataset. By combining metadata features like cast, genres, and directors into a "metadata soup," it calculates movie similarity with vectorizers (Count) and cosine similarity. Ideal for learning content-based filtering and text vectorization techniques.
cosine-similarity countvectorizer recommendation-system scikit-learn tfidf-vectorizer vectorization
Last synced: 18 Apr 2026
https://github.com/sahiltiwariiii/email-spam-classifier
This model will tell you weather mail is spam or not
dataanalysis datacleaning datascience eda machine-learning nlp-machine-learning nltk numpy pandas python scikit-learn streamlit streamlit-webapp tfidf-vectorizer wordcloud-visualization wordtovec
Last synced: 09 Apr 2026
https://github.com/armanjscript/rag-driven-generative-ai
Generative AI has made remarkable strides in creating human-like text, images, and even code. However, traditional models like GPT rely solely on pre-trained knowledge, which can lead to outdated, inaccurate, or hallucinated responses. Retrieval-Augmented Generation (RAG) addresses these limitations. We offer various types of RAG here
cosine-similarity langchain langchain-ollama qwen2-5 spacy spacy-nlp tfidf tfidf-vectorizer wordnet
Last synced: 09 Apr 2026
https://github.com/dynamicanupam/classification_of_customer_complaints_using_nlp
Create a solution that will help in identifying the type of complaint ticket raised by the customers of a multinational bank using NLP and Topic Modelling (NMF)
nlp nmf tfidf-vectorizer topicmodelling
Last synced: 22 Aug 2025
https://github.com/somjit101/nlp-stackeroverflow-tag-prediction
A multi-class classification problem where the objective is to read a question posted on the popular reference website, StackOverflow and predict the primary topics it deals with, i.e. tags which the question will be associated with.
bag-of-words countvectorizer logistic-regression multi-class-classification multiclass-logistic-regression natural-language-processing nlp one-vs-rest onevsrestclassifier stackoverflow-tags stemming text-mining tf-idf tfidf-vectorizer word-cloud
Last synced: 06 Mar 2025
https://github.com/faraazarsath/customer-segmentation_e_commerce
Customer Segmentation of E commerce purchase database
kmeans-clustering latent-semantic-analysis natural-language-processing tfidf-vectorizer
Last synced: 19 Nov 2025
https://github.com/lightxlk/smbdunlp
Making a project for detecting bots and fraud in social media using Deep Learning & NLP.
bot botdetection histgram-gradient-boosting kde nlp-machine-learning random-forest shap social-media tfidf-vectorizer
Last synced: 16 May 2025
https://github.com/srijaadhya12/project-to-interview
Your ultimate interview preparation for personal project related questions
flask gemini-api random-forest-classifier react sklearn tailwind tfidf-vectorizer
Last synced: 11 Apr 2026
https://github.com/sanjanahombal/sentiment-analysis-using-neural-networks
This project explores sentiment analysis using neural networks
convolutional-neural-networks countvectorizer deep-learning keras matplotlib neural-network numpy python tensorflow tfidf-vectorizer
Last synced: 06 Jan 2026
https://github.com/tahirzia-1/nlp-textclassify
A hands-on NLP project comparing classic ML models (Naïve Bayes, SVM, Logistic Regression) and ANNs for text classification using SMS Spam and 20 Newsgroups datasets.
adam-optimizer ann cbow deep-learning lemmatization logistic-regression naive-bayes-classifier nlp nlp-machine-learning skipgram-algorithm svm tensorflow tfidf tfidf-vectorizer tokenization vectorization word2vec
Last synced: 12 Apr 2026
https://github.com/sambhu431/medicine-recommendation-system
The project aims to recommend medicines based on product uses similarity, side effects, and product review weightages. Powered by NLP techniques like TF-IDF and Cosine Similarity, the system provides intelligent and user-centric recommendations.
cosine-similarity flask machine-learning medicine medicine-recommendation medicine-search pickle recommendation-system tfidf tfidf-vectorizer
Last synced: 09 Apr 2025
https://github.com/supriya811106/twitter-sentiment-analysis
Analyzing the mood of tweets! We sort tweets on popular topics into positive, negative, or neutral categories to gauge public opinion. See what Twitter really thinks!
bernoulli-naive-bayes jupyter-notebook matplotlib nlp-machine-learning nltk numpy pandas python scikit-learn seaborn sentiment-analysis text-classification tfidf-vectorizer wordcloud
Last synced: 05 Apr 2026
https://github.com/ankulmaurya88/zomato-content-based-restaurant
Content-based restaurant recommendation system using Zomato data with TF-IDF and cosine similarity.
content-based-recommendation data-science machine-learning python3 recommender-system tfidf-vectorizer zomato
Last synced: 21 May 2026
https://github.com/ishanmk/spam-email-classifier
nlp regex spam spam-detection tfidf-vectorizer tokenization
Last synced: 28 Feb 2025
https://github.com/mrtaz77/fitymi-fake-it-till-you-make-it
Fake news detection using machine learning
fake-news-detection passiveaggressiveclassifier tfidf-vectorizer
Last synced: 05 Jul 2025
https://github.com/arufonsekun/covid-topic-modeling
Covid news topic modeling using TFIDF feature extractor and non-negative matrix factorization (NMF)
covid-19 nlp spacy-nlp tfidf-vectorizer
Last synced: 17 Mar 2025
https://github.com/Wardah26Nabilah/SMSGuard-Intelligent-Spam-SMS-Detection-System
📱 Detect spam SMS messages using a Machine Learning system, classifying texts as Spam or Ham with high accuracy and efficiency.
classification-model cybersecurity data-science jupyter-notebook logistic-regression-algorithm machine-learning ml-project naive-bayes-classifier python scikit-learn-python sms-spam-detection spam-detection-machine-learning svm-classifier text-classification-python tfidf-vectorizer
Last synced: 14 Jan 2026
https://github.com/team-denis/hackyeah2024
Hackyeah2024 Cybersecurity Task
ai cybersecurity fraud-detection machine-learning tfidf-vectorizer
Last synced: 09 Apr 2025
https://github.com/otuemre/emailphishingdetection
A real-time phishing email detection system using Machine Learning (SVM, Logistic Regression, Naive Bayes) with FastAPI backend and custom domain deployment.
cybersecurity fastapi huggingface machine-learning nlp real-time scikit-learn spam-detection svm-classifier tfidf-vectorizer
Last synced: 13 Apr 2026
https://github.com/rid17pawar/sentiment-analysis-model-experiments
Experiments in the field of Sentiment Analysis using ML Algorithms namely Logistic Regression, Naive Bayes along with tfidf, one hot encoding, bag of words vectorization. Different MLP and RNN models viz. LSTM, GRU, Bidirectional LSTM. Lastly, state of the art BERT model
bag-of-words bert bidirectional-lstm gru logistic-regression lstm ml-algorithms naive-bayes neural-networks one-hot-encoding rnn sentiment-analysis sentiment-classification text-vectorization tfidf tfidf-vectorizer transformer-architecture twitter-sentiment-analysis
Last synced: 30 May 2026
https://github.com/soumyapro/movie-recommendation-system
A machine learning model to recommend movies.This model is completely build in python using cosine similarity.This type of recommendation system, takes in a movie that a user currently likes as input. Then it analyzes the contents,popularity etc of the movie to find out other movies which have similar content.
cosine-similarity tfidf-vectorizer
Last synced: 01 Mar 2025
https://github.com/04bhavyaa/sms-spam-classification-system
A Machine Learning project that identifies whether a given message is spam or not. It uses Natural Language Processing (NLP) techniques (Stemming and TF-IDF Vectorization) for text transformation and a trained Multinomial Naive Bayes Classifier for predictions.
bernoulli-naive-bayes nlp-machine-learning nltk-library spam-classification stemming streamlit tfidf-vectorizer
Last synced: 24 Apr 2026
https://github.com/kush1912/phocket---ml-internship
This repository consists of machine Learning models, deep learning models and some NLP tasks such as Topic Modelling, Sequence generation, Sentiment analysis, Recommendation System
black-friday classification-algorithims decison-trees keywords-extraction knn model-selection n-grams natural-language-processing nlp nlp-keywords-extraction pre-processing random-forest roc-curve sentiment-analysis sequence-to-sequence svm-classifier tensorflow tfidf-vectorizer topic-modeling twitter-sentiment-analysis
Last synced: 19 Apr 2026
https://github.com/chandkund/sms-spam-detection
The goal is to develop a classification model that can accurately differentiate between spam and non-spam messages. This is crucial for applications like email filtering, SMS spam detection, and improving overall user experience by reducing the influx of unwanted or malicious content.
matplotlib nlp-machine-learning numpy pandas seaborn stemming tfidf-vectorizer tokenization
Last synced: 19 Jan 2026
https://github.com/fyt3rp4til/tfidf-emotiondetection
multinomial-naive-bayes n-grams random-forest spacy tfidf-vectorizer
Last synced: 24 Feb 2026
https://github.com/soumyapro/sms-spam-classifier
A machine learning project that detects spam SMS messages using natural language processing techniques. The model analyzes text messages and accurately classifies them as spam or legitimate (ham).
multinomial-naive-bayes nltk sklearn tfidf-vectorizer tokenizer
Last synced: 15 Apr 2026
https://github.com/floressek/languageprocessinglab
Collection of Natural Language Processing laboratory exercises exploring text processing, linguistic analysis, and statistical methods.
pca-analysis tfidf-vectorizer word-frequency-analysis
Last synced: 31 Jan 2026
https://github.com/inddrsingh/email-sms-spam-classifier
Given a text, the ML model can predict whether it is "SPAM" or "NOT SPAM"
machine-learning-algorithms naive-bayes-classifier python3 tfidf-vectorizer vectorization
Last synced: 15 Feb 2026
https://github.com/rohansardar/speechflowguard
A machine learning web API that detects toxic language in user comments using classical ML
docker logistic-regression machine-learning python3 scikit-learn tf-idf tfidf-text-analysis tfidf-vectorizer
Last synced: 17 Apr 2026
https://github.com/somjit101/nlp-casestudy-amazon-fine-foods-review
Efficient Sentencing Encoding and Vectorization techniques with customer reviews on a product page of the popular E-Commerce website, Amazon using proven NLP techniques for the purpose of sentiment analysis.
amazon-fine-food-reviews amazon-fine-food-reviews-dataset featurization natural-language-processing nlp text-classification text-preprocessing tfidf-vectorizer vectorization word2vec
Last synced: 20 Apr 2026
https://github.com/abdelrahman-amen/active_learning_in_nlp_using_small_text_technique
This project demonstrates active learning for text classification using the Small-Text library on the IMDB dataset. A logistic regression model is trained iteratively, selecting the most uncertain samples for labeling with a smart query strategy. The approach highlights efficient learning with minimal labeled data, improving model performance.
activelearning imdb logistic-regression nlp python sklearn smalltext tfidf-vectorizer uncertainty
Last synced: 20 Apr 2026
https://github.com/jash271/news_classifier
Classifies news text to True or Fake
fake-news nlp pipelines pkl python sklearn tf-idf tfidf-vectorizer
Last synced: 20 Apr 2026
https://github.com/jeffrine/inverted-index-search-engine
A Document Search Engine with TF-IDF.
python semantic-search text-search tfidf-vectorizer
Last synced: 27 Apr 2026
https://github.com/chandadiya2004/movie-recommendation-system
A Movie Recommendation System built using TfidfVectorizer and cosine similarity. The model processes a large dataset of movies and recommends similar movies based on a given input movie by analyzing textual features and calculating similarity scores.
cosine-similarity numpy pandas python sklearn tfidf-vectorizer
Last synced: 29 Apr 2026
https://github.com/vishal815/language_predictor_ml_nlp
Click below to checkout the website of this ML-NLP Project
coderun data-science deep-learning githubproject huggingface language language-detection language-prediction machine-learning ml nlp nlp-machine-learning open-source streamlit textanalysis tfidf-vectorizer vishal vishal-lazrus vishallazrus
Last synced: 30 Apr 2026
https://github.com/kaustavmodak/business-aided-customer-feedback-assessment-system
A Streamlit-based sentiment analysis app that classifies customer reviews into Positive, Neutral, or Negative using a pre-trained ML mode
framework machine-learning matplotlib nlp nltk numpy pandas pickle regex scikit-learn seaborn sentiment-analysis streamlt tfidf-vectorizer
Last synced: 03 May 2026
https://github.com/chengetanaim/sentimentanalysisforfinancialnews
This is a Django application for predicting whether the sentiment of a financial news headline is positive, negative or neutral (from an investor point of view)
beautifulsoup4 chartjs django html-css-javascript logistic-regression machine-learning natural-language-processing scikit-learn tfidf-vectorizer webscraping
Last synced: 10 May 2026
https://github.com/gui-sitton/detectnegativereviews
Create a model to classify reviews as positive and negative.
catboost logisticregression nltk nltk-python random-forest-classifier text-analysis text-classification tfidf-vectorizer xgboost
Last synced: 18 Mar 2025
https://github.com/kkeshav1101/nlp
Based on Natural Language Programming Lab coursework as a part of my degree
bag-of-words keras-tensorflow lstm nlp nltk python rnn-tensorflow tensorflow tfidf-vectorizer word2vec
Last synced: 11 May 2026
https://github.com/tonmoytalukder/analysis-on-ml-model-s-classification-of-bengali-hate-speech-in-different-social-contexts
4rth Year 2nd Semester Pattern Recognition Lab Project.
machine-learning nlp-machine-learning tfidf-vectorizer
Last synced: 26 Mar 2025
https://github.com/jash271/topic-modeling
Segregating Quora Questions to 8 Categories
nlp nmf-decomposition sklearn tfidf-vectorizer topic-modeling wordcloud
Last synced: 15 May 2026
https://github.com/ramneek2003/movie-recommendation-system
Developed as a warm-up project, this machine learning-based movie recommendation system utilizes cosine similarity to find and suggest similar films. By combining content-based filtering with popularity metrics, it provides personalized movie recommendations based on user preferences and trends, enhancing the overall user experience.
cosine-similarity machine-learning tfidf-vectorizer
Last synced: 15 May 2026
https://github.com/himank-khatri/spamham
NLP models trained using Bag of Words (BoW) and Term Frequency - Inverse Document Frequency (TF-IDF) to classify SMS as Spam or Ham.
bag-of-words naive-bayes-algorithm nlp nlp-machine-learning spam-detection tfidf-vectorizer
Last synced: 02 Mar 2025
https://github.com/souravxbera/movie-recommendation
Movie Recommender - A Smart Movie Recommendation System, built using NLP, TF-IDF & FastAPI
ml nlp-machine-learning tfidf-vectorizer
Last synced: 15 May 2026
https://github.com/jeffreywijaya100/ecommerce-product-textmining
Pemodelan klasifikasi menggunakan data product dari sebuah ecommerce dengan ketentuan yang diberikan
classification-report count-vectorizer hyperparameter-tuning machine-learning nltk optuna random-forest-classifier svm-classifier text-mining text-representation tfidf-vectorizer
Last synced: 11 Jul 2025
https://github.com/abinashsahoo007/project-resume-classification
The document classification solution should significantly reduce the manual human effort in the HRM. It should achieve a higher level of accuracy and automation with minimal human intervention.
corpus count-vectorizer label-encoding lemmitization machine-learning nltk part-of-speech-tagging resume-classification spacy stemming text-mining text-preprocessing textract tfidf-vectorizer tokenization wordcloud
Last synced: 02 Feb 2026
https://github.com/veer-parikh/amazon-review-helpfulness
A machine learning project that predicts the helpfulness of Amazon customer reviews using NLP techniques, TF-IDF, and a Random Forest classifier.
amazon-reviews machine-learning natural-language-processing random-forest sentiment-analysis tfidf-vectorizer
Last synced: 21 Jun 2025
https://github.com/beenish-ishtiaq/dep-task-2-spam-email-classifier
This project focuses on building a classifier to distinguish between spam and ham emails using Logistic Regression. Key steps include data preprocessing, feature extraction with TF-IDF vectorization, and model evaluation with accuracy metrics and a confusion matrix.
data-science email-filtering logistic-regression machine-learning natural-language-processing python spam-detection text-classification tfidf-vectorizer
Last synced: 17 May 2026
https://github.com/jeffreywijaya100/youtube-comment-textmining
scrapping data komentar youtube yang berkaitan dengan machine learning dalam bahasa Indonesia sebanyak minimal 100 komentar
api-key count-vectorizer machine-learning scraping text-mining tfidf-vectorizer word-cloud youtube-api-v3 youtube-comment-scraper
Last synced: 28 Mar 2025
https://github.com/pedrofracassi/insper-nlp-relevance-search
Busca por posts no Bluesky usando TFIDF para classificar relevância dos resultados
Last synced: 27 Mar 2025
https://github.com/ardra-a-h/resume-parser
artificial-intelligence cosine-similarity machine-learning nlp sklearn tfidf-vectorizer
Last synced: 06 Jan 2026
https://github.com/aasthaj28/ai-for-social-good
Using natural language processing to analyze the sentiments of people and detect suicidal ideation on online social content.
lstm natural-language-processing random-forest tfidf-vectorizer web-scraping
Last synced: 05 Apr 2025
https://github.com/sanjanahombal/study-on-sentiment-analysis
This project explores the optimal combination of Bag-of-Words and TF-IDF vectorization with Naive Bayes and SVM for sentiment analysis. It evaluates performance using accuracy, precision, recall, and F1-score, addressing ethical concerns like data privacy and bias to improve sentiment classification in real-world applications.
bag-of-words confusionmatrix googlecollab gridsearch-crossvalidation matplotlib-pyplot naive-bayes-classifier numpy pandas seaborn sklearn svm-classifier tfidf-vectorizer
Last synced: 07 Jan 2026
https://github.com/sanjurajveer/moview_review_analysis_nlp
Analysing movie reviews using NLP and categorising int good and bad
nlp-machine-learning nltk-python perplexity tfidf-vectorizer tsne-algorithm
Last synced: 25 Jun 2025
https://github.com/vbhatsaccnt/recommendation_engine
This is a content based movie recommendation engine.
data-science movie-recommendation nltk-python recommender-system tfidf-vectorizer
Last synced: 20 Oct 2025
https://github.com/abideen-olawuwo/netflix
A recommendation engine
linear-kernel matplotlib netflix numpy pandas plotly python seaborn sklearn tfidf-vectorizer
Last synced: 12 Apr 2026
https://github.com/sridharyadav07/ai--powered-task-management-system
An intelligent Task Management System that integrates Sentiment Analysis, Task Optimization, and Forecasting to streamline project and task handling. This AI-powered tool is designed to assist teams and project managers in making data-driven decisions by understanding emotional context, forecasting productivity, and optimizing workload distribution
arima flask joblib jupyter-notebook naive-bayes-classifier nltk numpy pandas pickle-file python randomforestregressor scikit-learn stopwords-removal streamlit tfidf-vectorizer
Last synced: 08 Apr 2026
https://github.com/bhaskrr/restaurant-reviews-5-class-rating-prediction-
This repo contains the dataset and notebook for the kaggle restaurant reviews five class rating prediction
kaggle-dataset machine-learning natural-language-processing randomoversampler rating-prediction tfidf-vectorizer
Last synced: 27 Jun 2025
https://github.com/pthmhatre/stylescribe-using-generative-adversarial-network
A fashion AI-based model capable of generating images from textual descriptions. The model should take natural language text as input and generate images that visually represent the given text. This text-to-image generation system bridges the gap between textual descriptions and visual content.
deep-neural-networks flask-application generative-adversarial-network generative-ai googlecloudplatform hyperparameter-tuning keras-tensorflow neural-networks nlp os pillow rdp-connection scipy sklearn-metrics spacy-nlp texttoimage tfidf-vectorizer
Last synced: 30 Jan 2026
https://github.com/snehawk20/log_anomaly_detection
Detecting anomalous log entries
logistic-regression tfidf-vectorizer
Last synced: 10 Sep 2025
https://github.com/FaraazArsath/Customer-segmentation_E_commerce
Customer Segmentation of E commerce purchase database
kmeans-clustering latent-semantic-analysis natural-language-processing tfidf-vectorizer
Last synced: 15 Sep 2025
https://github.com/ahmad-ali-rafique/mail-spam-detection-ml
This repository contains a machine learning project for email spam detection. It includes data preprocessing, model training, evaluation, and deployment using Python and scikit-learn.
artificial-intelligence data-science dataanalysis datavisualization linear-regression machine-learning modeling scikitlearn-machine-learning spam-detection tfidf-vectorizer
Last synced: 05 Mar 2025
https://github.com/thangtran3112/machine-learning
NLP, Neural networks, pytorch, tensorflow, AWS Sagemaker fine-tuning
artificial-neural-networks aws-bedrock aws-sagemaker gensim gru-neural-networks keras lemmatization lstm-neural-networks nltk numpy one-hot-encoding pandas python recurrent-neural-network scikit-learn tensorflow tfidf-vectorizer word2vec
Last synced: 15 Feb 2026
https://github.com/kush1912/text-classification
This is one of the Projects which was done in interest to learn the difference between the different classification algorithm and derive a solid conclusion from that. It scrap sthe data from youtube and related to six different classes and then by using different classification algorithm it classifies them.
beautifulsoup4 naive-bayes-algorithm neural-network randomforest selenium-webdriver svm-classifier text-classification tfidf-vectorizer webscrapping xgboost-algorithm youtube
Last synced: 09 Apr 2026