scikit-learn
scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.
- GitHub: https://github.com/topics/scikit-learn
- Wikipedia: https://en.wikipedia.org/wiki/Scikit-learn
- Repo: https://github.com/scikit-learn/scikit-learn
- Created by: David Cournapeau
- Released: January 05, 2010
- Related Topics: scikit, python,
- Aliases: sklearn,
- Last updated: 2026-06-23 00:27:46 UTC
- JSON Representation
https://github.com/somenath203/movie-recommender-system
Click below to checkout the website
content-based-recommendation cosine-similarity huggingface-spaces movie-recommender-system python recommender-system scikit-learn streamlit streamlit-webapp
Last synced: 13 Apr 2026
https://github.com/emanuel-poblano/stock-market-predictor
An end-to-end Python stock price prediction project that pulls real market data, performs feature engineering, trains a machine learning model, and predicts the next-day closing price of a stock.
matplotlib pandas python scikit-learn yfinance
Last synced: 13 Apr 2026
https://github.com/prakharchoudhary/mlchallenge-2
My submission for machine learning challenge #2, organised by hackerEarth.
adaboost gradient-boosting-classifier jupyter-notebook machine-learning python scikit-learn
Last synced: 13 Apr 2026
https://github.com/johnnixon6972/cirrhosis-outcomes-prediction
This leverages advanced machine learning techniques to predict patient outcomes for those suffering from cirrhosis. Utilizing a comprehensive dataset from a Mayo Clinic study, this project explores various data imputation methods and class balancing techniques to enhance prediction accuracy.
ai algorithms analytics artificial-intelligence machine-learning ml pandas python3 scikit-learn
Last synced: 13 Apr 2026
https://github.com/tnleite/loan-approval-prediction
Este repositório apresenta um modelo preditivo de aprovação de empréstimos, focado em minimizar o risco de inadimplência. Utilizando EDA e algoritmos de machine learning (Random Forest, XGBoost), ajustamos o threshold para maximizar o recall de inadimplentes, contribuindo para uma gestão de riscos eficiente.
classification-algorithm data-science exploratory-data-analysis machine-learning-algorithms machine-learning-models matplotlib numpy scikit-learn scipy seaborn xgboost-classifier
Last synced: 13 Apr 2026
https://github.com/tusharpandey003/iris-flower-classification
Iris flower classification using KNN and Random forest algorithm
data-science iris iris-classification iris-data iris-dataset iris-detection iris-flower-classification iris-flowers knn-classification machine-learning-algorithms random-forest scikit-learn streamlit
Last synced: 13 Apr 2026
https://github.com/khushi130404/placemetrix
A machine learning-based placement prediction app using IQ and CQPA as inputs. Built with Python in Jupyter Notebook, leveraging scikit-learn, pandas, and matplotlib.
jupyter-notebook machine-learning matplotlib python scikit-learn
Last synced: 13 Apr 2026
https://github.com/sanchariii/order_amt_prediction
Order Amount Prediction is a machine learning project that predicts customer order amounts based on past behavior. It includes milestones for data cleaning, exploratory data analysis, feature engineering, and model building. The framework can be customized to suit specific needs and provides insights for better decision-making.
jupyter-notebook machine-learning python scikit-learn
Last synced: 13 Apr 2026
https://github.com/akashwav/fake-news-detection
📰 A complete NLP project that takes a news dataset, builds a highly accurate classification model, and deploys it as a live web application using Streamlit and GitHub.
data-science fake-news-detection machine-learning nlp nltk pandas python scikit-learn spacy streamlit text-classification
Last synced: 11 Apr 2026
https://github.com/kalpthakkar/jobpilot-ai
JobPilot AI is a next-generation, AI-powered job application and management platform that automates the end-to-end process of job searching, intelligent application submission, and workflow analytics. It combines state-of-the-art AI, ML, NLP, and cloud technologies to deliver a seamless, highly customizable, and extensible solution for job seekers.
artificial-intelligence automation beautifulsoup chromadb fastapi gmail-api jobs langchain llm lxml nlp nltk ollama python3 pywinauto rag scikit-learn selenium sqlite-database
Last synced: 08 Apr 2026
https://github.com/danishtalpur/sentiview-website
SentiView is a sentiment analysis tool designed to analyze and interpret the emotions behind tweets on Twitter. The platform processes textual data from user-generated tweets to determine the sentiment behind them—whether they are positive, negative, or neutral.
css flask html java naive-bayes-classifier scikit-learn twitter-sentiment-analysis
Last synced: 16 Apr 2026
https://github.com/blleshi/neural_network_binary_classification
Venture Funding with Deep Learning (Neural Network Binary Classification)
binary-classification binary-crossentropy deep-learning hdf5 keras neural-network neural-network-model pandas preprocessing-data scikit-learn standard-scaler tensorflow venture-funding
Last synced: 16 Apr 2026
https://github.com/joewlos/fantasy_football_monte_carlo_draft_simulator
Monte Carlo Fantasy Football Draft Simulator Featuring FastAPI, NextUI, and ODMantic
fantasy-football monte-carlo nextjs nextui odmantic pydantic python scikit-learn
Last synced: 13 Apr 2026
https://github.com/lingumd/neural_network_charity_analysis
Machine learning and neural networks used to create a binary classifier capable of predicting whether applicants will be successful if funded by Alphabet Soup.
deep-learning machine-learning matplotlib-pyplot neural-networks onehotencoder pandas scikit-learn seaborn standardscaler tensorflow
Last synced: 13 Apr 2026
https://github.com/kianoushamirpour/end_to_end_text_classification
Developing feature engineering pipelines, building packages, automating tests, and creating FastAPI endpoints.
apache-airflow ci docker-compose factory-design-pattern fastapi feast grafana hyperopt mlflow prometheus pytorch scikit-learn tox transformers xgboost-classifier
Last synced: 08 Apr 2026
https://github.com/no-country-simulation/s16-21-n-data-bi
Analisis del COVID-19 - insights sobre la evolución de la pandemia - impacto en 5 paises sudamericanos.
eda etl machine-learning matplotlib pandas powerbi python scikit-learn seabron streamlit
Last synced: 28 Apr 2025
https://github.com/ifte-13/early-stage-brain-stroke-detection
Predictive Analysis & Early Detection of Brain stroke using Machine Learning Algorithm
decision-tree imbalanced-learn knn matplotlib numpy pandas random-forest scikit-learn seaborn
Last synced: 06 Jul 2025
https://github.com/santoshn86/dlp-ev-system-for-pa-optimization
This system is a game-changer, enabling smarter energy management through predictive insights and personalized optimization strategies.
aiml django flask keras pytorch scikit-learn tensorflow typescript
Last synced: 13 Apr 2026
https://github.com/adityasreevatsak/smartflow
SmartFlow enhances bike-sharing efficiency by combining deep reinforcement learning with agentic AI. The RL model optimizes bike distribution, while agentic AI coordinates real-time actions, like alerting truck drivers. This scalable approach ensures smart decisions and timely execution for urban transport.
agentic-ai jupyter-notebook keras open-ai-gym pandas python pytorch reinforcement-learning scikit-learn seaborn stable-baselines3 tensorflow
Last synced: 13 Apr 2026
https://github.com/towaquimbayo/comp-4948
BCIT Computer Systems Technology (CST) - COMP 4948 (Predictive Machine Learning)
feature-selection keras matplotlib numpy pandas predictive-analytics predictive-modeling python pytorch regression scikit-learn sklearn statsmodels tensorflow
Last synced: 13 Apr 2026
https://github.com/dummumounika/ecommerce-sales-categorization
This repository contains Python code for text classification and analysis of e-commerce sales data. The script processes textual descriptions of products and categorizes them into predefined categories using a Naive Bayes classifier. It also includes various analysis and visualization methods to explore the dataset.
machine-learning matplotlib-pyplot ntlk numpy pandas python scikit-learn
Last synced: 13 Apr 2026
https://github.com/1401dev/customer-lifetime-value-prediction
A data science project leveraging Python and Scikit-Learn to build predictive models that estimate customer lifetime value (CLV). Includes data cleaning, feature engineering, and model selection to identify key drivers of CLV, supporting strategic decision-making in customer retention and marketing.
clv clv-analysis customer-retention data-analysis dataprocessing feature-engineering machine-learning marketing-analytics predictive-modeling python regression-analysis scikit-learn
Last synced: 06 May 2026
https://github.com/sorabh-kapoor/face-recognition-attendance-system
The Facial Recognition System is an AI-powered application USING FLASK designed to detect and recognize faces with high accuracy. This system can be integrated into various applications, including security systems, attendance management, and identity verification.
flask flask-application knn ml numpy opencv pandas python scikit-learn
Last synced: 13 Apr 2026
https://github.com/intscription/machine-learning
Machine Learning and it's advance concepts
adaboost numpy pandas pca-analysis pipeline random-forest scikit-learn svm
Last synced: 28 Apr 2026
https://github.com/gsmafra/sklearn-dummies
Scikit-learn label binarizer with support for missing values
data-science machine-learning pandas python scikit-learn
Last synced: 15 Jan 2026
https://github.com/pramodyasahan/spaceship-titanic
This repository features a machine learning model designed to predict whether passengers of a space travel company are likely to be transported. The model employs CatBoostClassifier, a machine learning algorithm known for handling categorical data effectively.
machine-learning numpy pandas python scikit-learn
Last synced: 13 Apr 2026
https://github.com/apfirebolt/spam_email_classifier
An Email classifier using CountVectorizer and Naive Bayes strategy. PyQt5 is used for GUI
count-vectorizer naive-bayes-classifier pandas pyqt5 python scikit-learn
Last synced: 08 May 2026
https://github.com/muscaanmnmnm/breast-cancer-detector
A predictive model for breast cancer detection using K-Nearest Neighbors, demonstrating the impact of feature scaling on model performance and recall.
breast-cancer-wisconsin data-science feature-scaling jupyter-notebook knn-classification machine-learning pandas-dataframe python-3 scikit-learn
Last synced: 06 Sep 2025
https://github.com/1adore1/deadlock-match-tracker-bot
Telegram bot for tracking real-time Deadlock matches for top 250 players of the leaderboard. Fetches match data and predicts winners using a machine learning model.
aiogram api deadlock optuna pandas python scikit-learn
Last synced: 13 Apr 2026
https://github.com/vatshayan/pokemon-analysis
Visualization, Analysis & Predicting the accuracy of finding Pokemon power, attack & speed through Machine Learning
artificial-intelligence data data-analysis data-science data-visualization dataset machine-learning machine-learning-algorithms pokemon scikit-learn
Last synced: 30 May 2026
https://github.com/vyjayanthipolapragada/kmeans_clustering_customer_analysis
Using the algorithm of KMeans to analyse real customer datasets and draw valuable insights to boost business stragegy
algorithms analysis customer-data jupyter-notebook kmeans-clustering machine-learning matplotlib pandas python scikit-learn
Last synced: 13 Apr 2026
https://github.com/adrianmarino/knn-cf-rec-sys
Similarity CF based RecSys examples
python recommender-system scikit-learn
Last synced: 08 May 2026
https://github.com/ot-code/coca-cola-stock-prediction
This repo compares four predictive models—Linear Regression, ARIMA, XGBoost, and LSTM—to forecast Coca‑Cola FEMSA stock closing prices using Python and five years of historical data.
arima csv linear-regression lstm-neural-networks mae matplotlib mse numpy pandas python r2 scikit-learn seaborn tensorflow-keras xgboost
Last synced: 13 Apr 2026
https://github.com/nikhilakki/predicting-the-gender-of-the-riders-of-new-york-s-citi-bikes
Predicting the Gender of the riders of New York Citi Bikes (2015-2017)
data-science decision-trees feature-engineering machine-learning pandas python scikit-learn
Last synced: 13 Apr 2026
https://github.com/shruthin4/news-articles-classification
Classifying News Articles using Machine Learning and NLP techniques.. Built an end-to-end text classification pipeline using TF-IDF vectorization and models like Logistic Regression and SVM. Includes exploratory data analysis, model evaluation, and deployment-ready artifacts.
data-analysis data-science logistic-regression machine-learning model news-classification nlp python scikit-learn svm tf-idf-vectorization
Last synced: 13 Apr 2026
https://github.com/grandechowhiskey/fcc-data_analysis-projects
A collection of projects completed as part of the FreeCodeCamp "Data Analysis with Python" certification. These projects cover statistical calculations, data visualization, and trend analysis using real-world datasets.
data-analysis data-visualization matplotlib pandas python3 scikit-learn seaborn
Last synced: 01 May 2026
https://github.com/arindal1/breast-cancer-detection
A simple Neural Network model to detect Breast Cancer.
machine-learning neaural-network scikit-learn tensorflow
Last synced: 13 Apr 2026
https://github.com/otuemre/emailphishingdetection
A real-time phishing email detection system using Machine Learning (SVM, Logistic Regression, Naive Bayes) with FastAPI backend and custom domain deployment.
cybersecurity fastapi huggingface machine-learning nlp real-time scikit-learn spam-detection svm-classifier tfidf-vectorizer
Last synced: 13 Apr 2026
https://github.com/NoName115/Bachelor-thesis
Bachelor thesis - Determination of Gun Type and Position in Image Scene
bachelor-thesis classification computer-vision fit gun keras machine-learning scikit-image scikit-learn vut
Last synced: 11 Mar 2025
https://github.com/dineshh912/analysis_stock_price_data
Experiment analysis of stock price data with python3
data-analysis data-visualization financial-data python3 scikit-learn stock-price-prediction
Last synced: 24 Apr 2026
https://github.com/veronsheva/global_food_wastage
Global Food Wastage Analysis
analysis data data-analitics pandas predictions python scikit-learn seaborn visualization
Last synced: 18 Apr 2026
https://github.com/nicolascoiado/nivel-mar
Este projeto realiza uma análise detalhada do nível médio global do mar (GMSL), utilizando uma base de dados pública que abrange medições históricas. O objetivo é explorar tendências, calcular a taxa média de elevação e visualizar os dados por meio de gráficos.
google-colab jupyter-notebook matplotlib numpy pandas python python3 scikit-learn
Last synced: 11 Mar 2025
https://github.com/thinker84/real-time-stock-price-prediction-and-market-analysis-using-machine-learning
Real-time stock price prediction app using LSTM, Streamlit, and historical data (2010–2023). Forecasts next 10 days & visualizes trends.
data-science django lstm machine-learning numpy pandas pandas-datareader scikit-learn stock-market stock-price-prediction stooq streamlit yahoo-finance yahoo-finance-api
Last synced: 13 Jul 2025
https://github.com/hetuvpatel/brain-stroke-prediction
Machine Learning project for predicting stroke risk using healthcare data. Includes EDA, preprocessing, SMOTE, feature selection (RFE), evaluation of Logistic Regression, Decision Tree, Random Forest, KNN, SVM, and Stacked Ensemble models.
data-mining ensemble-learning healthcare machine-learning predictive-modeling python rfe scikit-learn smote
Last synced: 17 May 2026
https://github.com/omar7001-b/data-miner
DataMiner is an interactive web application for data mining and machine learning. It helps users upload, clean, transform, and analyze datasets while building predictive models — all through a simple and powerful Streamlit interface.
data-cleaning data-mining data-preprocessing data-science data-visualization interactive-dashboards pandas python scikit-learn streamlit
Last synced: 28 Apr 2025
https://github.com/pksvv/machinelearning_svm
Various implementations of Support Vector Machine Algo
machine-learning python scikit-learn support-vector-machine
Last synced: 04 May 2026
https://github.com/nirmaldeepponnada/codeclauseinternshipproject2
Python, NLTK, Scikit-Learn, Pandas, NumPy, Pickle, SciPy, and JSON are used for text preprocessing, feature engineering, multi-label classification, and model persistence.
nltk numpy pandas pickle python scikit-learn scipy
Last synced: 07 Apr 2026
https://github.com/mastermindromii/car-price-prediction-model
Here is My Regression Project based on Predicting Price of Car using Linear Regression.
linear-regression matplotlib numpy pandas python scikit-learn seaborn
Last synced: 13 Apr 2026
https://github.com/dmarks84/coursework_project_network-analysis-node-link-prediction
Project for University of Michigan Applied Data Science Specialization -- Analyzed network nodes and edges, developing custom features based on various scoring metrics; used features to train classifier model to predict node attribute (employee salary type) and future edges (employee connections)
classification cross-validation data-reporting databases eda grid-search matplotlib network-analysis numpy pandas python scikit-learn statistics supervised-ml visualization
Last synced: 13 Apr 2026
https://github.com/dharma-acha/imageclassification
This project is an interactive Streamlit web application using the VGG-13 model to classify images from the CIFAR-10 dataset. Users can upload images to receive real-time predictions and visual explanations of the model's decisions. The goal is to accurately classify images into one of the ten CIFAR-10 classes: airplanes, automobiles, birds, cats,
colab-notebook matplotlib numpy pandas python3 pytorch scikit-learn seaborn streamlit
Last synced: 13 Apr 2026
https://github.com/vishant007/annadataa
A Website For Farmers To Guide Them Regarding Crop Prouction In Their Native Language
django flask-application google-collab kaggle machine-learning-algorithms numpy pandas python3 scikit-learn
Last synced: 13 Apr 2026
https://github.com/farrajota/kaggle_house_prices
Kaggle's house prices competition
docker jupyter-notebook kaggle kaggle-competition kaggle-house-prices notebook pyspark python scikit-learn
Last synced: 12 Apr 2026
https://github.com/aml-hassan-abd-el-hamid/finding-donors-for-charityml
Predicting salary of the people based on various data about them
machine-learning python scikit-learn supervised-learning udacity-machine-learning-nanodegree
Last synced: 08 May 2026
https://github.com/ozcankyo28/ds-ml-bootcamp
📊 Master data science and machine learning in one month with hands-on projects, covering the complete ML workflow from data collection to deployment.
data-science datascience jose-portilla lgbm lgbmregressor machine-learning matplotlib-pyplot python regression-models scikit-learn seaborn tensorflow udemy-course-project udemy-machine-learning
Last synced: 14 Apr 2026
https://github.com/codecraft-sanju/medvisionai-medical-image-ai-vision.
MedVisionAI is an AI-powered platform that analyzes ultrasound images to detect PCOS and provide actionable recommendations. Using CNN-based deep learning and generative AI, it ensures fast, accurate diagnosis, reduces errors, and supports clinicians with instant insights all while maintaining patient privacy and compliance.
deep-learning fastapi gemini-api genai keras-tensorflow machine-learning matplotlib python react scikit-learn seaborn tailwindcss tensorflow
Last synced: 07 Sep 2025
https://github.com/hilalozdemirbuyukasik/deep-learning
A collection of deep learning projects demonstrating RNNs, BiLSTMs, CNNs, and basic neural networks applied to time series forecasting, text sentiment analysis, image classification, and tabular data tasks, with examples of data preprocessing, model training, evaluation, and visualization.
bilstm cnn keras matplotlib nn numpy rnn scikit-learn tensorflow
Last synced: 12 Apr 2026
https://github.com/cesar312/python-data-science-toolbox
A collection of useful data science tools and techniques
data-science jupyter-notebook pandas python scikit-learn statistics visualization
Last synced: 13 Apr 2026
https://github.com/djb15/machine-learning-project
Machine learning project 2018 - Imperial College London
machine-learning project python3 scikit-learn scikitlearn-machine-learning university university-project
Last synced: 27 Apr 2026
https://github.com/nurulashraf/telco-customer-churn-prediction-model
This repository contains a Telco Customer Churn Prediction project using machine learning. It includes data preprocessing, exploratory data analysis, feature engineering, and model development to predict customer churn. Key tools used are Python, Pandas, NumPy, Matplotlib, Seaborn, and scikit-learn.
churn-prediction classification-model customer-churn data-visualization exploratory-data-analysis machine-learning predictive-analytics python scikit-learn
Last synced: 16 Mar 2025
https://github.com/wilfordaf/ml-sect-introduction-task
Test task for students assosiation
classic-machine-learning keras machine-learning regression-models scikit-learn
Last synced: 28 Feb 2025
https://github.com/mmerlyn/analysis-of-tomato-prices
Forecasting tomato prices in Karnataka using machine learning to help farmers make better crop planning and selling decisions.
css flask html matplotlib numpy pandas python scikit-learn seaborn
Last synced: 06 Jul 2025
https://github.com/lukacerr/lovelytics
Lovelytics technical task for AI engineer position
ai-agents deepagents langchain ml python scikit-learn
Last synced: 31 May 2026
https://github.com/yanne0800/lung_cancer_prediction
This project predicts lung cancer risks using machine learning models like Random Forest, Logistic Regression, and SVM. It analyzes patient data with features such as age, smoking habits, and symptoms. Data preprocessing, visualization, and performance evaluation ensure accurate predictions for early diagnosis.
algorithm classification cnn decision-tree-classifier decision-trees deep-learning gradientboosting keras lung-cancer medical-image-processing navies-bayes-classifer neuralnetworks python scikit-learn
Last synced: 05 May 2026
https://github.com/okerx/spotifymoods
A simple ML model to classify Spotify tracks using audio features.
machine-learning pandas python scikit-learn
Last synced: 09 May 2026
https://github.com/lingumd/cryptocurrencies
Unsupervised machine learning models used to group the cryptocurrencies to help prepare for a new investment.
concatenate elbow-curves get-dummies hvplot jupyterlab kmeans matplotlib-pyplot minmaxscaler pandas path pca-analysis plotly-express scikit-learn unsupervised-machine-learning
Last synced: 13 Apr 2026
https://github.com/sasank-sasi/subtheme-sentiment-analysis-for-review
"Comprehensive Subtheme Sentiment Analysis of Customer Reviews Using Advanced NLP Techniques"
matplotlib natural-language-processing nltk plotly python scikit-learn spacy vader-sentiment-analysis
Last synced: 05 Feb 2026
https://github.com/anirudh-pulavarthy/car-evaluation-using-smote
machine-learning python scikit-learn smote-sampling
Last synced: 24 Apr 2026
https://github.com/subratamondal1/heart-attack-prediction
Heart Attack Prediction of patients based on the required data. Data Ingestion - Data Preparation - Exploratory Data Analysis (EDA) - Modelling - Evaluation.
data-analysis data-science data-visualization kaggle-dataset machine-learning matplotlib-pyplot numpy pandas python3 scikit-learn seaborn
Last synced: 09 Apr 2026
https://github.com/duruii/contest-dingtalkcup2-a
2023年第二届“钉钉杯”大学生大数据挑战赛——智能手机用户监测数据分析
data-mining machine-learning pandas scikit-learn xgboost
Last synced: 12 Mar 2025
https://github.com/abhipatel35/svm-hyperparameter-optimization-for-breast-cancer
Utilizing SVM for breast cancer classification, this project compares model performance before and after hyperparameter tuning using GridSearchCV. Evaluation metrics like classification report showcase the effectiveness of the optimized model.
breast-cancer cancer-diagnosis classification data-analysis data-science gridsearchcv healthcare hyperparameter-tuning jupyter-notebook machine-learning medical-imaging pycharm python scikit-learn support-vector-machine svm
Last synced: 05 Feb 2026
https://github.com/hotequil/computer-vision
Study about computer vision.
jupyter-notebook matplotlib numpy python scikit-learn
Last synced: 13 Apr 2026
https://github.com/javi-cc/python-ml-portcanto
Portcanto és un projecte de simulació d'un trajecte en bicicleta. S'ha definit 4 tipus de ciclistes que es diferencien en el temps que tarda a fer el trajecte. L'objectiu és descobrir els 4 patrons amb l'algoritme de clustering KMeans.
clustering docker docker-compose kmeans machine-learning mlfow pydoc pylint python scikit-learn testing venv
Last synced: 13 Apr 2026
https://github.com/oceanuz/car-price-regression
A comprehensive ML evaluation and improvement notebook for a car price prediction model. It includes topics such as scoring with r2, cross-validation, overfitting/underfitting diagnosis, and polynomial regression. *Ridge regression* is applied to reduce overfitting, and (GridSearchCV) techniques are used to find the best alpha hyperparameter.
cross-validation data-science grid-search hyperparameter-tuning machine-learning machine-learning-models model-evaluation overfitting python regression ridge-regression scikit-learn
Last synced: 11 Dec 2025
https://github.com/18mahi/digital_cave
An intermediate-level deep learning project that compares Convolutional Neural Networks (CNN) and Multi-Layer Perceptrons (MLP) on the MNIST handwritten digits dataset. This project demonstrates data augmentation, learning rate scheduling, and visual comparison of model performance
cnn confusion-matrix data-augmentation data-science deep-learning evaluation-metrics jupyter-notebook keras learning-rate-scheduler machine-learning matplotlib mlp numpy python3 scikit-learn seaborn tensorflow
Last synced: 13 Apr 2026
https://github.com/pranavsp108/time-series-forcasting
A time-series forecasting project to predict hourly energy consumption using Python, Pandas, and an XGBoost regression model.
data-analysis data-science energy-consumption forecasting matplotlib numpy pandas python scikit-learn sustainability time-series xgboost
Last synced: 10 Apr 2026
https://github.com/ahmadbuilds/fake-news-classifier
Classifies news articles as real or fake using an NLP pipeline with TF-IDF + n-grams and machine learning models. Includes text preprocessing, feature engineering, model training, and evaluation.
fastapi logistic-regression matplotlib n-grams nextjs nltk numpy pandas python3 random-forest-classifier react scikit-learn seaborn supervised-learning tf-idf typescript xgboost-classifier
Last synced: 11 Apr 2026
https://github.com/pranavsp108/financial-fraud-detection
A comprehensive machine learning project for detecting financial fraud using XGBoost and LightGBM, with a focus on advanced feature engineering, class imbalance handling, and hyperparameter tuning.
classification-model data-science feature-engineering fraud-detection hyperparameter-tuning lightgbm machine-learning pandas python scikit-learn xgboost
Last synced: 04 May 2026
https://github.com/ivanswetz/banana_shelf-life_prediction
Goal: Predict how many days a banana has left before spoiling (“days to death”) based on a photo. This project demonstrates an end-to-end machine learning pipeline: image preprocessing, feature extraction, supervised & semi-supervised learning, and model deployment.
image-processing machine-learning opencv python random-forest scikit-learn supervised-learning
Last synced: 04 May 2026
https://github.com/imehranasgari/mlflow_starter
This project is a hands-on guide to the complete end-to-end MLflow workflow, designed as an educational resource. It demonstrates how MLflow is used in practice for experiment tracking, model versioning, and ensuring a reproducible MLOps lifecycle, focusing on the methodology and best practices rather than high model accuracy.
data-science experiment-tracking mlflow mlops model-registry python scikit-learn
Last synced: 11 May 2026
https://github.com/njorogepaul-moghul/house-price-predictions-kaggle-competition-
Built a predictive model for the Kaggle House Prices competition using feature engineering and LightGBM, achieving strong leaderboard performance."
data-science house-price-prediction-with-lightgbm kaggle-competition lightgbm machine-learning predicting-home-values-using-machine-learning random-forest scikit-learn
Last synced: 15 May 2026
https://github.com/ytalk/deep-learning
Um repositório dedicado à minha jornada de aprendizado e experimentação em Deep Learning. Contém diversas pipelines e implementações em diferentes datasets, explorando modelos (MLPs, LSTMs, CNNs) e técnicas (Regressão, Classificação, etc.) com foco em TensorFlow e Keras.
data-science deep-learning keras machine-learning neural-networks pandas python scikit-learn tensorflow
Last synced: 30 Dec 2025
https://github.com/abhay-rudatala/resume-analyzer
Intelligent Resume Analysis System using Machine Learning and NLP. Features TF-IDF + Naive Bayes/SVM classification (90-95% accuracy), SpaCy NER for information extraction, and interactive Streamlit web app with custom UI. Built with Python, Scikit-learn, and deployed on Streamlit Cloud.
classification machine-learning named-entity-recognition nlp portfolio-project python resume-analysis scikit-learn spacy streamlit
Last synced: 06 May 2026
https://github.com/manishrajmss13/regression_project
A predictive machine learning model to forecast the Algerian Forest Fire FWI using Python, Scikit-learn, and Statsmodels. Includes complete data cleaning and EDA.
data-cleaning-and-preprocessing data-science eda feature-engineering learning-by-doing linear-regression machine-learning python regression scikit-learn statsmodel
Last synced: 09 May 2026
https://github.com/smusab9152/bpm_pred_songs
ML project to predict the Beats Per Minute (BPM) of a song using various audio features. This is a submission for the Kaggle Playground Series (S04E02). The notebook covers a full data science workflow, including EDA, handling skewed data with log transformations, feature scaling, and building various regressions
data-science jupyter-notebook kaggle-competition machine-learning pandas regression scikit-learn
Last synced: 11 May 2026
https://github.com/jobanjps089/mental_wellness
This project is a Flask web application that predicts mental wellness levels based on lifestyle factors such as screen time, sleep hours, and work-related screen exposure. It uses a Machine Learning model trained in Google Colab and deployed via Hugging Face Spaces for public access.
flask joblib puthon3 scikit-learn
Last synced: 16 May 2026
https://github.com/kostadinlambov/time-series-forecasting
This project evaluates the predictive performance of a CNN-LSTM Hybrid deep learning model for Bitcoin price movement prediction.
keras-tensorflow matplotlib-pyplot mlflow numpy optuna pandas python scikit-learn seaborn statsmodels ta-lib tensorflow
Last synced: 07 Apr 2026
https://github.com/chirindaopensource/strapsim_portfolio_similarity_metric
End-to-End Python implementation of STRAPSim: a novel portfolio similarity metric from Li et al. (2025). Combines Random Forest proximity learning with residual-aware bipartite matching to quantify economic substitutability between ETF baskets. Full replication pipeline included.
asset-management bipartite-matching corporate-bonds etf-analysis fixed-income jupyter-notebook machine-learning numba pandas portfolio-optimization portfolio-similarity proximity-matrix python quantitative-finance random-forest research-replication scikit-learn similarity-metrics statistical-analysis supervised-learning
Last synced: 28 Apr 2026
https://github.com/javedfazlulahf/customer-churn-prediction
📊 Predict customer churn in telecom using machine learning to enhance retention strategies and drive better business outcomes.
churn-prediction cross-validation data-science factorization-machines imbalanced-learn libsvm machine-learning model-evaluation pipelines plotly scikit-learn seaborn shap-values spark-ml survival-analysis tensorflow watson-studio xgboost4j
Last synced: 11 May 2026
https://github.com/affan005-ai/tesla-stock-prediction
This project analyzes Tesla stock data and builds machine learning models to predict and classify stock movements. The analysis includes EDA, feature correlation, moving averages, and two models
data data-analysis data-science data-visualization-project eda machine-learning matplotlib pandas predictive-analytics predictive-modeling python scikit-learn
Last synced: 05 Oct 2025
https://github.com/kianaabrisham/stroke-prediction-ml-pipeline
Clinical ML pipeline with ROC/PR and interpretability
class-imbalance clinical-data healthcare interpretability machine-learning pandas pipeline precision-recall roc-auc scikit-learn
Last synced: 05 Oct 2025
https://github.com/blue-catblues/tieba-integratedanalysis
Python期末大作业—对百度贴吧进行爬虫采集(scrapy)、统计分析(pandas)、可视化展示(matplotlib),与机器学习分类(scikitLearn)的综合性数据分析
matplotlib nlp-machine-learning pandas python scikit-learn scrapy seaborn
Last synced: 05 Oct 2025
https://github.com/nihanthbhargav/time-series-stock-market
This project combines computer vision and NLP by segmenting pet images with a U-Net model and generating captions using CNN-RNN/LSTM. Using the Oxford-IIIT Pets dataset, it demonstrates a unified pipeline that integrates pixel-level segmentation with automatic caption generation for meaningful image understanding.
matplotlib numpy pandas plotly python scikit-learn seaborn
Last synced: 11 Apr 2026
https://github.com/disney35/stock-prices-dashboard
A dashboard to analyze, predict, and visualize stock prices using Python & LSTM
ema jupyter-notebook keras macd matplotlib-pyplot mfi numpy pandas python rsi scikit-learn sma streamlit tenserflow yfinance
Last synced: 12 Apr 2026
https://github.com/khaifara/klafisikasi_jeruk_faiz_kece
Step by step machine learning classification dengan StandardScaler, OneHotEncoder, OrdinalEncoder, ColumnTransformer, Pipeline, Classification Report, Confusion Matrix dan deployment menggunakan Streamlit
machine-learning scikit-learn streamlit
Last synced: 05 Oct 2025
https://github.com/veerchaudhary0708/credit-fraud-detection
An end-to-end machine learning project to detect credit fraud using XGBoost.
datascience fintech fraud-detection machinelearning scikit-learn xgboost
Last synced: 18 May 2026
https://github.com/inesruizblach/data-science-project
A data science project exploring Portuguese "Vinho Verde" wine quality prediction. Features EDA, feature engineering, ML models, and evaluation using Python, pandas, scikit-learn, and visualization tools.
binary-classification classification data-science exploratory-data-analysis feature-engineering imbalanced-learn jupyter-notebook machine-learning model-evaluation pandas regression scikit-learn seaborn uci-dataset wine-quality
Last synced: 09 May 2026
https://github.com/kianaabrisham/naive-bayes-sentiment
Sentiment classification using Multinomial NB (scratch + sklearn)
bag-of-words naive-bayes nlp scikit-learn sentiment-analysis text-classification
Last synced: 14 May 2026
https://github.com/scorchinghot/core-machine-learning-exploration
This repository provides a hands-on exploration of classical machine learning algorithms applied to the MovieLens 100k dataset, aiming to build intuition and understanding of core ML concepts.
core-ml data-science hands-on machine-learning ml-algorithms python scikit-learn tutorial
Last synced: 05 Oct 2025
https://github.com/vedanty3/bulldozer-price-prediction
A machine learning project aiming to build a machine learning model which could predict the sales price of bulldozer.
andrew-ng-machine-learning ensemble-machine-learning gridsearchcv jupyter-notebook machine-learning matplotlib numpy pandas python randomforestregressor randomizedsearchcv scikit-learn ztm
Last synced: 05 Apr 2026