An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/sumdiboii/loan-prediction-decision-trees

A Decision Tree Classifier was implemented to predict personal loan acceptance using a dataset of 5,000 customers. Key features included income, education, mortgage, and credit card usage. The model achieved 97% accuracy, with 92% precision and 76% recall for positive loan predictions, validated using a classification report and confusion matrix.

classification data-visualisation decision-trees loan-prediction machine-learning python scikit-learn supervised-learning

Last synced: 07 May 2026

https://github.com/nicovandenhooff/wids-datathon-2022

This repository contains solution for the 2022 Women in Data Science Kaggle competition that I participated in, which obtained a top 10% leaderboard standing.

catboost data-visualization datascience energy-consumption ensemble-learning exploratory-data-analysis kaggle lightgbm machine-learning scikit-learn women-in-data-science xgboost

Last synced: 07 May 2026

https://github.com/alphacrypto246/titanic-survival

This project leverages machine learning techniques to predict passenger survival in the Titanic disaster using the Kaggle Titanic dataset. It includes data preprocessing, exploratory data analysis (EDA), and model building with algorithms like Logistic Regression and Random Forests to achieve reliable predictions.

logistic-regression machine-learning machine-learning-algorithms python scikit-learn scikitlearn-machine-learning

Last synced: 07 May 2026

https://github.com/andrewsy1004/linear-regression-model-for-house-price-prediction

A linear regression model to predict house prices based on features like size, location, and number of rooms. This project demonstrates the application of machine learning in real estate price estimation

linear-regression python scikit-learn xgbregressor

Last synced: 07 May 2026

https://github.com/dynle/2020f-ml

2020F Keio University - Machine Learning Laboratory

machine-learning python scikit-learn

Last synced: 07 May 2026

https://github.com/tedim52/discjockey

a content-based recommender system for your party playlist preferences

jupyter-notebook matplotlib pandas scikit-learn spotify-web-api

Last synced: 07 May 2026

https://github.com/cnoret/hexa-watts

Interactive data visualization and machine learning app for energy consumption analysis and prediction in France, built with Streamlit. (Text in French)

data-visualization electricity-forecasting energy-analysis france machine-learning scikit-learn streamlit

Last synced: 07 May 2026

https://github.com/mark-mdo47/family-machine-learning-project-2017

We are doing a two-part Machine Learning project this summer with SciKit-Learn and Keras/TensorFlow

machine-learning python scikit-learn tensorflow

Last synced: 07 May 2026

https://github.com/moustafamohamed01/mall-customer-segmentation-data

Customer segmentation using K-Means clustering based on annual income and spending score.

data-science data-visualization k-means-clustering machine-learning python scikit-learn unsupervised-learning

Last synced: 08 May 2026

https://github.com/aravindnathan02/machine-learning-projects

Machine Learning and Deep Learning projects which mainly focuses on predictive modeling.

deep-learning machine-learning neural-networks predictive-modeling python scikit-learn tensorflow

Last synced: 08 May 2026

https://github.com/prajjwal6969/recommender-system-using-python

A collection of content-based recommendation systems for songs and movies using Python and machine learning.

content-based-filtering cosine-similarity machine-learning movie-recommendation python recommender-system scikit-learn song-recommendation

Last synced: 08 May 2026

https://github.com/jatin-mehra119/churn_modeling

This repository is dedicated to predicting customer churn using machine learning techniques. It includes comprehensive scripts for data preprocessing, model training, and evaluation, along with detailed visualizations and insights.

classification-model datavisualization pandas scikit-learn

Last synced: 08 May 2026

https://github.com/deepanshkhurana/udacityproject-prediciting-boston-housing-prices

This is a Udacity Project for the Machine Learning Nanodegree. Here, we are trying to predict Boston Housing Prices using sklearn.

data-analysis data-science machine-learning python scikit-learn udacity

Last synced: 08 May 2026

https://github.com/gregoritsch3/dl_cv_e2e_potatodiseaseclassification

A guided CodeBasics Deep Learning Project where a Convolutional Model is deployed onto a Website (FastAPI) and Mobile App (React Native, Google Cloud). Its purpose is the classification of potato plant images into "healthy", "Early Blight" and "Late Blight" categories.

cnn-classification gcp model-deployment scikit-learn tensorflow

Last synced: 08 May 2026

https://github.com/oriolventur/assignment-2-model-creation

Assignment 2 from Artificial Intelligence 1 course: Model creation using synthetic data and scikit-learn.

jupyter-notebook model-creation python scikit-learn

Last synced: 08 May 2026

https://github.com/labex-labs/supervised-learning-regression

Supervised Learning: Regression | This repo collects 7 of programming labs exercises for Supervised Learning: Regression. Supervised learning. If you are hearing or reading this term for the first time, then it may be completely unclear what it means. Don't worry. In this lab, you will get a comp...

challenges course exercises hands-on labex labs machine-learning playgroud programming scikit-learn

Last synced: 08 May 2026

https://github.com/shingiraibhengesa/house-price-predictor

A machine learning project that predicts house prices based on user input features such as square footage, number of bedrooms, and more.

machine-learning-models matplotlib numpy python scikit-learn seaborn

Last synced: 09 May 2026

https://github.com/vijaykumarr1452/customer-churn-prediction

Analysis the data of telecom company and insights gained to reduce customer churn.

anaconda jupyter-notebook machine-learning pandas prediction scikit-learn

Last synced: 09 May 2026

https://github.com/ahmed122000/ml_model_deployment

The HR Analytics: Job Change Predictor is a Flask-based web application that uses machine learning to predict whether an employee will stay with a company or leave. It allows users to train models, evaluate their performance, and make predictions based on employee data, providing valuable insights for HR decision-making.

classification flask machine-learning python3 rest-api scikit-learn

Last synced: 09 May 2026

https://github.com/santiagoasp98/spam-detection

SMS spam detection using Logistic Regression and Multinomial Naive Bayes.

classification logistic-regression machine-learning multinomial-naive-bayes python scikit-learn spam-detection

Last synced: 09 May 2026

https://github.com/l1ght14/customer-churn-prediction

Predict customer churn using machine learning models like Logistic Regression and Random Forest. Includes data preprocessing, model evaluation, feature importance, and insights to drive retention strategies.

churn-prediction classification customer-churn customer-churn-prediction data-analysis logistic-regression machine-learning python random-forest scikit-learn telecom

Last synced: 09 May 2026

https://github.com/mayankanand007/yfraud

Credit card fraud detection platform using scikit-learn and xgboost ๐Ÿ’ณ

knearest-neighbor-algorithm linear-regression machine-learning predictive-analytics python3 scikit-learn svm xgboost

Last synced: 09 May 2026

https://github.com/akwardhan/loan-default-prediction-xgboost-streamlit

Full-scale loan default prediction system using XGBoost, trained on 1.3M LendingClub loans. Includes feature-rich preprocessing, class imbalance handling, recall-focused ML pipeline, and Streamlit web deployment for real-time borrower risk scoring.

credit-risk data-science google-colab loan-default-prediction machine-learning python real-world-project scikit-learn streamlit xgboost

Last synced: 09 May 2026

https://github.com/otuemre/viginids

VigiNIDS: A machine learning-based system for detecting malicious network traffic using the UNSW-NB15 dataset. It distinguishes between normal and attack activities, providing a data-driven approach to network security.

classification cybersecurity intrusion-detection-system machine-learning network-intrusion-detection python scikit-learn unsw-nb15 xgboost

Last synced: 09 May 2026

https://github.com/mpolinowski/multi-dimensional-scaling

Multidimensional Scaling is a family of statistical methods that focus on creating mappings of items based on distance.

matplotlib-pyplot multi-dimensional-scaling python scikit-learn

Last synced: 09 May 2026

https://github.com/saahilanande/naivebayes

Implimenting Naive Bayes classifier from scratch for sentiment analysis of IMDB dataset

machine-learning naive-bayes-classifier python-3 scikit-learn

Last synced: 09 May 2026

https://github.com/thanh12273203/hotel-booking-cancellation-prediction

Binary classification on hotel booking cancellations.

classification machine-learning python scikit-learn

Last synced: 09 May 2026

https://github.com/adadalshabab/human-stress-analysis-greadsearch-classifier

The project leverages data from physiological signals, self-reported surveys, behavioral observations, or other relevant sources to infer and analyze stress levels.

classification knn-classification machine-learning machine-learning-algorithms matplotlib pandas scikit-learn

Last synced: 09 May 2026

https://github.com/rajan-bhateja/aqi-predictor

Different models trained on Indian Cities to predict AQI

machine-learning-algorithms model-comparison neural-networks scikit-learn tensorflow

Last synced: 09 May 2026

https://github.com/samuelson777/iris-flower-classification

Iris Flower Classification: A machine learning project that classifies iris flowers into three species based on sepal and petal dimensions. Includes data exploration, visualization, and model evaluation using Python and scikit-learn.

classification data-science data-visualization iris-dataset jupyter-notebook machine-learning python scikit-learn

Last synced: 09 May 2026

https://github.com/suvasish114/house-price-estimation

A machine learning model that estimate housing prices in California using the California census data

jupyter-notebook machine-learning python scikit-learn

Last synced: 09 May 2026

https://github.com/vivprime/diabetes-prediction-system

MERISKILL INTERNSHIP: To predict whether an individual have Diabetes or not

django html scikit-learn

Last synced: 09 May 2026

https://github.com/laavanjan/real_estate_price_prediction

This project predicts the house price per unit area based on various real estate features using a Linear Regression model. The application is built with Dash, a Python framework for building interactive web apps.

dash linear-regression pandas scikit-learn

Last synced: 10 May 2026

https://github.com/macdon112/credit-card-fraud-detection

Comparing ML models (Random Forest, KNN, Decision Tree) for credit card fraud detection using SMOTE and stratified cross-validation.

classification data-analysis fraud-detection imbalanced-data machine-learning python scikit-learn

Last synced: 10 May 2026

https://github.com/hassanislam463/nyc_airbnb_eda

This project is a comprehensive data analysis of Airbnb listings in New York City, exploring pricing trends, seasonality effects, host market dynamics, rental preferences, and revenue estimation. It provides valuable insights for hosts, investors, and policymakers to optimize Airbnb operations and understand the short-term rental landscape in NYC.

exploratory-data-analysis matplotlib python scikit-learn seaborn

Last synced: 10 May 2026

https://github.com/ejw-data/ml-classification-credit-risk

Compares several machine learning classification models to determine whether to approve or reject a loan request

classification python scikit-learn

Last synced: 10 May 2026

https://github.com/i30101/mathworks2024

Coding tools for 2024 MathWorks Math Modeling Challenge

machine-learning mathematical-modelling python scikit-learn

Last synced: 10 Jun 2026

https://github.com/alphacrypto246/student-learning-style-prediction

An interactive web application built with Streamlit that predicts a student's preferred learning style (visual, auditory, or kinesthetic) using machine learning, aiding educators in personalizing teaching strategies.

machine-learning scikit-learn scikitlearn-machine-learning streamlit

Last synced: 11 May 2026

https://github.com/mpolinowski/tstochastic-neighbor-embedding

Improve Data Quality by discarding non-correlating, noisy Dimensions

matplotlib-pyplot python scikit-learn t-sne

Last synced: 11 May 2026

https://github.com/monarch1108/customerinsights-kmeans

understanding customers using KMeans and RFM(recency, frequency & monetary) analysis

data-analysis data-visualization kmeans-clustering machine-learning matplotlib numpy pandas scikit-learn

Last synced: 11 May 2026

https://github.com/theladev/machine-learning

This repository is focus on show u my personal projects and interests on Machine Learning and Data Science. Hope u enjoy it.

data-science machine-learning machine-learning-models pandas python scikit-learn

Last synced: 11 May 2026

https://github.com/matheusadc/valorizai

Projeto que tem como objetivo a previsรฃo do preรงo de casas.

jupyter-notebook pandas scikit-learn

Last synced: 11 May 2026

https://github.com/johannesvc/data-science-portfolio

A curated portfolio of applied data science projects focused on machine learning, NLP, and social impact.

academic-portfolio data-science deep-learning keras machine-learning media-bias nlp pandas scikit-learn

Last synced: 11 May 2026

https://github.com/deaneeth/churn-prediction-model-training

Step-by-step guide to building machine learning models for customer churn prediction, continuing from the data preprocessing phase. The repo covers training, evaluation, and saving of models, with weekly updates.

churn-prediction data-science-projects jupyter-notebook machine-learning model-evaluation model-training model-training-and-evaluation python scikit-learn

Last synced: 11 May 2026

https://github.com/ananyagubba/bike-sharing-demand-prediction

Using machine learning techniques, the model learns from features such as weather conditions, time of day, season, and holiday information to forecast hourly or daily demand.

machine-learning python scikit-learn seaborn

Last synced: 11 May 2026

https://github.com/sharvesh1401/inverse-design-patch-antenna

A machine learning approach to the inverse design of microstrip patch antennas by predicting optimal physical dimensions from desired performance metrics.

antenna-design deep-learning engineering-project gradio jupyter-notebook machine-learning patch-antenna python regression-model scikit-learn

Last synced: 11 May 2026

https://github.com/rajireddy15/student_grade_pred

A machine learning project to predict student final grades using academic and demographic data. Built with pandas, scikit-learn, and visualized with seaborn and matplotlib to gain insights and support early intervention for students.

academic-insights data-science eda education-analytics grade-prediction machine-learning ml-project pandas regression-models scikit-learn student-performance-analysis

Last synced: 11 May 2026

https://github.com/cptanalatriste/copycat-detector

A Naive-Bayes classifier for detecting plagiarism.

amazon-sagemaker naive-bayes-classifier scikit-learn

Last synced: 12 May 2026

https://github.com/shubhamkarampure/asl-streamlit-signlingo

streamlit based web-app for teaching sign language through real-time hand gesture recognition.

learning-exercise mediapipe opencv-python python scikit-learn sign-language streamlit-webapp

Last synced: 12 May 2026

https://github.com/xunchiasg/nyc_property_sales

Exploratory Data Analysis of rolling property sales data in NYC from March 2023-2025

matplotlib-pyplot plotly python scikit-learn

Last synced: 12 May 2026

https://github.com/g-eoj/kaggle-rotten-tomatoes

Movie review sentiment analysis with the Stanford parsed Rotten Tomatoes dataset.

cross-validation nlp nltk rotten-tomatoes scikit-learn

Last synced: 12 May 2026

https://github.com/mateusoliveira30/house-prices

This project was developed for the Kaggle competition "House Prices - Advanced Regression Techniques." The goal is to predict house sale prices using advanced regression techniques, including feature engineering, Random Forests, and Gradient Boosting.

kaggle-competition machine-learning scikit-learn

Last synced: 13 May 2026

https://github.com/johanneswiesner/skplot

A python package for extracting, plotting and reporting information from one or multiple sklearn classification & prediction pipelines.

plotting python scikit-learn sklearn visualization

Last synced: 14 May 2026

https://github.com/janek1842/mlbyjan-sandbox

Testbed for private ML investigations

ml scikit-learn

Last synced: 14 May 2026

https://github.com/breezy-codes/machine-learning-for-spam-sms

Real-time SMS spam detection using ML models in simulated cellular networks. Compares 4 algorithms with comprehensive performance analysis.

logistic-regression machine-learning naive-bayes network-simulation random-forest research scikit-learn spam-sms spam-sms-detection svm telecommunication

Last synced: 14 May 2026

https://github.com/fulviofavilla/cvd-prediction-ml

Comparative ML analysis for CVD prediction. Winner of the 2023 HPCC Systems Poster Competition.

data-science ecl healthcare hpcc-systems machine-learning pandas python scikit-learn

Last synced: 11 Jun 2026

https://github.com/arjunan-k/medical_insurance

Project to analyze and forecast medical insurance costs of patients using data science framework.

medical-insurance scikit-learn tableau

Last synced: 12 Jun 2026

https://github.com/nayutalienx/osu-skill-predictor

ML-powered osu! pass probability & accuracy predictor with real-time overlay. Standalone Windows bundle available.

fastapi machine-learning osu overlay predictor scikit-learn

Last synced: 14 Jun 2026

https://github.com/tomdewildt/interactive-and-explainable-ai-design

Code for The Interactive And Explainable AI Design course of my master's degree

jupyter lime numpy pandas python scikit-learn shap

Last synced: 18 Jun 2026

https://github.com/royxlead/production-drift-detection

Production ML monitoring library - KL, PSI, MMD, and ADWIN drift detectors with empirical benchmarks, confidence tracking, and a 6-page FastAPI dashboard.

data-drift drift-detection fastapi kl-divergence mlops mmd model-monitoring production-ml psi pytorch scikit-learn uncertainty-quantification

Last synced: 23 Jun 2026

https://github.com/josepablodmg/python--linear-regression---housing-exercise

A predictive analysis exploring the relationship between household characteristics and median income in California. Using linear regression, the project investigates whether blocks with fewer households correspond to higher median incomes.

california data-analysis data-science exploratory-data-analysis housing-data linear-regression machine-learning python regression scikit-learn statistics visualization

Last synced: 05 Oct 2025

https://github.com/imosudi/unsupervised-ml-kmeans-analysis

K-Means clustering analysis using synthetic datasets generated with scikit-learn, including meshgrid visualisation, silhouette score evaluation, and investigation of cluster count and random seed effects.

clustering data-analysis jupyter-notebook kmeans kmeans-clustering machine-learning matplotlib python3 scikit-learn silhouette-score unsupervised-learning

Last synced: 25 Jun 2026

https://github.com/vivekky57/car-price-prediction

Now you can get Car Price with this wonderful end-to-end project.

flask machine-learning machine-learning-algorithms python python3 random-forest-classifier scikit-learn

Last synced: 13 Apr 2026

https://github.com/dearabhin/girlfriend-predictor

Using machine learning to solve the ultimate college classification problem. A fun project applying Python and Logistic Regression to predict relationship outcomes based on a (hilariously) synthetic dataset. ๐Ÿ“Šโค๏ธ

classification data-science fun-project google-colab jyputer-notebook jypyternotebook logistic-regression machine-learning pandas python scikit-learn

Last synced: 06 Oct 2025

https://github.com/sora468/best-of-ml-python

๐Ÿ† Discover top-ranked Python libraries for machine learning, updated weekly to help you find the best tools for your projects.

airport airport-simulation chatgpt configuration data-analysis data-science data-visualization data-visualizations gpt keras machine-learning nlp python scikit-learn tensorflow transformer usg-ai-training-data usg-artificial-intelligence

Last synced: 09 May 2026

https://github.com/muellerconstantin/house-prices

Data analysis about house prices in Ames (Iowa) with advanced regression techniques.

dvc jupyter-notebook python python3 scikit-learn

Last synced: 14 Apr 2026

https://github.com/harris-giki/e-comdataanalysis_ml

E-commerce Customer Analysis with Linear Regression: analyzes customer behavior within an e-commerce setting and predict yearly customer spending based on various features using a linear regression model.

development ecommerce linear-regression machine-learning model prediction-model python scikit-learn

Last synced: 14 Apr 2026

https://github.com/dukebw/ml-model-selection

Machine learning model selection using Dlib and scikit-learn.

dlib machine-learning ranking scikit-learn

Last synced: 07 Oct 2025

https://github.com/sducournau/ign_lidar_hd_dataset

๐Ÿ—๏ธ Comprehensive Python library for processing IGN LiDAR HD data into machine learning-ready datasets for Building Level of Detail (LOD) classification. Features GPU/CPU processing, smart data management, and complete ML pipeline integration.

building-classification data-processing dataset france geospatial gis ign lidar lidar-hd numpy point-cloud scikit-learn

Last synced: 20 Jan 2026

https://github.com/albarji/teachingcontainer

A Docker container I use for my lectures

docker keras machine-learning scikit-learn

Last synced: 14 Apr 2026

https://github.com/prarthana-singh/bangalore-house-price-predictor

๐Ÿก Bangalore House Price Prediction โ€“ A Machine Learning model to predict house prices in Bangalore using real estate data. Built with Linear Regression, Python, Pandas, NumPy, and Scikit-Learn.

data-analysis eda house-price-prediction linear-regression machine-learning numpy pandas python real-estate regression scikit-learn

Last synced: 19 Apr 2026

https://github.com/jyablonski/nba_elt_mlflow

ML Pipeline for NBA ELT Project

python scikit-learn

Last synced: 17 Jan 2026

https://github.com/arish-mhrjn/aimodelinspector

A fairly comprehensive Python library allowing for exploration, self-education and categorizaton of AI models

ai analysis coreml-models diffusers diffusion-models ggml hdf5-format jax model-discovery model-insights openvino-models pytorch scikit-learn scikitlearn-machine-learning

Last synced: 07 Oct 2025

https://github.com/r-gg/ml-37

Amazon Reviews ~ Sentiment analysis evaluation: fine-tuned BERT vs LSTM. (+ Extensive Data Mining & Visualization)

bert deep-learning ipynb-jupyter-notebook lstm machine-learning python scikit-learn uni-project

Last synced: 05 Feb 2026

https://github.com/shubhamsoni98/classification-with-random-forest-1

To classify sales into categories (Low, Moderate, High) using Random Forests to inform strategic decisions and optimize marketing strategies.

algorithms anaconda data data-science datacleaning eda jupyter-notebook machine-learning pyhton random-forest scikit-learn visualization

Last synced: 18 Jan 2026

https://github.com/madsondeluna/mvp_pucrio_data_analytics_and_machine_learning

MVP referente a sprint de Machine Learning & Analytics (40530010056_20250_01) da Pรณs-Graduaรงรฃo em Data Science and Analytics da PUC-Rio.

comparative-analysis data-analytics data-science machine-learning-algorithms postgraduate-course python pytorch scikit-learn

Last synced: 03 May 2026

https://github.com/pragati928/cancer-severity-prediction-ml

๐Ÿ“Š End-to-end data science project predicting cancer severity using Python, EDA, and Random Forests โ€” focusing on lifestyle and genetic factors.

data-analysis-python data-science-projects eda machine-learning pandas-python random-forest scikit-learn visualizations

Last synced: 08 Oct 2025

https://github.com/hiftd22/wpattern

๐Ÿ“ˆ Analyze and visualize stock patterns with wPattern for better trading insights. Simplify your investment decisions through data-driven analysis.

cli finance financial-analysis financial-data matplotlib numpy pandas pattern-recognition python scikit-learn stock-scanner technical-analysis yfinance

Last synced: 29 Apr 2026

https://github.com/pranavsp108/market_basket_analysis-instacart

Customer segmentation and market basket analysis using the Instacart dataset with Python, Pandas, and K-Means clustering.

customer-segmentation-and-buying-behavior data-analysis data-visualization instacart jupyter-notebook kmeans-clustering market-basket-analysis pandas python scikit-learn

Last synced: 05 May 2026