An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/njorogepaul-moghul/house-price-predictions-kaggle-competition-

Built a predictive model for the Kaggle House Prices competition using feature engineering and LightGBM, achieving strong leaderboard performance."

data-science house-price-prediction-with-lightgbm kaggle-competition lightgbm machine-learning predicting-home-values-using-machine-learning random-forest scikit-learn

Last synced: 15 May 2026

https://github.com/inesruizblach/data-science-project

A data science project exploring Portuguese "Vinho Verde" wine quality prediction. Features EDA, feature engineering, ML models, and evaluation using Python, pandas, scikit-learn, and visualization tools.

binary-classification classification data-science exploratory-data-analysis feature-engineering imbalanced-learn jupyter-notebook machine-learning model-evaluation pandas regression scikit-learn seaborn uci-dataset wine-quality

Last synced: 09 May 2026

https://github.com/kianaabrisham/naive-bayes-sentiment

Sentiment classification using Multinomial NB (scratch + sklearn)

bag-of-words naive-bayes nlp scikit-learn sentiment-analysis text-classification

Last synced: 14 May 2026

https://github.com/dearabhin/girlfriend-predictor

Using machine learning to solve the ultimate college classification problem. A fun project applying Python and Logistic Regression to predict relationship outcomes based on a (hilariously) synthetic dataset. šŸ“Šā¤ļø

classification data-science fun-project google-colab jyputer-notebook jypyternotebook logistic-regression machine-learning pandas python scikit-learn

Last synced: 06 Oct 2025

https://github.com/harris-giki/e-comdataanalysis_ml

E-commerce Customer Analysis with Linear Regression: analyzes customer behavior within an e-commerce setting and predict yearly customer spending based on various features using a linear regression model.

development ecommerce linear-regression machine-learning model prediction-model python scikit-learn

Last synced: 14 Apr 2026

https://github.com/prarthana-singh/bangalore-house-price-predictor

šŸ” Bangalore House Price Prediction – A Machine Learning model to predict house prices in Bangalore using real estate data. Built with Linear Regression, Python, Pandas, NumPy, and Scikit-Learn.

data-analysis eda house-price-prediction linear-regression machine-learning numpy pandas python real-estate regression scikit-learn

Last synced: 19 Apr 2026

https://github.com/jyablonski/nba_elt_mlflow

ML Pipeline for NBA ELT Project

python scikit-learn

Last synced: 17 Jan 2026

https://github.com/madsondeluna/mvp_pucrio_data_analytics_and_machine_learning

MVP referente a sprint de Machine Learning & Analytics (40530010056_20250_01) da Pós-Graduação em Data Science and Analytics da PUC-Rio.

comparative-analysis data-analytics data-science machine-learning-algorithms postgraduate-course python pytorch scikit-learn

Last synced: 03 May 2026

https://github.com/workwithchaimaa/codealpha_diseaseprediction

Complete ML pipeline for binary classification to predict heart disease. Includes data preprocessing, model comparison (Logistic Regression, RF), hyperparameter tuning, and feature importance analysis.

classification heart-disease machine-learning python random-forest scikit-learn

Last synced: 08 Oct 2025

https://github.com/manjotkaurgill/agritech

Enter details of your soil and weather, and find best suitable crop for farming. With our advanced AI system, you can make informed decisions and optimize your agricultural practices.

flask generative-ai insight-generation machine-learning matplotlib mongodb nextjs numpy pandas python scikit-learn seaborn

Last synced: 12 Apr 2026

https://github.com/jlee9503/telecommunication-churn

Analyze key factors influencing customer churn using Python data analytics technique. Explore key factors through data preprocessing, exploratory data analysis (EDA), and predictive modeling.

data-analysis data-visualization matplotlib pandas python scikit-learn

Last synced: 18 Jan 2026

https://github.com/himanshkr03/loan_default_prediction_using_machine_learning

This repository contains a Python-based project that uses machine learning to predict loan defaults. It explores data preprocessing, feature engineering, and model training techniques to build a predictive model for assessing loan risk.

data-science finance loan-default-prediction machine-learning pandas prediction-model python risk-assessment scikit-learn

Last synced: 14 Apr 2026

https://github.com/vishalgaud17/stroke

A simple Streamlit web app that predicts stroke risk based on user input features like age, BMI, glucose level, and lifestyle factors, using a pre-trained machine learning model.

machine-learning numpy pandas python scikit-learn streamlit

Last synced: 14 Apr 2026

https://github.com/shadmanshaikh/ml_algo_from_scratch

All standard machine learning algorithms from scratch in python šŸ

classification deep-learning machine-learning neural-nets python regression scikit-learn

Last synced: 09 May 2026

https://github.com/divyajnanakshi-cloud/phishing-detector

This project presents a Phishing Detection System implemented as a Chrome Extension designed to help users determine whether a website is legitimate or malicious

chrome-extension css hashing-algorithm html javascript python qr-code-processing random-forest scikit-learn steganography

Last synced: 14 Apr 2026

https://github.com/sharvesh1401/battsense

BattSense is a machine learning project focused on predicting the State of Health (SOH) of lithium-ion batteries using operational parameters such as voltage, current, temperature, and capacity. The model enables accurate, data-driven diagnostics for battery performance monitoring in electric vehicles and portable devices.

battery-diagnostics battery-health battery-health-prediction battery-soh data-analysis electric-vehicles energy-storage machine-learning predictive-maintenance python regression scikit-learn

Last synced: 07 May 2026

https://github.com/alisonmitchell/titanic

Exploration of a subset of the Titanic passenger manifest to create a predictive classification model to determine which passengers were more likely to survive.

deep-learning keras machine-learning matplotlib numpy pandas python scikit-learn scipy seaborn tensorflow

Last synced: 14 Apr 2026

https://github.com/khushirajurkar/exoplanet-habitability-prediction-model

Predicts whether an exoplanet is habitable using ML. Handles class imbalance with ADASYN, tests multiple models, and saves the best one. Includes confusion matrices, ROC curves, and a clean Jupyter notebook

adasyn astroinformatics confusion-matrix exoplanets logistic-regression machine-learning multiclass-classification python roc-curve scikit-learn smote

Last synced: 06 May 2026

https://github.com/lorenzorottigni/ml-breast-cancer

Machine Learning python bootcamp: Support Vector Machines using breast cancer dataset

ipynb machine-learning numpy pandas python scikit-learn seaborn support-vector-machines

Last synced: 14 Apr 2026

https://github.com/sudarshanc00/smishing

This project aims to classify text messages to detect potential smishing (SMS phishing) attacks. Using machine learning, the project provides a classifier that can differentiate between legitimate messages and smishing attempts, helping to prevent scams.

nltk numpy pandas python scikit-learn scipy

Last synced: 14 Apr 2026

https://github.com/stewartpark/sklearn2gem

⚔ sklearn2gem ports your scikit-learn model into a fast ruby C binding!

ruby rubygem scikit-learn sklearn

Last synced: 01 Mar 2026

https://github.com/gabrielmazzotta/nlp-clustering--movie-similarity-from-plot-summaries

A Python-based movie recommendation system leveraging NLP and clustering techniques. This project includes data processing, vectorization of plot summaries, and the implementation of recommendation algorithms to suggest similar movies based on user input.

clustering cosine-similarity hierarchical-clustering kmeans lemmatization nlp recommendation-engine scikit-learn similarity-score spacy tokenization

Last synced: 21 Jan 2026

https://github.com/angelarreola/ai_notes

Notas de la materia "Inteligencia Artificial" para su posterior extraccion mediante algun modelo de lenguaje que nos permita dar respuestas personalizadas con base a la informacion presente en este repositorio.

ai matplotlib numpy pandas phaserjs python scikit-learn

Last synced: 21 Jan 2026

https://github.com/taimoorkhan10/ai-fairness-explainability-toolkit

AI Fairness and Explainability Toolkit (AFET) is an open-source project aimed at providing tools and frameworks to assess, visualize, and mitigate bias in machine learning models. It supports multiple ML frameworks and offers a comprehensive suite of metrics and visualization components to enhance model transparency and fairness.

ai bias-detection data-science ethical-ai explainable-artificial-intelligence fairness machine-learning mlops model-interpretation open-source python responsible-ai scikit-learn

Last synced: 19 Jan 2026

https://github.com/waikato-datamining/shallowflow-sklearn

scikit-learn support for the shallowflow Python workflow system.

python3 scikit-learn sklearn workflow-engine

Last synced: 14 Apr 2026

https://github.com/jsimell/sleepanalysis

A Python data analysis project analyzing the sleep quality affecting factors and temporal patterns in the sleeping data of a single subject.

data-analysis matplotlib numpy pandas python scikit-learn seaborn

Last synced: 14 Apr 2026

https://github.com/szymon-budziak/real_estate_house_prices_prediction

Predicting real estate house prices using various machine learning algorithms, including data exploration, preprocessing, model training, and evaluation.

data-analysis data-preprocessing data-science eda jupyter-notebook machine-learning matplotlib numpy optuna pandas predictive-modeling price-prediction python random-forest regression scikit-learn seaborn

Last synced: 21 Jan 2026

https://github.com/ayorick23/python-data-science-cheat-sheet

Guía rÔpida y prÔctica de sintaxis, comandos y funciones esenciales de Python para Ciencia de Datos. Perfecta para recordar cómo usar las librerías mÔs comunes como NumPy, Pandas, Matplotlib y Scikit-learn en tus anÔlisis diarios.

cheat-sheet data-analysis data-science data-visualization deep-learning jupyter-notebook machine-learning matplotlib ml numpy pandas python scikit-learn scipy seaborn statistics sympy tensorflow

Last synced: 07 Apr 2026

https://github.com/harris-giki/cancerdetectionmodel_ml

Simple Logistic Regression and Neural Network powered Machine Learning models that predicts whether a breast tumor is malignant or benign based on input features extracted from a breast cancer dataset.

cancer-detection development keras keras-tensorflow logistic-regression machine-learning neural-network scikit-learn streamlit tensorflow

Last synced: 13 Apr 2026

https://github.com/josancamon19/boston_housing

Predicting Boston Housing Prices for Udacity Machine Learning Nanodegree

boston-housing-price-prediction machine-learning machine-learning-nanodegree scikit-learn udacity

Last synced: 21 Apr 2026

https://github.com/gregoritsch3/ml_eda_classification_diabetes

An EDA and Machine Learning Classification exercise on the Diabetes dataset demonstrating the use of SQLAlchemy data import from an SQL database (PostgreSQL), Pre-processing Pipelines, ANOVA, 9 ScikitLearn ML models, Hyperparamter Tuning for the best performing one, and feature importance.

anova machine-learning matplotlib numpy pandas pipelines scikit-learn seaborn sql sqlalchemy statistics

Last synced: 14 Apr 2026

https://github.com/mecha-aima/fake-bills-detection

This Python project implements a simple classification model comparison using scikit-learn to classify banknotes as either "Authentic" or "Counterfeit" based on four features

classification-model machine-learning model-selection scikit-learn

Last synced: 27 Jan 2026

https://github.com/smahala02/svm-machine-learning

This repository provides an in-depth tutorial and practical implementation of Support Vector Machines (SVM) for classification tasks, using Python and popular data science libraries.

classification data-science machine-learning python scikit-learn svm

Last synced: 30 Jan 2026

https://github.com/pradeep31747/smartsuggest-personalized_product_recommendations

This project implements a personalized product recommendation system using machine learning techniques to enhance user experience and drive engagement.

jupyter-notebook keras numpy pandas pyhton scikit-learn sql tensorflow vscode

Last synced: 28 Jan 2026

https://github.com/rahul-120/crop_recom

This project is a Machine Learning based Crop Recommendation System built using Flask. It helps farmers or users decide the most suitable crop to grow based on soil nutrients and environmental conditions.

crop-recommendation-system flask flask-application machine-learning python3 scikit-learn

Last synced: 02 May 2026

https://github.com/asut00/Machine-Learning-Program_42AI

Comprehensive Machine Learning path by 42AI: hands-on modules on regression, gradient descent, and real-world ML applications.

linear-regression machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 27 Oct 2025

https://github.com/allwin107/loan-prediction-web-app

A Flask-based loan prediction web app using a Random Forest model to predict loan approval based on user input. It includes a clean, responsive UI, form validation, and real-time prediction display.

classification data-processing deployment flask loan-prediction machine-learning python random-forest-classifier scikit-learn web-application

Last synced: 15 Apr 2026

https://github.com/asherk7/house-price-prediction

House Prices - Advanced Regression Techniques - Predict sales prices and practice feature engineering, RFs, and gradient boosting

data-science numpy pandas regression scikit-learn

Last synced: 15 Apr 2026

https://github.com/beatrizandradeds/sistema-recomendacao-filmes

šŸŽ¬ Sistema de Recomendação de Filmes usando ML | Vetorização de texto, cosine similarity e NLP com Python

content-based-filtering cosine-similarity data-science data-science-projects machine-learning natural-language-processing nlp portfolio python recommendation-system scikit-learn

Last synced: 29 Apr 2026

https://github.com/bangaji313/recommender-system-movielens

Proyek Sistem Rekomendasi Film dengan Content-Based & Collaborative Filtering. Submission untuk modul Machine Learning Terapan di Coding Camp 2025.

collaborative-filtering content-based-filtering data-science deep-learning dicoding jupyter-notebook keras movie-recommendation movielens pandas python recommender-system scikit-learn tensorflow

Last synced: 15 Apr 2026

https://github.com/itssahilwhat/ai-fundamentals

A curated collection of fundamental AI concepts, algorithms, and code implementations — including Machine Learning, Deep Learning, and Computer Vision — built from scratch and with practical examples.

computer-vision deep-learning machine-learning numpy pandas python pytorch scikit-learn

Last synced: 15 Apr 2026

https://github.com/emv271828/diabetes_cdc_uci_machine_learning

Segunda avaliação para a disciplina de Inteligência Artificial da Universidade Federal Fluminense.

jupyter-notebook machine-learning pandas python scikit-learn

Last synced: 15 Apr 2026

https://github.com/sarmad426/ai

AI basic to advanced featuring Machine Learning, Deep Learning and Data Science.

ai data-science deep-learning hugging-face machine-learning numpy pandas python scikit-learn

Last synced: 15 Apr 2026

https://github.com/samiyaalizaidi/nn-ml-homeworks

Homework solutions for CPE-4903: Neural Networks & Machine Learning at Kennesaw State University.

machine-learning machine-learning-workflow neural-networks numpy scikit-learn

Last synced: 15 Apr 2026

https://github.com/as1467/canada-per-capita-income-prediction

This project is a simple machine learning exercise to predict Canada's per capita income based on historical data. The dataset used in this project was sourced from the CodeBasics GitHub repository and is used here to practice linear regression as part of my machine learning learning process.

machine-learning matplotlib-pyplot pandas python scikit-learn

Last synced: 15 Apr 2026

https://github.com/aerojam95/math70076-data-science-cw2

This repository presents the second coursework for the MATH70076 Data Science module at Imperial College London, where the project showcases different machine and deep learning models for image classification

data-science deep-learning machine-learning python3 pytorch scikit-learn

Last synced: 15 Apr 2026

https://github.com/max00358/sign_language_detection

A sign language detector that recognizes ASL(American Sign Language) alphabet

mediapipe opencv scikit-learn

Last synced: 09 Feb 2026

https://github.com/sachinh123/cognitive-customer-insights-with-watson-ai

This project analyzes customer data to provide insights for personalized services, behavior prediction, and improved support.

flask ibm-cloud ibm-watson-assistant ibm-watson-nlu nltk python scikit-learn

Last synced: 10 Feb 2026

https://github.com/nurulashraf/predictive-maintenance-analysis-for-machine-failure-prevention

Predictive maintenance analysis for machine failure prevention using sensor data and ML. Built a Random Forest model and Gradio dashboard to identify high-risk machines for proactive maintenance.

data-science failure-prediction gradio industrial-iot machine-learning power-bi predictive-maintenance python scikit-learn

Last synced: 16 Apr 2026

https://github.com/sabin74/fake_news_detection

This project implements a Fake News Detection system using Python, Natural Language Processing (NLP), and machine learning. It classifies news articles as Real or Fake based on their textual content.

fake-news-detection kaggle-dataset multinomial-naive-bayes passive-aggressive-classifier python3 regex scikit-learn

Last synced: 16 Apr 2026

https://github.com/mindkerchief/baselineml

A collection of machine learning task performed during my studies in computer science major in intelligent system.

decision-tree dummy gaussian-mixture-models kmeans-clustering linear-regression logistic-regression machine-learning matplotlib numpy pandas random-forest scikit-learn seaborn tensorflow

Last synced: 16 Apr 2026

https://github.com/selcia25/iris-dataset-classification

☘This repository contains a Python script for classifying the Iris dataset using the Random Forest algorithm.

data-processing iris-classification pandas random-forest-classifier scikit-learn

Last synced: 16 Apr 2026

https://github.com/s0fft/airline-passenger-satisfaction

Airline-Customer-Model — Machine Learning Project on: Scikit-learn / Pandas / Matplotlib / Seaborn

jupyter-notebook mashine-learning matplotlib pandas python3 scikit-learn seaborn

Last synced: 12 Feb 2026

https://github.com/sergeimakarovv/energy-data-analytics-ml

Analyzing global data on sustainable energy, predicting CO2 emissions per capita

machine-learning pandas plotly python scikit-learn streamlit

Last synced: 12 Feb 2026

https://github.com/gliuck/diabetesprediction

Machine learning exam project, focused on predicting diabetes based on health and demographic data. The project uses models like Logistic Regression, KNN, SVM and NN to analyze and predict the likelihood of diabetes in individuals.

machine-learning machine-learning-models numpy-library pandas-library prediction-model python scikit-learn

Last synced: 14 Feb 2026

https://github.com/chanmeng666/mnist-handwritten-digit-recognition-project

怐Sprinkle some star dust on this repo! ā­ļø It's good karma!怑A comprehensive implementation and analysis of handwritten digit recognition using multiple neural network architectures on the MNIST dataset. Features basic MLP, optimized feature-selected model, and deep CNN approaches with detailed performance comparisons and visualizations.

cnn computer-vision data-analysis data-visualization deep-learning feature-analysis handwritten-digit-recognition keras machine-learning mlp mnist model-optimization neural-networks python scikit-learn tensorflow

Last synced: 02 Apr 2026

https://github.com/hlexnc/project-arepo

Data-driven stroke risk assessment & personalized recommendations, powered by machine-learning and an NLU-driven chatbot.

chatbot data-analysis docker docker-compose machine-learning nlu-chatbot python rasa scikit-learn sklearn streamlit

Last synced: 15 Feb 2026

https://github.com/mgesteban/analyzing_car_prices

A comprehensive data science project analyzing factors that drive used car prices to provide actionable insights for used car dealerships.

crisp-dm data-science lasso-regression linear-regression machine-learning one-hot-encoding pandas ridge-regression scikit-learn

Last synced: 15 Feb 2026

https://github.com/sridharyadav07/machine-learning-project-combined-cycle-power-plant-

This project is focused on Multiple machine learning models, including Linear Regression, Decision Tree Regression, and Random Forest Regression, were implemented to predict the target variable and evaluated using various metrics like RMSE, MAE, and R-squared. The performance of these models was compared, and the Random Forest Regressor was found.

data-processing decisiontreeregressor linear-regression metrics-evaluation python random-forest-regressor scikit-learn

Last synced: 16 Apr 2026

https://github.com/pramodyasahan/health-insurance-cost-prediction

This project focuses on predicting health insurance costs using a polynomial regression model. By employing machine learning techniques in Python, the project aims to accurately estimate insurance costs based on various personal attributes. The model takes into account several features including age, sex, BMI, number of children, smoking status etc

machine-learning matplotlib numpy pandas python3 scikit-learn

Last synced: 16 Apr 2026

https://github.com/piotrwnuczek/cloudprediction

Predicting cloud task execution time using AI/ML

matplotlib pandas python scikit-learn

Last synced: 16 Apr 2026

https://github.com/khaymanii/calories-burnt-prediction-model

This model was built using Python and XGBoost Regression algorithm

matplotlib numpy pandas python scikit-learn

Last synced: 16 Apr 2026

https://github.com/archish27/pythontutorial

Python Programming Tutorial for new geeks who want to learn python from scratch to deal with various applications

matplotlib numpy pandas pygame python python-2 python-3 scikit-learn soup

Last synced: 01 Apr 2026

https://github.com/shreeparab1890/indian-cricketer-classifier

This notebook is trying to bulia a model which will predict a Indian Cricketer based on the given image. In this project we have handled 8 Indian Cricketers and build a model to classify the given image between this 8 Cricketers.

image-classification matplotlib numpy opencv pandas python random-forest-classifier scikit-learn sklearn streamlit

Last synced: 01 Apr 2026

https://github.com/capsuleismail/income-census-prediction

Predict whether annual income of an individual exceeds $50K per annum based on census data. Also known as "Census Income" dataset.

datascience jupyter-notebook machinelearning-python scikit-learn

Last synced: 16 Apr 2026

https://github.com/drorata/mnist-examples

ML examples for the MNIST dataset

machine-learning ml mnist python scikit-learn torch

Last synced: 19 Apr 2026

https://github.com/junya737/weighted-pls-regression

A Python implementation of Weighted Partial Least Squares Regression with support for sample weights.

machine-learning partial-least-squares-regression scikit-learn

Last synced: 17 Apr 2026

https://github.com/danicc097/python-ml-app

Various [arguably useless] Machine Learning services with gRPC and OpenTelemetry for demo purposes

grpc-python opentelemetry scikit-learn

Last synced: 17 Apr 2026

https://github.com/amirmohammadgholampour/mall-customer-segmentation

Project for segmenting customers in a shopping mall using the Clustering algorithm.

numpy pandas python scikit-learn

Last synced: 02 Apr 2026

https://github.com/anshvaid4/ml_practice

This is the new repository, where I have added all the notebooks demonstrating the usage of various transformers and models for Supervised and Unsupervised algorithms

anaconda jupyter-notebook machine-learning machine-learning-algorithms python scikit-learn

Last synced: 17 Apr 2026

https://github.com/orliluq/inmersion-datos-python

Desarrollar modelos de machine learning para predecir la probabilidad de incumplimiento crediticio de los clientes, utilizando diferentes algoritmos de clasificación (Regresión Logística, Árboles de Decisión, Random Forest, Naive Bayes).

colab-notebook numpy pandas python scikit-learn

Last synced: 02 Apr 2026