An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/manu-karenite/medical-insurance-cost-predictor

Medical Insurance Cost Generator is a Linear Regression based Predictor which is used to estimate and predict the Cost a person has to pay while Buying a Medical Insurance.

kaggle-dataset linear-regression machine-learning matplotlib numpy pandas python3 reactjs scikit-learn

Last synced: 15 Apr 2026

https://github.com/emv271828/diabetes_cdc_uci_machine_learning

Segunda avaliação para a disciplina de Inteligência Artificial da Universidade Federal Fluminense.

jupyter-notebook machine-learning pandas python scikit-learn

Last synced: 15 Apr 2026

https://github.com/christiansandovalgarcia01-creator/megaline-plan-classifier

Modelo de clasificación para recomendar plan Smart vs Ultra (Megaline). Split 60/20/20, RandomForest ganador, accuracy TEST ≥ 0.75. Incluye matriz de confusión y classification report. Stack: Python, Pandas, scikit-learn, Jupyter.

classification data-science jupyter-notebook machine-learning python random-forest scikit-learn telecom

Last synced: 15 Apr 2026

https://github.com/moustafamohamed01/breast-cancer-prediction

A machine learning model built with PyTorch to predict if a tumor is malignant or benign using the Breast Cancer Dataset. The model uses a neural network to classify the data and shows how to train, evaluate, and visualize results.

ai data-science deep-learning machine-learning neural-network python pytorch scikit-learn

Last synced: 15 Apr 2026

https://github.com/nikitalpopov/evotor_champ

solution for evotor data challenge

data-analysis data-science python scikit-learn

Last synced: 15 Apr 2026

https://github.com/idaraabasiudoh/telco-churn-logistic-regression

A predictive model using logistic regression to identify customers likely to churn from a telecommunications company.

logistic-regression machine-learning python3 scikit-learn

Last synced: 01 Feb 2026

https://github.com/nits2612/data-science-projects

Portfolio of data science projects completed by me during PGP AI/ML, self learning, and hobby purposes.

data data-science dataanalysis deep deep-learning keras machine-learning matplotlib numpy opencv pandas python scikit-learn seaborn surprise-python tensorflow transfer-learning

Last synced: 01 Feb 2026

https://github.com/sarowarahmed/advertising-sales-app

📈 Advertising Sales Predictor: A web app powered by a Machine Learning model, built with Numpy, Pandas, Scikit-learn, and Streamlit, to forecast sales based on TV, Newspaper, and Online Advertising. Deployed on Streamlit Cloud for real-time, easy-to-use predictions.

advertising app machine-learning multiple-linear-regression numpy pandas sales scikit-learn streamlit

Last synced: 07 Feb 2026

https://github.com/vladimiracunadev-create/python-data-science-program

Python Data Science Program — 197 clases en 9 partes. Pauta avanzada derivada de Géron, VanderPlas, Huyen, ISLP y Barocas/Hardt/Narayanan. Recurso personal de aprendizaje, enseñanza y mejora continua.

bootcamp data-analysis data-science education jupyter machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 01 Jun 2026

https://github.com/danicaalana/wine-dataset-decision-tree

This project is developed as part of Digital Skill Fair (DSF) 35.0 - Data Science by Dibimbing. I am using Wine Recognition Dataset from scikit-learn, which is the results of a chemical analysis of wines grown in the same region in Italy by three different cultivators.

data data-analysis-python data-science decision-tree-classification machine-learning python scikit-learn wine-dataset

Last synced: 18 Apr 2026

https://github.com/max00358/sign_language_detection

A sign language detector that recognizes ASL(American Sign Language) alphabet

mediapipe opencv scikit-learn

Last synced: 09 Feb 2026

https://github.com/sarowarahmed/predicting-kolkata-house-price

🏠 Predicting Kolkata House Price: A web app powered by a Machine Learning model, built with Numpy, Pandas, Scikit-learn, and Streamlit, to predict house prices in Kolkata. Deployed on Streamlit Cloud for easy access and real-time predictions.

app kolkata linear-regression machine-learning numpy pandas scikit-learn streamlit

Last synced: 26 Feb 2026

https://github.com/brossend/automl_bank_project

Automated ML pipeline for the UCI Bank Marketing dataset: ETL, Optuna-based AutoML, model evaluation, MLflow logging, pytest tests, Docker, and CI/CD.

automl bank-marketing binary-classification ci-cd classification data-science docker docker-compose etl github-actions gitlab-ci machine-learning ml-pipeline mlflow model-monitoring optuna pytest python scikit-learn uci-dataset

Last synced: 02 Jun 2026

https://github.com/0eix/ibm-ds-spacex-falcon9

IBM Professional data science certificate Final Project Notebooks

data-science data-visualization exploratory-data-analysis ibm poetry scikit-learn shap

Last synced: 11 Feb 2026

https://github.com/nurulashraf/predictive-maintenance-analysis-for-machine-failure-prevention

Predictive maintenance analysis for machine failure prevention using sensor data and ML. Built a Random Forest model and Gradio dashboard to identify high-risk machines for proactive maintenance.

data-science failure-prediction gradio industrial-iot machine-learning power-bi predictive-maintenance python scikit-learn

Last synced: 16 Apr 2026

https://github.com/sabin74/fake_news_detection

This project implements a Fake News Detection system using Python, Natural Language Processing (NLP), and machine learning. It classifies news articles as Real or Fake based on their textual content.

fake-news-detection kaggle-dataset multinomial-naive-bayes passive-aggressive-classifier python3 regex scikit-learn

Last synced: 16 Apr 2026

https://github.com/sanjiv856/machine_learning_scikit-learn

Repository for machine learning in Python using Scikit-learn.

pipelines python scikit-learn sklearn titanic-kaggle titanic-survival-prediction

Last synced: 27 Feb 2026

https://github.com/codedby-mozz/habits_vs_academic_performance

This repository contains a Jupyter Notebook that explores the relationship between student lifestyle habits and academic performance. It demonstrates the process of data loading, exploratory data analysis (EDA), correlation analysis, and the development of a predictive model using linear regression to predict exam scores based on daily habits.

linear-regression python scikit-learn

Last synced: 16 Apr 2026

https://github.com/cego669/dirtycategoriesencoding

Repository containing two classes (StringAgglomerativeEncoder and StringDistanceEncoder) useful for grouping or visualizing the distance between dirty categorical variables. They are compatible with the scikit-learn API.

category clustering dimensionality-reduction dirty hierarchical-clustering machine-learning scikit-learn singular-value-decomposition svd

Last synced: 11 Feb 2026

https://github.com/mindkerchief/baselineml

A collection of machine learning task performed during my studies in computer science major in intelligent system.

decision-tree dummy gaussian-mixture-models kmeans-clustering linear-regression logistic-regression machine-learning matplotlib numpy pandas random-forest scikit-learn seaborn tensorflow

Last synced: 16 Apr 2026

https://github.com/selcia25/iris-dataset-classification

☘This repository contains a Python script for classifying the Iris dataset using the Random Forest algorithm.

data-processing iris-classification pandas random-forest-classifier scikit-learn

Last synced: 16 Apr 2026

https://github.com/s0fft/airline-passenger-satisfaction

Airline-Customer-Model — Machine Learning Project on: Scikit-learn / Pandas / Matplotlib / Seaborn

jupyter-notebook mashine-learning matplotlib pandas python3 scikit-learn seaborn

Last synced: 12 Feb 2026

https://github.com/zsailer/skspline

A Scikit-learn interface on Scipy's spline.

scikit-learn scipy

Last synced: 16 Apr 2026

https://github.com/gliuck/diabetesprediction

Machine learning exam project, focused on predicting diabetes based on health and demographic data. The project uses models like Logistic Regression, KNN, SVM and NN to analyze and predict the likelihood of diabetes in individuals.

machine-learning machine-learning-models numpy-library pandas-library prediction-model python scikit-learn

Last synced: 14 Feb 2026

https://github.com/chanmeng666/mnist-handwritten-digit-recognition-project

【Sprinkle some star dust on this repo! ⭐️ It's good karma!】A comprehensive implementation and analysis of handwritten digit recognition using multiple neural network architectures on the MNIST dataset. Features basic MLP, optimized feature-selected model, and deep CNN approaches with detailed performance comparisons and visualizations.

cnn computer-vision data-analysis data-visualization deep-learning feature-analysis handwritten-digit-recognition keras machine-learning mlp mnist model-optimization neural-networks python scikit-learn tensorflow

Last synced: 02 Apr 2026

https://github.com/hlexnc/project-arepo

Data-driven stroke risk assessment & personalized recommendations, powered by machine-learning and an NLU-driven chatbot.

chatbot data-analysis docker docker-compose machine-learning nlu-chatbot python rasa scikit-learn sklearn streamlit

Last synced: 15 Feb 2026

https://github.com/smuralee/machine-learning-samples

Machine learning samples

pytorch scikit-learn

Last synced: 15 Feb 2026

https://github.com/mgesteban/analyzing_car_prices

A comprehensive data science project analyzing factors that drive used car prices to provide actionable insights for used car dealerships.

crisp-dm data-science lasso-regression linear-regression machine-learning one-hot-encoding pandas ridge-regression scikit-learn

Last synced: 15 Feb 2026

https://github.com/quran-yeamen/serverlifecycleml

Predictive modeling of server lifecycle stages using synthetic data and machine learning.

data-science machine-learning predictive-modeling python scikit-learn synthetic-data

Last synced: 15 Feb 2026

https://github.com/paultheal1en/dsc-fact-checking

Fact-checking project classifying claims as SUPPORTED, REFUTED, or NEI. Uses ANN, DNN, RNN, CNN, Random Forest, PhoBERT, and Sentence Transformers.

deep-learning fact-checking keras machine-learning nlp phobert random-forest scikit-learn sentence-transformers tensorflow transformers

Last synced: 16 Apr 2026

https://github.com/sridharyadav07/machine-learning-project-combined-cycle-power-plant-

This project is focused on Multiple machine learning models, including Linear Regression, Decision Tree Regression, and Random Forest Regression, were implemented to predict the target variable and evaluated using various metrics like RMSE, MAE, and R-squared. The performance of these models was compared, and the Random Forest Regressor was found.

data-processing decisiontreeregressor linear-regression metrics-evaluation python random-forest-regressor scikit-learn

Last synced: 16 Apr 2026

https://github.com/hafidaso/predicting-industrial-machine-downtime-level-3

This project aims to develop a predictive model using machine learning techniques to forecast machine failures based on historical operational data.

imbalanced-learning numpy pandas python scikit-learn seaborn xgboost

Last synced: 16 Apr 2026

https://github.com/sasanka14/water_quality_predictions

Water Quality Prediction - College Project 🌊💧 Predicts water potability (safe/unsafe) using ML models like XGBoost & Random Forest. Features data preprocessing, feature importance, model evaluation, and visualizations. Built with Python, Pandas, Scikit-learn & Seaborn for analysis. 🚀

anaconda jupyter-notebook matplotlib numpy pandas python scikit-learn seaborn xgboost

Last synced: 16 Apr 2026

https://github.com/lorenzorottigni/ml-advertising

Machine Learning python bootcamp: logistic regression on advertising dataset

ipynb logistic-regression machine-learning numpy pandas python scikit-learn seaborn

Last synced: 16 Apr 2026

https://github.com/sergeimakarovv/solar-panel-detection

Applying deep learning models to detect solar panel installations in satellite imagery and estimating their generation capacity

albumentations convolutional-neural-networks deep-learning geopandas pandas pvlib python pytorch rasterio scikit-learn wms-service

Last synced: 16 Apr 2026

https://github.com/khaymanii/calories-burnt-prediction-model

This model was built using Python and XGBoost Regression algorithm

matplotlib numpy pandas python scikit-learn

Last synced: 16 Apr 2026

https://github.com/thekartikeyamishra/customer-retention-predictor

The Customer Retention Predictor is a Python-based tool designed to help businesses predict customer churn using historical data. This project is particularly beneficial for small businesses and MSMEs in India, allowing them to identify customers at risk of leaving and take proactive measures to retain them.

joblib machine-learning numpy pandas python scikit-learn tinker

Last synced: 16 Apr 2026

https://github.com/archish27/pythontutorial

Python Programming Tutorial for new geeks who want to learn python from scratch to deal with various applications

matplotlib numpy pandas pygame python python-2 python-3 scikit-learn soup

Last synced: 01 Apr 2026

https://github.com/shreeparab1890/indian-cricketer-classifier

This notebook is trying to bulia a model which will predict a Indian Cricketer based on the given image. In this project we have handled 8 Indian Cricketers and build a model to classify the given image between this 8 Cricketers.

image-classification matplotlib numpy opencv pandas python random-forest-classifier scikit-learn sklearn streamlit

Last synced: 01 Apr 2026

https://github.com/capsuleismail/income-census-prediction

Predict whether annual income of an individual exceeds $50K per annum based on census data. Also known as "Census Income" dataset.

datascience jupyter-notebook machinelearning-python scikit-learn

Last synced: 16 Apr 2026

https://github.com/bkamapantula/discover

Code search utility to assist developer workflows via code discovery. Currently uses TF-IDF estimator.

developer-tools python scikit-learn tf-idf

Last synced: 16 Apr 2026

https://github.com/grupoguerreroherrera/ethical-ai-recruitment-audit

Bias audit toolkit reproducing the recruitment AI case from Activity 6 — Unidad 3, Electiva II Inteligencia Artificial Avanzada. Empirical analysis with reweighing mitigation, Model Card documentation, and APA 7 references.

academic-project algorithmic-auditing artificial-intelligence bias-mitigation disparate-impact ethical-ai fairness machine-learning model-card python random-forest recruitment-bias reweighing scikit-learn unesco-ai-ethics

Last synced: 03 Jun 2026

https://github.com/drorata/mnist-examples

ML examples for the MNIST dataset

machine-learning ml mnist python scikit-learn torch

Last synced: 19 Apr 2026

https://github.com/supershivam5/python_projects

💻 Python programming with Numpy, Pandas, Matplotlib.🌟 Love exploring new technologies. Check out my projects!

matplotlib-pyplot numpy pandas scikit-learn seaborn

Last synced: 17 Apr 2026

https://github.com/junya737/weighted-pls-regression

A Python implementation of Weighted Partial Least Squares Regression with support for sample weights.

machine-learning partial-least-squares-regression scikit-learn

Last synced: 17 Apr 2026

https://github.com/erikglz/coap-mtd

Repository for an IoT security project implementing Moving Target Defense (MTD) through CoAP protocol randomization to mitigate spoofing attacks and enhance adaptive security.

coap-protocol cybersecurity iot machine-learning python scikit-learn spoofing

Last synced: 17 Apr 2026

https://github.com/vaishnavis03/finlatics_ml_program

This repository contains the .ipynb files for 3 datasets, along with a PPT for each. The datasets included are Facebook Marketplace Data, Sales Prediction Data, and Wine Quality data.

correlation data-analysis data-science data-visualization knn linear-regression machine-learning matplotlib numpy pandas random-forest-classifier scikit-learn

Last synced: 17 Apr 2026

https://github.com/dimdasci/car-price-prediction-demo

Demo project of EDA and regression task solution: Pandas, Jupyter Notebook, Scikit-learn, LightGBM

eda lightgbm-regressor regression scikit-learn

Last synced: 03 Jun 2026

https://github.com/iamwatchdogs/cardiovascular-risk-prediction

This mini-project uses machine learning algorithms to predict possible risks of heart disease by analyzing given data.

jupyter-notebook machine-learning-algorithms matplotlib numpy pandas python scikit-learn seaborn

Last synced: 02 Apr 2026

https://github.com/akshitvats026/heart_disease_prediction

An ML-based Heart Disease Prediction System that predicts the likelihood of heart disease based on user health parameters. Built using Python, Pandas, and Scikit-learn, the system performs data preprocessing, trains a predictive model, and provides real-time predictions with high accuracy.

accuracy-score logistic-regression machine-learning matplotlib-pyplot numpy pandas python scikit-learn

Last synced: 02 Apr 2026

https://github.com/broodhoney/blue-book-for-bulldozers

This repository holds the project which solves a regression problem on predicting the futures sales of bulldozers. This is from a kaggle competition.

matplotlib numpy pandas python scikit-learn

Last synced: 02 Apr 2026

https://github.com/raphael-ufrj/analise_algodao

Análise histórica de plantio de algodão, analise do plantio com base no clima e nos dados históricos.

analysis data-science data-visualization dataset docker pandas provenance python python3 scikit-learn seaborn streamlit

Last synced: 02 Apr 2026

https://github.com/anshvaid4/ml_practice

This is the new repository, where I have added all the notebooks demonstrating the usage of various transformers and models for Supervised and Unsupervised algorithms

anaconda jupyter-notebook machine-learning machine-learning-algorithms python scikit-learn

Last synced: 17 Apr 2026

https://github.com/isshiki/machine-learning-with-python

連載『Pythonで学ぶ「機械学習」入門』(@IT)で使用するノートブックが配布されているリポジトリです。

data-science machine-learning machinelearning-python python scikit-learn

Last synced: 17 Apr 2026

https://github.com/orliluq/inmersion-datos-python

Desarrollar modelos de machine learning para predecir la probabilidad de incumplimiento crediticio de los clientes, utilizando diferentes algoritmos de clasificación (Regresión Logística, Árboles de Decisión, Random Forest, Naive Bayes).

colab-notebook numpy pandas python scikit-learn

Last synced: 02 Apr 2026

https://github.com/felixamaladhas/amazon-reviews-sentiment-analysis

This is a sentiment analysis project that classifies Amazon product reviews as positive or negative using machine learning techniques.

matplotlib numpy pandas python scikit-learn

Last synced: 02 Apr 2026

https://github.com/mayankyadav23/shipment-pricing-prediction

Shipment Pricing Prediction 📦🔍 is a machine learning project that forecasts shipment prices based on various supply chain factors. Using advanced regression models, it provides valuable insights 📊 to optimize pricing strategies in the supply chain analytics domain.

data-visulization flask ineuron-ai machine-learning python scikit-learn shipment-and-pricing

Last synced: 02 Apr 2026

https://github.com/otuemre/obesity-classification

Machine learning project to classify obesity levels based on health metrics like age, sex, height, weight, and BMI.

classification data-science healthcare machine-learning obesity-classification scikit-learn

Last synced: 17 Apr 2026

https://github.com/a-poor/sample-model-serve

Demo for using Flask to serve a scikit-learn model as an API

api data-science docker flask machine-learning scikit-learn

Last synced: 30 Apr 2026

https://github.com/mangesh-balkawade/pythonautomationsscripts

This is the repository which contains the python automations scripts and machine learning case studies , and Python Projects that I have write to learn automations and ML using python.

automation data-science machine-learning-algorithms matplotlib mongodb pandas python3 scikit-learn seaborn webscraping

Last synced: 13 Apr 2026

https://github.com/rosieoh/emergency_dataanalysis

오픈데이터분석-응급의료체계 방안 정책 제안 데이터 분석

ipython matplotlib numpy pandas python scikit-learn scipy

Last synced: 04 Apr 2026

https://github.com/yelamankarassay/personal-health-wellness-dashboard

A Streamlit-based dashboard for visualizing and analyzing personal daily data—weight, mood, meals, sleep, and more. This project uses pandas, plotly, matplotlib, seaborn, scikit-learn, and wordcloud to present insights about your health and daily habits.

matplotlib pandas plotly scikit-learn seaborn wordcloud

Last synced: 17 Apr 2026

https://github.com/belzebu013/prever_nivel_colesterol

Projeto de IA com algoritmo de Regressão Linear múltipla para prever o nível de colesterol de um individuo.

ia jupiter-notebook pandas python regressao-linear-multipla scikit-learn

Last synced: 17 Apr 2026

https://github.com/shaharband/calcofi-oceanographic-analysis

This repository contains an analysis of the CalCOFI (California Cooperative Oceanic Fisheries Investigations) dataset, which represents one of the longest and most complete time series of oceanographic and larval fish data in the world.

pandas regression scikit-learn

Last synced: 10 May 2026

https://github.com/mnj-tothetop/english-handwritten-characters-recognizer

A handwritten english character recognizer [0-9, A-Z, a-z] made by using a Dataset of 3409 images. Tensorflow, Keras, Scikit-learn, and OpenCV was used to implement the Convolution Neural Network (CNN). Matplotlib and Seaborn were used to visualize the data.

artificial-intelligence convolutional-neural-networks keras matplotlib opencv-python scikit-learn seaborn tensorflow

Last synced: 18 Apr 2026

https://github.com/27ahmad/movie-recommendation-system

Welcome to the Movie Recommendation System! This project uses Streamlit to provide personalized movie recommendations based on user preferences and similarity.

movie-recommendation numpy pandas python scikit-learn

Last synced: 04 Apr 2026

https://github.com/minhtran241/ml-dl-llm-genai

Showcasing ML/DL fundamentals, paper implementations, deep learning models, and other projects. The purpose of this repository is to provide a playground for me to explore and learn about PyTorch, deep learning, and generative AI.

deep-learning generative-ai llm machine-learning paper-implementations pytorch scikit-learn

Last synced: 18 Apr 2026

https://github.com/justsecret123/nba-players-stats-analysis

A quick interactive Notebook to visualize some NBA players stats (points, assists, steals, blocks...) and totals, rankings and comparisons. Feel free to add any player in the .csv data files. 🏀

csv ipython-notebook ipywidgets jupyter-notebook jupyterlab matplotlib pandas python scikit-learn seaborn

Last synced: 18 Apr 2026

https://github.com/gattsu001/telecom-churn-predictor

Predicts which telecom customers are likely to churn with 95% accuracy using engineered features from usage, billing, and support data. Implements Sturges-based binning, one-hot encoding, stratified 80/20 train-test split, and a two-level ensemble pipeline with soft voting. Achieves 94.60% accuracy, 0.8968 AUC, 0.8675 precision, 0.7423 recall.

churn-prediction classification classification-algorithm customer-retention data-science data-visualization feature-engineering joblib jupyter-notebook machine-learning pandas scikit-learn supervised-learning svm

Last synced: 18 Apr 2026

https://github.com/gregoritsch3/ml_clustering_eda_customersegmentation

An EDA and Machine Learning Clustering exercise on the Mall Customer Segmentation synthetic dataset demonstrating the use of KMeans Clustering and the Elbow Method. The clustering algorithm successfully segments the customer base into groups distinguishable by their annual income and spending score.

clustering kmeans-clustering machine-learning matplotlib numpy pandas scikit-learn scipy seaborn

Last synced: 04 Apr 2026

https://github.com/pedroteixeiraw/variational_quantum_circuit_binary_classification

This project focuses on developing a Variational Quantum Circuit capable of performing Binary Classification between two classes: red wine and white wine, based on their characteristics using machine learning.

binary-classification cost-function json machine-learning matplotlib numpy pandas qiskit qiskit-machine-learning quantum-machine-learning scikit-learn training-data variational-circuit

Last synced: 04 Apr 2026

https://github.com/abdul-rafay19/california-housing-price-prediction

This project predicts California housing prices using machine learning regression models, including Random Forests and Decision Trees. It covers data preprocessing, exploratory analysis, model training, and hyperparameter tuning to optimize performance.

decision-trees gridsearchcv linear-regression matplotlib numpy pandas python random-forest randomsearch-cv scikit-learn scipy seaborn

Last synced: 04 Apr 2026

https://github.com/alainlebret/python-et-ia-1

Ressources personnelles du cours "Python & IA" en 2e année GPSE à l'ENSICAEN

artificial-intelligence image-processing machine-learning matplotlib numpy python scikit-image scikit-learn

Last synced: 04 Apr 2026

https://github.com/yashsonaar/machine-learning-tasks

This repository has machine learning tasks which include classification, recommendation system, fraud detection system

classification jupyter-notebook machine-learning numpy pandas prediction python scikit-learn testing

Last synced: 04 Apr 2026

https://github.com/anushrey10/fuel_efficiency_predictor

Welcome to the Fuel Efficiency Predictor! This advanced tool uses machine learning to predict your vehicle's fuel efficiency based on various characteristics.

decision-tree gradient-boosting-classifier html-css-javascript linear-regression machile-learning matplotlib python random-forest scikit-learn tailwindcss

Last synced: 18 Apr 2026