An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/glennx1/heartdrive

ML-powered heart disease predictor using Streamlit, featuring data preprocessing, visualization, and user input interface.

matplotlib pandas python scikit-learn seaborn streamlit

Last synced: 29 Apr 2026

https://github.com/engineertolulope/us_states_living_ranking_analysis

Python script for analyzing and ranking U.S. states based on factors like cost of living, tax burden, diversity, crime rates, and climate. Uses weighted criteria to identify the best states to live in according to these metrics. Ideal for decision-making on relocation.

data-analysis data-science linear-regression machine-learning python scikit-learn

Last synced: 29 Jan 2026

https://github.com/allwin107/loan-prediction-web-app

A Flask-based loan prediction web app using a Random Forest model to predict loan approval based on user input. It includes a clean, responsive UI, form validation, and real-time prediction display.

classification data-processing deployment flask loan-prediction machine-learning python random-forest-classifier scikit-learn web-application

Last synced: 15 Apr 2026

https://github.com/samjoesilvano/airline_ticket_fare_prediction

Airline Fare Prediction using Machine Learning focuses on developing a Random Forest model to predict flight prices, achieving an R² score of 0.804. The project includes hyperparameter tuning using RandomizedSearchCV, alongside extensive data preprocessing and feature engineering to ensure robust model performance.

airline-fare-prediction data-preprocessing data-visualization feature-engineering feature-selection hyperparameter-tuning machine-learning pandas python random-forest randomizedsearchcv regression-analysis scikit-learn

Last synced: 15 Apr 2026

https://github.com/shahaba83/airplane-ticket-cancellation

In this project, we try to predict the possibility of canceling the plane ticket by the buyer

datatime numpy pandas python scikit-learn seaborn

Last synced: 25 Feb 2026

https://github.com/asherk7/house-price-prediction

House Prices - Advanced Regression Techniques - Predict sales prices and practice feature engineering, RFs, and gradient boosting

data-science numpy pandas regression scikit-learn

Last synced: 15 Apr 2026

https://github.com/chengetanaim/beatrecommendersystembackend

A system for music producers and rappers/singers. I was trying to implement the product recommendation feature for music uploaded by producers. I used the collaborative filtering algorithm to be able to recommend songs to users.

fastapi scikit-learn sqlalchemy unsupervised-learning

Last synced: 06 Feb 2026

https://github.com/lau1944/coronavirus-world-prediction

Coronavirus Case Confirmed Trend Around The World

coronavirus pandas python scikit-learn

Last synced: 15 Apr 2026

https://github.com/jaypanchal9/fraud-detection-case-study

A comprehensive case study applying machine learning techniques to detect fraudulent transactions effectively.

machine-learning matplotlib numpy pandas python3 scikit-learn seaborn xgboost

Last synced: 15 Apr 2026

https://github.com/bangaji313/recommender-system-movielens

Proyek Sistem Rekomendasi Film dengan Content-Based & Collaborative Filtering. Submission untuk modul Machine Learning Terapan di Coding Camp 2025.

collaborative-filtering content-based-filtering data-science deep-learning dicoding jupyter-notebook keras movie-recommendation movielens pandas python recommender-system scikit-learn tensorflow

Last synced: 15 Apr 2026

https://github.com/itssahilwhat/ai-fundamentals

A curated collection of fundamental AI concepts, algorithms, and code implementations — including Machine Learning, Deep Learning, and Computer Vision — built from scratch and with practical examples.

computer-vision deep-learning machine-learning numpy pandas python pytorch scikit-learn

Last synced: 15 Apr 2026

https://github.com/tamk-kol/project_orbital_data_analysis

The goal of this project is to develop an automatic method to detect orbital maneuvers using machine learning.

matplotlib numpy pandas scikit-learn

Last synced: 30 Jan 2026

https://github.com/diiblo/la-poste-predictive-flux

Prédiction journalière du flux de colis dans les centres de tri de La Poste. Pipeline complet : génération de données, modélisation LightGBM, orchestration via Airflow (Docker), stockage PostgreSQL et dashboard interactif Streamlit. Projet réalisé en Mastère 2 Data Engineering à l’ECE Paris.

airflow docker postgresql scikit-learn streamlit

Last synced: 31 Jan 2026

https://github.com/gunjangyl/iris-detection

The Iris Detection Project classifies different species of Iris flowers using machine learning techniques. It analyzes four key features—sepal length, sepal width, petal length, and petal width—to predict one of three classes: Setosa, Versicolor, or Virginica. The project uses algorithms like KNN, Decision Trees, or SVM for classification. Model pe

knn-classification matplotlib python scikit-learn seaborn

Last synced: 15 Apr 2026

https://github.com/manu-karenite/medical-insurance-cost-predictor

Medical Insurance Cost Generator is a Linear Regression based Predictor which is used to estimate and predict the Cost a person has to pay while Buying a Medical Insurance.

kaggle-dataset linear-regression machine-learning matplotlib numpy pandas python3 reactjs scikit-learn

Last synced: 15 Apr 2026

https://github.com/emv271828/diabetes_cdc_uci_machine_learning

Segunda avaliação para a disciplina de Inteligência Artificial da Universidade Federal Fluminense.

jupyter-notebook machine-learning pandas python scikit-learn

Last synced: 15 Apr 2026

https://github.com/jofaval/titanic-disaster

Data Analysis of the famous Titanic Disaster in 1912 with Machine Learning

classification data-analysis data-science data-visualization google-colab kaggle machine-learning python scikit-learn

Last synced: 15 Apr 2026

https://github.com/sarmad426/ai

AI basic to advanced featuring Machine Learning, Deep Learning and Data Science.

ai data-science deep-learning hugging-face machine-learning numpy pandas python scikit-learn

Last synced: 15 Apr 2026

https://github.com/christiansandovalgarcia01-creator/megaline-plan-classifier

Modelo de clasificación para recomendar plan Smart vs Ultra (Megaline). Split 60/20/20, RandomForest ganador, accuracy TEST ≥ 0.75. Incluye matriz de confusión y classification report. Stack: Python, Pandas, scikit-learn, Jupyter.

classification data-science jupyter-notebook machine-learning python random-forest scikit-learn telecom

Last synced: 15 Apr 2026

https://github.com/samiyaalizaidi/nn-ml-homeworks

Homework solutions for CPE-4903: Neural Networks & Machine Learning at Kennesaw State University.

machine-learning machine-learning-workflow neural-networks numpy scikit-learn

Last synced: 15 Apr 2026

https://github.com/as1467/canada-per-capita-income-prediction

This project is a simple machine learning exercise to predict Canada's per capita income based on historical data. The dataset used in this project was sourced from the CodeBasics GitHub repository and is used here to practice linear regression as part of my machine learning learning process.

machine-learning matplotlib-pyplot pandas python scikit-learn

Last synced: 15 Apr 2026

https://github.com/moustafamohamed01/breast-cancer-prediction

A machine learning model built with PyTorch to predict if a tumor is malignant or benign using the Breast Cancer Dataset. The model uses a neural network to classify the data and shows how to train, evaluate, and visualize results.

ai data-science deep-learning machine-learning neural-network python pytorch scikit-learn

Last synced: 15 Apr 2026

https://github.com/nikitalpopov/evotor_champ

solution for evotor data challenge

data-analysis data-science python scikit-learn

Last synced: 15 Apr 2026

https://github.com/idaraabasiudoh/telco-churn-logistic-regression

A predictive model using logistic regression to identify customers likely to churn from a telecommunications company.

logistic-regression machine-learning python3 scikit-learn

Last synced: 01 Feb 2026

https://github.com/nits2612/data-science-projects

Portfolio of data science projects completed by me during PGP AI/ML, self learning, and hobby purposes.

data data-science dataanalysis deep deep-learning keras machine-learning matplotlib numpy opencv pandas python scikit-learn seaborn surprise-python tensorflow transfer-learning

Last synced: 01 Feb 2026

https://github.com/khanovico/python-stock-analyzer

This is a Webapp implemented by python and several data science frameworks, enabling online stock trend analyzing.

amcharts-js-charts data-analysis data-visualization flask javascript pandas python scikit-learn

Last synced: 02 Feb 2026

https://github.com/sarowarahmed/advertising-sales-app

📈 Advertising Sales Predictor: A web app powered by a Machine Learning model, built with Numpy, Pandas, Scikit-learn, and Streamlit, to forecast sales based on TV, Newspaper, and Online Advertising. Deployed on Streamlit Cloud for real-time, easy-to-use predictions.

advertising app machine-learning multiple-linear-regression numpy pandas sales scikit-learn streamlit

Last synced: 07 Feb 2026

https://github.com/vladimiracunadev-create/python-data-science-program

Python Data Science Program — 197 clases en 9 partes. Pauta avanzada derivada de Géron, VanderPlas, Huyen, ISLP y Barocas/Hardt/Narayanan. Recurso personal de aprendizaje, enseñanza y mejora continua.

bootcamp data-analysis data-science education jupyter machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 01 Jun 2026

https://github.com/sharkb8t/credit-risk-classification

Demonstrates my abilities to use Jupyter Notebook with scikit-learn to train and evaluate a machine learning model.

jupyter-notebook numpy pandas pathlib python scikit-learn

Last synced: 15 Apr 2026

https://github.com/aerojam95/math70076-data-science-cw2

This repository presents the second coursework for the MATH70076 Data Science module at Imperial College London, where the project showcases different machine and deep learning models for image classification

data-science deep-learning machine-learning python3 pytorch scikit-learn

Last synced: 15 Apr 2026

https://github.com/danicaalana/wine-dataset-decision-tree

This project is developed as part of Digital Skill Fair (DSF) 35.0 - Data Science by Dibimbing. I am using Wine Recognition Dataset from scikit-learn, which is the results of a chemical analysis of wines grown in the same region in Italy by three different cultivators.

data data-analysis-python data-science decision-tree-classification machine-learning python scikit-learn wine-dataset

Last synced: 18 Apr 2026

https://github.com/max00358/sign_language_detection

A sign language detector that recognizes ASL(American Sign Language) alphabet

mediapipe opencv scikit-learn

Last synced: 09 Feb 2026

https://github.com/sarowarahmed/predicting-kolkata-house-price

🏠 Predicting Kolkata House Price: A web app powered by a Machine Learning model, built with Numpy, Pandas, Scikit-learn, and Streamlit, to predict house prices in Kolkata. Deployed on Streamlit Cloud for easy access and real-time predictions.

app kolkata linear-regression machine-learning numpy pandas scikit-learn streamlit

Last synced: 26 Feb 2026

https://github.com/brossend/automl_bank_project

Automated ML pipeline for the UCI Bank Marketing dataset: ETL, Optuna-based AutoML, model evaluation, MLflow logging, pytest tests, Docker, and CI/CD.

automl bank-marketing binary-classification ci-cd classification data-science docker docker-compose etl github-actions gitlab-ci machine-learning ml-pipeline mlflow model-monitoring optuna pytest python scikit-learn uci-dataset

Last synced: 02 Jun 2026

https://github.com/sachinh123/cognitive-customer-insights-with-watson-ai

This project analyzes customer data to provide insights for personalized services, behavior prediction, and improved support.

flask ibm-cloud ibm-watson-assistant ibm-watson-nlu nltk python scikit-learn

Last synced: 10 Feb 2026

https://github.com/0eix/ibm-ds-spacex-falcon9

IBM Professional data science certificate Final Project Notebooks

data-science data-visualization exploratory-data-analysis ibm poetry scikit-learn shap

Last synced: 11 Feb 2026

https://github.com/nurulashraf/predictive-maintenance-analysis-for-machine-failure-prevention

Predictive maintenance analysis for machine failure prevention using sensor data and ML. Built a Random Forest model and Gradio dashboard to identify high-risk machines for proactive maintenance.

data-science failure-prediction gradio industrial-iot machine-learning power-bi predictive-maintenance python scikit-learn

Last synced: 16 Apr 2026

https://github.com/sabin74/fake_news_detection

This project implements a Fake News Detection system using Python, Natural Language Processing (NLP), and machine learning. It classifies news articles as Real or Fake based on their textual content.

fake-news-detection kaggle-dataset multinomial-naive-bayes passive-aggressive-classifier python3 regex scikit-learn

Last synced: 16 Apr 2026

https://github.com/sanjiv856/machine_learning_scikit-learn

Repository for machine learning in Python using Scikit-learn.

pipelines python scikit-learn sklearn titanic-kaggle titanic-survival-prediction

Last synced: 27 Feb 2026

https://github.com/codedby-mozz/habits_vs_academic_performance

This repository contains a Jupyter Notebook that explores the relationship between student lifestyle habits and academic performance. It demonstrates the process of data loading, exploratory data analysis (EDA), correlation analysis, and the development of a predictive model using linear regression to predict exam scores based on daily habits.

linear-regression python scikit-learn

Last synced: 16 Apr 2026

https://github.com/cego669/dirtycategoriesencoding

Repository containing two classes (StringAgglomerativeEncoder and StringDistanceEncoder) useful for grouping or visualizing the distance between dirty categorical variables. They are compatible with the scikit-learn API.

category clustering dimensionality-reduction dirty hierarchical-clustering machine-learning scikit-learn singular-value-decomposition svd

Last synced: 11 Feb 2026

https://github.com/c2ramel/autonomous-semantic-discovery

An unsupervised machine learning engine that utilizes Non-negative Matrix Factorization (NMF) to autonomously extract and visualize latent semantic topics from the 20 Newsgroups dataset.

data-visualization machine-learning nlp nmf python scikit-learn unsupervised-learning

Last synced: 16 Apr 2026

https://github.com/mindkerchief/baselineml

A collection of machine learning task performed during my studies in computer science major in intelligent system.

decision-tree dummy gaussian-mixture-models kmeans-clustering linear-regression logistic-regression machine-learning matplotlib numpy pandas random-forest scikit-learn seaborn tensorflow

Last synced: 16 Apr 2026

https://github.com/selcia25/iris-dataset-classification

☘This repository contains a Python script for classifying the Iris dataset using the Random Forest algorithm.

data-processing iris-classification pandas random-forest-classifier scikit-learn

Last synced: 16 Apr 2026

https://github.com/arshc0der/n.o.v.a-geospatial-ozone-predictor

An AI-powered geospatial intelligence dashboard for predicting atmospheric ozone levels using 27 years of NASA data. Features 3D climate mapping and live satellite tracking.

atmospheric-science climate-tech dashboard-ui data-visualization desktop-app geospatial-analysis gis machine-learning matplotlib ozone-prediction pandas python random-forest-regressor satellite-tracking scikit-learn tkinter windows-executable

Last synced: 01 Mar 2026

https://github.com/s0fft/airline-passenger-satisfaction

Airline-Customer-Model — Machine Learning Project on: Scikit-learn / Pandas / Matplotlib / Seaborn

jupyter-notebook mashine-learning matplotlib pandas python3 scikit-learn seaborn

Last synced: 12 Feb 2026

https://github.com/zsailer/skspline

A Scikit-learn interface on Scipy's spline.

scikit-learn scipy

Last synced: 16 Apr 2026

https://github.com/sergeimakarovv/energy-data-analytics-ml

Analyzing global data on sustainable energy, predicting CO2 emissions per capita

machine-learning pandas plotly python scikit-learn streamlit

Last synced: 12 Feb 2026

https://github.com/manjit-baishya-datascience/spam-email-detection

This project demonstrates how to build a spam detection system using Natural Language Processing (NLP) and machine learning techniques.

imblearn nlp nlp-machine-learning nltk scikit-learn spam-detection

Last synced: 12 Feb 2026

https://github.com/gliuck/diabetesprediction

Machine learning exam project, focused on predicting diabetes based on health and demographic data. The project uses models like Logistic Regression, KNN, SVM and NN to analyze and predict the likelihood of diabetes in individuals.

machine-learning machine-learning-models numpy-library pandas-library prediction-model python scikit-learn

Last synced: 14 Feb 2026

https://github.com/chanmeng666/mnist-handwritten-digit-recognition-project

【Sprinkle some star dust on this repo! ⭐️ It's good karma!】A comprehensive implementation and analysis of handwritten digit recognition using multiple neural network architectures on the MNIST dataset. Features basic MLP, optimized feature-selected model, and deep CNN approaches with detailed performance comparisons and visualizations.

cnn computer-vision data-analysis data-visualization deep-learning feature-analysis handwritten-digit-recognition keras machine-learning mlp mnist model-optimization neural-networks python scikit-learn tensorflow

Last synced: 02 Apr 2026

https://github.com/mattia-hulathduwage/wine-quality-analyzer

A machine learning project that analyzes wine quality using clustering, regression, and classification techniques. The model predicts wine quality scores based on chemical properties and determines the most influential features affecting quality.

machine-learning matplotlib numpy pandas python scikit-learn seaborn

Last synced: 16 Apr 2026

https://github.com/hlexnc/project-arepo

Data-driven stroke risk assessment & personalized recommendations, powered by machine-learning and an NLU-driven chatbot.

chatbot data-analysis docker docker-compose machine-learning nlu-chatbot python rasa scikit-learn sklearn streamlit

Last synced: 15 Feb 2026

https://github.com/smuralee/machine-learning-samples

Machine learning samples

pytorch scikit-learn

Last synced: 15 Feb 2026

https://github.com/mgesteban/analyzing_car_prices

A comprehensive data science project analyzing factors that drive used car prices to provide actionable insights for used car dealerships.

crisp-dm data-science lasso-regression linear-regression machine-learning one-hot-encoding pandas ridge-regression scikit-learn

Last synced: 15 Feb 2026

https://github.com/quran-yeamen/serverlifecycleml

Predictive modeling of server lifecycle stages using synthetic data and machine learning.

data-science machine-learning predictive-modeling python scikit-learn synthetic-data

Last synced: 15 Feb 2026

https://github.com/paultheal1en/dsc-fact-checking

Fact-checking project classifying claims as SUPPORTED, REFUTED, or NEI. Uses ANN, DNN, RNN, CNN, Random Forest, PhoBERT, and Sentence Transformers.

deep-learning fact-checking keras machine-learning nlp phobert random-forest scikit-learn sentence-transformers tensorflow transformers

Last synced: 16 Apr 2026

https://github.com/sridharyadav07/machine-learning-project-combined-cycle-power-plant-

This project is focused on Multiple machine learning models, including Linear Regression, Decision Tree Regression, and Random Forest Regression, were implemented to predict the target variable and evaluated using various metrics like RMSE, MAE, and R-squared. The performance of these models was compared, and the Random Forest Regressor was found.

data-processing decisiontreeregressor linear-regression metrics-evaluation python random-forest-regressor scikit-learn

Last synced: 16 Apr 2026

https://github.com/hafidaso/predicting-industrial-machine-downtime-level-3

This project aims to develop a predictive model using machine learning techniques to forecast machine failures based on historical operational data.

imbalanced-learning numpy pandas python scikit-learn seaborn xgboost

Last synced: 16 Apr 2026

https://github.com/sasanka14/water_quality_predictions

Water Quality Prediction - College Project 🌊💧 Predicts water potability (safe/unsafe) using ML models like XGBoost & Random Forest. Features data preprocessing, feature importance, model evaluation, and visualizations. Built with Python, Pandas, Scikit-learn & Seaborn for analysis. 🚀

anaconda jupyter-notebook matplotlib numpy pandas python scikit-learn seaborn xgboost

Last synced: 16 Apr 2026

https://github.com/lorenzorottigni/ml-advertising

Machine Learning python bootcamp: logistic regression on advertising dataset

ipynb logistic-regression machine-learning numpy pandas python scikit-learn seaborn

Last synced: 16 Apr 2026

https://github.com/pramodyasahan/health-insurance-cost-prediction

This project focuses on predicting health insurance costs using a polynomial regression model. By employing machine learning techniques in Python, the project aims to accurately estimate insurance costs based on various personal attributes. The model takes into account several features including age, sex, BMI, number of children, smoking status etc

machine-learning matplotlib numpy pandas python3 scikit-learn

Last synced: 16 Apr 2026

https://github.com/piotrwnuczek/cloudprediction

Predicting cloud task execution time using AI/ML

matplotlib pandas python scikit-learn

Last synced: 16 Apr 2026

https://github.com/silky-x0/spam-detector

An machine learning algorithm to detect spam emails or such.

jupyter-notebook nltk-python pandas python3 scikit-learn

Last synced: 16 Apr 2026

https://github.com/sergeimakarovv/solar-panel-detection

Applying deep learning models to detect solar panel installations in satellite imagery and estimating their generation capacity

albumentations convolutional-neural-networks deep-learning geopandas pandas pvlib python pytorch rasterio scikit-learn wms-service

Last synced: 16 Apr 2026