An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/krish57-bit/diabetes-prediction-

A comprehensive machine learning pipeline to predict the onset of diabetes using the PIMA Indian Diabetes dataset. This includes data cleaning, visualization, outlier detection, standardization, SMOTE-based imbalance handling, and multiple classification algorithms (Logistic Regression, Naive Bayes, and KNN).

classification data-science diabetes healthcare jupyter-notebook machine-learning python scikit-learn smote

Last synced: 07 May 2026

https://github.com/itsdawei/qsc-airplane

A regression model predicting airline stock prices based on public flight data.

regression-analysis scikit-learn statsmodels stock-price-prediction

Last synced: 17 May 2026

https://github.com/balavenkatesh3322/loan-default-prediction

An end-to-end machine learning project to predict loan default risk. Includes Exploratory Data Analysis (EDA), feature engineering, a Gradient Boosting model, and a proposed system architecture for deployment.

data-science deep-learning feature-engineering gradient-boosting loan-default-prediction machine-learning scikit-learn tutorial-exercises

Last synced: 17 May 2026

https://github.com/myself-aas/predict-influence-of-social-media-and-other-socio-demographic-factors-on-study-duration

'The Study Duration Prediction Web App' uses machine learning to predict student study time based on factors like GPA, family background, social media engagement, and personal influences. Built with Flask and scikit-learn, it offers personalized insights into how lifestyle choices affect academic performance and study habits.

flask-application machine-learning machine-learning-algorithms prediction-model python scikit-learn scikitlearn-machine-learning webapp

Last synced: 23 Jul 2025

https://github.com/tabotcharlesbessong/python-errors

This repository will contain all python errors i will encounter in my life as a python plus their solutions

matplotlib-animation matplotlib-pyplot numpy pandas python-script python3 scikit-learn seaborn

Last synced: 16 Apr 2026

https://github.com/anandparayil/sign-language-translator

Real-Time AI-Based Sign Language Translator using MediaPipe, Random Forest, and Tkinter GUI.

jupyter-notebook mediapipe opencv python pyttsx3 scikit-learn tkinter-gui

Last synced: 07 Apr 2026

https://github.com/Gamowy/Music-Classification

Music genre classification using k nearest neighbors classifier based on gtzan dataset

machinelearning python scikit-learn university-assignment

Last synced: 17 Jul 2025

https://github.com/tomas542/dl_examples

Examples of Machine Learning, Deep Learning, Natural Language Processing and so on

computer-vision cv deep-learning dl keras machine-learning ml natural-language-processing nlp numpy python pytorch scikit-learn

Last synced: 08 Apr 2026

https://github.com/notshrirang/m2connex

M2ConneX is an all-encompassing platform specifically crafted for MMCOE alumni, enabling seamless communication, networking, and collaboration. It provides tailored recommendations for connections, posts, and job opportunities based on each user's unique skills and experience.

django django-rest-framework scikit-learn

Last synced: 28 Jun 2025

https://github.com/colinwu0403/weatherpredictor

ML model that predicts future weather temperatures. Dataset taken from NOAA's Climate Data Online

pandas scikit-learn

Last synced: 02 May 2026

https://github.com/gmork2/covid-19

A mathematical analysis of the infection growth

coronavirus covid-19 jupyter-notebook numpy pandas python scikit-learn

Last synced: 08 Apr 2026

https://github.com/beolawork-art/novabank-churn-analysis

NovaBank has noticed that customers are closing accounts or going inactive, and they want to understand why.

data-analysis data-science-projects data-visualization eda machine-learning numpy pandas python scikit-learn sql

Last synced: 08 Apr 2026

https://github.com/lorenzorottigni/ml-lending-club

Machine Learning python bootcamp: random forest classifier on LendingClub dataset

ipynb machine-learning numpy pandas python random-forest-classifier scikit-learn seaborn

Last synced: 08 Apr 2026

https://github.com/kristishqau/sentimentanalysis_nlp

A project for sentiment analysis of tweets using various NLP techniques and machine learning models.

datascience jupyter-notebook machine-learning nlp nltk python scikit-learn sentiment-analysis xgboost

Last synced: 01 May 2026

https://github.com/shahbazshaddy/explainable-multimodal-ai-for-breast-cancer-and-pneumonia-prediction

A deep learning-based framework integrating explainable multimodal AI for accurate prediction and transparent diagnosis of breast cancer and pneumonia.

deep-learning explainable-ai grad-cam groq-api llm machine-learning matplotlib multimodal numpy pandas python pytorch scikit-learn seaborn streamlit

Last synced: 08 Apr 2026

https://github.com/vladstudennikov/diabetes-prediction-app

ML-powered web app built with Laravel and Vue.js to predict diabetes risk based on users' daily habits and behavior

cypress data-analysis diabetes-prediction fastapi inertiajs laravel matplotlib medicine ml pandas php scikit-learn seaborn vuejs

Last synced: 08 Apr 2026

https://github.com/ashrw/handwritten_digit_recognizer

A handwritten digit recognition system using Python and Scikit-learn to preprocess images and classify digits with a trained SVM model.

ml python scikit-learn

Last synced: 03 Jan 2026

https://github.com/jhylin/ml1-1_small_mols_in_chembl

Polars dataframe library and logistic regression in scikit-learn (update)

logistic-regression machine-learning parquet-files polars-dataframe scikit-learn

Last synced: 03 Jan 2026

https://github.com/andrewsy1004/mask-detection

Mask detection system capable of identifying individuals with or without masks

kaggle keras python scikit-learn tensorflow

Last synced: 08 Apr 2026

https://github.com/hariprasath-v/machinehack_analytics_olympiad_2023

Create a machine learning model to determine the likelihood of a customer defaulting on a loan based on credit history, payment behavior, and account details.

binaryclassification catboost exploratory-data-analysis machine-learning numpy pandas python scikit-learn shap

Last synced: 08 Apr 2026

https://github.com/ismaelvr1999/air-quality-clustering

This project focuses on analyzing air quality data and categorizing it into clusters using the K-Means algorithm.

jupyter-notebook machine-learning matplotlib pandas python scikit-learn

Last synced: 05 Mar 2026

https://github.com/luona-zhang/kaggle-data-science-competitions

This repository contains code developed for participating in Kaggle Data Science competitions.

fitting-algorithm machine-learning model-evaluation numpy pandas scikit-learn seaborn tensorflow

Last synced: 07 Apr 2026

https://github.com/priyanshulathi/cancer-diagnosis-prediction-model

A Machine Learning project to predict cancer malignancy using K-Nearest Neighbor, Support Vector Machine, and Decision Tree algorithms.

machine-learning numpy pandas python scikit-learn

Last synced: 03 Jan 2026

https://github.com/rinuya/ml-cancer-diagnosis

Binary classficiation using MLP & Random Forest

ml mlp random-forest scikit-learn

Last synced: 03 Jan 2026

https://github.com/barraharrison/seoul-bike-sharing

Performing EDA on a kaggle dataset to look at the distribution of Seoul's bike-sharing system

jupyterlab matplotlib numpy pandas python scikit-learn seaborn

Last synced: 23 Jul 2025

https://github.com/barbarahayd/com410-ml

atividades aula machine learning

decision-tree scikit-learn

Last synced: 01 May 2026

https://github.com/ledsouza/deep-learning-noticias

Este projeto visa construir dois modelos de Machine Learning: um para classificar notícias em diferentes categorias e outro para realizar o autocomplete de texto, prevendo a próxima palavra em uma frase. O conjunto de dados fornecido consiste em notícias de um site de notícias, já pré-processadas e armazenadas em um arquivo CSV.

deep-learning keras machine-learning python scikit-learn tensorflow

Last synced: 08 Mar 2026

https://github.com/alphacrypto246/insurance-charges-prediction

The Predicting Insurance Charges project uses Decision Tree Regression to predict insurance charges based on features like age, sex, BMI, and smoking habits. It involves data preprocessing, feature scaling, and model evaluation with metrics like MAE and R².

machine-learning numpy pandas scikit-learn scikitlearn-machine-learning

Last synced: 03 May 2026

https://github.com/hazim-hf/machine-learning

This repository contains materials and implementations related to machine learning concepts, techniques, and algorithms. The focus is on building self-learning computer systems that improve through experience and data. The course explores fundamental and advanced topics in machine learning, with applications in Big Data across various fields.

decision-trees neural-network pytorch reinforcement-learning scikit-learn support-vector-machine tensorflow unsupervised-learning

Last synced: 07 Apr 2026

https://github.com/radoslawregula/geo-music-classification

Jupyter notebook implementing a classification solution to the geographical origins of music problem.

classification jupyter-notebook machine-learning pandas python random-forest-classifier scikit-learn

Last synced: 17 Apr 2026

https://github.com/tejaswirupa/early-prediction-of-diabetes-risk-using-machine-learning

Built a predictive model using CDC health data to identify individuals at risk of developing diabetes. Achieved 90.6% F1-score using Logistic Regression and revealed key health indicators like BMI and blood pressure as top predictors.

data-science datacleaning exploratory-data-analysis modelevaluation preprocessing-data python scikit-learn supervised-machine-learning

Last synced: 15 Jul 2025

https://github.com/shakeel-data/amazon-sales-forecasting-python-bigquery-ml

An end-to-end analytics project using Python, SQL, & ML to forecast Amazon sales and segment customers. We build predictive models (LightGBM, Prophet) and clustering (KMeans) to deliver actionable insights for revenue growth and targeted marketing.

bigquery kmeans-clustring lightgbm linear-regression prophet-facebook scikit-learn

Last synced: 09 May 2026

https://github.com/eljandoubi/disasterresponsepipeline

Project aim is to build a Natural Language Processing (NLP) model to categorize messages on a real time basis.

flask nltk numpy pandas plotly scikit-learn scipy sqlalchemy

Last synced: 09 Apr 2026

https://github.com/armahdavi/data_pipeline_analytics_statistics_ML_PM_PSD_residential_QFF

Sharing all the data pipelines and processing codes, statistical modellings, descriptive statistics, plot visualizations, and machine learning from Mahdavi & Siegel (2021) (Indoor Air) Project Miestone: 2017 - 2020 Full-length article: https://onlinelibrary.wiley.com/doi/abs/10.1111/ina.12782

data-science data-visualization dust hvac indoor-air-quality jupyter-notebook machine-learning matplotlib-pyplot numpy pandas python scikit-learn scipy-stats spyder spyder-python-ide statistics

Last synced: 17 Sep 2025

https://github.com/anthippi/naive-bayes-imdb-classification

A custom Naive Bayes classifier for sentiment analysis of movie reviews from the IMDb dataset, utilizing feature selection based on Information Gain and comparing its performance with scikit-learn's BernoulliNB.

classification imdb matplotlib naive-bayes-classifier numpy pandas scikit-learn sklearn

Last synced: 09 Apr 2026

https://github.com/riyajain255/customer-segmentation-for-e-commerce

This project analyzes online retail data to segment customers using K-Means clustering and build classification models to predict those segments based on purchasing behavior.

customer-segmentation data-analysis kmeans-clustering logistic-regression machine-learning matplotlib numpy pandas python random-forest scikit-learn seaborn-plots

Last synced: 02 Apr 2026

https://github.com/rohanbanerjee1234567-cell/prediction-of-expected-salary-using-machine-learning

Here is my first Project Repository where I have made a Machine Learning Project using Python. The Problem statement was to train a model based on the given Dataset and from there we need to Predict the Expected Salary of an Employee who will have similar profiles.

exploratory-data-analysis linearregression matplotlib-pyplot numpy pandas randomforest randomforestregressor scikit-learn scikitlearn-machine-learning searborn visualization

Last synced: 27 Apr 2026

https://github.com/rajan-bhateja/machine-learning-with-python

Machine learning algorithms implemented using Scikit-learn

classification clustering machine-learning regression scikit-learn sklearn

Last synced: 17 May 2026

https://github.com/sabbadini10/job4you

Job4You is an AI-powered job application assistant that streamlines the entire application process. Built on Angular and Firebase with GPT-4 integration.

angular api ats-optimization cover-letter email-automation firebase jobforall openai-api python resume-builder scikit-learn sheraz sherazhussain sherazhussain546

Last synced: 04 Mar 2026

https://github.com/praatibhsurana/breast-cancer-prediction-svm

A SVM classifier coded in Python using Scikit-Learn to classify whether a patient's tumor is malignant or benign.

kaggle-dataset linear-classifier machine-learning-algorithms python scikit-learn svm-classifier

Last synced: 16 May 2026

https://github.com/pramodyasahan/learn-ml

This repository serves as both a personal learning diary and a resource for others interested in understanding and applying machine learning concepts. The projects are categorized based on the type of ML model and are implemented in Python using libraries like scikit-learn, pandas, and numpy.

classification clustering machine-learning matplotlib numpy pandas regression scikit-learn supervised-learning unsupervised-learning

Last synced: 07 Apr 2026

https://github.com/jofaval/ionosphere

Binary Classification of Ionosphere signals at Goose Bay, Labrador in 1988

data-analysis data-science data-visualization deep-learning google-colab keras machine-learning python scikit-learn tensorflow uci xgboost

Last synced: 09 Apr 2026

https://github.com/jain1shh/solar-flare-prediction

This repository contains code and data for predicting solar flare energy ranges using machine learning, based on NASA's RHESSI mission data. It includes preprocessing of FITS files into a unified CSV dataset and implements models like Gradient Boosting, Random Forest, and Decision Tree classifiers, achieving accuracies up to 87%.

data-visualization machine-learning numpy pandas python scikit-learn solar-flare-prediction

Last synced: 09 Apr 2026

https://github.com/dhanraj-parigi/diabetes_prediction_app

🩺 A simple and interactive web app that predicts diabetes using 🧠 machine learning. 🚀 Built with Python, Streamlit, and the 🧮 Pima Indians Diabetes dataset.

ai-in-healthcare classification data-science diabetes-prediction health-check healthcare-ai jupyter-notebook machine-learning ml-project pandas python random-forest scikit-learn streamlit

Last synced: 09 Apr 2026

https://github.com/mirzaazwad/tymbert

TYMBert is our submission for NCIM 2025, a spam classifier that makes use of knowledge distillation to compress the model while preserving accuracy

bert huggingface-transformers knowledge-distillation machine-learning matplotlib numpy pandas python3 scikit-learn tiny-bert torch

Last synced: 09 Apr 2026

https://github.com/shafaq-aslam/predicting-heart-disease-risk-with-logistic-regression-techniques

Develop a predictive model using logistic regression techniques to assess heart disease risk based on patient health metrics and data analysis.

data-analysis heart-disease logistic-regression machine-learning machine-learning-models matplotlib numpy pandas python scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/stefagnone/text_adventure_game

A text-based adventure game project using Python fundamentals

matplotlib numpy pandas python r scikit-learn seaborn sql

Last synced: 09 Apr 2026

https://github.com/shauryashaurya/marty_mcfly

Code, text and notebooks on a tutorial for Introduction to Machine Learning using open sources

anaconda jupyter-notebooks machine-learning machine-learning-tutorials notebook numpy python regression scikit-learn scipy tutorial

Last synced: 09 Apr 2026

https://github.com/aneeshmurali-n/ann-diabetes-prediction

Predicting diabetes progression using an Artificial Neural Network (ANN). This project leverages the scikit-learn diabetes dataset for training and evaluation. Includes data preprocessing, model building, and performance visualization.

ann data-preprocessing data-visualization deep-learning diabetes-prediction exploratory-data-analysis keras machine-learning matplotlib neural-network numpy pandas regression scikit-learn seaborn tensorflow visualization

Last synced: 07 Apr 2026

https://github.com/abdiasarsene/developpement_tableau_de_bord_de_la_chaine_approvisionnement_power_bi

Développer une solution complète pour visualiser, analyser et prédire des données de la chaîne d'approvisionnement.

ci-cd docker fastapi github-actions mysql-database randomizedsearchcv scikit-learn seaborn-plots

Last synced: 23 Jun 2025

https://github.com/rajan-bhateja/Machine-Learning-with-Python

ML/DL projects done using sklearn and TensorFlow

machine-learning scikit-learn sklearn

Last synced: 28 Jul 2025

https://github.com/ajxxxs/spotify-music-analysis

spotify Music (web scraped playlists ) analysis (over 3 states) , trends, features and a music recommendation system.

matplotlib numpy panda scikit-learn seaborn

Last synced: 28 Jul 2025

https://github.com/anuranjanjain/cardioguide

This is the project that I created for DSN 2 at VIT , As its name suggests it will help you to check for any abnormalities with your heart by giving the "Heart Risk Assessment"

chartjs chatbot flask-application mlmodel pandas pickle python rest-api scikit-learn

Last synced: 20 Jan 2026

https://github.com/antonio-f/housing-simplemlexample

Basic example with California Housing Prices dataset from the StatLib repository using scikit-learn

housing-simplemlexample machine-learning scikit-learn simple

Last synced: 01 May 2026

https://github.com/rafay-imraan/email-spam-filtering

Machine learning models that filter spam emails from a dataset downloaded from kaggle.com.

machine-learning ml pandas python scikit-learn xgboost

Last synced: 20 Jan 2026

https://github.com/dmarks84/coursework_project_ml-classifier-eval-selection

Project for University of Michigan Applied Data Science Specialization -- Predicted viewer engagement based on features related to video metrics; evaluated a large set of classifiers under different scoring metrics to select the "optimal" one.

classification cross-validation data-modeling data-reporting data-visualization databases dataframes eda grid-search matplotlib numpy pandas python scikit-learn statistics supervised-ml

Last synced: 02 Apr 2026

https://github.com/iamjuniorb/d499-supervised-learning

This class for machine Learning presents the end-to-end process of investigating data through a machine learning lens.

machine-learning project python python3 scikit-learn scikit-learn-python scikitlearn-machine-learning supervised-learning supervised-machine-learning

Last synced: 16 May 2026

https://github.com/kavyachouhan/fake-news-detection-dravidian-language

This repository contains the code and resources for a machine learning project focused on detecting fake news in the Malayalam language, developed as part of the IITM-PAN BS AI-ML Challenge.

jupyter-notebook machine-learning numy pandas python scikit-learn

Last synced: 08 Feb 2026

https://github.com/antrita/stroke_prediction_model

A model that combines Kaggle's Stroke Prediction Dataset with live weather/air quality data to implement FDA-compliant MLOps pipeline and shows expertise in healthcare regulations and real-time inference.

ai data-analysis deep-learning kaggle-dataset machine-learning prediction-model random-forest real-time scikit-learn streamlit weather-api xgboost

Last synced: 07 May 2026

https://github.com/nathadriele/transaction_fraud_prevention_pipeline

Uma solução de detecção e prevenção de fraudes em transações financeiras, combinando Machine Learning, regras de negócio e análises estatísticas avançadas. O sistema oferece um dashboard interativo para monitoramento em tempo real, análise de dados e gestão de alertas de fraude.

data-analysis data-visualization docker fraud-prevention machine-learning matplotlib numpy pandas pipeline pytest python scikit-learn scipy seaborn streamlit tensorflow transaction xgboost

Last synced: 10 Apr 2026

https://github.com/akapich/clustermatic

Python AutoML library for clustering tasks

automl clustering machine-learning scikit-learn

Last synced: 11 Feb 2026

https://github.com/fikri-rouzan/energy-consumption-prediction

Final Project for the AI/ML Weekly Class by Google Developer Group on Campus (GDGoC) UIN Jakarta.

jupyter-notebook matplotlib numpy pandas python scikit-learn scipy seaborn

Last synced: 07 Apr 2026

https://github.com/080bct12alex/nepalestate

A real estate price prediction web app using machine learning, Next.js and Flask

flask-api mlp-regresor nextjs scikit-learn

Last synced: 31 Jul 2025

https://github.com/lanhhoang/toronto-bicycle-thefts-classifier

A predictive service using Toronto Police Open Data to provide a classification of either the bike is likely to be returned or not

clustering decision-trees flask logistic-regression machine-learning python scikit-learn streamlit

Last synced: 04 May 2026

https://github.com/bgmp/svm

Support Vector Machine implementation written in Python

iris-dataset scikit-learn svm

Last synced: 31 Jul 2025

https://github.com/muhdhammad/machine-learning

Crafted for hands-on learning and implementation of ML with scikit-learn

data-science jupyter-notebook machine-learning matplotlib numpy pandas python scikit-learn seaborn

Last synced: 07 Apr 2026

https://github.com/manshreet27/mrs

This Movie Recommendation System is a web-based application built using Python and Streamlit, designed to provide movie recommendations based on user preferences. It utilizes TMDb API for fetching real-time movie details and Kaggle's TMDB 5000 Movies dataset for content-based filtering.

numpy pandas python scikit-learn streamlit tmdb-5000-movies-dataset-from-kaggle tmdb-api-for-fetching-real-time-movie-data

Last synced: 07 Apr 2026

https://github.com/vigneshvaranasi/breast_cancer_detection

This project employs machine learning, focusing on Logistic Regression, to detect breast cancer using tumor-related features. The dataset is preprocessed, and the model achieves 100% accuracy on the test set. The goal is to gain insights into breast cancer factors and provide an effective detection solution.

jupyter machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 09 Apr 2026

https://github.com/sridharyadav07/machine-learning-project-bankruptcy-prevention-

The project explores multiple machine learning algorithms and evaluates their performance using various metrics, such as accuracy and confusion matrices. The models tested include Logistic Regression, K-Nearest Neighbors (KNN), Naive Bayes, and Support Vector Machine (SVM). In addition, regularization techniques (L1, L2) are used to avoid overfit.

data-preprocessing evaluation machine-learning-models matplotlib-pyplot modelbuilding modeldeployment numpy pandas python scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/presizhai/iris-predictor-fastapi

A web application for predicting the species of Iris flowers using a machine learning model trained with the Iris dataset, with FastAPI, a modern web framework for building APIs.

essemblelearning fastapi python random-forest-classifier scikit-learn uvicorn

Last synced: 25 Dec 2025

https://github.com/oroszgy/cookiecutter-ml-flask

Cookiecutter template for training and serving machine learning models with scikit-learn, spacy, Flask and Docker

docker flask flask-application machine-learning nlp rest-api scikit-learn spacy

Last synced: 09 Apr 2026

https://github.com/kiapanahi/handson-machine-learning-book-playground

Sample codes and practices around the book "Hands-On Machine Learning with Scikit-Learn and TensorFlow"

machine-learning python scikit-learn tensorflow

Last synced: 09 Apr 2026

https://github.com/mtlh/fyp_prempredict

In PremPredict, players will predict all Premier League games. Compete against the algorithm and other users across a full season. Scoring points for every correct result/prediction.

django prediction premierleague python scikit-learn tailwindcss

Last synced: 09 Apr 2026

https://github.com/alexgoodison/boxbox

F1 Race Visualiser & Overtake Prediction Model 🏎️

fastapi keras nextjs scikit-learn

Last synced: 09 Apr 2026

https://github.com/mrmalik2512/catsvsdog.github.io

A CNN model integrated with flask backend the project is trained on image data of dogs and cats and integrated with a website predicts the given image is dog or a cat

deep-learning numpy python scikit-learn tensorflow

Last synced: 09 Apr 2026

https://github.com/lefteris-souflas/modern-slavery-analysis

Jupyter notebook using machine learning techniques to explore the complex drivers of modern slavery. Models from a research paper are replicated and evaluated . Actions also include filling missing data, training regression models, and analyzing feature importance.

decision-tree feature-importance grid-search-cv imputation jupyter-notebook lasso-regression linear-regression matplotlib mean-absolute-error numpy pandas preprocessing principal-component-analysis python3 random-forest ridge-regression scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/sabin74/spam_mail_detection

A machine learning project to classify SMS messages as Spam or Ham (Not Spam) using Natural Language Processing (NLP) techniques and Scikit-learn. This binary classification task uses the UCI SMS Spam Collection Dataset and implements various models including Naive Bayes, SVM, and Logistic Regression with performance tuning.

gridsearchcv nltk python scikit-learn smote sms-spam-detection uci-machine-learning

Last synced: 04 May 2026