An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/daniel-furman/recfeatureselect

Feature selection functions (1) using the multi-collinearity matrix and recursively proceeding to a spearman threshold and (2) using Forward Stepwise Selection running on an ensemble sklearner (with options for HPO).

correlation-threshold machine-learning modeling multicollinearity recursion recursive-algorithm scikit-learn spearman-rho

Last synced: 03 Jan 2026

https://github.com/myself-aas/predict-influence-of-social-media-and-other-socio-demographic-factors-on-study-duration

'The Study Duration Prediction Web App' uses machine learning to predict student study time based on factors like GPA, family background, social media engagement, and personal influences. Built with Flask and scikit-learn, it offers personalized insights into how lifestyle choices affect academic performance and study habits.

flask-application machine-learning machine-learning-algorithms prediction-model python scikit-learn scikitlearn-machine-learning webapp

Last synced: 23 Jul 2025

https://github.com/tabotcharlesbessong/python-errors

This repository will contain all python errors i will encounter in my life as a python plus their solutions

matplotlib-animation matplotlib-pyplot numpy pandas python-script python3 scikit-learn seaborn

Last synced: 16 Apr 2026

https://github.com/tomas542/dl_examples

Examples of Machine Learning, Deep Learning, Natural Language Processing and so on

computer-vision cv deep-learning dl keras machine-learning ml natural-language-processing nlp numpy python pytorch scikit-learn

Last synced: 08 Apr 2026

https://github.com/notshrirang/m2connex

M2ConneX is an all-encompassing platform specifically crafted for MMCOE alumni, enabling seamless communication, networking, and collaboration. It provides tailored recommendations for connections, posts, and job opportunities based on each user's unique skills and experience.

django django-rest-framework scikit-learn

Last synced: 28 Jun 2025

https://github.com/colinwu0403/weatherpredictor

ML model that predicts future weather temperatures. Dataset taken from NOAA's Climate Data Online

pandas scikit-learn

Last synced: 02 May 2026

https://github.com/gmork2/covid-19

A mathematical analysis of the infection growth

coronavirus covid-19 jupyter-notebook numpy pandas python scikit-learn

Last synced: 08 Apr 2026

https://github.com/beolawork-art/novabank-churn-analysis

NovaBank has noticed that customers are closing accounts or going inactive, and they want to understand why.

data-analysis data-science-projects data-visualization eda machine-learning numpy pandas python scikit-learn sql

Last synced: 08 Apr 2026

https://github.com/lorenzorottigni/ml-lending-club

Machine Learning python bootcamp: random forest classifier on LendingClub dataset

ipynb machine-learning numpy pandas python random-forest-classifier scikit-learn seaborn

Last synced: 08 Apr 2026

https://github.com/shahbazshaddy/explainable-multimodal-ai-for-breast-cancer-and-pneumonia-prediction

A deep learning-based framework integrating explainable multimodal AI for accurate prediction and transparent diagnosis of breast cancer and pneumonia.

deep-learning explainable-ai grad-cam groq-api llm machine-learning matplotlib multimodal numpy pandas python pytorch scikit-learn seaborn streamlit

Last synced: 08 Apr 2026

https://github.com/vladstudennikov/diabetes-prediction-app

ML-powered web app built with Laravel and Vue.js to predict diabetes risk based on users' daily habits and behavior

cypress data-analysis diabetes-prediction fastapi inertiajs laravel matplotlib medicine ml pandas php scikit-learn seaborn vuejs

Last synced: 08 Apr 2026

https://github.com/ashrw/handwritten_digit_recognizer

A handwritten digit recognition system using Python and Scikit-learn to preprocess images and classify digits with a trained SVM model.

ml python scikit-learn

Last synced: 03 Jan 2026

https://github.com/jhylin/ml1-1_small_mols_in_chembl

Polars dataframe library and logistic regression in scikit-learn (update)

logistic-regression machine-learning parquet-files polars-dataframe scikit-learn

Last synced: 03 Jan 2026

https://github.com/andrewsy1004/mask-detection

Mask detection system capable of identifying individuals with or without masks

kaggle keras python scikit-learn tensorflow

Last synced: 08 Apr 2026

https://github.com/hariprasath-v/machinehack_analytics_olympiad_2023

Create a machine learning model to determine the likelihood of a customer defaulting on a loan based on credit history, payment behavior, and account details.

binaryclassification catboost exploratory-data-analysis machine-learning numpy pandas python scikit-learn shap

Last synced: 08 Apr 2026

https://github.com/priyanshulathi/cancer-diagnosis-prediction-model

A Machine Learning project to predict cancer malignancy using K-Nearest Neighbor, Support Vector Machine, and Decision Tree algorithms.

machine-learning numpy pandas python scikit-learn

Last synced: 03 Jan 2026

https://github.com/rinuya/ml-cancer-diagnosis

Binary classficiation using MLP & Random Forest

ml mlp random-forest scikit-learn

Last synced: 03 Jan 2026

https://github.com/barraharrison/seoul-bike-sharing

Performing EDA on a kaggle dataset to look at the distribution of Seoul's bike-sharing system

jupyterlab matplotlib numpy pandas python scikit-learn seaborn

Last synced: 23 Jul 2025

https://github.com/ledsouza/deep-learning-noticias

Este projeto visa construir dois modelos de Machine Learning: um para classificar notícias em diferentes categorias e outro para realizar o autocomplete de texto, prevendo a próxima palavra em uma frase. O conjunto de dados fornecido consiste em notícias de um site de notícias, já pré-processadas e armazenadas em um arquivo CSV.

deep-learning keras machine-learning python scikit-learn tensorflow

Last synced: 08 Mar 2026

https://github.com/alphacrypto246/insurance-charges-prediction

The Predicting Insurance Charges project uses Decision Tree Regression to predict insurance charges based on features like age, sex, BMI, and smoking habits. It involves data preprocessing, feature scaling, and model evaluation with metrics like MAE and R².

machine-learning numpy pandas scikit-learn scikitlearn-machine-learning

Last synced: 03 May 2026

https://github.com/radoslawregula/geo-music-classification

Jupyter notebook implementing a classification solution to the geographical origins of music problem.

classification jupyter-notebook machine-learning pandas python random-forest-classifier scikit-learn

Last synced: 17 Apr 2026

https://github.com/eljandoubi/disasterresponsepipeline

Project aim is to build a Natural Language Processing (NLP) model to categorize messages on a real time basis.

flask nltk numpy pandas plotly scikit-learn scipy sqlalchemy

Last synced: 09 Apr 2026

https://github.com/armahdavi/data_pipeline_analytics_statistics_ML_PM_PSD_residential_QFF

Sharing all the data pipelines and processing codes, statistical modellings, descriptive statistics, plot visualizations, and machine learning from Mahdavi & Siegel (2021) (Indoor Air) Project Miestone: 2017 - 2020 Full-length article: https://onlinelibrary.wiley.com/doi/abs/10.1111/ina.12782

data-science data-visualization dust hvac indoor-air-quality jupyter-notebook machine-learning matplotlib-pyplot numpy pandas python scikit-learn scipy-stats spyder spyder-python-ide statistics

Last synced: 17 Sep 2025

https://github.com/riyajain255/customer-segmentation-for-e-commerce

This project analyzes online retail data to segment customers using K-Means clustering and build classification models to predict those segments based on purchasing behavior.

customer-segmentation data-analysis kmeans-clustering logistic-regression machine-learning matplotlib numpy pandas python random-forest scikit-learn seaborn-plots

Last synced: 02 Apr 2026

https://github.com/rohanbanerjee1234567-cell/prediction-of-expected-salary-using-machine-learning

Here is my first Project Repository where I have made a Machine Learning Project using Python. The Problem statement was to train a model based on the given Dataset and from there we need to Predict the Expected Salary of an Employee who will have similar profiles.

exploratory-data-analysis linearregression matplotlib-pyplot numpy pandas randomforest randomforestregressor scikit-learn scikitlearn-machine-learning searborn visualization

Last synced: 27 Apr 2026

https://github.com/rajan-bhateja/machine-learning-with-python

Machine learning algorithms implemented using Scikit-learn

classification clustering machine-learning regression scikit-learn sklearn

Last synced: 17 May 2026

https://github.com/sabbadini10/job4you

Job4You is an AI-powered job application assistant that streamlines the entire application process. Built on Angular and Firebase with GPT-4 integration.

angular api ats-optimization cover-letter email-automation firebase jobforall openai-api python resume-builder scikit-learn sheraz sherazhussain sherazhussain546

Last synced: 04 Mar 2026

https://github.com/jofaval/ionosphere

Binary Classification of Ionosphere signals at Goose Bay, Labrador in 1988

data-analysis data-science data-visualization deep-learning google-colab keras machine-learning python scikit-learn tensorflow uci xgboost

Last synced: 09 Apr 2026

https://github.com/jain1shh/solar-flare-prediction

This repository contains code and data for predicting solar flare energy ranges using machine learning, based on NASA's RHESSI mission data. It includes preprocessing of FITS files into a unified CSV dataset and implements models like Gradient Boosting, Random Forest, and Decision Tree classifiers, achieving accuracies up to 87%.

data-visualization machine-learning numpy pandas python scikit-learn solar-flare-prediction

Last synced: 09 Apr 2026

https://github.com/dhanraj-parigi/diabetes_prediction_app

🩺 A simple and interactive web app that predicts diabetes using 🧠 machine learning. 🚀 Built with Python, Streamlit, and the 🧮 Pima Indians Diabetes dataset.

ai-in-healthcare classification data-science diabetes-prediction health-check healthcare-ai jupyter-notebook machine-learning ml-project pandas python random-forest scikit-learn streamlit

Last synced: 09 Apr 2026

https://github.com/mirzaazwad/tymbert

TYMBert is our submission for NCIM 2025, a spam classifier that makes use of knowledge distillation to compress the model while preserving accuracy

bert huggingface-transformers knowledge-distillation machine-learning matplotlib numpy pandas python3 scikit-learn tiny-bert torch

Last synced: 09 Apr 2026

https://github.com/shafaq-aslam/predicting-heart-disease-risk-with-logistic-regression-techniques

Develop a predictive model using logistic regression techniques to assess heart disease risk based on patient health metrics and data analysis.

data-analysis heart-disease logistic-regression machine-learning machine-learning-models matplotlib numpy pandas python scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/stefagnone/text_adventure_game

A text-based adventure game project using Python fundamentals

matplotlib numpy pandas python r scikit-learn seaborn sql

Last synced: 09 Apr 2026

https://github.com/shauryashaurya/marty_mcfly

Code, text and notebooks on a tutorial for Introduction to Machine Learning using open sources

anaconda jupyter-notebooks machine-learning machine-learning-tutorials notebook numpy python regression scikit-learn scipy tutorial

Last synced: 09 Apr 2026

https://github.com/rajan-bhateja/Machine-Learning-with-Python

ML/DL projects done using sklearn and TensorFlow

machine-learning scikit-learn sklearn

Last synced: 28 Jul 2025

https://github.com/ajxxxs/spotify-music-analysis

spotify Music (web scraped playlists ) analysis (over 3 states) , trends, features and a music recommendation system.

matplotlib numpy panda scikit-learn seaborn

Last synced: 28 Jul 2025

https://github.com/anuranjanjain/cardioguide

This is the project that I created for DSN 2 at VIT , As its name suggests it will help you to check for any abnormalities with your heart by giving the "Heart Risk Assessment"

chartjs chatbot flask-application mlmodel pandas pickle python rest-api scikit-learn

Last synced: 20 Jan 2026

https://github.com/rafay-imraan/email-spam-filtering

Machine learning models that filter spam emails from a dataset downloaded from kaggle.com.

machine-learning ml pandas python scikit-learn xgboost

Last synced: 20 Jan 2026

https://github.com/iamjuniorb/d499-supervised-learning

This class for machine Learning presents the end-to-end process of investigating data through a machine learning lens.

machine-learning project python python3 scikit-learn scikit-learn-python scikitlearn-machine-learning supervised-learning supervised-machine-learning

Last synced: 16 May 2026

https://github.com/kavyachouhan/fake-news-detection-dravidian-language

This repository contains the code and resources for a machine learning project focused on detecting fake news in the Malayalam language, developed as part of the IITM-PAN BS AI-ML Challenge.

jupyter-notebook machine-learning numy pandas python scikit-learn

Last synced: 08 Feb 2026

https://github.com/antrita/stroke_prediction_model

A model that combines Kaggle's Stroke Prediction Dataset with live weather/air quality data to implement FDA-compliant MLOps pipeline and shows expertise in healthcare regulations and real-time inference.

ai data-analysis deep-learning kaggle-dataset machine-learning prediction-model random-forest real-time scikit-learn streamlit weather-api xgboost

Last synced: 07 May 2026

https://github.com/nathadriele/transaction_fraud_prevention_pipeline

Uma solução de detecção e prevenção de fraudes em transações financeiras, combinando Machine Learning, regras de negócio e análises estatísticas avançadas. O sistema oferece um dashboard interativo para monitoramento em tempo real, análise de dados e gestão de alertas de fraude.

data-analysis data-visualization docker fraud-prevention machine-learning matplotlib numpy pandas pipeline pytest python scikit-learn scipy seaborn streamlit tensorflow transaction xgboost

Last synced: 10 Apr 2026

https://github.com/080bct12alex/nepalestate

A real estate price prediction web app using machine learning, Next.js and Flask

flask-api mlp-regresor nextjs scikit-learn

Last synced: 31 Jul 2025

https://github.com/lanhhoang/toronto-bicycle-thefts-classifier

A predictive service using Toronto Police Open Data to provide a classification of either the bike is likely to be returned or not

clustering decision-trees flask logistic-regression machine-learning python scikit-learn streamlit

Last synced: 04 May 2026

https://github.com/bgmp/svm

Support Vector Machine implementation written in Python

iris-dataset scikit-learn svm

Last synced: 31 Jul 2025

https://github.com/vigneshvaranasi/breast_cancer_detection

This project employs machine learning, focusing on Logistic Regression, to detect breast cancer using tumor-related features. The dataset is preprocessed, and the model achieves 100% accuracy on the test set. The goal is to gain insights into breast cancer factors and provide an effective detection solution.

jupyter machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 09 Apr 2026

https://github.com/sridharyadav07/machine-learning-project-bankruptcy-prevention-

The project explores multiple machine learning algorithms and evaluates their performance using various metrics, such as accuracy and confusion matrices. The models tested include Logistic Regression, K-Nearest Neighbors (KNN), Naive Bayes, and Support Vector Machine (SVM). In addition, regularization techniques (L1, L2) are used to avoid overfit.

data-preprocessing evaluation machine-learning-models matplotlib-pyplot modelbuilding modeldeployment numpy pandas python scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/apal21/tensorflow-pima-indians-dataset-classification

Pima Indians Dataset classification using Tensorflow Linear Classifier and DNN Classifier.

classification deep-neural-networks kaggle linear-classifier pandas pima-indians-dataset scikit-learn tensorflow

Last synced: 09 Apr 2026

https://github.com/presizhai/iris-predictor-fastapi

A web application for predicting the species of Iris flowers using a machine learning model trained with the Iris dataset, with FastAPI, a modern web framework for building APIs.

essemblelearning fastapi python random-forest-classifier scikit-learn uvicorn

Last synced: 25 Dec 2025

https://github.com/oroszgy/cookiecutter-ml-flask

Cookiecutter template for training and serving machine learning models with scikit-learn, spacy, Flask and Docker

docker flask flask-application machine-learning nlp rest-api scikit-learn spacy

Last synced: 09 Apr 2026

https://github.com/kiapanahi/handson-machine-learning-book-playground

Sample codes and practices around the book "Hands-On Machine Learning with Scikit-Learn and TensorFlow"

machine-learning python scikit-learn tensorflow

Last synced: 09 Apr 2026

https://github.com/mtlh/fyp_prempredict

In PremPredict, players will predict all Premier League games. Compete against the algorithm and other users across a full season. Scoring points for every correct result/prediction.

django prediction premierleague python scikit-learn tailwindcss

Last synced: 09 Apr 2026

https://github.com/0xunkn0wn4m1r/data_engineering_banking_project

🏦 Build a complete data engineering workflow for a banking system, showcasing ETL processes, data transformations, and an interactive financial dashboard.

automation data-analysis data-cleaning data-science feature-engineering fintech-bank flask-api loan-default-prediction machine-learning mlops model-explainability numpy postgresql scikit-learn segmentation shap sql unsupervised-learning

Last synced: 09 Apr 2026

https://github.com/alexgoodison/boxbox

F1 Race Visualiser & Overtake Prediction Model 🏎️

fastapi keras nextjs scikit-learn

Last synced: 09 Apr 2026

https://github.com/ranimeshehata/feed-forward-neural-network-on-mnist

A PyTorch-based project for classifying the MNIST dataset using Feed Forward Neural Networks, including training, validation, results and visualization.

feedforward-neural-network matplotlib mnist python3 pytorch scikit-learn torchvision

Last synced: 11 Apr 2026

https://github.com/mrmalik2512/catsvsdog.github.io

A CNN model integrated with flask backend the project is trained on image data of dogs and cats and integrated with a website predicts the given image is dog or a cat

deep-learning numpy python scikit-learn tensorflow

Last synced: 09 Apr 2026

https://github.com/lefteris-souflas/modern-slavery-analysis

Jupyter notebook using machine learning techniques to explore the complex drivers of modern slavery. Models from a research paper are replicated and evaluated . Actions also include filling missing data, training regression models, and analyzing feature importance.

decision-tree feature-importance grid-search-cv imputation jupyter-notebook lasso-regression linear-regression matplotlib mean-absolute-error numpy pandas preprocessing principal-component-analysis python3 random-forest ridge-regression scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/manishkumarpatel07/heartattack_risk_prediction

"Heart Attack Risk Prediction" uses machine learning to estimate the likelihood of a heart attack based on user-provided data like physical attributes, symptoms, and medical history. This system enables remote screening, identifying high-risk individuals, and easing medical system burdens by providing early, data-driven health risk assessments.

boruta knn-algorithm matplotlib numpy pandas python scikit-learn

Last synced: 09 Apr 2026

https://github.com/idaraabasiudoh/credit_card_fraud_detection

This repository contains a machine learning project focused on detecting credit card fraud using Decision Tree and Support Vector Machine (SVM) classifiers.

data-analysis jupyter-notebook machine-learning python3 scikit-learn snapml

Last synced: 19 Feb 2026

https://github.com/lc-rezende/eqx_boston_dataset

Exploratory data analysis, clustering, and forecasting on Boston crime data (2011-2015), revealing key crime trends, hotspots, and temporal patterns to support data-driven insights for urban safety and policing strategies.

data-analysis exploratory-data-analysis jupyter-notebook kmeans matplotlib numpy pandas prophet-facebook python scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/ashishsingh789/bcg_virtual_internship

This repository showcases my BCG X virtual internship project on customer churn analysis for PowerCo, covering business understanding, EDA, feature engineering, and modeling using Python and machine learning.

data-manipulation data-science dataanalysis datavisualization eda machine-learning matplotlib numpy pandas python random-forest scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/kasraskari/tumor-predict

Streamlit app for predicting tumor malignancy using logistic regression.

logistic-regression machine-learning numpy pandas python scikit-learn streamlit tumor-detection

Last synced: 09 Apr 2026

https://github.com/mhkamel/ecommerce-targeting-system

A Flask-based E-Commerce Targeting System that provides customer segmentation and personalized product recommendations. Users can upload structured interaction data for analysis, receive AI-driven recommendations, and gain insights into user behavior. The application is built with Flask, Pandas, Scikit-Learn, and integrates an interactive web inter

ai bootstrap csv-processing customer-segmentation data-analysis data-science e-commerce flask machine-learning pandas python recommendation-system scikit-learn user-behavior web-application

Last synced: 09 Apr 2026

https://github.com/kuldeep-gif/interactive-gesture-speech-system

An interactive AI system that translates real-time hand gestures into audible speech and converts spoken words into visual gestures using OpenCV and MediaPipe.

computer-vision gesture-recognition hci machine-learning mediapipe opencv python scikit-learn speech-recognition

Last synced: 09 Apr 2026

https://github.com/jianninapinto/bandersnatch

This project implements a machine learning model using Random Forest, XGBoost, and Support Vector Machines algorithms with oversampling and undersampling techniques to handle imbalanced classes for classification tasks in the context of predicting the rarity of monsters.

altair imbalanced-classification imblearn machine-learning mongodb oversampling pycharm-ide pymongo python random-forest-classifier scikit-learn smote support-vector-machines undersampling xgboost

Last synced: 29 Sep 2025

https://github.com/macromrit/air-flick

Transfer files through the air with just a gesture. Push. Pull. Done.

css cv2 fastapi html js media-pipe peer2peer python random-forest-classifier restful-api scikit-learn websockets

Last synced: 09 Apr 2026

https://github.com/anusha-me/customer_churn_analysis

Predict and analyze telecom customer churn using machine learning techniques and business dashboards. This end-to-end project includes data preprocessing, EDA, model evaluation (SVM, XGBoost), real-time Streamlit deployment, and Power BI dashboard reporting. Built for actionable insights and decision support.

churn-prediction classification-model customer-analytics dashboard data-science eda machine-learning powerbi predictive-analytics python scikit-learn streamlit svm telecom xgboost

Last synced: 29 Apr 2026

https://github.com/subhas-pramanik-09/mediscan-ai

A smart and scalable ML-powered health prediction system that can help detect the risk of three major diseases: Diabetes + Heart Disease + Parkinsons Disease

jupyter-notebook logistic-regression machine-learning numpy pandas scikit-learn streamlit svm-classifier

Last synced: 09 Apr 2026

https://github.com/omdoshi13/pricing-of-laptops-using-ml

Data Analysis, training Machine Learning models, and Model Evaluation and Refinement for Pricing of Laptops dataset.

data-analysis data-analysis-project datascience google-colab jupyter-notebook machine-learning matplotlib model-evaluation model-refinement numpy pandas python scikit-learn

Last synced: 09 Apr 2026

https://github.com/nurulashraf/linear-regression-insurance-premium

This analysis applies simple linear regression to explore the relationship between age and insurance premium. It includes model training, visualisation, and evaluation using MSE and RMSE to assess prediction accuracy.

beginner-project data-analysis insurance-data linear-regression machine-learning matplotlib predictive-modeling python regression-models scikit-learn

Last synced: 05 May 2026

https://github.com/pejpero/machine_learning

This repository contains two comprehensive machine learning projects using scikit-learn, demonstrating ensemble learning with a Voting Classifier and the comparison of linear and polynomial regression models on different datasets.

ensemble-learning linear-regression logistic-regression machine-learning polynomial-regression random-forest scikit-learn svm

Last synced: 09 Feb 2026

https://github.com/praditaw/patient-los-prediction

Predicting patient Length of Stay (LoS) using machine learning to provide insights for hospital operational efficiency.

exploratory-data-analysis feature-engine healthcare-analysis huggingface-spaces hyperparameter-tuning length-of-stay los-prediction machine-learning pandas scikit-learn streamlit

Last synced: 05 May 2026

https://github.com/gerardo1909/proyecto_nba_mvp

Trabajo práctico final de la materia "Introducción al Aprendizaje Automático" de la Licenciatura en Ciencia de Datos (UNSAM). 2C-2023

machine-learning nba notebooks-jupyter pandas python random-forest scikit-learn

Last synced: 03 Oct 2025

https://github.com/impesud/ai-finops-platform

AI FinOps is an AI-powered platform for cloud cost optimization and forecasting. Built with FastAPI, Python, and modern MLOps tools, it allows teams to track multi-cloud usage, detect anomalies, and predict future expenses using real-time data and machine learning.

aws docker fastapi jupyter mlflow python react scikit-learn statsmodels tailwindcss terraform xgboost

Last synced: 09 Apr 2026