An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/ayaarbi/prediction_des_maladies_cardiovasculaires_avec_ml

Ce projet, développé au sein de cours de Machine Learning, utilise des algorithmes de classification supervisée pour prédire la présence de maladies cardio-vasculaires à partir de données médicales publiées sur Kaggle.

cardiovascular-diseases jupyter-notebook machine-learning matplotlib pandas python scikit-learn

Last synced: 07 May 2026

https://github.com/n1k1f0rm/car-price-predict

By car characteristics you can predict it prise

fastapi ml scikit-learn streamlit

Last synced: 07 May 2026

https://github.com/z-fran/walmart-store-sales-forecasting

Data analysis and machine learning solution in Python for the Kaggle competition Walmart Recruiting - Store Sales Forecasting.

machine-learning sales-analysis sales-forecasting sales-prediction scikit-learn walmart-sales-forecasting

Last synced: 07 May 2026

https://github.com/garimarao24/customer-churn-project

This repository contains a Customer Churn Prediction project that leverages Machine Learning techniques to predict customer churn and segment customers using clustering.

customer-churn kmeans-clustering logistic-regression machine-learning pca scikit-learn

Last synced: 07 May 2026

https://github.com/govind-prakash/machinelearning

A collection of my machine learning projects, tutorial exercises, algorithm implementations, and related code.

decision-trees gradientboostinclassifier linear-regression logistic-regression scikit-learn unsupervised-learning

Last synced: 07 May 2026

https://github.com/rishi035/advanced-house-price-predictions

This is my First Project and also participated in kaggle competition

linear-regression machine-learning python random random-forest regressor-models scikit-learn

Last synced: 07 May 2026

https://github.com/tony123105/comp4423_garbage_classification

Garbage classification using traditional machine learning approaches (HOG, LBP, SIFT features with SVM, KNN, Random Forest classifiers) and an ensemble method to categorize waste into 10 types.

computer-vision feature-extraction garbage-classification hog image-classification knn lbp machine-learning opencv python random-forest scikit-learn sift svm

Last synced: 07 May 2026

https://github.com/saswatamcode/datascienceapi

This is a RESTful API built using Flask and Scikit-Learn. It provides a host of Classification and Regression algorithms that can be used readily and returns results in the form of predictions, confusion matrices, accuracy scores and more.

api flask ml python3 scikit-learn

Last synced: 07 May 2026

https://github.com/pspanoudakis/machine-learning-nlp

NLP 🤖 📖 projects on Vaccine Sentiment Classification 💉 and Question Answering 💬

bert-fine-tuning glove-embeddings neural-networks pytorch question-answering rnn scikit-learn sentiment-classification softmax-regression squad

Last synced: 07 May 2026

https://github.com/sumdiboii/loan-prediction-decision-trees

A Decision Tree Classifier was implemented to predict personal loan acceptance using a dataset of 5,000 customers. Key features included income, education, mortgage, and credit card usage. The model achieved 97% accuracy, with 92% precision and 76% recall for positive loan predictions, validated using a classification report and confusion matrix.

classification data-visualisation decision-trees loan-prediction machine-learning python scikit-learn supervised-learning

Last synced: 07 May 2026

https://github.com/nicovandenhooff/wids-datathon-2022

This repository contains solution for the 2022 Women in Data Science Kaggle competition that I participated in, which obtained a top 10% leaderboard standing.

catboost data-visualization datascience energy-consumption ensemble-learning exploratory-data-analysis kaggle lightgbm machine-learning scikit-learn women-in-data-science xgboost

Last synced: 07 May 2026

https://github.com/alphacrypto246/titanic-survival

This project leverages machine learning techniques to predict passenger survival in the Titanic disaster using the Kaggle Titanic dataset. It includes data preprocessing, exploratory data analysis (EDA), and model building with algorithms like Logistic Regression and Random Forests to achieve reliable predictions.

logistic-regression machine-learning machine-learning-algorithms python scikit-learn scikitlearn-machine-learning

Last synced: 07 May 2026

https://github.com/andrewsy1004/linear-regression-model-for-house-price-prediction

A linear regression model to predict house prices based on features like size, location, and number of rooms. This project demonstrates the application of machine learning in real estate price estimation

linear-regression python scikit-learn xgbregressor

Last synced: 07 May 2026

https://github.com/dynle/2020f-ml

2020F Keio University - Machine Learning Laboratory

machine-learning python scikit-learn

Last synced: 07 May 2026

https://github.com/mwasifanwar/automl_framework

Comprehensive AutoML framework that automates data preprocessing, feature engineering, model selection, hyperparameter tuning, and deployment. Features neural architecture search and automated data cleaning pipelines.

automl automl-algorithms data-science data-science-projects feature-engineering feature-engineering-algorithm feature-engineering-ml hyperparameter-optimization machine-learning machine-learning-algorithms machine-learning-models mlops mlops-workflow python scikit-learn scikit-learn-python

Last synced: 07 May 2026

https://github.com/tedim52/discjockey

a content-based recommender system for your party playlist preferences

jupyter-notebook matplotlib pandas scikit-learn spotify-web-api

Last synced: 07 May 2026

https://github.com/jimmymugendi/bulding-a-decision-tree-to-predict-customer-churn

This repo desribes bulding a decision tree to predict customer churn in a given organisation

accuracy-score decision-tree-classifier matplotlib-pyplot numpy pandas-dataframe scikit-learn

Last synced: 07 May 2026

https://github.com/cnoret/hexa-watts

Interactive data visualization and machine learning app for energy consumption analysis and prediction in France, built with Streamlit. (Text in French)

data-visualization electricity-forecasting energy-analysis france machine-learning scikit-learn streamlit

Last synced: 07 May 2026

https://github.com/mark-mdo47/family-machine-learning-project-2017

We are doing a two-part Machine Learning project this summer with SciKit-Learn and Keras/TensorFlow

machine-learning python scikit-learn tensorflow

Last synced: 07 May 2026

https://github.com/henrytseng/example_docker_scikit-learn

A quick example of using Scikit-Learn from a Docker container

docker scikit-learn

Last synced: 08 May 2026

https://github.com/moustafamohamed01/mall-customer-segmentation-data

Customer segmentation using K-Means clustering based on annual income and spending score.

data-science data-visualization k-means-clustering machine-learning python scikit-learn unsupervised-learning

Last synced: 08 May 2026

https://github.com/anusha-me/disease-x-detection-ml-project

A machine learning classification system for early detection of Disease X based on patient symptoms using Python, Scikit-learn, and Streamlit.

classification data-science disease-prediction healthcare-ai machine-learning medicaldata scikit-learn streamlit

Last synced: 08 May 2026

https://github.com/aravindnathan02/machine-learning-projects

Machine Learning and Deep Learning projects which mainly focuses on predictive modeling.

deep-learning machine-learning neural-networks predictive-modeling python scikit-learn tensorflow

Last synced: 08 May 2026

https://github.com/samjoesilvano/password_strength_prediction_using_nlp

Developed a predictive model to categorize passwords as Strong, Good, or Weak, enhancing security and reducing breach risks. The project involves cleaning and analyzing data from an SQL database, using the TF-IDF technique for transformation, and implementing a Logistic Regression model to achieve accurate classifications.

data-analysis data-classification data-cleaning data-visualization logistic-regression machine-learning natural-language-processing pandas password-security password-strength python scikit-learn sql tf-idf

Last synced: 08 May 2026

https://github.com/thekartikeyamishra/data-preprocessor

A Google Colab module for interactive data preprocessing. Handles missing values, categorical encoding (One-Hot, Label), and numerical scaling (Standard, MinMax). Outputs a cleaned dataset

ipywidgets numpy pandas python scikit-learn

Last synced: 08 May 2026

https://github.com/prajjwal6969/recommender-system-using-python

A collection of content-based recommendation systems for songs and movies using Python and machine learning.

content-based-filtering cosine-similarity machine-learning movie-recommendation python recommender-system scikit-learn song-recommendation

Last synced: 08 May 2026

https://github.com/jatin-mehra119/churn_modeling

This repository is dedicated to predicting customer churn using machine learning techniques. It includes comprehensive scripts for data preprocessing, model training, and evaluation, along with detailed visualizations and insights.

classification-model datavisualization pandas scikit-learn

Last synced: 08 May 2026

https://github.com/mpolinowski/local-linear-embedding

Improve Data Quality by discarding non-correlating, noisy Dimensions

locally-linear-embedding pyplot python scikit-learn

Last synced: 08 May 2026

https://github.com/deepanshkhurana/udacityproject-prediciting-boston-housing-prices

This is a Udacity Project for the Machine Learning Nanodegree. Here, we are trying to predict Boston Housing Prices using sklearn.

data-analysis data-science machine-learning python scikit-learn udacity

Last synced: 08 May 2026

https://github.com/samkazan/fraud-detection-ml

Machine learning models for enhanced fraud detection in e-commerce transactions, exploring feature engineering, distance prediction, and clustering analysis.

clustering data-science data-visualization dataanalytics dbscan eda hierarchical-clustering kmeans-clustering knn-imputer matplotlib mlxtend python scikit-learn seaborn xgboost

Last synced: 08 May 2026

https://github.com/gregoritsch3/dl_cv_e2e_potatodiseaseclassification

A guided CodeBasics Deep Learning Project where a Convolutional Model is deployed onto a Website (FastAPI) and Mobile App (React Native, Google Cloud). Its purpose is the classification of potato plant images into "healthy", "Early Blight" and "Late Blight" categories.

cnn-classification gcp model-deployment scikit-learn tensorflow

Last synced: 08 May 2026

https://github.com/icejan/predicton-systems

Various systems that train on data and generate a prediction

lightfm machine-learning numpy python scikit-learn

Last synced: 08 May 2026

https://github.com/oriolventur/assignment-2-model-creation

Assignment 2 from Artificial Intelligence 1 course: Model creation using synthetic data and scikit-learn.

jupyter-notebook model-creation python scikit-learn

Last synced: 08 May 2026

https://github.com/seyha1007/amazon-reviews-analysis

🧐 This project analyzes Amazon Fine Food Reviews to investigate whether negative reviews are more emotionally intense and lexically repetitive than positive ones. Using R, we apply sentiment analysis and lexical diversity metrics to uncover patterns in consumer review language.

acp amazon-reviews bert data-analytics glove jupyter-notebook lstm-sentiment-analysis machine-learning nltk random-forest scikit-learn sentiment-classification sentimental-analysis support-vector-machine

Last synced: 08 May 2026

https://github.com/labex-labs/supervised-learning-regression

Supervised Learning: Regression | This repo collects 7 of programming labs exercises for Supervised Learning: Regression. Supervised learning. If you are hearing or reading this term for the first time, then it may be completely unclear what it means. Don't worry. In this lab, you will get a comp...

challenges course exercises hands-on labex labs machine-learning playgroud programming scikit-learn

Last synced: 08 May 2026

https://github.com/sundarmd/breast-cancer-detection

Breast-Cancer-Detection is a machine learning project that utilizes logistic regression to predict whether a tumor is benign or malignant based on the Breast Cancer Wisconsin (Diagnostic) dataset. The project demonstrates data preprocessing, model training, and evaluation using the `scikit-learn` library.

logistic-regression machine-learning python scikit-learn

Last synced: 09 May 2026

https://github.com/shingiraibhengesa/house-price-predictor

A machine learning project that predicts house prices based on user input features such as square footage, number of bedrooms, and more.

machine-learning-models matplotlib numpy python scikit-learn seaborn

Last synced: 09 May 2026

https://github.com/aasjunior/mlapp-api

Esta API fornece endpoints para aplicar algoritmos de aprendizado de máquina, como K-Nearest Neighbors (KNN), Árvore de Decisão e Algoritmo Genético. Realizado como tarefa da disciplina de Laboratório Mobile/Computação Natural no 5º Semestre de Desenvolvimento de Software Multiplataforma.

fastapi machine-learning python scikit-learn

Last synced: 09 May 2026

https://github.com/vijaykumarr1452/customer-churn-prediction

Analysis the data of telecom company and insights gained to reduce customer churn.

anaconda jupyter-notebook machine-learning pandas prediction scikit-learn

Last synced: 09 May 2026

https://github.com/davidrpugh/kaust-cs-294w

Course materials for KAUST CS 294W

deep-learning machine-learning pytorch scikit-learn

Last synced: 09 May 2026

https://github.com/ahmed122000/ml_model_deployment

The HR Analytics: Job Change Predictor is a Flask-based web application that uses machine learning to predict whether an employee will stay with a company or leave. It allows users to train models, evaluate their performance, and make predictions based on employee data, providing valuable insights for HR decision-making.

classification flask machine-learning python3 rest-api scikit-learn

Last synced: 09 May 2026

https://github.com/radoslawregula/iris-classification

Jupyter notebook implementing an efficient machine learning method to classify flowers from the Iris data set.

classification iris-dataset jupyter-notebook machine-learning python scikit-learn softmax-classifier

Last synced: 09 May 2026

https://github.com/santiagoasp98/spam-detection

SMS spam detection using Logistic Regression and Multinomial Naive Bayes.

classification logistic-regression machine-learning multinomial-naive-bayes python scikit-learn spam-detection

Last synced: 09 May 2026

https://github.com/l1ght14/customer-churn-prediction

Predict customer churn using machine learning models like Logistic Regression and Random Forest. Includes data preprocessing, model evaluation, feature importance, and insights to drive retention strategies.

churn-prediction classification customer-churn customer-churn-prediction data-analysis logistic-regression machine-learning python random-forest scikit-learn telecom

Last synced: 09 May 2026

https://github.com/alphacrypto246/employee-attrition

This project analyzes employee attrition data to uncover key factors driving employee turnover. Using Python, it employs data preprocessing, exploratory data analysis, and machine learning models to predict attrition and provide actionable insights for improving employee retention strategies.

decision-tree-classifier machine-learning machine-learning-algorithms python scikit-learn scikitlearn-machine-learning

Last synced: 09 May 2026

https://github.com/mayankanand007/yfraud

Credit card fraud detection platform using scikit-learn and xgboost 💳

knearest-neighbor-algorithm linear-regression machine-learning predictive-analytics python3 scikit-learn svm xgboost

Last synced: 09 May 2026

https://github.com/akwardhan/loan-default-prediction-xgboost-streamlit

Full-scale loan default prediction system using XGBoost, trained on 1.3M LendingClub loans. Includes feature-rich preprocessing, class imbalance handling, recall-focused ML pipeline, and Streamlit web deployment for real-time borrower risk scoring.

credit-risk data-science google-colab loan-default-prediction machine-learning python real-world-project scikit-learn streamlit xgboost

Last synced: 09 May 2026

https://github.com/peterchain/titanic

Script for the Titanic dataset for evaluating which passengers survived

kaggle machine-learning pandas-dataframe python3 scikit-learn

Last synced: 09 May 2026

https://github.com/otuemre/viginids

VigiNIDS: A machine learning-based system for detecting malicious network traffic using the UNSW-NB15 dataset. It distinguishes between normal and attack activities, providing a data-driven approach to network security.

classification cybersecurity intrusion-detection-system machine-learning network-intrusion-detection python scikit-learn unsw-nb15 xgboost

Last synced: 09 May 2026

https://github.com/roggersanguzu/tomato-disease-detector

This project Uses transfer learning with MobileNetV2 to accurately classify tomato leaf diseases including Mosaic Virus, Septoria Leaf Spot, Blight, and Healthy leaves.

deep-learning python scikit-learn transfer-learning

Last synced: 09 May 2026

https://github.com/mpolinowski/multi-dimensional-scaling

Multidimensional Scaling is a family of statistical methods that focus on creating mappings of items based on distance.

matplotlib-pyplot multi-dimensional-scaling python scikit-learn

Last synced: 09 May 2026

https://github.com/callmerajesh/ames-housing-price-prediction

Predicting house prices using Decision Tree Regressor on the Ames dataset

ames-housing data-science decision-tree machine-learning python regression scikit-learn

Last synced: 09 May 2026

https://github.com/saahilanande/naivebayes

Implimenting Naive Bayes classifier from scratch for sentiment analysis of IMDB dataset

machine-learning naive-bayes-classifier python-3 scikit-learn

Last synced: 09 May 2026

https://github.com/thanh12273203/hotel-booking-cancellation-prediction

Binary classification on hotel booking cancellations.

classification machine-learning python scikit-learn

Last synced: 09 May 2026

https://github.com/adadalshabab/human-stress-analysis-greadsearch-classifier

The project leverages data from physiological signals, self-reported surveys, behavioral observations, or other relevant sources to infer and analyze stress levels.

classification knn-classification machine-learning machine-learning-algorithms matplotlib pandas scikit-learn

Last synced: 09 May 2026

https://github.com/malisha4065/flightdelaypredictiongroup99

This project focuses on predicting flight delays in the United States domestic air traffic system over 500 000+ data using machine learning techniques. Leveraging a dataset from the Bureau of Transportation Statistics for the year 2020, we aim to develop a predictive model that can anticipate flight delays with 93.1 % high accuracy.

k-nearest-neighbors machine-learning python scikit-learn support-vector-machine

Last synced: 09 May 2026

https://github.com/rajan-bhateja/aqi-predictor

Different models trained on Indian Cities to predict AQI

machine-learning-algorithms model-comparison neural-networks scikit-learn tensorflow

Last synced: 09 May 2026

https://github.com/jaswanthv99/basic_ml-model_understanding

This project explains basic ML-Models(KNN, Naive bayes, Logistic Regression, SVM, A neural N/W)

matplotlib-python pandas-python scikit-learn tensorflow

Last synced: 09 May 2026

https://github.com/samuelson777/iris-flower-classification

Iris Flower Classification: A machine learning project that classifies iris flowers into three species based on sepal and petal dimensions. Includes data exploration, visualization, and model evaluation using Python and scikit-learn.

classification data-science data-visualization iris-dataset jupyter-notebook machine-learning python scikit-learn

Last synced: 09 May 2026

https://github.com/piras-s/braincancerclassifier

Classifying brain tumors using Gaussian Naive Bayes with MRI-derived features. Includes feature selection, model evaluation, prediction uncertainty, and probability calibration.

baysian-inference calibrated-classification classification data-visualization feature-selection machine-learning medical-imaging naive-bayes-classifier python scikit-learn uncertainty-estimation

Last synced: 09 May 2026

https://github.com/njaffe/eda_example_2025

Sample end-to-end data analysis walkthrough using Python and Scikit-learn.

data-science data-visualization jupyter-notebooks machine-learning python regression scikit-learn

Last synced: 09 May 2026

https://github.com/suvasish114/house-price-estimation

A machine learning model that estimate housing prices in California using the California census data

jupyter-notebook machine-learning python scikit-learn

Last synced: 09 May 2026

https://github.com/vivprime/diabetes-prediction-system

MERISKILL INTERNSHIP: To predict whether an individual have Diabetes or not

django html scikit-learn

Last synced: 09 May 2026

https://github.com/bhoomikaniranjan/pulmotrainer

A Deep Learning-based Lung Cancer Detection application using a 3D CNN model with TensorFlow and OpenCV, featuring an interactive Tkinter GUI for easy data processing and training.

matplotlib numpy-pandas opencv python scikit-learn seaborn tensorflow-keras

Last synced: 09 May 2026

https://github.com/mpolinowski/fisher-discriminant-analysis

LDA is a widely used dimensionality reduction technique built on Fisher’s linear discriminant.

linear-discriminant-analysis matplotlib-pyplot python scikit-learn

Last synced: 10 May 2026

https://github.com/laavanjan/real_estate_price_prediction

This project predicts the house price per unit area based on various real estate features using a Linear Regression model. The application is built with Dash, a Python framework for building interactive web apps.

dash linear-regression pandas scikit-learn

Last synced: 10 May 2026

https://github.com/amirdora/python_ml_supervisedlearning_example

Building Classification Models with scikit-learn

machine-learning python3 scikit-learn

Last synced: 10 May 2026

https://github.com/macdon112/credit-card-fraud-detection

Comparing ML models (Random Forest, KNN, Decision Tree) for credit card fraud detection using SMOTE and stratified cross-validation.

classification data-analysis fraud-detection imbalanced-data machine-learning python scikit-learn

Last synced: 10 May 2026

https://github.com/chengetanaim/sentimentanalysisforfinancialnews

This is a Django application for predicting whether the sentiment of a financial news headline is positive, negative or neutral (from an investor point of view)

beautifulsoup4 chartjs django html-css-javascript logistic-regression machine-learning natural-language-processing scikit-learn tfidf-vectorizer webscraping

Last synced: 10 May 2026

https://github.com/hassanislam463/nyc_airbnb_eda

This project is a comprehensive data analysis of Airbnb listings in New York City, exploring pricing trends, seasonality effects, host market dynamics, rental preferences, and revenue estimation. It provides valuable insights for hosts, investors, and policymakers to optimize Airbnb operations and understand the short-term rental landscape in NYC.

exploratory-data-analysis matplotlib python scikit-learn seaborn

Last synced: 10 May 2026

https://github.com/ejw-data/ml-classification-credit-risk

Compares several machine learning classification models to determine whether to approve or reject a loan request

classification python scikit-learn

Last synced: 10 May 2026

https://github.com/afonsojramos/feup-iart

Projects developed for Artificial Intelligence class.

feup feup-iart iart neural-network python scikit-learn tensorflow

Last synced: 10 May 2026

https://github.com/tnleite/real-estate-opportunities-analysis

Este repositório apresenta uma análise de oportunidades no mercado imobiliário, combinando séries temporais, clusterização e previsões para identificar estados com maior potencial de crescimento e orientar estratégias de expansão eficientes.

catboostregressor cluster-analysis data-science kmeans-clustering lightgbm-regressor machine-learning-algorithms numpy regression-models scikit-learn xgboost-regression

Last synced: 10 May 2026

https://github.com/ankur-krgarg/credit-risk

Predict credit risk using machine learning (LogReg, Random Forest). Built clean pipeline with EDA, modeling, and visualizations.

classification credit-risk-analysis data-science eda machine-learning portfolio python remote-read scikit-learn

Last synced: 10 Jun 2026

https://github.com/i30101/mathworks2024

Coding tools for 2024 MathWorks Math Modeling Challenge

machine-learning mathematical-modelling python scikit-learn

Last synced: 10 Jun 2026

https://github.com/alphacrypto246/student-learning-style-prediction

An interactive web application built with Streamlit that predicts a student's preferred learning style (visual, auditory, or kinesthetic) using machine learning, aiding educators in personalizing teaching strategies.

machine-learning scikit-learn scikitlearn-machine-learning streamlit

Last synced: 11 May 2026

https://github.com/vijaykumarr1452/ipl-first-innings-score-prediction-deployment

Deployment of IPL Score Prediction Analyser Model. https://github.com/vijaykumarr1452/IPL-First-Innings-Score-Prediction)

css deployment gunicorn html machine-learning ml predictive-analytics python scikit-learn

Last synced: 11 May 2026

https://github.com/mpolinowski/tstochastic-neighbor-embedding

Improve Data Quality by discarding non-correlating, noisy Dimensions

matplotlib-pyplot python scikit-learn t-sne

Last synced: 11 May 2026