scikit-learn
scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.
- GitHub: https://github.com/topics/scikit-learn
- Wikipedia: https://en.wikipedia.org/wiki/Scikit-learn
- Repo: https://github.com/scikit-learn/scikit-learn
- Created by: David Cournapeau
- Released: January 05, 2010
- Related Topics: scikit, python,
- Aliases: sklearn,
- Last updated: 2026-07-02 00:27:34 UTC
- JSON Representation
https://github.com/tbarlow12/learn-it-your-way
Using Python Flask, I wanted to create a simple web API that allows users to upload a dataset, choose one or more models, store them server side, and then hit an endpoint to get a prediction.
flask machine-learning python scikit-learn tensorflow
Last synced: 29 Apr 2026
https://github.com/gabrielfmcoelho/heart-disease-webapp
flask-application healthcare machine-learning scikit-learn webapp
Last synced: 30 Apr 2026
https://github.com/lhm30/scikitlearn_clustermap_rdkit_bicluster_molecules
hacked code to bicluster molecules using rdkit and scikitlearn
biclustering cheminformatics circular-fingerprints clustering ecfp-4 ecfp4 molecular-fingerprints molecular-similarity molecules rdkit scikit scikit-learn similarity smiles
Last synced: 30 Apr 2026
https://github.com/ledsouza/machine-learning-semisupervisionado
Este projeto utiliza algoritmos de aprendizado de máquina semi-supervisionado para classificar a qualidade do leite como alta, média ou baixa.
data-science joblib machine-learning machine-learning-algorithms pandas python scikit-learn
Last synced: 30 Apr 2026
https://github.com/pramodyasahan/grade-predictor
This project aims to predict student performance based on various features such as job, study time, failures, absences, and first and second period grades. The project utilizes a linear regression model from the scikit-learn library in Python.
machine-learning matplotlib numpy pandas python regression scikit-learn
Last synced: 30 Apr 2026
https://github.com/fikri-rouzan/burnaway-capstone-data-science
Dashboard analitik interaktif untuk memetakan faktor fisik dan pola kerja pemicu burnout pada software developer.
jupyter-notebook matplotlib pandas pillow plotly python scikit-learn seaborn statsmodels streamlit
Last synced: 08 Jun 2026
https://github.com/ttsudipto/recurrence-pred-genomics
ML-based prediction of NSCLC recurrence with gene expression data
boruta gene-expression imbalanced-learn machine-learning mcfs multilayer-perceptron non-small-cell-lung-cancer python r random-forest recurrence-prediction rna-seq scikit-learn smote support-vector-machine
Last synced: 30 Apr 2026
https://github.com/kumailn/machinelearning
Machine learning with Python
machine-learning python scikit-learn tensorflow
Last synced: 30 Apr 2026
https://github.com/boladjivinny/fire-prediction
Notebook for the Fire fighting using data on Zindi. Ranked number 5 on the public leaderboard and 8 on the private leaderboard. https://zindi.africa/hackathons/cmu-africa-fighting-fire-with-data
feature-engineering hackhathon machine-learning regression scikit-learn stacking
Last synced: 30 Apr 2026
https://github.com/harshitwaldia/disease_detection
A disease detection system using Random Forest Classifier and GUI in Python, identifying illnesses based on user symptoms.
pandas-python python3 random-forest-classifier scikit-learn tkinter-gui
Last synced: 01 May 2026
https://github.com/vansh-khaneja/spam-email-detection
This is a spam email detection model
machine-learning naive-bayes-classifier scikit-learn spam-detection
Last synced: 01 May 2026
https://github.com/kristishqau/sentimentanalysis_nlp
A project for sentiment analysis of tweets using various NLP techniques and machine learning models.
datascience jupyter-notebook machine-learning nlp nltk python scikit-learn sentiment-analysis xgboost
Last synced: 01 May 2026
https://github.com/clinton-mwachia/machine-learning-with-python
machine learning with python
machine-learning python regression scikit-learn
Last synced: 01 May 2026
https://github.com/sundanc/btcprediction
Predict Bitcoin prices based on historical data using machine learning techniques
bitcoin-prediction keras machine-learning pandas python python3 scikit-learn scikitlearn-machine-learning
Last synced: 02 May 2026
https://github.com/mehtadigisha/iris-flower-classification
Iris Flower Classification
accuracy-score classification-report data-analysis data-visualization eda iris-classification machine-learning matplotlib pandas prediction python scikit-learn seaborn svc-model svm-model visualization
Last synced: 03 May 2026
https://github.com/viniciusds2020/ml_pycaret_classificacao
Sistema de preprocessamento e treinamento de modelos de machine learning utilizando PyCaret. Uma metodologia low-code para processos de MLops
machine-learning mlops preprocessing pycaret python scikit-learn
Last synced: 03 May 2026
https://github.com/arrhythmia-detection/authorprovidedfeaturescombineddt
Deploys a vanilla Decision Tree for Arrhythmia classification using Chapman ECG dataset on Arduino UNO board
arduino-uno arrhythmia-classification atmega328p chapman-ecg decision-tree-classifier eloquent scikit-learn
Last synced: 09 Jun 2026
https://github.com/zhenglinlei/zdmp
Industry 4.0 Optimization with Machine Learning AI
industry-4 knn-classification machine-learning pandas python scikit-learn
Last synced: 03 May 2026
https://github.com/apfirebolt/movie_recommendation_using_scikitlearn_and_pyqt5
A movie recommendation system built using KNN model from scikit-learn library. GUI components are powered by pyQt5, a library to create GUI applications in Python
cosine-similarity jupyter-notebook knn-algorithm movie-recommedation pandas python scikit-learn
Last synced: 03 May 2026
https://github.com/pramodyasahan/binary-classifier
This repository houses the code for a machine learning model designed to predict customer churn. The model is built using Support Vector Machine (SVM) from the scikit-learn library and incorporates preprocessing, pipeline, and grid search techniques for optimal performance.
Last synced: 03 May 2026
https://github.com/gaarutyunov/savn-project
Community detection in large open source projects
community-detection gephi graph graphql jupyter-notebook networkx python3 scikit-learn sqlachemy
Last synced: 03 May 2026
https://github.com/atchayaah/home-value-insights-kc
Data-driven project predicting King County housing prices using EDA, regression models, and ML techniques, developed as part of IBM’s Data Analysis with Python course on Coursera.
joblib matplotlib numpy pandas pickle python scikit-learn seaborn
Last synced: 03 May 2026
https://github.com/darenr/gradientboostingmachines
Notebooks exploring strengths and weaknesses of GBM based classifiers
jupyter-notebook lightgbm pandas scikit-learn xgboost
Last synced: 03 May 2026
https://github.com/ceodaniyal/telecom_customer_churn_prediction
A machine learning project that predicts whether a telecom customer will churn (leave the service) using customer demographics, account information, and service usage. The repository includes data preprocessing, model training (with logistic regression), feature scaling, and example predictions.
classification customer-churn-prediction data-science logistic-regression machine-learning ml-project pandas prediction python scikit-learn streamlit telecom
Last synced: 04 May 2026
https://github.com/baponkar/scikit-logisticregression-application
A simple and detail application analysis of sci kit learn LogisticRegression model .
classification-algorithm logistic-regression machine-learning python3 scikit-learn
Last synced: 04 May 2026
https://github.com/marionchaff/real-estate-price-prediction-france
Real estate price prediction using French public database DVF
data-analysis dvf-data machine-learning price-prediction python real-estate scikit-learn
Last synced: 04 May 2026
https://github.com/mariiasam/stroke-prediction
A model for predicting the risk of stroke in a patient
balanced-random-forest-classifier decission-tree-classifier gradient-boosting imbalanced-learning joblib logistic-regression matplotlib numpy random-forest-classifier scikit-learn seaborn streamlit
Last synced: 04 May 2026
https://github.com/dakii24/credit-card-fraud-detection
This repository contains a machine learning project focused on detecting fraudulent credit card transactions. The project includes data preprocessing, model training, and evaluation to identify and prevent fraudulent activities.
capstone-project class-imbalance classification-algorithm credit-card credit-card-fraud data-science decision-trees fraud machine-learning open-data python scikit-learn svm svm-classifier
Last synced: 04 May 2026
https://github.com/msikorski93/protein-tertiary-structure
Performing a regression task for estimating residue size based on given physicochemical properties of protein tertiary structures (CASP 5-9).
bioinformatics gradient-boosting multilayer-perceptron-network protein-structure-prediction regression-algorithms scikit-learn tensorflow
Last synced: 04 May 2026
https://github.com/aqueeqazam/machine-learning-using-scikit
This repository contains all of the algorithms used to train the machine learning models using the Scikit library.
Last synced: 04 May 2026
https://github.com/pierrealexandre78/deathpredict
Predict Hospital mortality rate using Machine Learning for patients admitted in ICU (Intensive Care Unit)
healthcare hospital machine-learning predictions python random-forest-classifier scikit-learn xgboost-classifier
Last synced: 05 May 2026
https://github.com/simpl1fy/spam-classifier-project
A web application to classify spam texts or emails.
multinomial-naive-bayes nltk python render scikit-learn text-classification
Last synced: 05 May 2026
https://github.com/celineboutinon/bookworms
OpenClassrooms Data Analyst 2022-2023 - Projet 6
apriori-algorithm data-analysis data-analytics data-visualisation dataframes matplotlib-pyplot mlxtend numpy pandas python scikit-learn scikit-posthocs scikitlearn seaborn statsmodels
Last synced: 05 May 2026
https://github.com/smaddanki/pattern-pursuit-challenge
A personal challenge to build a production-ready trading signal system for S&P 500 stocks using deep learning. This project progresses from basic ML models to a complete trading infrastructure, focusing on 5-day forward return prediction and signal generation.
deep-learning machine-learning pytorch quantative-trading quantitative-finance quantitative-research scikit-learn
Last synced: 05 May 2026
https://github.com/patilsukanya/house-price-prediction
Libraries Used
matplotlib numpy pandas scikit-learn seaborn
Last synced: 05 May 2026
https://github.com/rohra-mehak/sciencesync
System for Personalized Google Scholar Alerts Processing and Data Management, and provision of ML based clustering analysis
agglomerative-clustering clustering crossref-api customtkinter google-api google-scholar graph-api machine-learning numpy pandas python3 scientific-article-analysis scikit-learn sqlite3
Last synced: 05 May 2026
https://github.com/rohit1901/py-cluster
Classifier and Cluster Analysis in Data Science
classification clustering data-science k-means-clustering machine-learning pytest python python3 ruff scikit-learn
Last synced: 05 May 2026
https://github.com/aysenurcftc/breast_cancer_streamlit
Breast Cancer Wisconsin Dataset Classifier with Scikit-learn and Streamlit
breast-cancer classification gridsearch scikit-learn streamlit
Last synced: 05 May 2026
https://github.com/kefrankk/ml-fraud-detection
I built a predictive model to detect fraud in financial transactions.
Last synced: 05 May 2026
https://github.com/kunalpisolkar24/dsbda_lab
Collection of practical codes for Savitribai Phule Pune University's Data Science and Big Data Analytics Laboratory (310256).
data-analytics data-preprocessing data-science data-wrangling descriptive-statistics linear-regression logistic-regression mapreduce scala scikit-learn sppu-computer-engineering tf-idf
Last synced: 05 May 2026
https://github.com/sevilaymuni/project-no.6-tree-based-models
Random Forest Assisted Suggestions for Salifort Motors Employee Retention: Plan, Analyze, Construct and Execute
data-science decision-trees evaluation-metrics gridsearchcv logistic-regression machine-learning matplotlib python random-forest-classifier scikit-learn seaborn-plots
Last synced: 05 May 2026
https://github.com/sadmansakib93/mental-resilience-analysis-using-machine-learning
Utilized supervised and unsupervised ML techniques to analyze mental health and resilience levels of medical students [Project completed on December, 2019]
artificial-intelligence classification clustering correlation linear-regression machine-learning machine-learning-algorithms mental-health python regression resilience scikit-learn statistical-analysis
Last synced: 06 May 2026
https://github.com/samia35-2973/living-type-classification-from-codon-usage
Machine learning project to classify living types based on codon usage data using Random Forest and XGBoost classifiers.
classification codon-usage data-cleaning data-preprocessing excel exploratory-data-analysis living-type machine-learning python random-forest-classifier scikit-learn supervised-learning xgboost-classifier
Last synced: 06 May 2026
https://github.com/billgewrgoulas/recommendation-systems
Algorithms for joke rating prediction using the joke data-set from Kaggle.
algorithm clustering collaborative-filtering machine-learning numpy pandas recommender-system scikit-learn scypi
Last synced: 06 May 2026
https://github.com/adesartika33/proyek-analisis-data-dataset-iris
Proyek ini bertujuan untuk menganalisis dataset Iris, salah satu dataset klasik dalam bidang Machine Learning dan Data Science. Dataset ini terdiri dari 150 sampel bunga Iris dari tiga spesies (Setosa, Versicolor, dan Virginica)
classification data-science data-visualization eda exploratory-data-analysis iris-dataset machine-learning python random-forest scikit-learn
Last synced: 06 May 2026
https://github.com/deshwalx/diabetes-prediction-svm
My first ML project using SVM to predict diabetes
beginner-project classification diabetes machine-learning python scikit-learn svm svm-classifier
Last synced: 06 May 2026
https://github.com/ejw-data/ml-playground
Testing the limitations, inabilities, and strengths of models with synthetic data
machine-learning python scikit-learn
Last synced: 06 May 2026
https://github.com/cycle-sync-ai/student-score-analysis
A data-driven student performance analysis project using UCI dataset (396 students, 33 features). Implements machine learning models (K-means, PCA, Decision Tree, Random Forest, Linear Regression) to analyze academic patterns and predict student scores based on lifestyle, health, and study habits.
clustering clustering-algorithm decision-trees feature-engineering learning-management-system linear-regression machine-learning machine-learning-algorithms matplotlib numpy pandas pca pickle prediction prediction-algorithm scikit-learn score seaborn student
Last synced: 06 May 2026
https://github.com/kartheekdama/salary-prediction
This salary prediction model leverages machine learning techniques, including Random Forest, Decision Tree, and Linear Regression, to estimate salaries based on individual attributes such as age, gender, education level, job title, and years of experience. The Random Forest model outperforms the others, achieving the highest R-squared score.
decision-tree exploratory-data-analysis feature-importance linear-regression machine-learning random-forest scikit-learn
Last synced: 06 May 2026
https://github.com/galaxy092/samsung-innovation-campus-big-data-capstone-project
Samsung Innovation Campus Big Data Capstone Project - Weather Prediction
hadoop jupyter-notebook pandas pyspark scikit-learn sparksql
Last synced: 06 May 2026
https://github.com/jbizzlefoshizzle/ibm_capstone_project
Used K-means clustering and mapping libraries to determine best cities in San Diego to open a Mexican restaurant
beautifulsoup4 folium-maps geopy pandas-python scikit-learn
Last synced: 06 May 2026
https://github.com/michael95-m/packaging-insurance-claim-model
Packaging regression model from scikit-learn
feature-engineering machine-learning python python-package scikit-learn
Last synced: 07 May 2026
https://github.com/kirillshiryaev61/customer_activity_prediction
Прогнозирование снижения покупательской активности в интернет-магазине. Модель на основе ML выявляет клиентов с риском оттока для повышения удержания. Учебный проект.
jupyter pandas python scikit-learn
Last synced: 07 May 2026
https://github.com/garimarao24/customer-churn-project
This repository contains a Customer Churn Prediction project that leverages Machine Learning techniques to predict customer churn and segment customers using clustering.
customer-churn kmeans-clustering logistic-regression machine-learning pca scikit-learn
Last synced: 07 May 2026
https://github.com/rishi035/advanced-house-price-predictions
This is my First Project and also participated in kaggle competition
linear-regression machine-learning python random random-forest regressor-models scikit-learn
Last synced: 07 May 2026
https://github.com/pspanoudakis/machine-learning-nlp
NLP 🤖 📖 projects on Vaccine Sentiment Classification 💉 and Question Answering 💬
bert-fine-tuning glove-embeddings neural-networks pytorch question-answering rnn scikit-learn sentiment-classification softmax-regression squad
Last synced: 07 May 2026
https://github.com/andrewsy1004/linear-regression-model-for-house-price-prediction
A linear regression model to predict house prices based on features like size, location, and number of rooms. This project demonstrates the application of machine learning in real estate price estimation
linear-regression python scikit-learn xgbregressor
Last synced: 07 May 2026
https://github.com/tedim52/discjockey
a content-based recommender system for your party playlist preferences
jupyter-notebook matplotlib pandas scikit-learn spotify-web-api
Last synced: 07 May 2026
https://github.com/cnoret/hexa-watts
Interactive data visualization and machine learning app for energy consumption analysis and prediction in France, built with Streamlit. (Text in French)
data-visualization electricity-forecasting energy-analysis france machine-learning scikit-learn streamlit
Last synced: 07 May 2026
https://github.com/moustafamohamed01/mall-customer-segmentation-data
Customer segmentation using K-Means clustering based on annual income and spending score.
data-science data-visualization k-means-clustering machine-learning python scikit-learn unsupervised-learning
Last synced: 08 May 2026
https://github.com/jahanostg/linear-regression_ml-algorithm
Linear Regression Algorithm
colab-notebook matplotlib numpy pandas scikit-learn seaborn
Last synced: 08 May 2026
https://github.com/samjoesilvano/password_strength_prediction_using_nlp
Developed a predictive model to categorize passwords as Strong, Good, or Weak, enhancing security and reducing breach risks. The project involves cleaning and analyzing data from an SQL database, using the TF-IDF technique for transformation, and implementing a Logistic Regression model to achieve accurate classifications.
data-analysis data-classification data-cleaning data-visualization logistic-regression machine-learning natural-language-processing pandas password-security password-strength python scikit-learn sql tf-idf
Last synced: 08 May 2026
https://github.com/jatin-mehra119/churn_modeling
This repository is dedicated to predicting customer churn using machine learning techniques. It includes comprehensive scripts for data preprocessing, model training, and evaluation, along with detailed visualizations and insights.
classification-model datavisualization pandas scikit-learn
Last synced: 08 May 2026
https://github.com/msikorski93/detecting-panic-disorder
Panic disorder detecting using machine learning techniques.
artificial-neural-networks classification knn logistic-regression machine-learning panic-disorder random-forest scikit-learn sgd svm tensorflow xgboost
Last synced: 08 May 2026
https://github.com/vijaykumarr1452/customer-churn-prediction
Analysis the data of telecom company and insights gained to reduce customer churn.
anaconda jupyter-notebook machine-learning pandas prediction scikit-learn
Last synced: 09 May 2026
https://github.com/santiagoasp98/spam-detection
SMS spam detection using Logistic Regression and Multinomial Naive Bayes.
classification logistic-regression machine-learning multinomial-naive-bayes python scikit-learn spam-detection
Last synced: 09 May 2026
https://github.com/alphacrypto246/employee-attrition
This project analyzes employee attrition data to uncover key factors driving employee turnover. Using Python, it employs data preprocessing, exploratory data analysis, and machine learning models to predict attrition and provide actionable insights for improving employee retention strategies.
decision-tree-classifier machine-learning machine-learning-algorithms python scikit-learn scikitlearn-machine-learning
Last synced: 09 May 2026
https://github.com/callmerajesh/ames-housing-price-prediction
Predicting house prices using Decision Tree Regressor on the Ames dataset
ames-housing data-science decision-tree machine-learning python regression scikit-learn
Last synced: 09 May 2026
https://github.com/saahilanande/naivebayes
Implimenting Naive Bayes classifier from scratch for sentiment analysis of IMDB dataset
machine-learning naive-bayes-classifier python-3 scikit-learn
Last synced: 09 May 2026
https://github.com/malisha4065/flightdelaypredictiongroup99
This project focuses on predicting flight delays in the United States domestic air traffic system over 500 000+ data using machine learning techniques. Leveraging a dataset from the Bureau of Transportation Statistics for the year 2020, we aim to develop a predictive model that can anticipate flight delays with 93.1 % high accuracy.
k-nearest-neighbors machine-learning python scikit-learn support-vector-machine
Last synced: 09 May 2026
https://github.com/samuelson777/iris-flower-classification
Iris Flower Classification: A machine learning project that classifies iris flowers into three species based on sepal and petal dimensions. Includes data exploration, visualization, and model evaluation using Python and scikit-learn.
classification data-science data-visualization iris-dataset jupyter-notebook machine-learning python scikit-learn
Last synced: 09 May 2026
https://github.com/suvasish114/house-price-estimation
A machine learning model that estimate housing prices in California using the California census data
jupyter-notebook machine-learning python scikit-learn
Last synced: 09 May 2026
https://github.com/bhoomikaniranjan/pulmotrainer
A Deep Learning-based Lung Cancer Detection application using a 3D CNN model with TensorFlow and OpenCV, featuring an interactive Tkinter GUI for easy data processing and training.
matplotlib numpy-pandas opencv python scikit-learn seaborn tensorflow-keras
Last synced: 09 May 2026
https://github.com/mpolinowski/fisher-discriminant-analysis
LDA is a widely used dimensionality reduction technique built on Fisher’s linear discriminant.
linear-discriminant-analysis matplotlib-pyplot python scikit-learn
Last synced: 10 May 2026
https://github.com/laavanjan/real_estate_price_prediction
This project predicts the house price per unit area based on various real estate features using a Linear Regression model. The application is built with Dash, a Python framework for building interactive web apps.
dash linear-regression pandas scikit-learn
Last synced: 10 May 2026
https://github.com/macdon112/credit-card-fraud-detection
Comparing ML models (Random Forest, KNN, Decision Tree) for credit card fraud detection using SMOTE and stratified cross-validation.
classification data-analysis fraud-detection imbalanced-data machine-learning python scikit-learn
Last synced: 10 May 2026
https://github.com/aneeshmurali-n/nlp-emotion-classification-in-text
Develop machine learning models to classify emotions in text samples.
bag-of-words data emotion-classification feature-extraction machine-learning naive-bayes natural-language-processing nlp nltk preprocessing python scikit-learn svm text-classification tf-idf tokenizer vectorizer
Last synced: 10 May 2026
https://github.com/zescalante/data1030-final-project
Final project for DATA1030
data-science machine-learning scikit-learn
Last synced: 10 May 2026
https://github.com/tnleite/real-estate-opportunities-analysis
Este repositório apresenta uma análise de oportunidades no mercado imobiliário, combinando séries temporais, clusterização e previsões para identificar estados com maior potencial de crescimento e orientar estratégias de expansão eficientes.
catboostregressor cluster-analysis data-science kmeans-clustering lightgbm-regressor machine-learning-algorithms numpy regression-models scikit-learn xgboost-regression
Last synced: 10 May 2026
https://github.com/i30101/mathworks2024
Coding tools for 2024 MathWorks Math Modeling Challenge
machine-learning mathematical-modelling python scikit-learn
Last synced: 10 Jun 2026
https://github.com/alphacrypto246/student-learning-style-prediction
An interactive web application built with Streamlit that predicts a student's preferred learning style (visual, auditory, or kinesthetic) using machine learning, aiding educators in personalizing teaching strategies.
machine-learning scikit-learn scikitlearn-machine-learning streamlit
Last synced: 11 May 2026
https://github.com/mpolinowski/tstochastic-neighbor-embedding
Improve Data Quality by discarding non-correlating, noisy Dimensions
matplotlib-pyplot python scikit-learn t-sne
Last synced: 11 May 2026
https://github.com/pdoup/atml-notebooks
Proposed assignment notebooks for Advanced Topics in Machine Learning tasks
active-learning cost-sensitive-learning imbalanced-data machine-learning multi-instance-learning multi-label-classification numpy scikit-learn
Last synced: 11 May 2026
https://github.com/anras5/criteo-search-data
EDA and statistical tests on CriteoSearchData dataset
data-science pandas scikit-learn statistics
Last synced: 11 May 2026
https://github.com/cplaza0997/py-ml
Machine learning
clustering linear-regression logistic-regression ml pyspark python scikit-learn sparkml
Last synced: 11 May 2026
https://github.com/rajireddy15/student_grade_pred
A machine learning project to predict student final grades using academic and demographic data. Built with pandas, scikit-learn, and visualized with seaborn and matplotlib to gain insights and support early intervention for students.
academic-insights data-science eda education-analytics grade-prediction machine-learning ml-project pandas regression-models scikit-learn student-performance-analysis
Last synced: 11 May 2026
https://github.com/xunchiasg/nyc_property_sales
Exploratory Data Analysis of rolling property sales data in NYC from March 2023-2025
matplotlib-pyplot plotly python scikit-learn
Last synced: 12 May 2026
https://github.com/srosalino/prediction_of_seoul_bikes_demand
The objective of this project is to predict the number of bicycles needed to be made available each hour in order to make the service as efficient as possible
cross-validation data-exploration-and-preprocessing hyperparameter-tuning machine-learning regularization-methods scikit-learn
Last synced: 13 May 2026
https://github.com/msikorski93/heart-failure-prediction
The subject of this repository was to perform binary classification based on respondent's collected features (age, cholesterol level, fasting blood sugar, thallium stress test results, etc.).
classification knn-classifier logistic-regression random-forest-classifier roc-curves scikit-learn svm-classifier
Last synced: 13 May 2026
https://github.com/fgebhart/handson-ml
hands-on machine learning notebooks collection
jupyter-notebook machine-learning scikit-learn
Last synced: 13 May 2026
https://github.com/janek1842/mlbyjan-sandbox
Testbed for private ML investigations
Last synced: 14 May 2026
https://github.com/fulviofavilla/cvd-prediction-ml
Comparative ML analysis for CVD prediction. Winner of the 2023 HPCC Systems Poster Competition.
data-science ecl healthcare hpcc-systems machine-learning pandas python scikit-learn
Last synced: 11 Jun 2026
https://github.com/neelimabonangi/defect-detection-hot-rolling
Defect Detection in Hot Rolling Using Machine Learning
classification data-analysis data-science defect-detection jupyter-notebook machine-learning manufacturing numpy pandas predictive-analytics python random-forest scikit-learn
Last synced: 12 Jun 2026
https://github.com/jayemscript/lab-to-code
A complete Python learning roadmap for scientists and researchers — covering data science, biology, chemistry, physics, and mathematics with curated libraries, tools, and resources.
bioinformatics chemistry data-science jupyter-notebook machine-learning mathematics numpy pandas physics python research roadmap scientific-computing scikit-learn
Last synced: 19 Jun 2026
https://github.com/royxlead/production-drift-detection
Production ML monitoring library - KL, PSI, MMD, and ADWIN drift detectors with empirical benchmarks, confidence tracking, and a 6-page FastAPI dashboard.
data-drift drift-detection fastapi kl-divergence mlops mmd model-monitoring production-ml psi pytorch scikit-learn uncertainty-quantification
Last synced: 23 Jun 2026
https://github.com/imosudi/model_training
Breast Cancer Diagnosis: Logistic Regression, Random Forest, k-NN and Decision Tree classifiers models with feature importance analysis - Includes data exploration, train/test splitting, feature scaling, cross-validation, and model evaluation metrics with confusion matrices and decision boundary visualisation
classification data-science decision-tree educational feature-importance k-nearest-neighbors linear-regression machine-learning model-evaluation python3 random-forest scikit-learn
Last synced: 25 Jun 2026
https://github.com/sundarmd/breast-cancer-detection
Breast-Cancer-Detection is a machine learning project that utilizes logistic regression to predict whether a tumor is benign or malignant based on the Breast Cancer Wisconsin (Diagnostic) dataset. The project demonstrates data preprocessing, model training, and evaluation using the `scikit-learn` library.
logistic-regression machine-learning python scikit-learn
Last synced: 09 May 2026
https://github.com/jeus0522/7-explore-different-classifier-ml-app
A project exploring various classification algorithms, showcasing their implementation, comparison, and evaluation using Python and scikit-learn.
k-nearest-neighbours knn random-forest scikit-learn streamlit support-vector-machine svm
Last synced: 21 Jan 2026
https://github.com/fanyicharllson/mobile-money-transaction-analysis
Machine learning pipeline for classifying mobile money users (MTN MoMo & Orange Money) into activity segments — CSC 3221 Final Project, ICT University Cameroon.
cameroon data-science ict-university jupyter jupyter-notebook machine-learning mtn-momo orange-money python scikit-learn
Last synced: 31 May 2026