An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/Gamowy/Music-Classification

Music genre classification using k nearest neighbors classifier based on gtzan dataset

machinelearning python scikit-learn university-assignment

Last synced: 17 Jul 2025

https://github.com/anandparayil/sign-language-translator

Real-Time AI-Based Sign Language Translator using MediaPipe, Random Forest, and Tkinter GUI.

jupyter-notebook mediapipe opencv python pyttsx3 scikit-learn tkinter-gui

Last synced: 07 Apr 2026

https://github.com/balavenkatesh3322/loan-default-prediction

An end-to-end machine learning project to predict loan default risk. Includes Exploratory Data Analysis (EDA), feature engineering, a Gradient Boosting model, and a proposed system architecture for deployment.

data-science deep-learning feature-engineering gradient-boosting loan-default-prediction machine-learning scikit-learn tutorial-exercises

Last synced: 17 May 2026

https://github.com/itsdawei/qsc-airplane

A regression model predicting airline stock prices based on public flight data.

regression-analysis scikit-learn statsmodels stock-price-prediction

Last synced: 17 May 2026

https://github.com/krish57-bit/diabetes-prediction-

A comprehensive machine learning pipeline to predict the onset of diabetes using the PIMA Indian Diabetes dataset. This includes data cleaning, visualization, outlier detection, standardization, SMOTE-based imbalance handling, and multiple classification algorithms (Logistic Regression, Naive Bayes, and KNN).

classification data-science diabetes healthcare jupyter-notebook machine-learning python scikit-learn smote

Last synced: 07 May 2026

https://github.com/kishankrishna1/spam-classifier

Developed a Machine Learning-based Spam Classifier using Multinomial Naive Bayes to identify and filter spam messages with high precision

matplotlib numpy pandas python scikit-learn seaborn

Last synced: 02 Apr 2026

https://github.com/daniil-leshchev/spotify_ml

Track Popularity Prediction based on Spotify Data

eda keras ml pandas scikit-learn

Last synced: 12 Apr 2026

https://github.com/ahmedheakl/diabetes_classification_svm

Classifying patients to know if they have diabetes using Supporting Vector Machine Model.

machine-learning python scikit-learn

Last synced: 13 Apr 2026

https://github.com/netcodez/climate-prediction-pipeline

Predicting London's climate using machine learning techniques. This project aims to forecast mean temperature in Celsius (°C) using various regression models and logging experiments with MLflow

huggingface machine-learning mlflow mlflow-tracking mlflow-tracking-server mlops python scikit-learn streamlit

Last synced: 09 Apr 2026

https://github.com/nicolasvauche/vinylexplore_ml

VinyleXplore est un moteur de recommandation de vinyles intelligent basé sur l'humeur et le contexte d'écoute de l'utilisateur. Il utilise FastAPI pour exposer une API REST et scikit-learn pour entraîner un modèle de Machine Learning qui améliore la pertinence des suggestions.

machine-learning python scikit-learn vinyle

Last synced: 17 May 2026

https://github.com/nafis2508/mobile-price-predictor

Machine learning project that classifies mobile phones into price ranges (low, medium, high, very high) based on hardware specifications.

classification data-science eda jupyter-notebook kagle knn logistic-regression machine-learning mobile-price-prediction python scikit-learn xgboost

Last synced: 24 Jun 2026

https://github.com/lefteris-souflas/the-algorithmic-approach-to-winning-guess-who

This repository provides a systematic approach to winning the "Guess Who?" game through advanced machine learning techniques. It offers a comprehensive methodology for enhancing gameplay strategy and optimizing decision-making processes with meticulous attention to detail.

decision-tree drawio gradient-boosting graphviz-dot lightgbm machine-learning matplotlib numpy pandas python random-forest scikit-learn

Last synced: 09 Apr 2026

https://github.com/szymonrucinski/pippi-lang

Elegant 📑 text preprocessing pipeline 🚰 available as pip package 🐍 based on scikit-learn pipeline. Combines Transformer and Column Transformer into a single object.

data-cleaning data-science nlp pipeline scikit-learn

Last synced: 30 Apr 2026

https://github.com/pratishtha-abrol/sentimentanalysis

Logistic Regression: A sentiment analysis case study

logistic-regression nltk-python scikit-learn sentiment-analysis

Last synced: 17 May 2026

https://github.com/pradeep-r04/attendiq

AttendIQ is a Face Recognition Attendance System designed to automate and streamline the attendance process with precision and ease. By leveraging real-time face detection and recognition technology, AttendIQ eliminates the need for manual roll calls or ID-based check-ins. The system captures facial data during a quick registration process .

csv cv2 kneighborsclassifier numpy os pandas pickle python scikit-learn streamlit time

Last synced: 02 Apr 2026

https://github.com/capac/higher-education-students-performance-evaluation

Machine learning project for evaluating higher education student performance

docker evidently grafana mlflow postgresql prefect python scikit-learn xgboost

Last synced: 09 Apr 2026

https://github.com/adirbella37/safety-analytics-project

Final project in Safety Management: analytics and predictive modeling for occupational incidents. Includes EDA, logistic regression, Poisson/Negative Binomial with overdispersion checks, ROC/AUC, and prediction exercises.

classification data-visualization drunk-and-drive eda logistic-regression matplotlib negative-binomial numpy occupational-safety overdispersion pandas poisson-regression python road-safety roc-auc scikit-learn seaborn statmodels

Last synced: 09 Apr 2026

https://github.com/urvee1810/bitcoin-price-forecasting-using-arma

The analysis reveals the challenges of predicting Bitcoin prices during highly volatile periods and demonstrates how traditional time series models perform under different market conditions. The project includes comparative analysis of model performance during stable and volatile market phases.

arima arma augmented-dickey-fuller-test feature-engineering machine-learning matplotlib mplfina numpy pandas python random-forest randomforestregressor scikit-learn seaborn statsmodels time-series-analysis

Last synced: 06 Mar 2026

https://github.com/rizquuula/sentimentanalyzenaivebayes

Analisis Sentimen menggunakan metode Naive Bayes dengan "One time learning" dan "Continuous Learning"

machine-learning naive-bayes nlp python scikit-learn sentiment-analysis text-classification

Last synced: 17 May 2026

https://github.com/mitchmedeiros/mlcompare

Quickly compare machine learning models across libraries and datasets.

huggingface-datasets kaggle machine-learning openml pytorch scikit-learn xgboost

Last synced: 02 Feb 2026

https://github.com/paulinhok14/csgo-datascience-project

📊 Analysis of CS:GO grenade usage patterns and their impact on match outcomes using data science and statistical methods.

matplotlib mlflow numpy python scikit-learn scipy seaborn

Last synced: 30 Dec 2025

https://github.com/surajsanap/technohack_mlinternship

1) Wine Quality Analysis and Classification, 2)Movie Review Sentiment Analysis, 3)Diabetes Prediction Using Machine Learning

deep-learning machine-learning pandas python scikit-learn

Last synced: 08 May 2025

https://github.com/dionixius7/titanic-disaster-ml-model

This project predicts the survival of passengers on the Titanic by using Kaggle Titanic Disaster Dataset. The dataset contains information related to passengers, such as age, gender, and class. Different machine learning algorithms have been applied for this predictive model to accomplish an accurate prediction that will define the survival chances

data-analysis data-science data-visualization eda knn-classifier machine-learning neural-network python scikit-learn svm tensorflow titanic-kaggle titanic-survival-prediction

Last synced: 07 Feb 2026

https://github.com/priyanshul28/ml_regression_eda_waiterstip

An EDA and Machine Learning Regression exercise on the Waiter's Tip dataset demonstrating the use of Linear Regression, Neural Network Regressors, Decision Trees, Random Forests, Linear SVR, XGBoost, etc. The models are optimized using hyperparameter tuning through GridSearchCV.

eda machine-learning regression scikit-learn seaborn

Last synced: 17 May 2026

https://github.com/blaz-cerpnjak/student-dropout-prediction

Student dropout predictions based on grades and other info. Classification problem with MLPClassifier.

classification machine-learning mlpclassifier neural-networks poetry predicting-student-dropout python scikit-learn scikit-learn-pipelines

Last synced: 17 May 2026

https://github.com/monish-nallagondalla/cement_strength_prediction

The Cement Strength Prediction project uses machine learning to predict the compressive strength of cement based on its components, such as Cement, Fly Ash, Water, Superplasticizer, Coarse Aggregate, Fine Aggregate, and Age. The goal is to forecast compressive strength (MPa) for optimized cement production and quality control.

cement-strength-prediction construction-industry data-analysis data-preprocessing data-science data-visualization feature-engineering machine-learning predictive-modeling python regression-analysis scikit-learn

Last synced: 11 May 2026

https://github.com/arrhythmia-detection/authorprovidedfeaturescombineddtoptimized

Deploys an optimized Decision Tree for Arrhythmia classification using Chapman ECG dataset on Arduino UNO board

arduino-uno arrhythmia-classification atmega328p chapman-ecg decision-tree-classifier eloquent scikit-learn

Last synced: 17 May 2026

https://github.com/christianconchari/bike-sharing-demand

Este repositorio contiene el trabajo práctico final de la materia Aprendizaje de Máquina II de la Especialización en Inteligencia Artificial (CEIA) de la Facultad de Ingeniería de la Universidad de Buenos Aires (FIUBA).

airflow docker fastapi machine-learning mlflow python scikit-learn

Last synced: 20 Jan 2026

https://github.com/jingjing-jin/purchase-behavior-analysis

Purchase Behavior Analysis for Targeted Customer Segmentation

clustering-algorithm data-mining machine-learning python scikit-learn

Last synced: 20 Jan 2026

https://github.com/alpha597/music_classification_ml

A project which compares different machine learning algorithms' accuracy in music genre classification of a large dataset.

machine-learning pandas python scikit-learn tensorflow

Last synced: 11 Apr 2026

https://github.com/simranjeet97/spam-classification

Spam Classification Using Natural Language Processing (NLP), Scikit-Learn Library, and Bayesian Method.

data-science emails kaggle kaggle-dataset naive-bayes-classifier nlp-machine-learning nltk-python python scikit-learn spam-classification

Last synced: 11 Apr 2026

https://github.com/vasu7052/spam-classifier

This is a Machine Learning Project to detect whether a given sentence maybe a spam or not using Python and Keras.

keras keras-neural-networks python3 scikit-learn spam-classification tensorflow

Last synced: 11 Apr 2026

https://github.com/karimosman89/resume-screening

Screen resumes to identify the best candidates.Build a machine learning model that screens resumes and ranks candidates based on job descriptions.Streamline the hiring process for HR departments by automating candidate screening.

machine-learning-algorithms nlp-machine-learning nltk-python python scikit-learn spacy text-processing

Last synced: 29 Apr 2026

https://github.com/zohaib-cheema/defacto

DeFacto is a machine learning-based tool that classifies fake news articles using a hybrid model built with Scikit-learn, TensorFlow, and Keras. The system analyzes social and political content to detect deception in news stories and social media posts, providing a reliable solution to address the growing issue of misinformation.

flask git keras numpy pandas r scikit-learn tensorflow

Last synced: 07 Apr 2026

https://github.com/ankitjha2202/sentiment_analysis

A simple web application that performs sentiment analysis using logistic regression to predict whether a given text has a positive, negative or neutral sentiment.

classification logistic-regression nlp scikit-learn sentiment

Last synced: 28 Mar 2025

https://github.com/hmasdev/ssbgm

Score Based Generative Model with scikit-learn

generative-model scikit-learn

Last synced: 17 May 2026

https://github.com/anuragkush2527/vibesync-3.0

Sentiment analysis in social media involves using natural language processing (NLP) and machine learning to analyze users' opinions, emotions, and attitudes expressed in posts, comments, and reviews. It helps in understanding public sentiment, monitoring trends, and making data-driven decisions.

expressjs fastapi mongodb nltk nodejs numpy pandas python reactjs scikit-learn sentiment-analysis tensorflow

Last synced: 16 Oct 2025

https://github.com/ledsouza/nlp-article-classification

This project aims to develop a machine learning model capable of classifying news articles into different categories based on their titles. Two different word embedding models (CBOW and Skip-gram) are trained and used to vectorize the article titles. These vectorized representations are then used to train a Logistic Regression classifier.

gensim-word2vec natural-language-processing nlp nlp-machine-learning pandas python scikit-learn spacy spacy-nlp

Last synced: 11 Apr 2026

https://github.com/nk-works/creditflow-ai

CreditFlow AI predicts loan defaulters using Artificial Neural Networks (ANNs). This model uses historical loan data to predict the likelihood of default for new loan applications.

ai artificial-neural-networks deep-learning jupyter-notebook machine-learning matplotlib numpy pandas python scikit-learn seaborn tensorflow

Last synced: 24 Jun 2025

https://github.com/ahmed-maher77/signlink___graduation-project

𝐀𝐈-𝐏𝐨𝐰𝐞𝐫𝐞𝐝 𝐒𝐢𝐠𝐧 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐓𝐫𝐚𝐧𝐬𝐥𝐚𝐭𝐨𝐫 | A web and mobile app that bridges communication gaps for the deaf and hard-of-hearing community by translating English and Arabic sign language into real-time text and speech, and converting spoken words into text during video calls.

csharp fastapi firebase-realtime-database flutter framer-motion javascript microsoft-dot-net-technologies numpy opencv python pytorch reactjs scikit-learn scss-framework sign-language-recognizer sign-language-translation sql-server tailwindcss webrtc websockets

Last synced: 07 Apr 2026

https://github.com/marktheo/bike-sharing-demand

Jupyter Notebook - Predicting bike rental numbers based on climate and temporal data

decision-tree-classifier decision-tree-regression jupyter-notebook machine-learning scikit-learn

Last synced: 18 May 2026

https://github.com/bhazel/dockerfiles

Some Dockerfiles for working with specific technologies or learning resources.

docker dockerfile ocaml python rails ruby scikit-learn

Last synced: 10 Apr 2026

https://github.com/suvanwita/safescope

Women safety pattern analyzer using public crime datasets, DBSCAN hotspot clustering, Isolation Forest anomaly detection, geospatial heatmaps, and explainable risk scoring to surface historical incident patterns and time-aware safety insights

civic-tech crime-analysis dbscan isolation-forest machine-learning plotly python risk-scoring scikit-learn streamlit women-safety

Last synced: 25 Jun 2026

https://github.com/prarthana-singh/heart-attack-prediction-model

A Machine Learning model that predicts the risk of a heart attack based on health parameters like cholesterol levels, blood pressure, BMI, smoking habits, and age. Built using Classification models, Scikit-Learn, Pandas, and Python.

classification data-analysis data-science heart-attack-prediction logistic-regression machine-learning numpy pandas python scikit-learn

Last synced: 25 Jun 2025

https://github.com/aryanpillai2007/credit-card-fraud-detection

The primary goal of this project is to develop a comprehensive fraud detection system that enhances the security and trustworthiness of financial transactions.

anomaly-detection classification credit-card-fraud data-preprocessing data-science data-visualization fraud-detection imbalanced-data logistic-regression machine-learning outlier-detection pca pca-analysis python roc-curve scikit-learn

Last synced: 18 May 2026

https://github.com/akshaypatra/cardiovascular_disease_detection

AI-driven ECG classification model that detects cardiovascular abnormalities such as arrhythmia and atrial fibrillation using a hybrid CNN-LSTM deep learning approach.

keras matplotlib numpy pandas python3 scikit-learn seaborn tensorflow wfdb

Last synced: 14 Apr 2026

https://github.com/kanika300393/loan_prediction

This project implements a Loan Prediction system using Support Vector Machine (SVM). It includes data preprocessing, visualization of features like income and education, and model evaluation. The goal is to predict loan approval based on the dataset. Clone the repo to explore the code and improve the model.

data-science machine-learning numpy pandas python scikit-learn svm-classifier

Last synced: 09 Apr 2026

https://github.com/simon2k/stock-price-prediction-evaluation

This project is indented to present a small evaluation of different types of regression models for predicting stock prices for AAPL.

evaluation machine-learning numpy pandas predicting-stock-prices scikit-learn

Last synced: 07 Apr 2026

https://github.com/enayar478/nomad_machine_learning_dash_app

An interactive Machine Learning app built with Dash and Plotly, developed as part of the Data Analytics Bootcamp at Le Wagon Bordeaux. It allows users to visualize data, make real-time predictions, and explore various model insights.

analytics cachetools dash dashboard-application data-analysis data-science deployment gunicorn interactive-visualization machine-learning pandas plotly plotly-dash prediction-model python python3 render scikit-learn web-application

Last synced: 02 Jan 2026

https://github.com/satheesh-meadi/real_time_financial_risk_dashboard

Financial Risk Analysis Dashboard 🚀. An interactive Streamlit dashboard designed for analyzing and visualizing portfolio performance. Features include CAPM analysis, portfolio optimization, efficient frontier visualization, and real-time stock data to help optimize investments.

numpy pandas plotly plotly-express python3 scikit-learn streamlit yfinance

Last synced: 05 Apr 2026

https://github.com/capsuleismail/drybeanuci

Data Science Project with Model comparison.

datascience jupyter-notebook machinelearning-python scikit-learn

Last synced: 18 May 2026

https://github.com/sbera01/credit-card-approval-predictor

End-to-end Machine Learning project to predict credit card approval decisions using real-world financial features. Includes EDA, model training, and deployment-ready architecture

credit-card-approval-prediction data-analysis machine-learning python scikit-learn streamlit

Last synced: 24 Dec 2025

https://github.com/adhishnanda/motion-based-german-learning-app

AI-powered language learning app with gesture recognition (MediaPipe + ML/DL models), real-time interaction, spaced repetition, and full React/TypeScript UI. Demonstrates ML engineering, computer vision, and frontend expertise.

capstone-project computer-vision data-science deep-learning gesture-recognition interactive interactive-learning machine-learning mediapipe portfolio-project pose-estimation react scikit-learn tensorflow typescript

Last synced: 07 Apr 2026

https://github.com/ramyacp14/sentimentanalysis

Implements a sentiment analysis model to determine the emotional tone behind text, helping understand attitudes, opinions, and emotions in online mentions.

machine-learning natural-language-processing nltk numpy pandas python scikit-learn

Last synced: 07 Apr 2026

https://github.com/jerinpious/house-price-prediction

This project is a machine learning-based application to predict house prices. A frontend interface has been developed using Streamlit to make the prediction process user-friendly for regular customers. The project is structured

data-analysis data-engineering data-science eda machine-learning pandas python random-forest scikit-learn streamlit

Last synced: 05 Apr 2026

https://github.com/dadvaiahpavan/ats-system

This AI-driven Applicant Tracking System (ATS) is a cutting-edge solution designed to revolutionize the recruitment process by providing intelligent resume analysis and matching capabilities.

google-generativeai nltk pandas plotly python-docx scikit-learn spacy streamlit

Last synced: 05 Apr 2026

https://github.com/jainish-prajapati/solar-flare-prediction

This repository contains code and data for predicting solar flare energy ranges using machine learning, based on NASA's RHESSI mission data. It includes preprocessing of FITS files into a unified CSV dataset and implements models like Gradient Boosting, Random Forest, and Decision Tree classifiers, achieving accuracies up to 87%.

data-visualization machine-learning numpy pandas python scikit-learn solar-flare-prediction

Last synced: 30 Dec 2025

https://github.com/myahninsi/credit_card_fraud_detection

This repository is for the Neural Networks and Deep Learning Course - Assignment 1, focusing on credit card fraud detection. The project utilizes a machine learning model to predict whether a transaction is fraudulent using a synthetic credit card dataset.

matplotlib numpy pandas pickle python scikit-learn seaborn streamlit

Last synced: 09 Apr 2026

https://github.com/jupitvq/simple-uib-assistant

Chatbot sederhana berbasis machine learning untuk membantu mahasiswa memberikan informasi seputar akademik & administrasi UIB.

chatbot machine-learning scikit-learn virtual-assistant

Last synced: 05 Apr 2025

https://github.com/abdulshaikh55/ml-involuntary-denied-boarding

A machine learning model that predicts whether you will be denied onboarding your plane.

first-timers ipynb machine-learning scikit-learn

Last synced: 29 Apr 2026

https://github.com/karthikarajagopal44/data-analysis-using-python-libraries-

The COVID-19 pandemic has significantly impacted India, necessitating a detailed analysis of the virus’s spread within the country. In this project, we explore an India-specific COVID-19 dataset, leveraging Python libraries such as Pandas, NumPy, Matplotlib, and Seaborn.

data-cleaning data-visualization matplotlib numpy pandas python python3 scikit-learn seaborn

Last synced: 07 Apr 2026

https://github.com/yuji1702/ai--powered-triage-system

This project implements a machine learning-based triage system for emergency rooms, which classifies patients based on their symptoms and vitals using a Random Forest Classifier. The system features real-time patient data integration, a user-friendly GUI built with Tkinter, and secure patient data encryption using Fernet from the cryptography lib

cryptography data-imputation data-preprocessing data-security encryption gui healthcare machine-learning matplotlib medical-data python random-forest realt-time scikit-learn seaborn tkinter triage-system

Last synced: 05 Apr 2025

https://github.com/akansharajput280799/data-driven-insights-into-job-satisfaction-and-compensation-trends

This project analyzes 2020 employee data to identify factors influencing job satisfaction, performance, and salary differences, offering insights for improving engagement and workplace strategies.

cluster-analysis colab-notebook data-cleaning descriptive-statistics factor-analysis hypothesis-testing jupyter-notebook matplotlib python scikit-learn seaborn t-test visualization

Last synced: 18 Apr 2026

https://github.com/fahrettinsolak/ai-map-based-geographic-clustering-project

This project focuses on clustering crime incidents in San Francisco using the K-Means algorithm. The dataset is obtained from Kaggle and contains information about crime types, geographical coordinates, and other relevant features. The goal is to identify crime hotspots through geographic clustering and visualize the clusters on an interactive map.

artificial-intelligence deep-learning elbow-method jupyter-notebook machine-learning numpy openstreetmap pandas phyton plotly scikit-learn standardscaler

Last synced: 05 Apr 2026

https://github.com/venky-1710/superhero-recruitment

Superhero Recruitment System predicts hero selection using machine learning. Users input hero attributes through a web interface. A Random Forest model analyzes abilities, strengths, weaknesses, success rates, and missions completed. The Flask app displays results, showing if a hero is selected.

css flask html numpy pandas python scikit-learn

Last synced: 07 Apr 2026

https://github.com/josugoar/digit-recognizer

Digit recognizer full stack web app and classifier

flask jquery opencv scikit-learn

Last synced: 12 Sep 2025

https://github.com/tapas-gope/telecommunication-customer-churn

This project involves predicting customer churn in a telecommunications company using machine learning techniques, exploring various features' impact, optimizing models, and identifying key factors influencing churn.

feature-engineering matplotlib-pyplot model-evaluation-and-validation numpy pandas python scikit-learn

Last synced: 12 Sep 2025

https://github.com/narendhiran-dev/predictive-analytics-for-repayment-predictions

A machine learning API built with Python, FastAPI, and Scikit-learn to predict borrower repayment risk based on historical payment data. A FinTech risk assessment system that uses a Random Forest model to predict a borrower's future repayment behavior and serves the prediction via a REST API.

data-science fastapi fintech loan-prediction loan-prediction-analysis machine-learning machine-learning-algorithms predictive-modeling python random-forest random-forest-classifier random-forest-regression rest-api risk-assessment scikit-learn scikit-learn-api scikit-learn-python

Last synced: 13 Apr 2026

https://github.com/sabin74/loan_approval_prediction

This project predicts whether a loan application will be approved or not using machine learning classification models. The dataset used is from Kaggle’s Loan Prediction problem. The goal is to build a robust model to assist banks or financial institutions in making automated loan approval decisions.

classification-models kaggel-dataset loan-approval-prediction matplotlib-seaborn pandas python scikit-learn

Last synced: 30 Apr 2026

https://github.com/nickklos10/seriea_machine_learning_predictions_2025

This project involves scraping data, processing the data, and building machine learning models to predict the standings for the 2024-2025 Serie-A season.

beatifulsoup data-scraping keras matplotlib pandas scikit-learn shap tensorflow

Last synced: 13 Apr 2026

https://github.com/sneha1012/ml-dl

Implementing concepts and algorithms from scratch.

deep-learning machine-learning matplotlib numpy-tutorial scikit-learn

Last synced: 18 May 2026

https://github.com/altescy/xsklearn

Expanded scikit-learn for my research

python scikit-learn

Last synced: 21 Mar 2025

https://github.com/tanaybhadula/ml-preprocessing-cli

A CLI tool with python to preprocess datasets for performing supervised learning to save time for users. Input data can be preprocessed using simple commands and preprocessed dataset can be downloaded later

cli data-cleaning data-preprocessing machine-learning pandas python scikit-learn

Last synced: 10 May 2026

https://github.com/konnik88/heart-disease-ml-practice

Practice notebook on heart-disease risk with a small/noisy dataset: EDA → preprocessing → classic ML baselines (scikit-learn). Not for clinical use

classification eda healthcare heart-disease imbalanced-data jupyter-notebook machine-learning model-evaluation optuna reproducibility scikit-learn

Last synced: 18 May 2026