scikit-learn
scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.
- GitHub: https://github.com/topics/scikit-learn
- Wikipedia: https://en.wikipedia.org/wiki/Scikit-learn
- Repo: https://github.com/scikit-learn/scikit-learn
- Created by: David Cournapeau
- Released: January 05, 2010
- Related Topics: scikit, python,
- Aliases: sklearn,
- Last updated: 2026-06-25 00:23:58 UTC
- JSON Representation
https://github.com/anirudh-pulavarthy/car-evaluation-using-smote
machine-learning python scikit-learn smote-sampling
Last synced: 24 Apr 2026
https://github.com/subratamondal1/heart-attack-prediction
Heart Attack Prediction of patients based on the required data. Data Ingestion - Data Preparation - Exploratory Data Analysis (EDA) - Modelling - Evaluation.
data-analysis data-science data-visualization kaggle-dataset machine-learning matplotlib-pyplot numpy pandas python3 scikit-learn seaborn
Last synced: 09 Apr 2026
https://github.com/duruii/contest-dingtalkcup2-a
2023年第二届“钉钉杯”大学生大数据挑战赛——智能手机用户监测数据分析
data-mining machine-learning pandas scikit-learn xgboost
Last synced: 12 Mar 2025
https://github.com/abhipatel35/svm-hyperparameter-optimization-for-breast-cancer
Utilizing SVM for breast cancer classification, this project compares model performance before and after hyperparameter tuning using GridSearchCV. Evaluation metrics like classification report showcase the effectiveness of the optimized model.
breast-cancer cancer-diagnosis classification data-analysis data-science gridsearchcv healthcare hyperparameter-tuning jupyter-notebook machine-learning medical-imaging pycharm python scikit-learn support-vector-machine svm
Last synced: 05 Feb 2026
https://github.com/hotequil/computer-vision
Study about computer vision.
jupyter-notebook matplotlib numpy python scikit-learn
Last synced: 13 Apr 2026
https://github.com/javi-cc/python-ml-portcanto
Portcanto és un projecte de simulació d'un trajecte en bicicleta. S'ha definit 4 tipus de ciclistes que es diferencien en el temps que tarda a fer el trajecte. L'objectiu és descobrir els 4 patrons amb l'algoritme de clustering KMeans.
clustering docker docker-compose kmeans machine-learning mlfow pydoc pylint python scikit-learn testing venv
Last synced: 13 Apr 2026
https://github.com/oceanuz/car-price-regression
A comprehensive ML evaluation and improvement notebook for a car price prediction model. It includes topics such as scoring with r2, cross-validation, overfitting/underfitting diagnosis, and polynomial regression. *Ridge regression* is applied to reduce overfitting, and (GridSearchCV) techniques are used to find the best alpha hyperparameter.
cross-validation data-science grid-search hyperparameter-tuning machine-learning machine-learning-models model-evaluation overfitting python regression ridge-regression scikit-learn
Last synced: 11 Dec 2025
https://github.com/18mahi/digital_cave
An intermediate-level deep learning project that compares Convolutional Neural Networks (CNN) and Multi-Layer Perceptrons (MLP) on the MNIST handwritten digits dataset. This project demonstrates data augmentation, learning rate scheduling, and visual comparison of model performance
cnn confusion-matrix data-augmentation data-science deep-learning evaluation-metrics jupyter-notebook keras learning-rate-scheduler machine-learning matplotlib mlp numpy python3 scikit-learn seaborn tensorflow
Last synced: 13 Apr 2026
https://github.com/pranavsp108/time-series-forcasting
A time-series forecasting project to predict hourly energy consumption using Python, Pandas, and an XGBoost regression model.
data-analysis data-science energy-consumption forecasting matplotlib numpy pandas python scikit-learn sustainability time-series xgboost
Last synced: 10 Apr 2026
https://github.com/ahmadbuilds/fake-news-classifier
Classifies news articles as real or fake using an NLP pipeline with TF-IDF + n-grams and machine learning models. Includes text preprocessing, feature engineering, model training, and evaluation.
fastapi logistic-regression matplotlib n-grams nextjs nltk numpy pandas python3 random-forest-classifier react scikit-learn seaborn supervised-learning tf-idf typescript xgboost-classifier
Last synced: 11 Apr 2026
https://github.com/pranavsp108/financial-fraud-detection
A comprehensive machine learning project for detecting financial fraud using XGBoost and LightGBM, with a focus on advanced feature engineering, class imbalance handling, and hyperparameter tuning.
classification-model data-science feature-engineering fraud-detection hyperparameter-tuning lightgbm machine-learning pandas python scikit-learn xgboost
Last synced: 04 May 2026
https://github.com/ivanswetz/banana_shelf-life_prediction
Goal: Predict how many days a banana has left before spoiling (“days to death”) based on a photo. This project demonstrates an end-to-end machine learning pipeline: image preprocessing, feature extraction, supervised & semi-supervised learning, and model deployment.
image-processing machine-learning opencv python random-forest scikit-learn supervised-learning
Last synced: 04 May 2026
https://github.com/imehranasgari/mlflow_starter
This project is a hands-on guide to the complete end-to-end MLflow workflow, designed as an educational resource. It demonstrates how MLflow is used in practice for experiment tracking, model versioning, and ensuring a reproducible MLOps lifecycle, focusing on the methodology and best practices rather than high model accuracy.
data-science experiment-tracking mlflow mlops model-registry python scikit-learn
Last synced: 11 May 2026
https://github.com/njorogepaul-moghul/house-price-predictions-kaggle-competition-
Built a predictive model for the Kaggle House Prices competition using feature engineering and LightGBM, achieving strong leaderboard performance."
data-science house-price-prediction-with-lightgbm kaggle-competition lightgbm machine-learning predicting-home-values-using-machine-learning random-forest scikit-learn
Last synced: 15 May 2026
https://github.com/ytalk/deep-learning
Um repositório dedicado à minha jornada de aprendizado e experimentação em Deep Learning. Contém diversas pipelines e implementações em diferentes datasets, explorando modelos (MLPs, LSTMs, CNNs) e técnicas (Regressão, Classificação, etc.) com foco em TensorFlow e Keras.
data-science deep-learning keras machine-learning neural-networks pandas python scikit-learn tensorflow
Last synced: 30 Dec 2025
https://github.com/abhay-rudatala/resume-analyzer
Intelligent Resume Analysis System using Machine Learning and NLP. Features TF-IDF + Naive Bayes/SVM classification (90-95% accuracy), SpaCy NER for information extraction, and interactive Streamlit web app with custom UI. Built with Python, Scikit-learn, and deployed on Streamlit Cloud.
classification machine-learning named-entity-recognition nlp portfolio-project python resume-analysis scikit-learn spacy streamlit
Last synced: 06 May 2026
https://github.com/manishrajmss13/regression_project
A predictive machine learning model to forecast the Algerian Forest Fire FWI using Python, Scikit-learn, and Statsmodels. Includes complete data cleaning and EDA.
data-cleaning-and-preprocessing data-science eda feature-engineering learning-by-doing linear-regression machine-learning python regression scikit-learn statsmodel
Last synced: 09 May 2026
https://github.com/smusab9152/bpm_pred_songs
ML project to predict the Beats Per Minute (BPM) of a song using various audio features. This is a submission for the Kaggle Playground Series (S04E02). The notebook covers a full data science workflow, including EDA, handling skewed data with log transformations, feature scaling, and building various regressions
data-science jupyter-notebook kaggle-competition machine-learning pandas regression scikit-learn
Last synced: 11 May 2026
https://github.com/jobanjps089/mental_wellness
This project is a Flask web application that predicts mental wellness levels based on lifestyle factors such as screen time, sleep hours, and work-related screen exposure. It uses a Machine Learning model trained in Google Colab and deployed via Hugging Face Spaces for public access.
flask joblib puthon3 scikit-learn
Last synced: 16 May 2026
https://github.com/kostadinlambov/time-series-forecasting
This project evaluates the predictive performance of a CNN-LSTM Hybrid deep learning model for Bitcoin price movement prediction.
keras-tensorflow matplotlib-pyplot mlflow numpy optuna pandas python scikit-learn seaborn statsmodels ta-lib tensorflow
Last synced: 07 Apr 2026
https://github.com/chirindaopensource/strapsim_portfolio_similarity_metric
End-to-End Python implementation of STRAPSim: a novel portfolio similarity metric from Li et al. (2025). Combines Random Forest proximity learning with residual-aware bipartite matching to quantify economic substitutability between ETF baskets. Full replication pipeline included.
asset-management bipartite-matching corporate-bonds etf-analysis fixed-income jupyter-notebook machine-learning numba pandas portfolio-optimization portfolio-similarity proximity-matrix python quantitative-finance random-forest research-replication scikit-learn similarity-metrics statistical-analysis supervised-learning
Last synced: 28 Apr 2026
https://github.com/javedfazlulahf/customer-churn-prediction
📊 Predict customer churn in telecom using machine learning to enhance retention strategies and drive better business outcomes.
churn-prediction cross-validation data-science factorization-machines imbalanced-learn libsvm machine-learning model-evaluation pipelines plotly scikit-learn seaborn shap-values spark-ml survival-analysis tensorflow watson-studio xgboost4j
Last synced: 11 May 2026
https://github.com/affan005-ai/tesla-stock-prediction
This project analyzes Tesla stock data and builds machine learning models to predict and classify stock movements. The analysis includes EDA, feature correlation, moving averages, and two models
data data-analysis data-science data-visualization-project eda machine-learning matplotlib pandas predictive-analytics predictive-modeling python scikit-learn
Last synced: 05 Oct 2025
https://github.com/kianaabrisham/stroke-prediction-ml-pipeline
Clinical ML pipeline with ROC/PR and interpretability
class-imbalance clinical-data healthcare interpretability machine-learning pandas pipeline precision-recall roc-auc scikit-learn
Last synced: 05 Oct 2025
https://github.com/blue-catblues/tieba-integratedanalysis
Python期末大作业—对百度贴吧进行爬虫采集(scrapy)、统计分析(pandas)、可视化展示(matplotlib),与机器学习分类(scikitLearn)的综合性数据分析
matplotlib nlp-machine-learning pandas python scikit-learn scrapy seaborn
Last synced: 05 Oct 2025
https://github.com/nihanthbhargav/time-series-stock-market
This project combines computer vision and NLP by segmenting pet images with a U-Net model and generating captions using CNN-RNN/LSTM. Using the Oxford-IIIT Pets dataset, it demonstrates a unified pipeline that integrates pixel-level segmentation with automatic caption generation for meaningful image understanding.
matplotlib numpy pandas plotly python scikit-learn seaborn
Last synced: 11 Apr 2026
https://github.com/disney35/stock-prices-dashboard
A dashboard to analyze, predict, and visualize stock prices using Python & LSTM
ema jupyter-notebook keras macd matplotlib-pyplot mfi numpy pandas python rsi scikit-learn sma streamlit tenserflow yfinance
Last synced: 12 Apr 2026
https://github.com/khaifara/klafisikasi_jeruk_faiz_kece
Step by step machine learning classification dengan StandardScaler, OneHotEncoder, OrdinalEncoder, ColumnTransformer, Pipeline, Classification Report, Confusion Matrix dan deployment menggunakan Streamlit
machine-learning scikit-learn streamlit
Last synced: 05 Oct 2025
https://github.com/veerchaudhary0708/credit-fraud-detection
An end-to-end machine learning project to detect credit fraud using XGBoost.
datascience fintech fraud-detection machinelearning scikit-learn xgboost
Last synced: 18 May 2026
https://github.com/inesruizblach/data-science-project
A data science project exploring Portuguese "Vinho Verde" wine quality prediction. Features EDA, feature engineering, ML models, and evaluation using Python, pandas, scikit-learn, and visualization tools.
binary-classification classification data-science exploratory-data-analysis feature-engineering imbalanced-learn jupyter-notebook machine-learning model-evaluation pandas regression scikit-learn seaborn uci-dataset wine-quality
Last synced: 09 May 2026
https://github.com/kianaabrisham/naive-bayes-sentiment
Sentiment classification using Multinomial NB (scratch + sklearn)
bag-of-words naive-bayes nlp scikit-learn sentiment-analysis text-classification
Last synced: 14 May 2026
https://github.com/scorchinghot/core-machine-learning-exploration
This repository provides a hands-on exploration of classical machine learning algorithms applied to the MovieLens 100k dataset, aiming to build intuition and understanding of core ML concepts.
core-ml data-science hands-on machine-learning ml-algorithms python scikit-learn tutorial
Last synced: 05 Oct 2025
https://github.com/vedanty3/bulldozer-price-prediction
A machine learning project aiming to build a machine learning model which could predict the sales price of bulldozer.
andrew-ng-machine-learning ensemble-machine-learning gridsearchcv jupyter-notebook machine-learning matplotlib numpy pandas python randomforestregressor randomizedsearchcv scikit-learn ztm
Last synced: 05 Apr 2026