An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/seyha1007/amazon-reviews-analysis

🧐 This project analyzes Amazon Fine Food Reviews to investigate whether negative reviews are more emotionally intense and lexically repetitive than positive ones. Using R, we apply sentiment analysis and lexical diversity metrics to uncover patterns in consumer review language.

acp amazon-reviews bert data-analytics glove jupyter-notebook lstm-sentiment-analysis machine-learning nltk random-forest scikit-learn sentiment-classification sentimental-analysis support-vector-machine

Last synced: 08 May 2026

https://github.com/sundarmd/breast-cancer-detection

Breast-Cancer-Detection is a machine learning project that utilizes logistic regression to predict whether a tumor is benign or malignant based on the Breast Cancer Wisconsin (Diagnostic) dataset. The project demonstrates data preprocessing, model training, and evaluation using the `scikit-learn` library.

logistic-regression machine-learning python scikit-learn

Last synced: 09 May 2026

https://github.com/aasjunior/mlapp-api

Esta API fornece endpoints para aplicar algoritmos de aprendizado de máquina, como K-Nearest Neighbors (KNN), Árvore de Decisão e Algoritmo Genético. Realizado como tarefa da disciplina de Laboratório Mobile/Computação Natural no 5º Semestre de Desenvolvimento de Software Multiplataforma.

fastapi machine-learning python scikit-learn

Last synced: 09 May 2026

https://github.com/davidrpugh/kaust-cs-294w

Course materials for KAUST CS 294W

deep-learning machine-learning pytorch scikit-learn

Last synced: 09 May 2026

https://github.com/santiagoasp98/spam-detection

SMS spam detection using Logistic Regression and Multinomial Naive Bayes.

classification logistic-regression machine-learning multinomial-naive-bayes python scikit-learn spam-detection

Last synced: 09 May 2026

https://github.com/mayankanand007/yfraud

Credit card fraud detection platform using scikit-learn and xgboost 💳

knearest-neighbor-algorithm linear-regression machine-learning predictive-analytics python3 scikit-learn svm xgboost

Last synced: 09 May 2026

https://github.com/akwardhan/loan-default-prediction-xgboost-streamlit

Full-scale loan default prediction system using XGBoost, trained on 1.3M LendingClub loans. Includes feature-rich preprocessing, class imbalance handling, recall-focused ML pipeline, and Streamlit web deployment for real-time borrower risk scoring.

credit-risk data-science google-colab loan-default-prediction machine-learning python real-world-project scikit-learn streamlit xgboost

Last synced: 09 May 2026

https://github.com/mpolinowski/multi-dimensional-scaling

Multidimensional Scaling is a family of statistical methods that focus on creating mappings of items based on distance.

matplotlib-pyplot multi-dimensional-scaling python scikit-learn

Last synced: 09 May 2026

https://github.com/saahilanande/naivebayes

Implimenting Naive Bayes classifier from scratch for sentiment analysis of IMDB dataset

machine-learning naive-bayes-classifier python-3 scikit-learn

Last synced: 09 May 2026

https://github.com/thanh12273203/hotel-booking-cancellation-prediction

Binary classification on hotel booking cancellations.

classification machine-learning python scikit-learn

Last synced: 09 May 2026

https://github.com/adadalshabab/human-stress-analysis-greadsearch-classifier

The project leverages data from physiological signals, self-reported surveys, behavioral observations, or other relevant sources to infer and analyze stress levels.

classification knn-classification machine-learning machine-learning-algorithms matplotlib pandas scikit-learn

Last synced: 09 May 2026

https://github.com/jaswanthv99/basic_ml-model_understanding

This project explains basic ML-Models(KNN, Naive bayes, Logistic Regression, SVM, A neural N/W)

matplotlib-python pandas-python scikit-learn tensorflow

Last synced: 09 May 2026

https://github.com/piras-s/braincancerclassifier

Classifying brain tumors using Gaussian Naive Bayes with MRI-derived features. Includes feature selection, model evaluation, prediction uncertainty, and probability calibration.

baysian-inference calibrated-classification classification data-visualization feature-selection machine-learning medical-imaging naive-bayes-classifier python scikit-learn uncertainty-estimation

Last synced: 09 May 2026

https://github.com/suvasish114/house-price-estimation

A machine learning model that estimate housing prices in California using the California census data

jupyter-notebook machine-learning python scikit-learn

Last synced: 09 May 2026

https://github.com/amirdora/python_ml_supervisedlearning_example

Building Classification Models with scikit-learn

machine-learning python3 scikit-learn

Last synced: 10 May 2026

https://github.com/macdon112/credit-card-fraud-detection

Comparing ML models (Random Forest, KNN, Decision Tree) for credit card fraud detection using SMOTE and stratified cross-validation.

classification data-analysis fraud-detection imbalanced-data machine-learning python scikit-learn

Last synced: 10 May 2026

https://github.com/hassanislam463/nyc_airbnb_eda

This project is a comprehensive data analysis of Airbnb listings in New York City, exploring pricing trends, seasonality effects, host market dynamics, rental preferences, and revenue estimation. It provides valuable insights for hosts, investors, and policymakers to optimize Airbnb operations and understand the short-term rental landscape in NYC.

exploratory-data-analysis matplotlib python scikit-learn seaborn

Last synced: 10 May 2026

https://github.com/ejw-data/ml-classification-credit-risk

Compares several machine learning classification models to determine whether to approve or reject a loan request

classification python scikit-learn

Last synced: 10 May 2026

https://github.com/i30101/mathworks2024

Coding tools for 2024 MathWorks Math Modeling Challenge

machine-learning mathematical-modelling python scikit-learn

Last synced: 10 Jun 2026

https://github.com/vijaykumarr1452/ipl-first-innings-score-prediction-deployment

Deployment of IPL Score Prediction Analyser Model. https://github.com/vijaykumarr1452/IPL-First-Innings-Score-Prediction)

css deployment gunicorn html machine-learning ml predictive-analytics python scikit-learn

Last synced: 11 May 2026

https://github.com/mpolinowski/tstochastic-neighbor-embedding

Improve Data Quality by discarding non-correlating, noisy Dimensions

matplotlib-pyplot python scikit-learn t-sne

Last synced: 11 May 2026

https://github.com/bheemisme/brain-tumor-classification

brain tumor classification using machin learning

deep-learning machine-learning pytorch scikit-learn xgboost

Last synced: 11 May 2026

https://github.com/theladev/machine-learning

This repository is focus on show u my personal projects and interests on Machine Learning and Data Science. Hope u enjoy it.

data-science machine-learning machine-learning-models pandas python scikit-learn

Last synced: 11 May 2026

https://github.com/johannesvc/data-science-portfolio

A curated portfolio of applied data science projects focused on machine learning, NLP, and social impact.

academic-portfolio data-science deep-learning keras machine-learning media-bias nlp pandas scikit-learn

Last synced: 11 May 2026

https://github.com/ananyagubba/bike-sharing-demand-prediction

Using machine learning techniques, the model learns from features such as weather conditions, time of day, season, and holiday information to forecast hourly or daily demand.

machine-learning python scikit-learn seaborn

Last synced: 11 May 2026

https://github.com/sharvesh1401/inverse-design-patch-antenna

A machine learning approach to the inverse design of microstrip patch antennas by predicting optimal physical dimensions from desired performance metrics.

antenna-design deep-learning engineering-project gradio jupyter-notebook machine-learning patch-antenna python regression-model scikit-learn

Last synced: 11 May 2026

https://github.com/cptanalatriste/copycat-detector

A Naive-Bayes classifier for detecting plagiarism.

amazon-sagemaker naive-bayes-classifier scikit-learn

Last synced: 12 May 2026

https://github.com/xunchiasg/nyc_property_sales

Exploratory Data Analysis of rolling property sales data in NYC from March 2023-2025

matplotlib-pyplot plotly python scikit-learn

Last synced: 12 May 2026

https://github.com/capsuleismail/rt-iot2022

RT-IoT2022 is a dataset obtained from a real-time IoT infrastructure. This project aims to compare the accuracy of three machine learning models: XGBoost and LGBMClassifier.

datascience jupyter-notebook machinelearning-python scikit-learn

Last synced: 12 May 2026

https://github.com/srosalino/prediction_of_seoul_bikes_demand

The objective of this project is to predict the number of bicycles needed to be made available each hour in order to make the service as efficient as possible

cross-validation data-exploration-and-preprocessing hyperparameter-tuning machine-learning regularization-methods scikit-learn

Last synced: 13 May 2026

https://github.com/msikorski93/heart-failure-prediction

The subject of this repository was to perform binary classification based on respondent's collected features (age, cholesterol level, fasting blood sugar, thallium stress test results, etc.).

classification knn-classifier logistic-regression random-forest-classifier roc-curves scikit-learn svm-classifier

Last synced: 13 May 2026

https://github.com/fgebhart/handson-ml

hands-on machine learning notebooks collection

jupyter-notebook machine-learning scikit-learn

Last synced: 13 May 2026

https://github.com/johanneswiesner/skplot

A python package for extracting, plotting and reporting information from one or multiple sklearn classification & prediction pipelines.

plotting python scikit-learn sklearn visualization

Last synced: 14 May 2026

https://github.com/janek1842/mlbyjan-sandbox

Testbed for private ML investigations

ml scikit-learn

Last synced: 14 May 2026

https://github.com/fulviofavilla/cvd-prediction-ml

Comparative ML analysis for CVD prediction. Winner of the 2023 HPCC Systems Poster Competition.

data-science ecl healthcare hpcc-systems machine-learning pandas python scikit-learn

Last synced: 11 Jun 2026

https://github.com/arjunan-k/medical_insurance

Project to analyze and forecast medical insurance costs of patients using data science framework.

medical-insurance scikit-learn tableau

Last synced: 12 Jun 2026

https://github.com/jayemscript/lab-to-code

A complete Python learning roadmap for scientists and researchers — covering data science, biology, chemistry, physics, and mathematics with curated libraries, tools, and resources.

bioinformatics chemistry data-science jupyter-notebook machine-learning mathematics numpy pandas physics python research roadmap scientific-computing scikit-learn

Last synced: 19 Jun 2026

https://github.com/royxlead/production-drift-detection

Production ML monitoring library - KL, PSI, MMD, and ADWIN drift detectors with empirical benchmarks, confidence tracking, and a 6-page FastAPI dashboard.

data-drift drift-detection fastapi kl-divergence mlops mmd model-monitoring production-ml psi pytorch scikit-learn uncertainty-quantification

Last synced: 23 Jun 2026

https://github.com/zsailer/skspline

A Scikit-learn interface on Scipy's spline.

scikit-learn scipy

Last synced: 16 Apr 2026

https://github.com/imosudi/unsupervised-ml-kmeans-analysis

K-Means clustering analysis using synthetic datasets generated with scikit-learn, including meshgrid visualisation, silhouette score evaluation, and investigation of cluster count and random seed effects.

clustering data-analysis jupyter-notebook kmeans kmeans-clustering machine-learning matplotlib python3 scikit-learn silhouette-score unsupervised-learning

Last synced: 25 Jun 2026

https://github.com/chanmeng666/mnist-handwritten-digit-recognition-project

【Sprinkle some star dust on this repo! ⭐️ It's good karma!】A comprehensive implementation and analysis of handwritten digit recognition using multiple neural network architectures on the MNIST dataset. Features basic MLP, optimized feature-selected model, and deep CNN approaches with detailed performance comparisons and visualizations.

cnn computer-vision data-analysis data-visualization deep-learning feature-analysis handwritten-digit-recognition keras machine-learning mlp mnist model-optimization neural-networks python scikit-learn tensorflow

Last synced: 02 Apr 2026

https://github.com/smuralee/machine-learning-samples

Machine learning samples

pytorch scikit-learn

Last synced: 15 Feb 2026

https://github.com/mgesteban/analyzing_car_prices

A comprehensive data science project analyzing factors that drive used car prices to provide actionable insights for used car dealerships.

crisp-dm data-science lasso-regression linear-regression machine-learning one-hot-encoding pandas ridge-regression scikit-learn

Last synced: 15 Feb 2026

https://github.com/paultheal1en/dsc-fact-checking

Fact-checking project classifying claims as SUPPORTED, REFUTED, or NEI. Uses ANN, DNN, RNN, CNN, Random Forest, PhoBERT, and Sentence Transformers.

deep-learning fact-checking keras machine-learning nlp phobert random-forest scikit-learn sentence-transformers tensorflow transformers

Last synced: 16 Apr 2026

https://github.com/sridharyadav07/machine-learning-project-combined-cycle-power-plant-

This project is focused on Multiple machine learning models, including Linear Regression, Decision Tree Regression, and Random Forest Regression, were implemented to predict the target variable and evaluated using various metrics like RMSE, MAE, and R-squared. The performance of these models was compared, and the Random Forest Regressor was found.

data-processing decisiontreeregressor linear-regression metrics-evaluation python random-forest-regressor scikit-learn

Last synced: 16 Apr 2026

https://github.com/sasanka14/water_quality_predictions

Water Quality Prediction - College Project 🌊💧 Predicts water potability (safe/unsafe) using ML models like XGBoost & Random Forest. Features data preprocessing, feature importance, model evaluation, and visualizations. Built with Python, Pandas, Scikit-learn & Seaborn for analysis. 🚀

anaconda jupyter-notebook matplotlib numpy pandas python scikit-learn seaborn xgboost

Last synced: 16 Apr 2026

https://github.com/pramodyasahan/health-insurance-cost-prediction

This project focuses on predicting health insurance costs using a polynomial regression model. By employing machine learning techniques in Python, the project aims to accurately estimate insurance costs based on various personal attributes. The model takes into account several features including age, sex, BMI, number of children, smoking status etc

machine-learning matplotlib numpy pandas python3 scikit-learn

Last synced: 16 Apr 2026

https://github.com/piotrwnuczek/cloudprediction

Predicting cloud task execution time using AI/ML

matplotlib pandas python scikit-learn

Last synced: 16 Apr 2026

https://github.com/silky-x0/spam-detector

An machine learning algorithm to detect spam emails or such.

jupyter-notebook nltk-python pandas python3 scikit-learn

Last synced: 16 Apr 2026

https://github.com/sergeimakarovv/solar-panel-detection

Applying deep learning models to detect solar panel installations in satellite imagery and estimating their generation capacity

albumentations convolutional-neural-networks deep-learning geopandas pandas pvlib python pytorch rasterio scikit-learn wms-service

Last synced: 16 Apr 2026

https://github.com/meiyor/abatech_ai_test

This repository contains the files for deploying an Exploratory Data Analysis (EDA) for participant demographic and company-based data collected by the outsourcing service given by the company ABATech located in Colombia. This repository also includes the evaluation of three different classifiers to decode the level of satisfaction of the users.

keras python scikit-learn scikitlearn-machine-learning tensorflow

Last synced: 16 Apr 2026

https://github.com/archish27/pythontutorial

Python Programming Tutorial for new geeks who want to learn python from scratch to deal with various applications

matplotlib numpy pandas pygame python python-2 python-3 scikit-learn soup

Last synced: 01 Apr 2026

https://github.com/dan-niles/iris-ml

Machine learning on the Iris dataset

iris-dataset machine-learning scikit-learn

Last synced: 16 Apr 2026

https://github.com/sahiltiwariiii/dssp

Predicting student math scores ! This project utilizes advanced machine learning techniques and MLOps tools like DVC and MLflow to predict a student's math score based on various factors such as gender, race/ethnicity, parental level of education, lunch type, test preparation course, writing etc

docker dotenv dvc flask machine-learning mlflow mlops mysql mysql-connector-python numpy pandas pymysql python python-dotenv scikit-learn seaborn sklearn-library statistics streamlit

Last synced: 27 Mar 2026

https://github.com/sanikamal/deep-learning-atoz

A collection of deep learning architectures ,model, code snippets, tips and mini projects.

computer-vision deep-learning nlp scikit-learn skimage tensorflow

Last synced: 16 Apr 2026

https://github.com/bkamapantula/discover

Code search utility to assist developer workflows via code discovery. Currently uses TF-IDF estimator.

developer-tools python scikit-learn tf-idf

Last synced: 16 Apr 2026

https://github.com/grupoguerreroherrera/ethical-ai-recruitment-audit

Bias audit toolkit reproducing the recruitment AI case from Activity 6 — Unidad 3, Electiva II Inteligencia Artificial Avanzada. Empirical analysis with reweighing mitigation, Model Card documentation, and APA 7 references.

academic-project algorithmic-auditing artificial-intelligence bias-mitigation disparate-impact ethical-ai fairness machine-learning model-card python random-forest recruitment-bias reweighing scikit-learn unesco-ai-ethics

Last synced: 03 Jun 2026

https://github.com/supershivam5/python_projects

💻 Python programming with Numpy, Pandas, Matplotlib.🌟 Love exploring new technologies. Check out my projects!

matplotlib-pyplot numpy pandas scikit-learn seaborn

Last synced: 17 Apr 2026

https://github.com/junya737/weighted-pls-regression

A Python implementation of Weighted Partial Least Squares Regression with support for sample weights.

machine-learning partial-least-squares-regression scikit-learn

Last synced: 17 Apr 2026

https://github.com/erikglz/coap-mtd

Repository for an IoT security project implementing Moving Target Defense (MTD) through CoAP protocol randomization to mitigate spoofing attacks and enhance adaptive security.

coap-protocol cybersecurity iot machine-learning python scikit-learn spoofing

Last synced: 17 Apr 2026

https://github.com/zenklinov/regression_logistic_-_sentiment_analysis_movie_data

This repository contains code for performing sentiment analysis using scikit-learn and logistic regression

llm natural-language-processing nlp nltk scikit-learn sentiment-analysis

Last synced: 10 May 2026

https://github.com/dimdasci/car-price-prediction-demo

Demo project of EDA and regression task solution: Pandas, Jupyter Notebook, Scikit-learn, LightGBM

eda lightgbm-regressor regression scikit-learn

Last synced: 03 Jun 2026

https://github.com/kheriberto/knn_project

This is a simple project that uses dummie data to practice and demonstrate my knowledge of the KNN algorithm.

data-analysis knn-classifier numpy python scikit-learn seaborn

Last synced: 02 Apr 2026

https://github.com/iamwatchdogs/cardiovascular-risk-prediction

This mini-project uses machine learning algorithms to predict possible risks of heart disease by analyzing given data.

jupyter-notebook machine-learning-algorithms matplotlib numpy pandas python scikit-learn seaborn

Last synced: 02 Apr 2026

https://github.com/akshitvats026/heart_disease_prediction

An ML-based Heart Disease Prediction System that predicts the likelihood of heart disease based on user health parameters. Built using Python, Pandas, and Scikit-learn, the system performs data preprocessing, trains a predictive model, and provides real-time predictions with high accuracy.

accuracy-score logistic-regression machine-learning matplotlib-pyplot numpy pandas python scikit-learn

Last synced: 02 Apr 2026

https://github.com/broodhoney/blue-book-for-bulldozers

This repository holds the project which solves a regression problem on predicting the futures sales of bulldozers. This is from a kaggle competition.

matplotlib numpy pandas python scikit-learn

Last synced: 02 Apr 2026

https://github.com/raphael-ufrj/analise_algodao

Análise histórica de plantio de algodão, analise do plantio com base no clima e nos dados históricos.

analysis data-science data-visualization dataset docker pandas provenance python python3 scikit-learn seaborn streamlit

Last synced: 02 Apr 2026

https://github.com/anshvaid4/ml_practice

This is the new repository, where I have added all the notebooks demonstrating the usage of various transformers and models for Supervised and Unsupervised algorithms

anaconda jupyter-notebook machine-learning machine-learning-algorithms python scikit-learn

Last synced: 17 Apr 2026

https://github.com/prashver/end-to-end-model-deployment-on-aws

Student Performance Analysis with Machine Learning analyzes factors impacting student outcomes using a robust machine learning pipeline. Achieving an impressive R2 score, it predicts student performance effectively. With extensive data preprocessing and deployment on AWS Elastic Beanstalk, it ensures scalability and high availability.

amazon-web-services aws-elastic-beanstalk end-to-end-deployment flask machine-learning-algorithms matplotlib numpy pandas scikit-learn seaborn

Last synced: 02 Apr 2026

https://github.com/orliluq/inmersion-datos-python

Desarrollar modelos de machine learning para predecir la probabilidad de incumplimiento crediticio de los clientes, utilizando diferentes algoritmos de clasificación (Regresión Logística, Árboles de Decisión, Random Forest, Naive Bayes).

colab-notebook numpy pandas python scikit-learn

Last synced: 02 Apr 2026