An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/priyanshulathi/air-quality-index-prediction

Machine learning based air quality index prediction using environmental and pollutant data to classify and forecast pollution levels.

machine-learning matplotlib numpy pandas python scikit-learn seaborn

Last synced: 19 Jan 2026

https://github.com/khushirajurkar/exoplanet-habitability-prediction-model

Predicts whether an exoplanet is habitable using ML. Handles class imbalance with ADASYN, tests multiple models, and saves the best one. Includes confusion matrices, ROC curves, and a clean Jupyter notebook

adasyn astroinformatics confusion-matrix exoplanets logistic-regression machine-learning multiclass-classification python roc-curve scikit-learn smote

Last synced: 06 May 2026

https://github.com/jeus0522/7-explore-different-classifier-ml-app

A project exploring various classification algorithms, showcasing their implementation, comparison, and evaluation using Python and scikit-learn.

k-nearest-neighbours knn random-forest scikit-learn streamlit support-vector-machine svm

Last synced: 21 Jan 2026

https://github.com/fanyicharllson/mobile-money-transaction-analysis

Machine learning pipeline for classifying mobile money users (MTN MoMo & Orange Money) into activity segments — CSC 3221 Final Project, ICT University Cameroon.

cameroon data-science ict-university jupyter jupyter-notebook machine-learning mtn-momo orange-money python scikit-learn

Last synced: 31 May 2026

https://github.com/katjaweb/king-county-house-price-prediction

This project aims to predict house prices based on various features such as square footage, number of rooms or location.

machine-learning python regression scikit-learn

Last synced: 19 Jan 2026

https://github.com/lorenzorottigni/ml-breast-cancer

Machine Learning python bootcamp: Support Vector Machines using breast cancer dataset

ipynb machine-learning numpy pandas python scikit-learn seaborn support-vector-machines

Last synced: 14 Apr 2026

https://github.com/manome/python-supervised-learning

This project provides sample code for performing supervised learning.

conformal-prediction scikit-learn supervised-learning

Last synced: 19 Jan 2026

https://github.com/soumyapro/parkinson-disease-prediction

This project predicts Parkinson's disease using machine learning models.

logistic-regression numpy pandas scikit-learn svc xgboost

Last synced: 19 Jan 2026

https://github.com/sudarshanc00/smishing

This project aims to classify text messages to detect potential smishing (SMS phishing) attacks. Using machine learning, the project provides a classifier that can differentiate between legitimate messages and smishing attempts, helping to prevent scams.

nltk numpy pandas python scikit-learn scipy

Last synced: 14 Apr 2026

https://github.com/afkewolczyk/data_science_bootcamp

A data science project to learn data science essentials such as: pandas, Matplotlib, Scikit learn

ai data-science machine-learning pandas scikit-learn

Last synced: 07 May 2026

https://github.com/stewartpark/sklearn2gem

⚡ sklearn2gem ports your scikit-learn model into a fast ruby C binding!

ruby rubygem scikit-learn sklearn

Last synced: 01 Mar 2026

https://github.com/ricardorobledo/paymentcardfrauddetection2025

Comparative analysis of probabilistic classification models for credit card fraud detection, focusing on model calibration and threshold optimization in highly imbalanced datasets.

imbalanced-learn matplotlib numpy pandas python3 scikit-learn search

Last synced: 14 Apr 2026

https://github.com/chaakshay/heartdrive

A Streamlet-based tool that analyzes cardiovascular health data, predicts population risk using ML, and suggests targeted government actions like awareness campaigns, health checkups, and policy changes.

csv mathplotlib numpy pandas pandas-dataframe python scikit-learn seaborn streamlit

Last synced: 05 Apr 2026

https://github.com/gabrielmazzotta/nlp-clustering--movie-similarity-from-plot-summaries

A Python-based movie recommendation system leveraging NLP and clustering techniques. This project includes data processing, vectorization of plot summaries, and the implementation of recommendation algorithms to suggest similar movies based on user input.

clustering cosine-similarity hierarchical-clustering kmeans lemmatization nlp recommendation-engine scikit-learn similarity-score spacy tokenization

Last synced: 21 Jan 2026

https://github.com/nikshithmenta/fake-news-detector

This repository contains a Streamlit web app designed for fake news detection. Users can input a news article, and the app will predict whether it's real or fake based on its content. It also allows users to choose between different vectorizers (TF-IDF or Bag of Words) and classifiers (Linear SVM or Naive Bayes) to customize the prediction model.

bag-of-words fake-news-detection linear-svc naive-bayes-classifier scikit-learn streamlit-application tf-idf

Last synced: 15 May 2026

https://github.com/1adityakadam/tweet-classification-using-nlp-techniques

This project classifies tweets as toxic or non-toxic using NLP and machine learning. It includes preprocessing, feature engineering, and models like Logistic Regression, Random Forest, and XGBoost on labeled tweet datasets. Technologies: Python, Pandas, NLTK, Scikit-learn, XGBoost.

nltk pandas python scikit-learn xgboost

Last synced: 05 May 2026

https://github.com/sudothearkknight/15-machinelearningprojects

A curation of 15 Machine Learning projects in various fields that are helping me gain a better understanding of the different machine learning tools, techniques, algorithms and methodalogies.

classification-algorithm machine-learning machine-learning-algorithms natural-language-processing pycharm-ide python3 regression-models scikit-learn scikitlearn-machine-learning spam-detection

Last synced: 19 Jan 2026

https://github.com/angelarreola/ai_notes

Notas de la materia "Inteligencia Artificial" para su posterior extraccion mediante algun modelo de lenguaje que nos permita dar respuestas personalizadas con base a la informacion presente en este repositorio.

ai matplotlib numpy pandas phaserjs python scikit-learn

Last synced: 21 Jan 2026

https://github.com/taimoorkhan10/ai-fairness-explainability-toolkit

AI Fairness and Explainability Toolkit (AFET) is an open-source project aimed at providing tools and frameworks to assess, visualize, and mitigate bias in machine learning models. It supports multiple ML frameworks and offers a comprehensive suite of metrics and visualization components to enhance model transparency and fairness.

ai bias-detection data-science ethical-ai explainable-artificial-intelligence fairness machine-learning mlops model-interpretation open-source python responsible-ai scikit-learn

Last synced: 19 Jan 2026

https://github.com/srijon57/disease-detector

Practice python3 detecting diseases based on symptoms

flask machine-learning pandas python3 scikit-learn typescript vite

Last synced: 11 Apr 2026

https://github.com/waikato-datamining/shallowflow-sklearn

scikit-learn support for the shallowflow Python workflow system.

python3 scikit-learn sklearn workflow-engine

Last synced: 14 Apr 2026

https://github.com/nisch-mhrzn/house_prediction

This project predicts house prices using data exploration, feature engineering, and machine learning models like Linear Regression and Random Forest. It demonstrates how to optimize models and evaluate their performance to accurately forecast house prices.

matplotlib numpy pandas python scikit-learn seaborn

Last synced: 08 Apr 2026

https://github.com/jsimell/sleepanalysis

A Python data analysis project analyzing the sleep quality affecting factors and temporal patterns in the sleeping data of a single subject.

data-analysis matplotlib numpy pandas python scikit-learn seaborn

Last synced: 14 Apr 2026

https://github.com/angelalim88/jakarta-air-quality-index-classification

This project classifies Jakarta's Air Quality Index (AQI) from 2010 to 2023 using machine learning models (Random Forest, MLP, SVM) based on pollutant concentrations.

data-analysis data-visua machine-learning scikit-learn tensorflow

Last synced: 13 Oct 2025

https://github.com/szymon-budziak/real_estate_house_prices_prediction

Predicting real estate house prices using various machine learning algorithms, including data exploration, preprocessing, model training, and evaluation.

data-analysis data-preprocessing data-science eda jupyter-notebook machine-learning matplotlib numpy optuna pandas predictive-modeling price-prediction python random-forest regression scikit-learn seaborn

Last synced: 21 Jan 2026

https://github.com/wasifsohail5/amusic-ai_powered_musicrecommendationsystem

AMUSIC is an AI-driven music recommendation system that helps users discover personalized songs. Using Python, Streamlit, and Scikit-learn, it offers smart recommendations, advanced search, and interactive music insights. Users can save favorites, create playlists, and export data for a seamless music discovery experience.

joblib k-nearest-neighbours matplotlib minmaxscaler numpy pandas pickle plotly python scikit-learn seaborn streamlit

Last synced: 14 Oct 2025

https://github.com/ayorick23/python-data-science-cheat-sheet

Guía rápida y práctica de sintaxis, comandos y funciones esenciales de Python para Ciencia de Datos. Perfecta para recordar cómo usar las librerías más comunes como NumPy, Pandas, Matplotlib y Scikit-learn en tus análisis diarios.

cheat-sheet data-analysis data-science data-visualization deep-learning jupyter-notebook machine-learning matplotlib ml numpy pandas python scikit-learn scipy seaborn statistics sympy tensorflow

Last synced: 07 Apr 2026

https://github.com/yahiazakaria445/ensemble-learning-voting-classifier

Ensemble Learning Using KNN, Naive Bayes, Decision Tree on Biomechanical Data

matplotlib numpy pandas scikit-learn seaborn

Last synced: 30 Apr 2026

https://github.com/aasjunior/machinelearningapp

O Machine Learning App é um aplicativo desenvolvido com Kotlin, Android Studio e Jetpack Compose, para aplicação de algoritmos de aprendizado de máquina e exibição dos resultados. Realizado como tarefa da disciplina de Laboratório Mobile/Computação Natural no 5º Semestre de Desenvolvimento de Software Multiplataforma.

fastapi jetpack-compose kotlin-android machine-learning material-design scikit-learn

Last synced: 18 Apr 2026

https://github.com/avik-pal/kaggle-titanic

Predicting whether a given set of people survive on the Titanic

machine-learning numpy pandas scikit-learn scikitlearn-machine-learning

Last synced: 14 Apr 2026

https://github.com/mahambilalandahaan/week8

K-Means Deep Dive: Clustering analysis with Elbow and Silhouette methods in Python

clustering data-visualization jupyter-notebook k-means machine-learning python scikit-learn unsupervised-learning

Last synced: 20 Apr 2026

https://github.com/harris-giki/cancerdetectionmodel_ml

Simple Logistic Regression and Neural Network powered Machine Learning models that predicts whether a breast tumor is malignant or benign based on input features extracted from a breast cancer dataset.

cancer-detection development keras keras-tensorflow logistic-regression machine-learning neural-network scikit-learn streamlit tensorflow

Last synced: 13 Apr 2026

https://github.com/josancamon19/boston_housing

Predicting Boston Housing Prices for Udacity Machine Learning Nanodegree

boston-housing-price-prediction machine-learning machine-learning-nanodegree scikit-learn udacity

Last synced: 21 Apr 2026

https://github.com/moanassiddiqui/handsonml_ml

This is the complete part I of the Hands-On Machine Learning book which was about the classical machine learning models.

hands-on machine-learning scikit-learn

Last synced: 14 Mar 2026

https://github.com/gregoritsch3/ml_eda_classification_diabetes

An EDA and Machine Learning Classification exercise on the Diabetes dataset demonstrating the use of SQLAlchemy data import from an SQL database (PostgreSQL), Pre-processing Pipelines, ANOVA, 9 ScikitLearn ML models, Hyperparamter Tuning for the best performing one, and feature importance.

anova machine-learning matplotlib numpy pandas pipelines scikit-learn seaborn sql sqlalchemy statistics

Last synced: 14 Apr 2026

https://github.com/ljubogdan/solar-cycle-lstm

This project predicts sunspot activity using an LSTM model for time series data. Built with TensorFlow and Keras, it uses Huber loss for outlier handling and MAE for performance evaluation. The dataset, sourced from Kaggle or SIDC, spans over 270 years of monthly sunspot data.

conv1d huber-loss-regression kaggle keras lstm machine matplotlib numpy pandas scikit-learn seaborn solar sunspots tensorflow time-ser

Last synced: 13 Apr 2026

https://github.com/szymon-budziak/ai_football_game_analysis

Football game analysis using YOLOv8 for object detection, Optical Flow for motion tracking, speed and distance calculations, perspective transformation, and K-Means clustering for pixel segmentation.

ai computer-vision kmeans object-detection optical-flow python3 pytorch roboflow scikit-learn segmentation supervision ultralytics yolov8

Last synced: 03 May 2026

https://github.com/mindlessmuse666/iris-ml-based-on-decision-trees

Проект демонстрирует применение моделей машинного обучения на основе деревьев решений и случайного леса для классификации набора данных Iris. Включает в себя загрузку данных, обучение моделей, оценку производительности и визуализацию результатов. Предназначен для изучения основ машинного обучения и анализа данных.

classification data-analysis data-visualization decision-trees iris-dataset machine-learning model-evaluation python random-forest scikit-learn

Last synced: 17 Oct 2025

https://github.com/sunilvarma-l/liverdiseaseprediction

"Streamlit app to predict liver disease risk using a machine learning model based on patient input data."

machine-learning matplotlib numpy pandas pickle python scikit-learn seaborn streamlit

Last synced: 13 Apr 2026

https://github.com/farhad-here/predict_student_performance

Predict Student Performance, is a data analysis and machine learning project aimed at predicting students' final performance (g3) based on demographic, family, and academic features. The project supports both Regression (predicting exact grades) and classification (Pass/Fail categories).

classification data-analysis data-visualization linear-regression machine-learning numpy pandas postgresql powerbi scikit-learn streamlit

Last synced: 14 Apr 2026

https://github.com/darkdk123/customer-churn-prediction-innobytes

Predicting Customer churns as an Internship project at Innobytes services.

data-science python scikit-learn streamlit xgboost-classifier

Last synced: 14 Apr 2026

https://github.com/hassan11196/churn-nn

A simple Churn Predictor using Scikit's Multi-Layer Perceptron Classifier

jupyter-notebook machine-learning ml neural-network python scikit-learn

Last synced: 14 Apr 2026

https://github.com/mecha-aima/fake-bills-detection

This Python project implements a simple classification model comparison using scikit-learn to classify banknotes as either "Authentic" or "Counterfeit" based on four features

classification-model machine-learning model-selection scikit-learn

Last synced: 27 Jan 2026

https://github.com/icepanorama/internship-visualizations-and-demonstrations

A collection of some of the programs that I've written over the course of my internship.

artificial-intelligence machine-learning matplotlib numpy pandas python3 pytorch scikit-learn

Last synced: 14 Apr 2026

https://github.com/mahdi-meyghani/movie-recommendation-system

A Python-based movie recommendation system utilizing popularity-based, content-based, and collaborative filtering models with data science and machine learning techniques.

data-analysis data-science machine-learning recommendation-system scikit-learn scikitlearn-machine-learning

Last synced: 23 Jan 2026

https://github.com/shubhamsoni98/prediction-with-binomial-logistic-regression

To predict client subscription to term deposits and optimize marketing strategies by identifying potential subscribers.

binomial data data-science eda machine-learning matplotlib pipeline python scikit-learn seaborn sklearn sql visualization

Last synced: 06 Feb 2026

https://github.com/haseeeb21/machine-learning-models

Machine Learning Models trained on Scikit-learn datasets. This repository contains the code files and saved models trained on Toy datasets (Classification & Regression), and Real World dataset.

anaconda classification classification-models jupyter-notebook knn knn-classification machine-learning machine-learning-algorithms python3 regression regression-models scikit-learn scikit-learn-python scikitlearn-machine-learning svm svm-classifier vscode

Last synced: 07 May 2026

https://github.com/messierandromeda/sentiment-analysis

Sentiment analysis with the IMDB movie review dataset.

imdb-dataset python scikit-learn sentiment-analysis

Last synced: 28 Jan 2026

https://github.com/bilgenurbekar/turkishcyberbullying

Contains fine-tuned BERT models and results in the text classification category using Turkish social media data

bert-fine-tuning huggingface-transformers matplotlib numpy pandas python pytorch scikit-learn transformers

Last synced: 07 Mar 2026

https://github.com/alexliap/sk_serve

Deployment of a Scikit-Learn model and it's column transformations made easy.

machine-learning mlops model-deployment scikit-learn

Last synced: 24 Oct 2025

https://github.com/jofaval/sonar

Binary Classification of Sonar Signals of Rocks and Metal cylinders in 1987

data-analysis data-science data-visualization machine-learning python scikit-learn sonar uci

Last synced: 09 Apr 2026

https://github.com/santiagoenriquega/ez-animate

A Python package for creating Matplotlib animations with minimal code. Built to quickly visualize model behavior.

animation machine-learning matplotlib python scikit-learn

Last synced: 15 Mar 2026

https://github.com/luliatuccu/weather_analysis

This project highlights a combination of data science techniques and Python programming to explore real-world weather data.

data-preprocessing eda feature-engineering machine-learning matplotlib numpy pandas regex scikit-learn seab seaborn weather weather-patterns

Last synced: 02 Apr 2026

https://github.com/smahala02/svm-machine-learning

This repository provides an in-depth tutorial and practical implementation of Support Vector Machines (SVM) for classification tasks, using Python and popular data science libraries.

classification data-science machine-learning python scikit-learn svm

Last synced: 30 Jan 2026

https://github.com/pradeep31747/smartsuggest-personalized_product_recommendations

This project implements a personalized product recommendation system using machine learning techniques to enhance user experience and drive engagement.

jupyter-notebook keras numpy pandas pyhton scikit-learn sql tensorflow vscode

Last synced: 28 Jan 2026

https://github.com/adriantomin/bulldozer-price-prediction

Predicting the Sale Price of Bulldozers Using Machine Learning 🚜💰 This project uses machine learning to predict bulldozer sale prices based on historical data from the Kaggle Bluebook for Bulldozers competition. The goal is to minimize the RMSLE between actual and predicted prices.

data-science jupyter-notebook machine-learning matplotlib-pyplot numpy pandas python scikit-learn

Last synced: 23 Jan 2026

https://github.com/mohammad95labbaf/churn-prediction

This project aims to predict customer churn using machine learning algorithms. The project includes data preprocessing, feature engineering, and model evaluation.

adaboost bagging churn churn-analysis churn-prediction decisiontree ensemble-learning knn randomforest scikit-learn sklearn svm voting

Last synced: 23 Jan 2026

https://github.com/elprofesoriqo/chrome-extension-gmail-spam-filter

Chrome extension that automatically identifies and moves emails marked as spam to the spam folder in Gmail.

api-client chrome-extension firebase-database javascript machine-learning python scikit-learn

Last synced: 09 Apr 2026

https://github.com/sonaligill/olympics-analysis

The outcome of this project is an interactive streamlit web application that visualizes the analysis of Olympic data while rendering different aspects of Olympic history, compare country performances, and gain insights into athlete demographics.

numpy plotly python scikit-learn scipy streamlit

Last synced: 28 Jan 2026

https://github.com/rahul-120/crop_recom

This project is a Machine Learning based Crop Recommendation System built using Flask. It helps farmers or users decide the most suitable crop to grow based on soil nutrients and environmental conditions.

crop-recommendation-system flask flask-application machine-learning python3 scikit-learn

Last synced: 02 May 2026

https://github.com/ficaan/ml-dl-projects

A collection of Machine Learning and Deep Learning projects implemented with frameworks including PyTorch, TensorFlow and scikit-learn.

deep-learning deep-learning-projects machine-learning machine-learning-projects pytorch scikit-learn tensorflow

Last synced: 27 Oct 2025

https://github.com/asut00/Machine-Learning-Program_42AI

Comprehensive Machine Learning path by 42AI: hands-on modules on regression, gradient descent, and real-world ML applications.

linear-regression machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 27 Oct 2025

https://github.com/raulmaulidhino-dev/ml_modelling_regression

There are many factors that influence the grades/scores of students. One of the factors is study hours. In this mini analysis project, there are 3 models that will learn and predict the relation between study hours of students and their scores in an exam/test. This project will result the best ML model to solve the problem.

data data-analysis-python data-science eda machine-learning scikit-learn

Last synced: 28 Jan 2026

https://github.com/anuranjanjain/video-upscaler

A WebAPP designed for upscalling video to HD Resolution using custom Denoise filter and OpenCV

artificial-intelligence opencv python scikit-learn tailwindcss

Last synced: 24 Jan 2026

https://github.com/sahraiidle/email-spam-detector

Email/SMS spam detector with a Flask UI/API, tuned ML models (TF‑IDF + SVM/LogReg/NB), and a ready-to-run web form plus JSON endpoint for predictions.

data machine-learning numpy pandas python randomforest scikit-learn spam-classifier spam-detection svm

Last synced: 24 Jan 2026

https://github.com/pyzit/recommandation-engine-in-drf-sk-learn

Full Stack Movie Recommendation System Project made in Django REST Framework and React JS

api django django-rest-framework movies reactjs recommender-system scikit-learn

Last synced: 28 Jan 2026

https://github.com/glennx1/heartdrive

ML-powered heart disease predictor using Streamlit, featuring data preprocessing, visualization, and user input interface.

matplotlib pandas python scikit-learn seaborn streamlit

Last synced: 29 Apr 2026

https://github.com/engineertolulope/us_states_living_ranking_analysis

Python script for analyzing and ranking U.S. states based on factors like cost of living, tax burden, diversity, crime rates, and climate. Uses weighted criteria to identify the best states to live in according to these metrics. Ideal for decision-making on relocation.

data-analysis data-science linear-regression machine-learning python scikit-learn

Last synced: 29 Jan 2026

https://github.com/allwin107/loan-prediction-web-app

A Flask-based loan prediction web app using a Random Forest model to predict loan approval based on user input. It includes a clean, responsive UI, form validation, and real-time prediction display.

classification data-processing deployment flask loan-prediction machine-learning python random-forest-classifier scikit-learn web-application

Last synced: 15 Apr 2026

https://github.com/samjoesilvano/airline_ticket_fare_prediction

Airline Fare Prediction using Machine Learning focuses on developing a Random Forest model to predict flight prices, achieving an R² score of 0.804. The project includes hyperparameter tuning using RandomizedSearchCV, alongside extensive data preprocessing and feature engineering to ensure robust model performance.

airline-fare-prediction data-preprocessing data-visualization feature-engineering feature-selection hyperparameter-tuning machine-learning pandas python random-forest randomizedsearchcv regression-analysis scikit-learn

Last synced: 15 Apr 2026

https://github.com/shahaba83/airplane-ticket-cancellation

In this project, we try to predict the possibility of canceling the plane ticket by the buyer

datatime numpy pandas python scikit-learn seaborn

Last synced: 25 Feb 2026

https://github.com/asherk7/house-price-prediction

House Prices - Advanced Regression Techniques - Predict sales prices and practice feature engineering, RFs, and gradient boosting

data-science numpy pandas regression scikit-learn

Last synced: 15 Apr 2026

https://github.com/chengetanaim/beatrecommendersystembackend

A system for music producers and rappers/singers. I was trying to implement the product recommendation feature for music uploaded by producers. I used the collaborative filtering algorithm to be able to recommend songs to users.

fastapi scikit-learn sqlalchemy unsupervised-learning

Last synced: 06 Feb 2026

https://github.com/lau1944/coronavirus-world-prediction

Coronavirus Case Confirmed Trend Around The World

coronavirus pandas python scikit-learn

Last synced: 15 Apr 2026