An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/h-sarhan/hate-speech-classifier

Automatic Detection of Hate Speech and Offensive Content

nlp python scikit-learn

Last synced: 22 Apr 2026

https://github.com/hoccyy/house-price-prediction

Machine learning model built with Scikit-learn to predict house prices based on various features.

linear-regression machine-learning ml pickle prediction-model scikit-learn scikitlearn-machine-learning

Last synced: 24 Apr 2026

https://github.com/capsuleismail/parkinsons-telemonitoring-dataset

Dataset used to predict Parkinson’s disease severity based on biomedical voice measurements.

data-science jupyter-notebook machinelearning-python scikit-learn

Last synced: 25 Apr 2026

https://github.com/bp0609/decision-tree-implementation-from-scratch

This repo contains the decision tree implementation from scratch for all possible cases i) discrete features, discrete output; ii) discrete features, real output; iii) real features, discrete output; iv) real features, real output.

decision-tree-classifier decision-tree-regressor scikit-learn

Last synced: 26 Apr 2026

https://github.com/leolion3/smartnanotubes-smellinspector-companion

Companion software for the SmellInspector Devices from SmartNanoTubes. Allows specifying substances, connecting multiple devices, collecting data and performing machine learning.

docker machine-learning python3 reactjs scikit-learn smartnanotubes smellinspector

Last synced: 27 Apr 2026

https://github.com/toscdom/spam_detection

This repository contains a project focused on analyzing and classifying emails to detect SPAM. It includes: Training a machine learning classifier for SPAM detection. Identifying key topics in SPAM emails using NLP techniques. Calculating semantic distances to evaluate topic similarity. Tools used include Python libraries like nlp frameworks

classifier nlp nltk scikit-learn semantic-analysis spam-detection

Last synced: 27 Apr 2026

https://github.com/sundanc/movierecommendation

Movie recommendation system based on user input. Built with Streamlit

movie-recommendation-app python scikit-learn scikitlearn-machine-learning streamlib

Last synced: 27 Apr 2026

https://github.com/tillscode/personal-finance-ml-analysis

Machine learning analysis of personal financial data with predictive modeling and interactive dashboard

dashboard data-analysis finance machine-learning python scikit-learn

Last synced: 28 Apr 2026

https://github.com/serdaraydem1r/10dayaichallenge101

In the 10-day camp, we experienced the basics of machine learning by coding

artificial-intelligence machine-learning-algorithms model-evaluation-and-selection scikit-learn

Last synced: 28 Apr 2026

https://github.com/hai4320/ml_ai_notebook

All my note about ML, AI and Data Science

ai machine-learning numpy pandas scikit-learn

Last synced: 28 Apr 2026

https://github.com/dwade-eng/amazon-product-recommender-prototype-

This project is a content-based product recommendation engine inspired by Amazon's "Customers who viewed this item also viewed" feature. It uses a dataset of product metadata and user interactions to suggest similar items based on product titles, brands, and categories using TF-IDF vectorization and cosine similarity.

html numpy pandas python3 scikit-learn

Last synced: 28 Apr 2026

https://github.com/brenofariasdasilva/dagster-education-model

Dagster Education Model using Dagster 1.3.11 and Python 3.7.17.

dagster makefile matplotlib pandas pyenv python3 scikit-learn seaborn shellscript

Last synced: 28 Apr 2026

https://github.com/findthehead/pentestpayload

A KNN algorithm based Web Application Payload search and modification engine with a nice red FLASK based GUI

knn-classification knn-regression machine-learning pentest-tool scikit-learn websecurity

Last synced: 28 Apr 2026

https://github.com/catcoder27/ai-portfolio

Reusable ML scaffold: notebooks, model cards, reports

data-science kaggle machine-learning pandas scikit-learn

Last synced: 28 Apr 2026

https://github.com/incalculable-driverslicence975/data-projects-portfolio

📊 Showcase data projects that highlight analytics, machine learning, and MLOps with reproducible code and clear business insights.

ai computer-vision dashboard data-science-projects data-visualization deep-learning etl excel finance hadoop hiveq keras machine-learning nlp pandas portfolio-project scikit-learn tableau-dashboards

Last synced: 28 Apr 2026

https://github.com/abhi227070/car-price-prediction

This project implements a machine learning model to predict the price of cars based on various features such as mileage, manufacturing date, fuel type, and more. Users can input car information, and the model will estimate the price of the car based on the provided data. This tool can be useful for both car buyers and sellers to estimate car price.

data-analysis machine-learning machine-learning-algorithms machinelearning python3 regression regression-models scikit-learn scikitlearn-machine-learning

Last synced: 28 Apr 2026

https://github.com/rakibhhridoy/customersegmentation-clustering

Customer segmentation heavily use in business purpose. It is needed skill for business intelligence and applied machine learning engineer. This represent quite basic way the customer segmentation is done. In python the task is quite easy to do.

agglomerative-clustering clustering-algorithm customer ecommerce kmeans-clustering machine-learning scikit-learn scikitlearn-machine-learning segmentation unsupervised-learning unsupervised-machine-learning

Last synced: 28 Apr 2026

https://github.com/belsabbagh/employee-turnover-and-customer-churn-classification

A data science project that tests mutliple models on an employee tunronver and customer churn problem

machine-learning pandas python scikit-learn

Last synced: 28 Apr 2026

https://github.com/alessine/predicting_pirate_attack_success

Using machine learning to predict the success or failure of pirate attacks; elaborated during the Data Science Bootcamp at Propulsion Academy

bokeh fine-tuning interactive-visualizations machine-learning modelling overfitting plotly prediction scikit-learn

Last synced: 28 Apr 2026

https://github.com/therayyanshariff/cinereview

A Machine Learning web app for sentiment analysis, using a Scikit-learn NLP model with a custom-styled Streamlit UI.

machine-learning nlp python scikit-learn sentiment-analysis streamlit

Last synced: 04 May 2026

https://github.com/skypse/santander-coders-data_science-course

Curso de Data Science, proposto pelo Satander, utilizando Python!

jupyter-notebook numpy pandas-python python scikit-learn

Last synced: 29 Apr 2026

https://github.com/jarif87/text-key-extractor

A Django web app that uses TF-IDF to extract keywords from text, featuring a modern, responsive UI with animated gradients and glassmorphism.

django-application keywords-extraction pandas python scikit-learn

Last synced: 29 Apr 2026

https://github.com/shibin08/sentiment-analysis-movie-reviews

A sentiment analysis project on IMDb movie reviews using Natural Language Processing (NLP) techniques. Text data is cleaned, vectorized using TF-IDF, and classified using machine learning models like Logistic Regression and Random Forest. Achieved high accuracy in distinguishing positive and negative reviews.

logistic-regression machine-learning movie-reviews natural-language-processing random-forest scikit-learn sentiment-analysis text-classification tf-idf

Last synced: 29 Apr 2026

https://github.com/vaishnavijain25/pca-based-digit-classification

A machine learning project that uses Principal Component Analysis (PCA) for dimensionality reduction and Logistic Regression for classifying handwritten digit images from the scikit-learn digits dataset.

digit-recognition dimensionality-reduction image-classification logistic-regression machine-learning pca-analysis scikit-learn

Last synced: 29 Apr 2026

https://github.com/m-muecke/text-normalizer

Text normalizer integration for sklearn.pipeline.Pipeline class

nlp nltk python scikit-learn

Last synced: 29 Apr 2026

https://github.com/fx31337/predict_zigzag

Prototype code to predict zigzag pattern prices.

machine-learning ml scikit-learn

Last synced: 29 Apr 2026

https://github.com/adnanrahin/sentiment_classification_logistic_regeression

Sentiment Analysis extracts subjective information in the source material. It's widely used in modern business, to understand the business module, product quality and consumer point of view regarding the products or the business.

logistic-regression machine-learning natural-language-processing preprocessing python3 scikit-learn

Last synced: 29 Apr 2026

https://github.com/mateluky/covid19-patient-status-prediction

Machine learning model to predict COVID-19 patient status from clinical data, using Python and scikit-learn for healthcare decision support.

classification clinical-decision-support covid19 data-science disease-prediction healthcare jupyter-notebook machine-learning medical-data open-source python scikit-learn

Last synced: 29 Apr 2026

https://github.com/saikumar787/car_price_prediction_using_linear-regression

A machine learning project to predict the selling price of used cars using regression techniques. Includes data preprocessing, model training, evaluation, and testing on new data.

car-price-prediction-with-machine-learning data-analysis joblib jupiter-notebook linear-regression-models model-deployment python scikit-learn standardscaler

Last synced: 29 Apr 2026

https://github.com/anrsgrl/regressions

This project contains examples of Linear, Polynomial, and Logistic Regression models implemented using Python. Explore how different regression techniques can be applied to various datasets 🤖

deep-learning linear-regression logistic-regression mahine-learning polynomial-regression regression scikit-learn

Last synced: 29 Apr 2026

https://github.com/pdoup/ml-codes

Python source files and notebooks for the Machine Learning course weekly tasks

machine-learning scikit-learn

Last synced: 29 Apr 2026

https://github.com/andreaschatzopoulos/face-landmark-detector

Facial landmark detection using HOG features and Ridge Regression. Simple, effective, and fast – no deep learning required.

computer-vision face-detection hog image-processing landmark-detection python ridge-regression scikit-learn

Last synced: 29 Apr 2026

https://github.com/karimosman89/energy-consumption-forecasting

Predict future energy consumption based on historical data.Create a model that predicts energy consumption in households or businesses to optimize energy distribution and reduce costs.Assist energy companies in planning and managing supply efficiently.

arima lstm matplotlib pandas python scikit-learn

Last synced: 29 Apr 2026

https://github.com/matheusvazdata/retail-sales-forecast-linreg-sklearn

Minimal project for retail sales forecasting using linear regression (scikit-learn).

forecasting linear-regression machine-learning matplotlib numpy pandas scikit-learn

Last synced: 29 Apr 2026

https://github.com/karmaniket/gtavcontrol

created dataset using different hand gestures and trained the ML model for in-game real time control for GTA V. Have fun!

gaming gta5 machine-learning mediapipe opencv python3 scikit-learn

Last synced: 29 Apr 2026

https://github.com/mertafacan/fertilizer-prediction-kaggle-playground-s05e06

Top 9% in Kaggle Playground Series - Predicting Optimal Fertilizers - Season 5, Episode 6

catboost kaggle kaggle-competition machine-learning optuna scikit-learn xgboost

Last synced: 29 Apr 2026

https://github.com/diestok/bmlb2025

Material for the BMLB2025 course

classification keras learning machine regression scikit-learn

Last synced: 29 Apr 2026

https://github.com/dhruvv1402/spam-detection-python-

This project is a Spam Detection System built using Python. It classifies SMS messages as spam or ham (not spam) using machine learning techniques.

countvectorizer kaggle-dataset nlp-machine-learning nltk numpy pandas python scikit-learn supervised-machine-learning tf-idf

Last synced: 01 May 2026

https://github.com/dmschauer/aws-sagemaker-deployment-test

I did a simple test to see how deploying a machine learning model on AWS Sagemaker and thus turning it into an API works. Since scikit-learn models require less dependencies than e.g. TensorFlow models I went with them for this test. To do so I used a tutorial.

aws boto3 python sagemaker scikit-learn

Last synced: 02 May 2026

https://github.com/viniciusds2020/ml_pycaret_classificacao

Sistema de preprocessamento e treinamento de modelos de machine learning utilizando PyCaret. Uma metodologia low-code para processos de MLops

machine-learning mlops preprocessing pycaret python scikit-learn

Last synced: 03 May 2026

https://github.com/alessandromonolo/fraud-detection-binary-classification-model

This project builds a machine learning model to classify fraudulent clients using a banking dataset. Data preprocessing, statistical analysis, and feature selection were performed before training KNN and Random Forest Classifier. Model performance was evaluated using accuracy, precision, recall, and F1-score.

classification-model fraud-detection knn-classification machine-learning pandas python random-forest scikit-learn statistical-analysis

Last synced: 03 May 2026

https://github.com/zhenglinlei/zdmp

Industry 4.0 Optimization with Machine Learning AI

industry-4 knn-classification machine-learning pandas python scikit-learn

Last synced: 03 May 2026

https://github.com/kaustavmodak/business-aided-customer-feedback-assessment-system

A Streamlit-based sentiment analysis app that classifies customer reviews into Positive, Neutral, or Negative using a pre-trained ML mode

framework machine-learning matplotlib nlp nltk numpy pandas pickle regex scikit-learn seaborn sentiment-analysis streamlt tfidf-vectorizer

Last synced: 03 May 2026

https://github.com/srilaasya/breast-cancer-classifier

Used several Python libraries to make a K-Nearest Neighbor classifier that is trained to predict whether a patient has breast cancer

knearest-neighbor-classifier python scikit-learn

Last synced: 03 May 2026

https://github.com/codejsha/machine-learning-examples

Examples of machine learning using scikit-learn

machine-learning scikit-learn

Last synced: 04 May 2026

https://github.com/dakii24/credit-card-fraud-detection

This repository contains a machine learning project focused on detecting fraudulent credit card transactions. The project includes data preprocessing, model training, and evaluation to identify and prevent fraudulent activities.

capstone-project class-imbalance classification-algorithm credit-card credit-card-fraud data-science decision-trees fraud machine-learning open-data python scikit-learn svm svm-classifier

Last synced: 04 May 2026

https://github.com/drod75/nyc-arrests-analysis

This is a simple Data Science Project made to analyze and display data and trends found within the NYC Arrests Year to Date Dataset.

data-analysis data-visualization folium jupyter-notebook matplotlib-pyplot nyc-opendata nypd python scikit-learn seaborn

Last synced: 04 May 2026

https://github.com/chathumiamarasinghe/nn-training-model

A comprehensive project for training neural networks to solve real-world problems. This repository includes customizable code for building, training, and evaluating neural network architectures using popular deep learning frameworks.

jupyter-notebook matplotlib numpy phyton scikit-learn

Last synced: 04 May 2026

https://github.com/simpl1fy/spam-classifier-project

A web application to classify spam texts or emails.

multinomial-naive-bayes nltk python render scikit-learn text-classification

Last synced: 05 May 2026

https://github.com/monish-nallagondalla/universal-bank

Credit Card Ownership Prediction A machine learning project that predicts credit card ownership using features like age and income, balancing class distributions for improved accuracy.

classification-models credit-card-prediction data-analysis data-classification decision-tree-classifier imbalanced-datasets machine-learning model-evaluation python scikit-learn

Last synced: 05 May 2026

https://github.com/akash-47-tank/personalized-e-commerce-review-summarizer

Personalized E-commerce Product Review Summarizer: A Streamlit app that summarizes product reviews (e.g., from a CSV) using T5-small and tailors summaries to user preferences (price, durability, etc.) with NLP and lightweight ML.

data-analysis e-commerce machine-learning nlp personalization portfolio python scikit-learn sentiment-analysis streamlit t5 transformers web-app

Last synced: 05 May 2026

https://github.com/zuhairzia/customer-segmentation

📖 About Customer Segmentation using KMeans clustering to analyze demographics, income, and spending. Helps businesses with targeted marketing and customer insights.

joblib matplotlib numpy pandas scikit-learn seaborn

Last synced: 05 May 2026

https://github.com/antoniskl/un-general-debate-corpus-classification

The aim of this project is to classify UNGDC speeches with regards to climate change. As a secondary objective, a correlation is being examined between these speeches, the forestation and the happiness index of the countries.

classification data-science jupyter-notebook machine-learning nlp python regression scikit-learn text-classification text-preprocessing

Last synced: 05 May 2026

https://github.com/vanilladucky/housing-prediction

This is a data analytics and machine learning project that I undertook using a housing dataset on Kaggle in order to put my machine learning knowledge to practice and some practical application.

data-science machine-learning python scikit-learn

Last synced: 05 May 2026

https://github.com/sevilaymuni/project-no.6-tree-based-models

Random Forest Assisted Suggestions for Salifort Motors Employee Retention: Plan, Analyze, Construct and Execute

data-science decision-trees evaluation-metrics gridsearchcv logistic-regression machine-learning matplotlib python random-forest-classifier scikit-learn seaborn-plots

Last synced: 05 May 2026

https://github.com/grandechowhiskey/fcc-machine_learning-boilerplates

A collection of projects completed as part of the FreeCodeCamp "Machine Learning with Python" certification. These projects focus on implementing machine learning models, data preprocessing, and predictive analysis using libraries like scikit-learn and TensorFlow.

ai ml python3 scikit-learn tensorflow

Last synced: 06 May 2026

https://github.com/sadmansakib93/mental-resilience-analysis-using-machine-learning

Utilized supervised and unsupervised ML techniques to analyze mental health and resilience levels of medical students [Project completed on December, 2019]

artificial-intelligence classification clustering correlation linear-regression machine-learning machine-learning-algorithms mental-health python regression resilience scikit-learn statistical-analysis

Last synced: 06 May 2026

https://github.com/kaoutarmi/predition_price-old-cars

Ce projet de prédiction du prix des voitures utilise l’apprentissage automatique pour estimer la valeur des véhicules en fonction de leurs caractéristiques.

car-price-prediction data-preprocessing data-science decision-tree feature-engineering machine-learning regression scikit-learn

Last synced: 06 May 2026

https://github.com/felipesbonatti/case-credit-risk-prediction

Projeto de classificação de risco de crédito construído com Python, Scikit-learn e Pandas. Demonstra um fluxo de trabalho de Machine Learning de ponta a ponta: pré-processamento de dados, feature engineering, treinamento de múltiplos algoritmos e avaliação de performance com métricas como AUC-ROC.

credit-risk machine-learning predictive-modeling python scikit-learn

Last synced: 06 May 2026

https://github.com/ejw-data/ml-playground

Testing the limitations, inabilities, and strengths of models with synthetic data

machine-learning python scikit-learn

Last synced: 06 May 2026

https://github.com/cycle-sync-ai/student-score-analysis

A data-driven student performance analysis project using UCI dataset (396 students, 33 features). Implements machine learning models (K-means, PCA, Decision Tree, Random Forest, Linear Regression) to analyze academic patterns and predict student scores based on lifestyle, health, and study habits.

clustering clustering-algorithm decision-trees feature-engineering learning-management-system linear-regression machine-learning machine-learning-algorithms matplotlib numpy pandas pca pickle prediction prediction-algorithm scikit-learn score seaborn student

Last synced: 06 May 2026

https://github.com/barbarpotato/applied-data-science-with-python-specialization

This skills-based specialization is intended for learners who have a basic python or programming background, and want to apply statistical, machine learning, information visualization, text analysis, and social network.

data-science matplotlib pandas scikit-learn

Last synced: 06 May 2026

https://github.com/josepablodmg/python--linear-regression-advertising

A linear regression analysis to predict sales based on advertising spending across TV, radio, and newspaper channels. The project includes exploratory data analysis, model training, coefficient visualization, and residual analysis.

advertising data-analysis exploratory-data-analysis linear-regression machine-learning python regression scikit-learn visualization

Last synced: 06 May 2026

https://github.com/ccastleberry/hands_on_machine_learning

Notebooks and files created while working through the book Hands on Machine Learning

data-science jupyter-notebook scikit-learn tensorflow

Last synced: 06 May 2026

https://github.com/galaxy092/samsung-innovation-campus-big-data-capstone-project

Samsung Innovation Campus Big Data Capstone Project - Weather Prediction

hadoop jupyter-notebook pandas pyspark scikit-learn sparksql

Last synced: 06 May 2026

https://github.com/kianaabrisham/svm-from-scratch

Linear SVM from scratch with hinge loss + decision boundaries

classification from-scratch fundamentals hinge-loss numpy optimization scikit-learn svm

Last synced: 07 May 2026

https://github.com/kirillshiryaev61/customer_activity_prediction

Прогнозирование снижения покупательской активности в интернет-магазине. Модель на основе ML выявляет клиентов с риском оттока для повышения удержания. Учебный проект.

jupyter pandas python scikit-learn

Last synced: 07 May 2026

https://github.com/rishi035/advanced-house-price-predictions

This is my First Project and also participated in kaggle competition

linear-regression machine-learning python random random-forest regressor-models scikit-learn

Last synced: 07 May 2026

https://github.com/nicovandenhooff/wids-datathon-2022

This repository contains solution for the 2022 Women in Data Science Kaggle competition that I participated in, which obtained a top 10% leaderboard standing.

catboost data-visualization datascience energy-consumption ensemble-learning exploratory-data-analysis kaggle lightgbm machine-learning scikit-learn women-in-data-science xgboost

Last synced: 07 May 2026

https://github.com/tedim52/discjockey

a content-based recommender system for your party playlist preferences

jupyter-notebook matplotlib pandas scikit-learn spotify-web-api

Last synced: 07 May 2026

https://github.com/moustafamohamed01/mall-customer-segmentation-data

Customer segmentation using K-Means clustering based on annual income and spending score.

data-science data-visualization k-means-clustering machine-learning python scikit-learn unsupervised-learning

Last synced: 08 May 2026

https://github.com/thekartikeyamishra/data-preprocessor

A Google Colab module for interactive data preprocessing. Handles missing values, categorical encoding (One-Hot, Label), and numerical scaling (Standard, MinMax). Outputs a cleaned dataset

ipywidgets numpy pandas python scikit-learn

Last synced: 08 May 2026

https://github.com/icejan/predicton-systems

Various systems that train on data and generate a prediction

lightfm machine-learning numpy python scikit-learn

Last synced: 08 May 2026

https://github.com/aasjunior/mlapp-api

Esta API fornece endpoints para aplicar algoritmos de aprendizado de máquina, como K-Nearest Neighbors (KNN), Árvore de Decisão e Algoritmo Genético. Realizado como tarefa da disciplina de Laboratório Mobile/Computação Natural no 5º Semestre de Desenvolvimento de Software Multiplataforma.

fastapi machine-learning python scikit-learn

Last synced: 09 May 2026

https://github.com/santiagoasp98/spam-detection

SMS spam detection using Logistic Regression and Multinomial Naive Bayes.

classification logistic-regression machine-learning multinomial-naive-bayes python scikit-learn spam-detection

Last synced: 09 May 2026

https://github.com/alphacrypto246/employee-attrition

This project analyzes employee attrition data to uncover key factors driving employee turnover. Using Python, it employs data preprocessing, exploratory data analysis, and machine learning models to predict attrition and provide actionable insights for improving employee retention strategies.

decision-tree-classifier machine-learning machine-learning-algorithms python scikit-learn scikitlearn-machine-learning

Last synced: 09 May 2026

https://github.com/peterchain/titanic

Script for the Titanic dataset for evaluating which passengers survived

kaggle machine-learning pandas-dataframe python3 scikit-learn

Last synced: 09 May 2026

https://github.com/roggersanguzu/tomato-disease-detector

This project Uses transfer learning with MobileNetV2 to accurately classify tomato leaf diseases including Mosaic Virus, Septoria Leaf Spot, Blight, and Healthy leaves.

deep-learning python scikit-learn transfer-learning

Last synced: 09 May 2026

https://github.com/callmerajesh/ames-housing-price-prediction

Predicting house prices using Decision Tree Regressor on the Ames dataset

ames-housing data-science decision-tree machine-learning python regression scikit-learn

Last synced: 09 May 2026