An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/iamwatchdogs/cardiovascular-risk-prediction

This mini-project uses machine learning algorithms to predict possible risks of heart disease by analyzing given data.

jupyter-notebook machine-learning-algorithms matplotlib numpy pandas python scikit-learn seaborn

Last synced: 02 Apr 2026

https://github.com/luthfiwulandari/machine-learning-breast-cancer

This project is a simple application that uses logistic regression to detect breast cancer. It classifies tumors as either malignant or benign based on the dataset provided by Scikit-learn.

datascience jupyter logistic-regression machine-learning python scikit-learn

Last synced: 01 May 2026

https://github.com/simrandalal/semantic-book-recommender

A semantic content-based book recommender using sentence-transformer embeddings, cosine similarity, and a Streamlit interface.

dotenv huggingface-transformers nlp-machine-learning pandas python scikit-learn similarity-search streamlit

Last synced: 05 Apr 2026

https://github.com/murugavl/flower-prediction

Flower Prediction is a machine learning project that uses the Iris dataset to classify iris flowers into three species: Setosa, Versicolor, and Virginica. The project includes data analysis, model training with various algorithms, and deployment via a Flask web application for user-friendly predictions.

flask machine-learning matplotlib numpy pandas python scikit-learn seaborn

Last synced: 05 Apr 2026

https://github.com/jlee9503/medical-readmission

Conduct an analysis of medical readmission status using hospital patient data and the Social Determinants of Health dataset. Identify key factors influencing readmission rates to provide insights for improving healthcare outcomes.

python random-forest-regression scikit-learn tableau

Last synced: 01 May 2026

https://github.com/taqsblaze/hush

Hush: A lightweight, context-aware text toxicity classifier. Leveraging NLP and Random Forest ensemble learning to detect and mitigate harmful language in real-time. Built for efficiency, safety, and cleaner digital communication.

content-moderation machine-learning nlp random-forest safety-tools scikit-learn text-classification toxicity-detection

Last synced: 05 Apr 2026

https://github.com/kheriberto/knn_project

This is a simple project that uses dummie data to practice and demonstrate my knowledge of the KNN algorithm.

data-analysis knn-classifier numpy python scikit-learn seaborn

Last synced: 02 Apr 2026

https://github.com/deliprofesor/game-search-volume-prediction-machine-learning-models-and-forecasting

This repository uses machine learning models like Random Forest, XGBoost, LightGBM, and time-series forecasting with Prophet to predict game search volumes. Additionally, Grid Search is applied for hyperparameter tuning of the LightGBM model.

data-cleaning data-science data-visualization feature-selection forecasting-models game-search grid-search hyperparameter-tuning lightgbm machine-learning pandas prophet python random-forest scikit-learn time-series-analysis time-series-forecasting xgboost

Last synced: 18 Apr 2026

https://github.com/danicc097/python-ml-app

Various [arguably useless] Machine Learning services with gRPC and OpenTelemetry for demo purposes

grpc-python opentelemetry scikit-learn

Last synced: 17 Apr 2026

https://github.com/malick08012/heart-disease-prediction

A machine learning project that predicts the risk of heart disease based on patient health data. Includes data cleaning, EDA, visualization, model training, evaluation and feature importance analysis

artificial-intelligence heartdisease-prediction logistic-regression machine-learning python scikit-learn

Last synced: 18 Apr 2026

https://github.com/manalisbhavsar/mall-customers-clustering

K-Means clustering to mall customer data, segmenting customers based on their annual income and spending score. To identify patterns and group customers for targeted marketing.

data-analysis data-visualization matplotlib numpy pandas python scikit-learn

Last synced: 18 Apr 2026

https://github.com/dimdasci/car-price-prediction-demo

Demo project of EDA and regression task solution: Pandas, Jupyter Notebook, Scikit-learn, LightGBM

eda lightgbm-regressor regression scikit-learn

Last synced: 03 Jun 2026

https://github.com/jeffandyalltogether/mlrecommendationsystem

project code for a recommendation system for Amazon using collaborative filtering, ranking, and matrix factorization to enhance customer satisfaction and product discovery.

eda matplotlib pandas python scikit-learn seaborn tensorflow

Last synced: 05 Apr 2026

https://github.com/naren1704/ml-approach-for-employee-performance-prediction

A Flask UI that predicts the performance of employee based on the XGBoost trained model.

css flask html python scikit-learn xgboost

Last synced: 05 Apr 2026

https://github.com/abhinav330/instagram-influencers-analysis

This Jupyter Notebook focuses on preprocessing and visualizing data from an Instagram profiles dataset. It includes data loading, inspection, visualization, and some data preprocessing steps.

data data-science data-visualization exploratory-data-analysis exploratory-data-visualizations influncer-products instagram scikit-learn sklearn

Last synced: 08 Jun 2026

https://github.com/rishi-sutar/healwise-ai-your-way-to-wellness

Healwise-AI is a health diagnostic tool that uses a Support Vector Classifier (SVC) model to predict diseases based on user-reported symptoms. After predicting, it offers detailed health advice, including descriptions, diets, medications, and workouts related to the diagnosis.

machine-learning scikit-learn support-vector-machine

Last synced: 30 Apr 2026

https://github.com/jarif87/tune-popularity-app

Flask web app to predict song popularity using CatBoost. Enter five song features for instant predictions. Modern, responsive UI, no CSRF for development.

catboost-classifier eda flask-application matplotlib-python music-classification python scikit-learn seaborn

Last synced: 30 Apr 2026

https://github.com/vaishnavis03/finlatics_ml_program

This repository contains the .ipynb files for 3 datasets, along with a PPT for each. The datasets included are Facebook Marketplace Data, Sales Prediction Data, and Wine Quality data.

correlation data-analysis data-science data-visualization knn linear-regression machine-learning matplotlib numpy pandas random-forest-classifier scikit-learn

Last synced: 17 Apr 2026

https://github.com/sjain2580/simple-linear-regression-model

This project demonstrates a simple, yet robust, multiple linear regression model built with Python and scikit-learn to predict median house values in California.

joblib linear-regression matplotlib matplotlib-pyplot numpy python scikit-learn

Last synced: 30 Apr 2026

https://github.com/ledsouza/machine-learning-semisupervisionado

Este projeto utiliza algoritmos de aprendizado de máquina semi-supervisionado para classificar a qualidade do leite como alta, média ou baixa.

data-science joblib machine-learning machine-learning-algorithms pandas python scikit-learn

Last synced: 30 Apr 2026

https://github.com/das-debjit/emotion-detection

A simple ML-powered web app for real-time emotion detection from text using Streamlit and TF-IDF-based classification.

machine-learning nlp python scikit-learn sentiment-analysis streamlit text-classification tfidf web-app

Last synced: 30 Apr 2026

https://github.com/andrewjmack/cryptoclustering

The purpose of this project is to utilize knowledge of Python and unsupervised learning to predict if cryptocurrencies are affected by 24-hour or 7-day price changes. Methods for analysis include K-Means clustering and dimensional reduction through Principal Component Analysis ("PCA").

jupyter-notebook pandas python scikit-learn

Last synced: 30 Apr 2026

https://github.com/maguids/supervised-learning---video-games

This project consists on exploratory data analysis and the application of supervised learning models for classification using a Video Games dataset. Second Semester of the First Year of the Bachelor's Degree in Artificial Intelligence and Data Science.

jupyter-notebook machine-learning matplotlib numpy pandas scikit-learn seaborn supervised-learning

Last synced: 30 Apr 2026

https://github.com/rokuu010/boxing-match-predictor

Machine learning project to predict the outcomes of pro boxing matches using Dataset/web-scraped data

boxing data-science machine-learning prediction-model python scikit-learn selenium sports-analytics

Last synced: 30 Apr 2026

https://github.com/pramodyasahan/grade-predictor

This project aims to predict student performance based on various features such as job, study time, failures, absences, and first and second period grades. The project utilizes a linear regression model from the scikit-learn library in Python.

machine-learning matplotlib numpy pandas python regression scikit-learn

Last synced: 30 Apr 2026

https://github.com/smakde/learning-resource-recommender

A lightweight recommender that helps you discover your next learning resource. It blends patterns from similar users with content keywords, and explains each suggestion in the UI.

als content-based-filtering evaluation-metrics explainable-ai hybrid-recommender implicit-feedback implicit-lib lightfm logistic-matrix-factorization mapk matrix-factorization ndcg pandas precision-at-k python recommender-system scikit-learn streamlit tf-idf top-n-recommendations

Last synced: 30 Apr 2026

https://github.com/sayed-ashfaq/delhivery-dataanalysis

In this project, I conducted basic analysis, feature engineering, normalization, and outlier handling, along with statistical and non-parametric testing to extract insights.

feature-engineering normalization outlier-detection pandas python scikit-learn statistcal-tests statistical-analysis

Last synced: 30 Apr 2026

https://github.com/moritzkoerber/tune_preprocessing_algos

Files for this blogpost https://moritzkoerber.github.io/python/tutorial/2019/11/18/blogpost/

cross-validation hyperparameter-tuning machine-learning python scikit-learn

Last synced: 30 Apr 2026

https://github.com/tinaland101/credit-risk-classification

The purpose of this project is to build a credit risk classification model using machine learning techniques. This model helps identify the creditworthiness of borrowers based on historical lending data. Specifically, it uses a logistic regression model to predict whether a loan is healthy (0) or high-risk (1).

numpy pandas pathlib scikit-learn

Last synced: 30 Apr 2026

https://github.com/fikri-rouzan/student-stress-levels-classification

Proyek pemodelan machine learning untuk mengklasifikasikan tingkat stres mahasiswa berdasarkan parameter input akademik dan psikologis.

joblib jupyter-notebook matplotlib numpy pandas python scikit-learn seaborn streamlit

Last synced: 08 Jun 2026

https://github.com/fikri-rouzan/burnaway-capstone-data-science

Dashboard analitik interaktif untuk memetakan faktor fisik dan pola kerja pemicu burnout pada software developer.

jupyter-notebook matplotlib pandas pillow plotly python scikit-learn seaborn statsmodels streamlit

Last synced: 08 Jun 2026

https://github.com/samuelpillai/machine-learning-classification-regression-nlp

A curated collection of machine learning mini-projects covering classification, regression, and natural language processing (NLP). This project demonstrates model training, evaluation, feature engineering, and pipeline integration using real-world datasets and Python tools like Scikit-learn, pandas, and NLTK.

classification data-analysis data-science data-visualization feature-engineering jupyter-notebook machine-learning ml-pipeline model-evaluation nlp python regression-models scikit-learn supervised-learning text-mining

Last synced: 30 Apr 2026

https://github.com/abhivur/connections-ai

Contributors: Meet Gamdha, Gaurav Nimmagadda

bert python scikit-learn word2vec

Last synced: 30 Apr 2026

https://github.com/dhruvv1402/spam-detection-python-

This project is a Spam Detection System built using Python. It classifies SMS messages as spam or ham (not spam) using machine learning techniques.

countvectorizer kaggle-dataset nlp-machine-learning nltk numpy pandas python scikit-learn supervised-machine-learning tf-idf

Last synced: 01 May 2026

https://github.com/kumailn/machinelearning

Machine learning with Python

machine-learning python scikit-learn tensorflow

Last synced: 30 Apr 2026

https://github.com/dharma-acha/explanability_in_deepneuralnetworks

Our project aims to enhance the transparency and trustworthiness of the VGG model in critical fields like healthcare imaging and self-driving cars. By integrating explainability methods into the VGG model for image classification, we will clarify its decision-making process.

colab-notebook matplotlib numpy pandas scikit-learn seaborn

Last synced: 30 Apr 2026

https://github.com/fbarffmann/credit-risk-classification

Classified 19,000+ loans as high-risk or healthy using logistic regression. Achieved 100% precision for healthy loans and 84% precision for high-risk loans.

classification credit-risk data-analysis logistic-regression machine-learning model-evaluation pandas python scikit-learn

Last synced: 30 Apr 2026

https://github.com/boladjivinny/fire-prediction

Notebook for the Fire fighting using data on Zindi. Ranked number 5 on the public leaderboard and 8 on the private leaderboard. https://zindi.africa/hackathons/cmu-africa-fighting-fire-with-data

feature-engineering hackhathon machine-learning regression scikit-learn stacking

Last synced: 30 Apr 2026

https://github.com/zenklinov/regression_logistic_-_sentiment_analysis_movie_data

This repository contains code for performing sentiment analysis using scikit-learn and logistic regression

llm natural-language-processing nlp nltk scikit-learn sentiment-analysis

Last synced: 10 May 2026

https://github.com/themihirmathur/mihir-clickpost-data-science-intern-round-1-assignment-submission

The objective of this project is to predict the predicted_exact_sla, which is the number of days between the shipment and delivery of an order, using historical shipment data.

data-science machine-learning pandas python random-forest-regression scikit-learn

Last synced: 30 Apr 2026

https://github.com/harshitwaldia/disease_detection

A disease detection system using Random Forest Classifier and GUI in Python, identifying illnesses based on user symptoms.

pandas-python python3 random-forest-classifier scikit-learn tkinter-gui

Last synced: 01 May 2026

https://github.com/fadlani-aditya/iris-plant-classification

This project focuses on classifying different species of Iris flowers using the Random Forest algorithm. The dataset, sourced from Scikit-learn, contains four key features: sepal length, sepal width, petal length, and petal width, which are used to predict the flower species (Setosa, Versicolor, and Virginica).

agriculture data-science iris-dataset machine-learning python scikit-learn supervised-learning

Last synced: 01 May 2026

https://github.com/erikglz/coap-mtd

Repository for an IoT security project implementing Moving Target Defense (MTD) through CoAP protocol randomization to mitigate spoofing attacks and enhance adaptive security.

coap-protocol cybersecurity iot machine-learning python scikit-learn spoofing

Last synced: 17 Apr 2026

https://github.com/myahninsi/customer-segmentation-recommendation-ml

This project addressed challenges in understanding customer behavior and personalizing shopping experiences for an e-commerce platform. Developed ML solutions including K-Means clustering for segmentation, Random Forest regression for CLV prediction, and collaborative filtering for product recommendations.

collaborative-filtering k-means-clustering pandas python random-forest scikit-learn

Last synced: 01 May 2026

https://github.com/luizassimoes/sklearn-kaggle-titanic

This repository was created to store all the code for tackling the Titanic challenge on Kaggle.

kaggle machine-learning scikit-learn

Last synced: 02 May 2026

https://github.com/dmschauer/aws-sagemaker-deployment-test

I did a simple test to see how deploying a machine learning model on AWS Sagemaker and thus turning it into an API works. Since scikit-learn models require less dependencies than e.g. TensorFlow models I went with them for this test. To do so I used a tutorial.

aws boto3 python sagemaker scikit-learn

Last synced: 02 May 2026

https://github.com/danishzulfiqar/language-detection-nlp-model

This machine learning model is designed to accurately detect and classify text in 18 languages using NLP

fastapi jupyter-notebook machine-learning natural-language-processing scikit-learn

Last synced: 01 May 2026

https://github.com/diiblo/la-poste-predictive-flux

Prédiction journalière du flux de colis dans les centres de tri de La Poste. Pipeline complet : génération de données, modélisation LightGBM, orchestration via Airflow (Docker), stockage PostgreSQL et dashboard interactif Streamlit. Projet réalisé en Mastère 2 Data Engineering à l’ECE Paris.

airflow docker postgresql scikit-learn streamlit

Last synced: 31 Jan 2026

https://github.com/manu-karenite/medical-insurance-cost-predictor

Medical Insurance Cost Generator is a Linear Regression based Predictor which is used to estimate and predict the Cost a person has to pay while Buying a Medical Insurance.

kaggle-dataset linear-regression machine-learning matplotlib numpy pandas python3 reactjs scikit-learn

Last synced: 15 Apr 2026

https://github.com/himanshugoyal77/shell-detection-frontend

Fraud detection of companies using Machine learning and django

django scikit-learn

Last synced: 01 May 2026

https://github.com/arturovaine/n8n-nodes-sklearn

Custom n8n nodes for integrating scikit-learn machine learning algorithms into your n8n workflows.

machine-learning n8n n8n-nodes scikit-learn sklearn

Last synced: 08 Jun 2026

https://github.com/deepthipathlawath20/emotion-recognition-bimodal

Bimodal emotion recognition (face + speech) with feature-level fusion and classic ML classifiers.

audio computer-vision emotion-recognition knn mfcc multimodal navie-bayes-algorithm python scikit-learn svm tensorflow

Last synced: 01 May 2026

https://github.com/sundanc/btcprediction

Predict Bitcoin prices based on historical data using machine learning techniques

bitcoin-prediction keras machine-learning pandas python python3 scikit-learn scikitlearn-machine-learning

Last synced: 02 May 2026

https://github.com/kristishqau/sentimentanalysis_nlp

A project for sentiment analysis of tweets using various NLP techniques and machine learning models.

datascience jupyter-notebook machine-learning nlp nltk python scikit-learn sentiment-analysis xgboost

Last synced: 01 May 2026

https://github.com/anshvaid4/ml_practice

This is the new repository, where I have added all the notebooks demonstrating the usage of various transformers and models for Supervised and Unsupervised algorithms

anaconda jupyter-notebook machine-learning machine-learning-algorithms python scikit-learn

Last synced: 17 Apr 2026

https://github.com/isshiki/machine-learning-with-python

連載『Pythonで学ぶ「機械学習」入門』(@IT)で使用するノートブックが配布されているリポジトリです。

data-science machine-learning machinelearning-python python scikit-learn

Last synced: 17 Apr 2026

https://github.com/prashver/end-to-end-model-deployment-on-aws

Student Performance Analysis with Machine Learning analyzes factors impacting student outcomes using a robust machine learning pipeline. Achieving an impressive R2 score, it predicts student performance effectively. With extensive data preprocessing and deployment on AWS Elastic Beanstalk, it ensures scalability and high availability.

amazon-web-services aws-elastic-beanstalk end-to-end-deployment flask machine-learning-algorithms matplotlib numpy pandas scikit-learn seaborn

Last synced: 02 Apr 2026

https://github.com/orliluq/inmersion-datos-python

Desarrollar modelos de machine learning para predecir la probabilidad de incumplimiento crediticio de los clientes, utilizando diferentes algoritmos de clasificación (Regresión Logística, Árboles de Decisión, Random Forest, Naive Bayes).

colab-notebook numpy pandas python scikit-learn

Last synced: 02 Apr 2026

https://github.com/felixamaladhas/amazon-reviews-sentiment-analysis

This is a sentiment analysis project that classifies Amazon product reviews as positive or negative using machine learning techniques.

matplotlib numpy pandas python scikit-learn

Last synced: 02 Apr 2026

https://github.com/mayankyadav23/shipment-pricing-prediction

Shipment Pricing Prediction 📦🔍 is a machine learning project that forecasts shipment prices based on various supply chain factors. Using advanced regression models, it provides valuable insights 📊 to optimize pricing strategies in the supply chain analytics domain.

data-visulization flask ineuron-ai machine-learning python scikit-learn shipment-and-pricing

Last synced: 02 Apr 2026

https://github.com/otuemre/obesity-classification

Machine learning project to classify obesity levels based on health metrics like age, sex, height, weight, and BMI.

classification data-science healthcare machine-learning obesity-classification scikit-learn

Last synced: 17 Apr 2026

https://github.com/a-poor/sample-model-serve

Demo for using Flask to serve a scikit-learn model as an API

api data-science docker flask machine-learning scikit-learn

Last synced: 30 Apr 2026

https://github.com/nikhilgugwad/sentiment-analysis

Sentiment analysis for the Kannada language to classify Kannada sentences into different emotions.

numpy pandas scikit-learn

Last synced: 17 Apr 2026

https://github.com/barbarahayd/com410-ml

atividades aula machine learning

decision-tree scikit-learn

Last synced: 01 May 2026

https://github.com/ngangawairimu/linear-regression-

This project builds a linear regression model in Python to predict outcomes and derive insights from feature data. It covers data cleaning, feature analysis, and model evaluation, showcasing predictive modeling techniques using scikit-learn, pandas, and visualization libraries.

data-analysis linear-regression machine-learning predictive-modeling python scikit-learn

Last synced: 17 Apr 2026

https://github.com/antonio-f/housing-simplemlexample

Basic example with California Housing Prices dataset from the StatLib repository using scikit-learn

housing-simplemlexample machine-learning scikit-learn simple

Last synced: 01 May 2026

https://github.com/rohansardar/speechflowguard

A machine learning web API that detects toxic language in user comments using classical ML

docker logistic-regression machine-learning python3 scikit-learn tf-idf tfidf-text-analysis tfidf-vectorizer

Last synced: 17 Apr 2026

https://github.com/raphael-ufrj/analise_algodao

Análise histórica de plantio de algodão, analise do plantio com base no clima e nos dados históricos.

analysis data-science data-visualization dataset docker pandas provenance python python3 scikit-learn seaborn streamlit

Last synced: 02 Apr 2026

https://github.com/mangesh-balkawade/pythonautomationsscripts

This is the repository which contains the python automations scripts and machine learning case studies , and Python Projects that I have write to learn automations and ML using python.

automation data-science machine-learning-algorithms matplotlib mongodb pandas python3 scikit-learn seaborn webscraping

Last synced: 13 Apr 2026

https://github.com/satyas567/weatherdataanalysis

Comprehensive Weather Data Analysis with Python: Explore trends, visualize patterns, detect outliers, and predict temperature using humidity and wind speed

jupyter-notebook linear-regression matplotlib numpy pandas python scikit-learn seaborn

Last synced: 02 Apr 2026

https://github.com/mnitin-reddy/content-based-recommendation-system-using-deep-learning

A content-based movie recommendation system using deep learning to predict user ratings by leveraging user and movie features. The system integrates neural networks for feature extraction, utility scripts for data processing, and supports both new and existing user recommendations.

deep-learning keras neural-networks numpy pandas python scikit-learn tensorflow

Last synced: 03 Apr 2026

https://github.com/rosieoh/emergency_dataanalysis

오픈데이터분석-응급의료체계 방안 정책 제안 데이터 분석

ipython matplotlib numpy pandas python scikit-learn scipy

Last synced: 04 Apr 2026

https://github.com/pablonunes/houseprediction

This a simple model to predict housing price in King County in Washingthon. Uses Scikit Learn, Numpy. Seaborn, Pandas, Scipy.

housing-data housing-prices scikit-learn scikitlearn-machine-learning seaborn

Last synced: 17 Apr 2026

https://github.com/anastasiaschmidt1/sqli-detection-ml

UNI-PROJEKT: Erkennung von SQL-Injection-Angriffen durch maschinelles Lernen (SVM-Modell)

bht-berlin machine-learning scikit-learn sqli svm

Last synced: 02 May 2026

https://github.com/yelamankarassay/personal-health-wellness-dashboard

A Streamlit-based dashboard for visualizing and analyzing personal daily data—weight, mood, meals, sleep, and more. This project uses pandas, plotly, matplotlib, seaborn, scikit-learn, and wordcloud to present insights about your health and daily habits.

matplotlib pandas plotly scikit-learn seaborn wordcloud

Last synced: 17 Apr 2026