An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/barbarpotato/applied-data-science-with-python-specialization

This skills-based specialization is intended for learners who have a basic python or programming background, and want to apply statistical, machine learning, information visualization, text analysis, and social network.

data-science matplotlib pandas scikit-learn

Last synced: 06 May 2026

https://github.com/erikglz/coap-mtd

Repository for an IoT security project implementing Moving Target Defense (MTD) through CoAP protocol randomization to mitigate spoofing attacks and enhance adaptive security.

coap-protocol cybersecurity iot machine-learning python scikit-learn spoofing

Last synced: 17 Apr 2026

https://github.com/zenklinov/regression_logistic_-_sentiment_analysis_movie_data

This repository contains code for performing sentiment analysis using scikit-learn and logistic regression

llm natural-language-processing nlp nltk scikit-learn sentiment-analysis

Last synced: 10 May 2026

https://github.com/vaishnavis03/finlatics_ml_program

This repository contains the .ipynb files for 3 datasets, along with a PPT for each. The datasets included are Facebook Marketplace Data, Sales Prediction Data, and Wine Quality data.

correlation data-analysis data-science data-visualization knn linear-regression machine-learning matplotlib numpy pandas random-forest-classifier scikit-learn

Last synced: 17 Apr 2026

https://github.com/dimdasci/car-price-prediction-demo

Demo project of EDA and regression task solution: Pandas, Jupyter Notebook, Scikit-learn, LightGBM

eda lightgbm-regressor regression scikit-learn

Last synced: 03 Jun 2026

https://github.com/danicc097/python-ml-app

Various [arguably useless] Machine Learning services with gRPC and OpenTelemetry for demo purposes

grpc-python opentelemetry scikit-learn

Last synced: 17 Apr 2026

https://github.com/bhavyac16/flairifyme

FlairifyMe is a Reddit Flair Detector for r/india subreddit, that takes a post's URL as user input and predicts the flair for the post using a model generated by Logistic Regression.

flair-prediction flask hacktoberfest linear-svm logistic-regression naive-bayes-classifier nltk praw-reddit reddit-flair-detector scikit-learn scraped-data subreddit text-classification

Last synced: 06 May 2026

https://github.com/sahilmate/ebm-breast-cancer-classifier

This repository implements an Explainable Boosting Machine (EBM) model for breast cancer classification using scikit-learn and interpret. The project includes data preprocessing, model training, accuracy evaluation, and feature importance visualization.

breast-cancer-classification data-visualization explainable-boosting-machine feature-importance interpret machine-learning scikit-learn

Last synced: 06 May 2026

https://github.com/kheriberto/knn_project

This is a simple project that uses dummie data to practice and demonstrate my knowledge of the KNN algorithm.

data-analysis knn-classifier numpy python scikit-learn seaborn

Last synced: 02 Apr 2026

https://github.com/iamwatchdogs/cardiovascular-risk-prediction

This mini-project uses machine learning algorithms to predict possible risks of heart disease by analyzing given data.

jupyter-notebook machine-learning-algorithms matplotlib numpy pandas python scikit-learn seaborn

Last synced: 02 Apr 2026

https://github.com/amirmohammadgholampour/mall-customer-segmentation

Project for segmenting customers in a shopping mall using the Clustering algorithm.

numpy pandas python scikit-learn

Last synced: 02 Apr 2026

https://github.com/akshitvats026/heart_disease_prediction

An ML-based Heart Disease Prediction System that predicts the likelihood of heart disease based on user health parameters. Built using Python, Pandas, and Scikit-learn, the system performs data preprocessing, trains a predictive model, and provides real-time predictions with high accuracy.

accuracy-score logistic-regression machine-learning matplotlib-pyplot numpy pandas python scikit-learn

Last synced: 02 Apr 2026

https://github.com/broodhoney/blue-book-for-bulldozers

This repository holds the project which solves a regression problem on predicting the futures sales of bulldozers. This is from a kaggle competition.

matplotlib numpy pandas python scikit-learn

Last synced: 02 Apr 2026

https://github.com/satyas567/weatherdataanalysis

Comprehensive Weather Data Analysis with Python: Explore trends, visualize patterns, detect outliers, and predict temperature using humidity and wind speed

jupyter-notebook linear-regression matplotlib numpy pandas python scikit-learn seaborn

Last synced: 02 Apr 2026

https://github.com/raphael-ufrj/analise_algodao

Análise histórica de plantio de algodão, analise do plantio com base no clima e nos dados históricos.

analysis data-science data-visualization dataset docker pandas provenance python python3 scikit-learn seaborn streamlit

Last synced: 02 Apr 2026

https://github.com/sudarshanc00/smishing

This project aims to classify text messages to detect potential smishing (SMS phishing) attacks. Using machine learning, the project provides a classifier that can differentiate between legitimate messages and smishing attempts, helping to prevent scams.

nltk numpy pandas python scikit-learn scipy

Last synced: 14 Apr 2026

https://github.com/josepablodmg/python--linear-regression-advertising

A linear regression analysis to predict sales based on advertising spending across TV, radio, and newspaper channels. The project includes exploratory data analysis, model training, coefficient visualization, and residual analysis.

advertising data-analysis exploratory-data-analysis linear-regression machine-learning python regression scikit-learn visualization

Last synced: 06 May 2026

https://github.com/soumyapro/parkinson-disease-prediction

This project predicts Parkinson's disease using machine learning models.

logistic-regression numpy pandas scikit-learn svc xgboost

Last synced: 19 Jan 2026

https://github.com/anras5/criteo-search-data

EDA and statistical tests on CriteoSearchData dataset

data-science pandas scikit-learn statistics

Last synced: 11 May 2026

https://github.com/anshvaid4/ml_practice

This is the new repository, where I have added all the notebooks demonstrating the usage of various transformers and models for Supervised and Unsupervised algorithms

anaconda jupyter-notebook machine-learning machine-learning-algorithms python scikit-learn

Last synced: 17 Apr 2026

https://github.com/isshiki/machine-learning-with-python

連載『Pythonで学ぶ「機械学習」入門』(@IT)で使用するノートブックが配布されているリポジトリです。

data-science machine-learning machinelearning-python python scikit-learn

Last synced: 17 Apr 2026

https://github.com/prashver/end-to-end-model-deployment-on-aws

Student Performance Analysis with Machine Learning analyzes factors impacting student outcomes using a robust machine learning pipeline. Achieving an impressive R2 score, it predicts student performance effectively. With extensive data preprocessing and deployment on AWS Elastic Beanstalk, it ensures scalability and high availability.

amazon-web-services aws-elastic-beanstalk end-to-end-deployment flask machine-learning-algorithms matplotlib numpy pandas scikit-learn seaborn

Last synced: 02 Apr 2026

https://github.com/orliluq/inmersion-datos-python

Desarrollar modelos de machine learning para predecir la probabilidad de incumplimiento crediticio de los clientes, utilizando diferentes algoritmos de clasificación (Regresión Logística, Árboles de Decisión, Random Forest, Naive Bayes).

colab-notebook numpy pandas python scikit-learn

Last synced: 02 Apr 2026

https://github.com/felixamaladhas/amazon-reviews-sentiment-analysis

This is a sentiment analysis project that classifies Amazon product reviews as positive or negative using machine learning techniques.

matplotlib numpy pandas python scikit-learn

Last synced: 02 Apr 2026

https://github.com/manome/python-supervised-learning

This project provides sample code for performing supervised learning.

conformal-prediction scikit-learn supervised-learning

Last synced: 19 Jan 2026

https://github.com/mayankyadav23/shipment-pricing-prediction

Shipment Pricing Prediction 📦🔍 is a machine learning project that forecasts shipment prices based on various supply chain factors. Using advanced regression models, it provides valuable insights 📊 to optimize pricing strategies in the supply chain analytics domain.

data-visulization flask ineuron-ai machine-learning python scikit-learn shipment-and-pricing

Last synced: 02 Apr 2026

https://github.com/otuemre/obesity-classification

Machine learning project to classify obesity levels based on health metrics like age, sex, height, weight, and BMI.

classification data-science healthcare machine-learning obesity-classification scikit-learn

Last synced: 17 Apr 2026

https://github.com/a-poor/sample-model-serve

Demo for using Flask to serve a scikit-learn model as an API

api data-science docker flask machine-learning scikit-learn

Last synced: 30 Apr 2026

https://github.com/nikhilgugwad/sentiment-analysis

Sentiment analysis for the Kannada language to classify Kannada sentences into different emotions.

numpy pandas scikit-learn

Last synced: 17 Apr 2026

https://github.com/johannesvc/data-science-portfolio

A curated portfolio of applied data science projects focused on machine learning, NLP, and social impact.

academic-portfolio data-science deep-learning keras machine-learning media-bias nlp pandas scikit-learn

Last synced: 11 May 2026

https://github.com/ngangawairimu/linear-regression-

This project builds a linear regression model in Python to predict outcomes and derive insights from feature data. It covers data cleaning, feature analysis, and model evaluation, showcasing predictive modeling techniques using scikit-learn, pandas, and visualization libraries.

data-analysis linear-regression machine-learning predictive-modeling python scikit-learn

Last synced: 17 Apr 2026

https://github.com/rohansardar/speechflowguard

A machine learning web API that detects toxic language in user comments using classical ML

docker logistic-regression machine-learning python3 scikit-learn tf-idf tfidf-text-analysis tfidf-vectorizer

Last synced: 17 Apr 2026

https://github.com/lorenzorottigni/ml-breast-cancer

Machine Learning python bootcamp: Support Vector Machines using breast cancer dataset

ipynb machine-learning numpy pandas python scikit-learn seaborn support-vector-machines

Last synced: 14 Apr 2026

https://github.com/mangesh-balkawade/pythonautomationsscripts

This is the repository which contains the python automations scripts and machine learning case studies , and Python Projects that I have write to learn automations and ML using python.

automation data-science machine-learning-algorithms matplotlib mongodb pandas python3 scikit-learn seaborn webscraping

Last synced: 13 Apr 2026

https://github.com/mnitin-reddy/content-based-recommendation-system-using-deep-learning

A content-based movie recommendation system using deep learning to predict user ratings by leveraging user and movie features. The system integrates neural networks for feature extraction, utility scripts for data processing, and supports both new and existing user recommendations.

deep-learning keras neural-networks numpy pandas python scikit-learn tensorflow

Last synced: 03 Apr 2026

https://github.com/rosieoh/emergency_dataanalysis

오픈데이터분석-응급의료체계 방안 정책 제안 데이터 분석

ipython matplotlib numpy pandas python scikit-learn scipy

Last synced: 04 Apr 2026

https://github.com/pablonunes/houseprediction

This a simple model to predict housing price in King County in Washingthon. Uses Scikit Learn, Numpy. Seaborn, Pandas, Scipy.

housing-data housing-prices scikit-learn scikitlearn-machine-learning seaborn

Last synced: 17 Apr 2026

https://github.com/arish-mhrjn/aimodelinspector

A fairly comprehensive Python library allowing for exploration, self-education and categorizaton of AI models

ai analysis coreml-models diffusers diffusion-models ggml hdf5-format jax model-discovery model-insights openvino-models pytorch scikit-learn scikitlearn-machine-learning

Last synced: 07 Oct 2025

https://github.com/yelamankarassay/personal-health-wellness-dashboard

A Streamlit-based dashboard for visualizing and analyzing personal daily data—weight, mood, meals, sleep, and more. This project uses pandas, plotly, matplotlib, seaborn, scikit-learn, and wordcloud to present insights about your health and daily habits.

matplotlib pandas plotly scikit-learn seaborn wordcloud

Last synced: 17 Apr 2026

https://github.com/belzebu013/prever_nivel_colesterol

Projeto de IA com algoritmo de Regressão Linear múltipla para prever o nível de colesterol de um individuo.

ia jupiter-notebook pandas python regressao-linear-multipla scikit-learn

Last synced: 17 Apr 2026

https://github.com/galaxy092/samsung-innovation-campus-big-data-capstone-project

Samsung Innovation Campus Big Data Capstone Project - Weather Prediction

hadoop jupyter-notebook pandas pyspark scikit-learn sparksql

Last synced: 06 May 2026

https://github.com/mryutaro/spla3clip

spla3clip: キル・デスした時刻を自動で解析するスプラトゥーン3用ツール

fastapi python react scikit-learn typescript

Last synced: 04 Apr 2026

https://github.com/rickyarians/ai-ml-nlp

Directory Machine Learning, Deep Learning, Artificial Int, Natural Language Processing Project

deep-learning machine-learning modeling python scikit-learn tensorflow

Last synced: 04 Apr 2026

https://github.com/shaharband/calcofi-oceanographic-analysis

This repository contains an analysis of the CalCOFI (California Cooperative Oceanic Fisheries Investigations) dataset, which represents one of the longest and most complete time series of oceanographic and larval fish data in the world.

pandas regression scikit-learn

Last synced: 10 May 2026

https://github.com/samudraneel05/stanford-open-policing

The Stanford Open Policing Project (SOPP) aims to bring transparency to police interactions by collecting and analyzing data on traffic stops across the United States. It accumulates a vast dataset on traffic stops, encompassing details such as demographics, location, and outcomes.

clustering heirarchical-clustering k-means-clustering machine-learning matplotlib pandas python scikit-learn

Last synced: 06 May 2026

https://github.com/mnj-tothetop/english-handwritten-characters-recognizer

A handwritten english character recognizer [0-9, A-Z, a-z] made by using a Dataset of 3409 images. Tensorflow, Keras, Scikit-learn, and OpenCV was used to implement the Convolution Neural Network (CNN). Matplotlib and Seaborn were used to visualize the data.

artificial-intelligence convolutional-neural-networks keras matplotlib opencv-python scikit-learn seaborn tensorflow

Last synced: 18 Apr 2026

https://github.com/27ahmad/movie-recommendation-system

Welcome to the Movie Recommendation System! This project uses Streamlit to provide personalized movie recommendations based on user preferences and similarity.

movie-recommendation numpy pandas python scikit-learn

Last synced: 04 Apr 2026

https://github.com/bjpcjp/scikit-learn

Updates in progress. Jupyter workbooks will be added as time allows.

python python3 scikit-learn

Last synced: 18 Apr 2026

https://github.com/minhtran241/ml-dl-llm-genai

Showcasing ML/DL fundamentals, paper implementations, deep learning models, and other projects. The purpose of this repository is to provide a playground for me to explore and learn about PyTorch, deep learning, and generative AI.

deep-learning generative-ai llm machine-learning paper-implementations pytorch scikit-learn

Last synced: 18 Apr 2026

https://github.com/justsecret123/nba-players-stats-analysis

A quick interactive Notebook to visualize some NBA players stats (points, assists, steals, blocks...) and totals, rankings and comparisons. Feel free to add any player in the .csv data files. 🏀

csv ipython-notebook ipywidgets jupyter-notebook jupyterlab matplotlib pandas python scikit-learn seaborn

Last synced: 18 Apr 2026

https://github.com/jbizzlefoshizzle/ibm_capstone_project

Used K-means clustering and mapping libraries to determine best cities in San Diego to open a Mexican restaurant

beautifulsoup4 folium-maps geopy pandas-python scikit-learn

Last synced: 06 May 2026

https://github.com/gattsu001/telecom-churn-predictor

Predicts which telecom customers are likely to churn with 95% accuracy using engineered features from usage, billing, and support data. Implements Sturges-based binning, one-hot encoding, stratified 80/20 train-test split, and a two-level ensemble pipeline with soft voting. Achieves 94.60% accuracy, 0.8968 AUC, 0.8675 precision, 0.7423 recall.

churn-prediction classification classification-algorithm customer-retention data-science data-visualization feature-engineering joblib jupyter-notebook machine-learning pandas scikit-learn supervised-learning svm

Last synced: 18 Apr 2026

https://github.com/rescurib/random_forest_arduino_uno

Ejemplo de implementación de un clasificador de bosque aleatorio en un Arduino UNO usando scikit-learn y m2cgen.

arduino scikit-learn tinyml

Last synced: 18 Apr 2026

https://github.com/katjaweb/king-county-house-price-prediction

This project aims to predict house prices based on various features such as square footage, number of rooms or location.

machine-learning python regression scikit-learn

Last synced: 19 Jan 2026

https://github.com/tanim-mishkat/data-science-prediction-model-pds-course-

Diabetes Progression Prediction Using Regression Analysis: This project uses regression analysis in Python to predict diabetes progression based on medical and physiological data. Includes data preprocessing, model training, evaluation, and visualizations.

data-science machine-learning python regression scikit-learn

Last synced: 19 Apr 2026

https://github.com/gregoritsch3/ml_clustering_eda_customersegmentation

An EDA and Machine Learning Clustering exercise on the Mall Customer Segmentation synthetic dataset demonstrating the use of KMeans Clustering and the Elbow Method. The clustering algorithm successfully segments the customer base into groups distinguishable by their annual income and spending score.

clustering kmeans-clustering machine-learning matplotlib numpy pandas scikit-learn scipy seaborn

Last synced: 04 Apr 2026

https://github.com/pedroteixeiraw/variational_quantum_circuit_binary_classification

This project focuses on developing a Variational Quantum Circuit capable of performing Binary Classification between two classes: red wine and white wine, based on their characteristics using machine learning.

binary-classification cost-function json machine-learning matplotlib numpy pandas qiskit qiskit-machine-learning quantum-machine-learning scikit-learn training-data variational-circuit

Last synced: 04 Apr 2026

https://github.com/sentinel-ml/sentinel_ai

Machine Learning Model to detect fraud in financial systems

ai python pytorch scikit-learn security security-tools tensorflow

Last synced: 04 Apr 2026

https://github.com/abdul-rafay19/california-housing-price-prediction

This project predicts California housing prices using machine learning regression models, including Random Forests and Decision Trees. It covers data preprocessing, exploratory analysis, model training, and hyperparameter tuning to optimize performance.

decision-trees gridsearchcv linear-regression matplotlib numpy pandas python random-forest randomsearch-cv scikit-learn scipy seaborn

Last synced: 04 Apr 2026

https://github.com/alainlebret/python-et-ia-1

Ressources personnelles du cours "Python & IA" en 2e année GPSE à l'ENSICAEN

artificial-intelligence image-processing machine-learning matplotlib numpy python scikit-image scikit-learn

Last synced: 04 Apr 2026

https://github.com/adhadse/hands-on-machine-learning-book-notes-and-practice

This repo holds the Jupyter notebooks and datasets containing notes/comments on things I learned from this book. Feel free to use and learned from them.

data-science deep-learning jupyter-notebooks keras machine-learning python scikit-learn tensorflow

Last synced: 04 Apr 2026

https://github.com/mnitin-reddy/a-b-testing-and-regression-analysis-for-ad-performance-optimization

Analyzed the performance of Facebook and AdWords ads using A/B testing and regression analysis to identify trends, correlations, and cost-effectiveness. Key insights included distribution of clicks and conversions, monthly trends, and cost-per-conversion analysis to optimize ROI.

abtesting data-science hypothesis-testing machine-learning matplotlib numpy pandas scikit-learn scipy seaborn statsmodels

Last synced: 04 Apr 2026

https://github.com/yashsonaar/machine-learning-tasks

This repository has machine learning tasks which include classification, recommendation system, fraud detection system

classification jupyter-notebook machine-learning numpy pandas prediction python scikit-learn testing

Last synced: 04 Apr 2026

https://github.com/anushrey10/fuel_efficiency_predictor

Welcome to the Fuel Efficiency Predictor! This advanced tool uses machine learning to predict your vehicle's fuel efficiency based on various characteristics.

decision-tree gradient-boosting-classifier html-css-javascript linear-regression machile-learning matplotlib python random-forest scikit-learn tailwindcss

Last synced: 18 Apr 2026

https://github.com/chengetanaim/high-school-alcoholism-and-academic-performance

Student Alcoholism and Academic Performance Data Analysis

jupyter-notebook scikit-learn

Last synced: 18 Apr 2026

https://github.com/giacomolat/object-detection-sperimental-thesis-for-degree

In this repository is my experimental thesis work on the recognition of museum works through object detection techniques.

convolutional-neural-networks detectron2 jupyter-notebook machine-learning neural-networks object-detection python pytorch rcnn rcnn-model scikit-learn

Last synced: 18 Apr 2026

https://github.com/eugen-goebel/predictive-analytics-agent

Automated ML pipeline — data profiling, preprocessing, model training, and evaluation report generation

automation data-science docker machine-learning predictive-analytics python scikit-learn streamlit

Last synced: 05 Apr 2026

https://github.com/sundanc/weatherprediction

This project implements a weather prediction system that predicts the temperature based on real-time weather data, including features like humidity, wind speed, and day-related features (day of the week, month

machine-learning machinelearning numpy pandas programming python scikit-learn scikitlearn-machine-learning weather-prediction

Last synced: 18 Apr 2026

https://github.com/rajireddy15/student_grade_pred

A machine learning project to predict student final grades using academic and demographic data. Built with pandas, scikit-learn, and visualized with seaborn and matplotlib to gain insights and support early intervention for students.

academic-insights data-science eda education-analytics grade-prediction machine-learning ml-project pandas regression-models scikit-learn student-performance-analysis

Last synced: 11 May 2026

https://github.com/akhundmuzzammil/energyconsumptionprediction

This repository contains code and resources for training a linear regression model to predict energy consumption based on various building parameters.

data-analysis energy-consumption linear-regression machine-learning python scikit-learn streamlit visualization

Last synced: 18 Apr 2026

https://github.com/hariprasath-v/machinehack-analytics-olympiad-2022

Create a machine learning model to help an insurance company understand which claims are worth rejecting and the claims which should be accepted for reimbursement.

catboost-classifier exploratory-data-analysis logloss machinehack numpy optuna pandas python scikit-learn shap

Last synced: 18 Apr 2026

https://github.com/alezoon/movie-revenue-prediction

Sk-learn practice using Linear Regression, ML workflow practice.

jupyter machine-learning matplotlib-pyplot numpy pandas python scikit-learn

Last synced: 05 Apr 2026

https://github.com/fanyicharllson/mobile-money-transaction-analysis

Machine learning pipeline for classifying mobile money users (MTN MoMo & Orange Money) into activity segments — CSC 3221 Final Project, ICT University Cameroon.

cameroon data-science ict-university jupyter jupyter-notebook machine-learning mtn-momo orange-money python scikit-learn

Last synced: 31 May 2026

https://github.com/simrandalal/semantic-book-recommender

A semantic content-based book recommender using sentence-transformer embeddings, cosine similarity, and a Streamlit interface.

dotenv huggingface-transformers nlp-machine-learning pandas python scikit-learn similarity-search streamlit

Last synced: 05 Apr 2026

https://github.com/murugavl/flower-prediction

Flower Prediction is a machine learning project that uses the Iris dataset to classify iris flowers into three species: Setosa, Versicolor, and Virginica. The project includes data analysis, model training with various algorithms, and deployment via a Flask web application for user-friendly predictions.

flask machine-learning matplotlib numpy pandas python scikit-learn seaborn

Last synced: 05 Apr 2026

https://github.com/royxlead/production-drift-detection

Production ML monitoring library - KL, PSI, MMD, and ADWIN drift detectors with empirical benchmarks, confidence tracking, and a 6-page FastAPI dashboard.

data-drift drift-detection fastapi kl-divergence mlops mmd model-monitoring production-ml psi pytorch scikit-learn uncertainty-quantification

Last synced: 23 Jun 2026

https://github.com/taqsblaze/hush

Hush: A lightweight, context-aware text toxicity classifier. Leveraging NLP and Random Forest ensemble learning to detect and mitigate harmful language in real-time. Built for efficiency, safety, and cleaner digital communication.

content-moderation machine-learning nlp random-forest safety-tools scikit-learn text-classification toxicity-detection

Last synced: 05 Apr 2026

https://github.com/deliprofesor/game-search-volume-prediction-machine-learning-models-and-forecasting

This repository uses machine learning models like Random Forest, XGBoost, LightGBM, and time-series forecasting with Prophet to predict game search volumes. Additionally, Grid Search is applied for hyperparameter tuning of the LightGBM model.

data-cleaning data-science data-visualization feature-selection forecasting-models game-search grid-search hyperparameter-tuning lightgbm machine-learning pandas prophet python random-forest scikit-learn time-series-analysis time-series-forecasting xgboost

Last synced: 18 Apr 2026