An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/khaymanii/calories-burnt-prediction-model

This model was built using Python and XGBoost Regression algorithm

matplotlib numpy pandas python scikit-learn

Last synced: 16 Apr 2026

https://github.com/archish27/pythontutorial

Python Programming Tutorial for new geeks who want to learn python from scratch to deal with various applications

matplotlib numpy pandas pygame python python-2 python-3 scikit-learn soup

Last synced: 01 Apr 2026

https://github.com/capsuleismail/income-census-prediction

Predict whether annual income of an individual exceeds $50K per annum based on census data. Also known as "Census Income" dataset.

datascience jupyter-notebook machinelearning-python scikit-learn

Last synced: 16 Apr 2026

https://github.com/priyanshul28/ml_eda_regression_energyconsumptionforecasting

An EDA and Machine Learning Time-Series Regression Forecasting exercise on the PMJE Energy Consumption dataset demonstrating time-series analysis and the use of Time-Series Split, XGBoost, etc. The model is optimized using hyperparameter tuning through GridSearchCV. A Rob Mulla guided exercise.

forcasting machine-learning numpy pandas scikit-learn time-series-analysis

Last synced: 17 Apr 2026

https://github.com/erikglz/coap-mtd

Repository for an IoT security project implementing Moving Target Defense (MTD) through CoAP protocol randomization to mitigate spoofing attacks and enhance adaptive security.

coap-protocol cybersecurity iot machine-learning python scikit-learn spoofing

Last synced: 17 Apr 2026

https://github.com/dimdasci/car-price-prediction-demo

Demo project of EDA and regression task solution: Pandas, Jupyter Notebook, Scikit-learn, LightGBM

eda lightgbm-regressor regression scikit-learn

Last synced: 03 Jun 2026

https://github.com/amirmohammadgholampour/mall-customer-segmentation

Project for segmenting customers in a shopping mall using the Clustering algorithm.

numpy pandas python scikit-learn

Last synced: 02 Apr 2026

https://github.com/broodhoney/blue-book-for-bulldozers

This repository holds the project which solves a regression problem on predicting the futures sales of bulldozers. This is from a kaggle competition.

matplotlib numpy pandas python scikit-learn

Last synced: 02 Apr 2026

https://github.com/prashver/end-to-end-model-deployment-on-aws

Student Performance Analysis with Machine Learning analyzes factors impacting student outcomes using a robust machine learning pipeline. Achieving an impressive R2 score, it predicts student performance effectively. With extensive data preprocessing and deployment on AWS Elastic Beanstalk, it ensures scalability and high availability.

amazon-web-services aws-elastic-beanstalk end-to-end-deployment flask machine-learning-algorithms matplotlib numpy pandas scikit-learn seaborn

Last synced: 02 Apr 2026

https://github.com/nikhilgugwad/sentiment-analysis

Sentiment analysis for the Kannada language to classify Kannada sentences into different emotions.

numpy pandas scikit-learn

Last synced: 17 Apr 2026

https://github.com/ngangawairimu/linear-regression-

This project builds a linear regression model in Python to predict outcomes and derive insights from feature data. It covers data cleaning, feature analysis, and model evaluation, showcasing predictive modeling techniques using scikit-learn, pandas, and visualization libraries.

data-analysis linear-regression machine-learning predictive-modeling python scikit-learn

Last synced: 17 Apr 2026

https://github.com/mangesh-balkawade/pythonautomationsscripts

This is the repository which contains the python automations scripts and machine learning case studies , and Python Projects that I have write to learn automations and ML using python.

automation data-science machine-learning-algorithms matplotlib mongodb pandas python3 scikit-learn seaborn webscraping

Last synced: 13 Apr 2026

https://github.com/mnitin-reddy/content-based-recommendation-system-using-deep-learning

A content-based movie recommendation system using deep learning to predict user ratings by leveraging user and movie features. The system integrates neural networks for feature extraction, utility scripts for data processing, and supports both new and existing user recommendations.

deep-learning keras neural-networks numpy pandas python scikit-learn tensorflow

Last synced: 03 Apr 2026

https://github.com/shaharband/calcofi-oceanographic-analysis

This repository contains an analysis of the CalCOFI (California Cooperative Oceanic Fisheries Investigations) dataset, which represents one of the longest and most complete time series of oceanographic and larval fish data in the world.

pandas regression scikit-learn

Last synced: 10 May 2026

https://github.com/mnj-tothetop/english-handwritten-characters-recognizer

A handwritten english character recognizer [0-9, A-Z, a-z] made by using a Dataset of 3409 images. Tensorflow, Keras, Scikit-learn, and OpenCV was used to implement the Convolution Neural Network (CNN). Matplotlib and Seaborn were used to visualize the data.

artificial-intelligence convolutional-neural-networks keras matplotlib opencv-python scikit-learn seaborn tensorflow

Last synced: 18 Apr 2026

https://github.com/minhtran241/ml-dl-llm-genai

Showcasing ML/DL fundamentals, paper implementations, deep learning models, and other projects. The purpose of this repository is to provide a playground for me to explore and learn about PyTorch, deep learning, and generative AI.

deep-learning generative-ai llm machine-learning paper-implementations pytorch scikit-learn

Last synced: 18 Apr 2026

https://github.com/gregoritsch3/ml_clustering_eda_customersegmentation

An EDA and Machine Learning Clustering exercise on the Mall Customer Segmentation synthetic dataset demonstrating the use of KMeans Clustering and the Elbow Method. The clustering algorithm successfully segments the customer base into groups distinguishable by their annual income and spending score.

clustering kmeans-clustering machine-learning matplotlib numpy pandas scikit-learn scipy seaborn

Last synced: 04 Apr 2026

https://github.com/sentinel-ml/sentinel_ai

Machine Learning Model to detect fraud in financial systems

ai python pytorch scikit-learn security security-tools tensorflow

Last synced: 04 Apr 2026

https://github.com/adhadse/hands-on-machine-learning-book-notes-and-practice

This repo holds the Jupyter notebooks and datasets containing notes/comments on things I learned from this book. Feel free to use and learned from them.

data-science deep-learning jupyter-notebooks keras machine-learning python scikit-learn tensorflow

Last synced: 04 Apr 2026

https://github.com/yashsonaar/machine-learning-tasks

This repository has machine learning tasks which include classification, recommendation system, fraud detection system

classification jupyter-notebook machine-learning numpy pandas prediction python scikit-learn testing

Last synced: 04 Apr 2026

https://github.com/giacomolat/object-detection-sperimental-thesis-for-degree

In this repository is my experimental thesis work on the recognition of museum works through object detection techniques.

convolutional-neural-networks detectron2 jupyter-notebook machine-learning neural-networks object-detection python pytorch rcnn rcnn-model scikit-learn

Last synced: 18 Apr 2026

https://github.com/hariprasath-v/machinehack-analytics-olympiad-2022

Create a machine learning model to help an insurance company understand which claims are worth rejecting and the claims which should be accepted for reimbursement.

catboost-classifier exploratory-data-analysis logloss machinehack numpy optuna pandas python scikit-learn shap

Last synced: 18 Apr 2026

https://github.com/simrandalal/semantic-book-recommender

A semantic content-based book recommender using sentence-transformer embeddings, cosine similarity, and a Streamlit interface.

dotenv huggingface-transformers nlp-machine-learning pandas python scikit-learn similarity-search streamlit

Last synced: 05 Apr 2026

https://github.com/manalisbhavsar/mall-customers-clustering

K-Means clustering to mall customer data, segmenting customers based on their annual income and spending score. To identify patterns and group customers for targeted marketing.

data-analysis data-visualization matplotlib numpy pandas python scikit-learn

Last synced: 18 Apr 2026

https://github.com/jeffandyalltogether/mlrecommendationsystem

project code for a recommendation system for Amazon using collaborative filtering, ranking, and matrix factorization to enhance customer satisfaction and product discovery.

eda matplotlib pandas python scikit-learn seaborn tensorflow

Last synced: 05 Apr 2026

https://github.com/emilyfelker/ieee_cis_fraud_detection

Which online transactions are fraudulent? Program that uses various machine learning algorithms to detect fraud.

decision-trees kaggle logistic-regression machine-learning neural-network pandas poetry pytest python scikit-learn sklearn tensorflow xgboost

Last synced: 05 Apr 2026

https://github.com/lexxai/goit_python_ds_hw_04

Модуль 4. Класифікація та оцінка роботи моделі. Лінійна регресія: перенавчання та регуляризація

lasso-regression linear-regression numpy pandas python red regression ridge-regression scikit-learn

Last synced: 05 Apr 2026

https://github.com/vijaykumarr1452/black_friday_sales_analysis

Black Friday Sales Analysis python machine learning project using pandas and scikit-learn for data preprocessing, model training, and performance evaluation.

confusion-matrix jupyter-notebook machine-learning pandas python random-forest-classifier sales-analysis scikit-learn

Last synced: 19 Apr 2026

https://github.com/kaladabrio2020/machine-learning-with-pytorch-and-scikit-learn

Progress on the book machine learning with pytorch and scikit-learn

deep-learning implementation machine-learning python3 pytorch scikit-learn

Last synced: 20 Apr 2026

https://github.com/vyjayanthipolapragada/car_mileage_prediction

Predicting the mileage of car using the linear regression model with Scikit-learn

kaggle-titanic linear-regression machine-learning numpy pandas predictive-modeling python scikit-learn

Last synced: 20 Apr 2026

https://github.com/grandechowhiskey/harvard-cs50-ai-projects

This project contains a collection of programming assignments from CS50’s Introduction to Artificial Intelligence with Python course.

html python scikit-learn tensorflow

Last synced: 20 Apr 2026

https://github.com/himasnhu-at/freecodecamp--ml

ML Models I built for my freeCodeCamp's Machine Learning with Python certification

freecodecamp freecodecamp-project machine-learning machine-learning-algorithms matplotlib pandas python scikit-learn

Last synced: 20 Apr 2026

https://github.com/sayan-mondal2022/mlops-assignment

A project for validating the Machine learning models

machine-learning scikit-learn streamlit

Last synced: 22 Apr 2026

https://github.com/waikato-datamining/spectral-data-converter-sklearn

Scikit-learn plugins for the spectral-data-converter library.

kasperl scikit-learn sdc seppl spectral-data

Last synced: 24 Apr 2026

https://github.com/jawwad-fida/data-science-salary-estimator

A tool that estimates data science salaries (MAE ~ $ 11K) to help data scientists negotiate their income when they get a job.

data-science machine-learning project scikit-learn

Last synced: 25 Apr 2026

https://github.com/mihirmakwana03/ci7521-cw1-notebook

Multi-class classification on imbalanced data — 8 sklearn classifiers + SMOTE + ROC-AUC benchmarking. Kingston CI7521 CW1.

classification hyperparameter-tuning imbalanced-data machine-learning scikit-learn smote

Last synced: 27 Apr 2026

https://github.com/tillscode/personal-finance-ml-analysis

Machine learning analysis of personal financial data with predictive modeling and interactive dashboard

dashboard data-analysis finance machine-learning python scikit-learn

Last synced: 28 Apr 2026

https://github.com/lmriccardo/moments-learning

Repository for the First-Second Moments Learning project. In this repo you will find an implementation of a learning model to learn the relationship between time-series model parameters and the first two moments of its outputs

machine-learning mean mlp-regressor models random-forest scikit-learn time-series torch variance

Last synced: 28 Apr 2026

https://github.com/ronverse17/loan-recovery-strategy

End-to-end ML project for predicting high-risk borrowers and recommending recovery actions

classification data-science kmeans-clustering machine-learning matplotlib random-forest-classifier scikit-learn seaborn

Last synced: 28 Apr 2026

https://github.com/rajivaleaakash/customer-churn-prediction

A machine learning project focused on predicting customer churn using various data analysis and modeling techniques. The repository includes data preprocessing, feature engineering, exploratory data analysis (EDA), model training, evaluation, and visualization to help businesses identify customers at risk of leaving.

churn-prediction classification customer-churn data-analysis data-science gridsearchcv imblearn machine-learning numpy pandas pyhton randomsearchcv scikit-learn

Last synced: 28 Apr 2026

https://github.com/findthehead/pentestpayload

A KNN algorithm based Web Application Payload search and modification engine with a nice red FLASK based GUI

knn-classification knn-regression machine-learning pentest-tool scikit-learn websecurity

Last synced: 28 Apr 2026

https://github.com/nexus69420/movie-recommender-streamlit

A hybrid movie recommendation system that combines content-based filtering using NLP and collaborative filtering using SVD. Built with Python, Streamlit, and trained on TMDB and MovieLens data. Delivers personalized recommendations with a simple web interface.

collaborative-filtering content-based-recommendation data-science machine-learning nlp python recommendation-system scikit-learn streamlit svd

Last synced: 28 Apr 2026

https://github.com/arnab-0053/song-identifier

It identifies songs and artists from lyric snippets using two distinct methods - simple NLP based approach and BM25(Best Match 25) approach.

bm25 nlp nltk python rank-bm25 scikit-learn song-lyrics spotify-dataset text-preprocessing

Last synced: 28 Apr 2026

https://github.com/senaldolage/spam-text-classifier

A simple machine learning project to classify whether a given text message is spam or not

python scikit-learn

Last synced: 28 Apr 2026

https://github.com/arizdn234/spotify-api-with-colab

Crawling, Analyzing, Clustering music data from Spotify API

machile-learning scikit-learn spotify-api spotipy-library

Last synced: 28 Apr 2026

https://github.com/alessine/predicting_pirate_attack_success

Using machine learning to predict the success or failure of pirate attacks; elaborated during the Data Science Bootcamp at Propulsion Academy

bokeh fine-tuning interactive-visualizations machine-learning modelling overfitting plotly prediction scikit-learn

Last synced: 28 Apr 2026

https://github.com/michaelzheng67/ml_classification_optimizer

Algorithm that determines best machine learning classification model to use for a given dataset. Written in Python.

classification machine-learning python scikit-learn

Last synced: 29 Apr 2026

https://github.com/jarif87/text-key-extractor

A Django web app that uses TF-IDF to extract keywords from text, featuring a modern, responsive UI with animated gradients and glassmorphism.

django-application keywords-extraction pandas python scikit-learn

Last synced: 29 Apr 2026

https://github.com/pymc-learn/pymc-learn-sphinx-theme

Sphinx theme for Pymc-learn documentation

pymc3 pymc4 scikit-learn sphinx sphinx-theme

Last synced: 29 Apr 2026

https://github.com/m-muecke/text-normalizer

Text normalizer integration for sklearn.pipeline.Pipeline class

nlp nltk python scikit-learn

Last synced: 29 Apr 2026

https://github.com/gustaminas/ai_primer---flatland

A project from the AI_primer course at Vilnius university.

cnn-keras data-augmentation data-mixup dropout-keras scikit-learn shape-classification

Last synced: 29 Apr 2026

https://github.com/mateluky/covid19-patient-status-prediction

Machine learning model to predict COVID-19 patient status from clinical data, using Python and scikit-learn for healthcare decision support.

classification clinical-decision-support covid19 data-science disease-prediction healthcare jupyter-notebook machine-learning medical-data open-source python scikit-learn

Last synced: 29 Apr 2026

https://github.com/inclinedadarsh/regression-metrics

A simple jupyter notebook demonstrating how to use different metrics from 'scikit-learn' library.

jupyter-notebook machine-learning notebook scikit-learn

Last synced: 29 Apr 2026

https://github.com/anrsgrl/regressions

This project contains examples of Linear, Polynomial, and Logistic Regression models implemented using Python. Explore how different regression techniques can be applied to various datasets 🤖

deep-learning linear-regression logistic-regression mahine-learning polynomial-regression regression scikit-learn

Last synced: 29 Apr 2026

https://github.com/matheusvazdata/retail-sales-forecast-linreg-sklearn

Minimal project for retail sales forecasting using linear regression (scikit-learn).

forecasting linear-regression machine-learning matplotlib numpy pandas scikit-learn

Last synced: 29 Apr 2026

https://github.com/mertafacan/fertilizer-prediction-kaggle-playground-s05e06

Top 9% in Kaggle Playground Series - Predicting Optimal Fertilizers - Season 5, Episode 6

catboost kaggle kaggle-competition machine-learning optuna scikit-learn xgboost

Last synced: 29 Apr 2026

https://github.com/xbants/recommendation-api

🎬 Intelligent movie recommendation system with FastAPI backend, Streamlit frontend, and collaborative filtering ML. Rate movies, get personalized suggestions, and enjoy automatic model retraining.

fastapi machine-learning movie-recommedation python3 scikit-learn streamlit

Last synced: 29 Apr 2026

https://github.com/hexbyte-lab/resumatch

AI-powered resume-to-job matching tool with NLP analysis | Python + Flask + Machine Learning

cosine-similarity flask job-search machine-learning nltk portfolio-project python resume scikit-learn tfidf

Last synced: 29 Apr 2026

https://github.com/tbarlow12/learn-it-your-way

Using Python Flask, I wanted to create a simple web API that allows users to upload a dataset, choose one or more models, store them server side, and then hit an endpoint to get a prediction.

flask machine-learning python scikit-learn tensorflow

Last synced: 29 Apr 2026

https://github.com/abhinav330/instagram-influencers-analysis

This Jupyter Notebook focuses on preprocessing and visualizing data from an Instagram profiles dataset. It includes data loading, inspection, visualization, and some data preprocessing steps.

data data-science data-visualization exploratory-data-analysis exploratory-data-visualizations influncer-products instagram scikit-learn sklearn

Last synced: 08 Jun 2026

https://github.com/rishi-sutar/healwise-ai-your-way-to-wellness

Healwise-AI is a health diagnostic tool that uses a Support Vector Classifier (SVC) model to predict diseases based on user-reported symptoms. After predicting, it offers detailed health advice, including descriptions, diets, medications, and workouts related to the diagnosis.

machine-learning scikit-learn support-vector-machine

Last synced: 30 Apr 2026

https://github.com/das-debjit/emotion-detection

A simple ML-powered web app for real-time emotion detection from text using Streamlit and TF-IDF-based classification.

machine-learning nlp python scikit-learn sentiment-analysis streamlit text-classification tfidf web-app

Last synced: 30 Apr 2026

https://github.com/pramodyasahan/grade-predictor

This project aims to predict student performance based on various features such as job, study time, failures, absences, and first and second period grades. The project utilizes a linear regression model from the scikit-learn library in Python.

machine-learning matplotlib numpy pandas python regression scikit-learn

Last synced: 30 Apr 2026

https://github.com/fikri-rouzan/burnaway-capstone-data-science

Dashboard analitik interaktif untuk memetakan faktor fisik dan pola kerja pemicu burnout pada software developer.

jupyter-notebook matplotlib pandas pillow plotly python scikit-learn seaborn statsmodels streamlit

Last synced: 08 Jun 2026

https://github.com/fbarffmann/credit-risk-classification

Classified 19,000+ loans as high-risk or healthy using logistic regression. Achieved 100% precision for healthy loans and 84% precision for high-risk loans.

classification credit-risk data-analysis logistic-regression machine-learning model-evaluation pandas python scikit-learn

Last synced: 30 Apr 2026

https://github.com/themihirmathur/mihir-clickpost-data-science-intern-round-1-assignment-submission

The objective of this project is to predict the predicted_exact_sla, which is the number of days between the shipment and delivery of an order, using historical shipment data.

data-science machine-learning pandas python random-forest-regression scikit-learn

Last synced: 30 Apr 2026

https://github.com/harshitwaldia/disease_detection

A disease detection system using Random Forest Classifier and GUI in Python, identifying illnesses based on user symptoms.

pandas-python python3 random-forest-classifier scikit-learn tkinter-gui

Last synced: 01 May 2026

https://github.com/kristishqau/sentimentanalysis_nlp

A project for sentiment analysis of tweets using various NLP techniques and machine learning models.

datascience jupyter-notebook machine-learning nlp nltk python scikit-learn sentiment-analysis xgboost

Last synced: 01 May 2026

https://github.com/antonio-f/housing-simplemlexample

Basic example with California Housing Prices dataset from the StatLib repository using scikit-learn

housing-simplemlexample machine-learning scikit-learn simple

Last synced: 01 May 2026

https://github.com/luthfiwulandari/machine-learning-breast-cancer

This project is a simple application that uses logistic regression to detect breast cancer. It classifies tumors as either malignant or benign based on the dataset provided by Scikit-learn.

datascience jupyter logistic-regression machine-learning python scikit-learn

Last synced: 01 May 2026

https://github.com/anastasiaschmidt1/sqli-detection-ml

UNI-PROJEKT: Erkennung von SQL-Injection-Angriffen durch maschinelles Lernen (SVM-Modell)

bht-berlin machine-learning scikit-learn sqli svm

Last synced: 02 May 2026

https://github.com/dmschauer/aws-sagemaker-deployment-test

I did a simple test to see how deploying a machine learning model on AWS Sagemaker and thus turning it into an API works. Since scikit-learn models require less dependencies than e.g. TensorFlow models I went with them for this test. To do so I used a tutorial.

aws boto3 python sagemaker scikit-learn

Last synced: 02 May 2026

https://github.com/bishopce16/cryptocurrencies

An analysis on cryptocurrencies dataset using unsupervised machine learning, PCA algorithm, and K-means clustering.

hvplot jupyter-notebook pandas plotly python scikit-learn unsupervised-machine-learning visual-studio-code

Last synced: 02 May 2026