An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/hiftd22/wpattern

📈 Analyze and visualize stock patterns with wPattern for better trading insights. Simplify your investment decisions through data-driven analysis.

cli finance financial-analysis financial-data matplotlib numpy pandas pattern-recognition python scikit-learn stock-scanner technical-analysis yfinance

Last synced: 29 Apr 2026

https://github.com/pranavsp108/market_basket_analysis-instacart

Customer segmentation and market basket analysis using the Instacart dataset with Python, Pandas, and K-Means clustering.

customer-segmentation-and-buying-behavior data-analysis data-visualization instacart jupyter-notebook kmeans-clustering market-basket-analysis pandas python scikit-learn

Last synced: 05 May 2026

https://github.com/enyaude/california_house_price_prediction

Developed a California house price prediction model utilizing linear regression and Random Forest, and applied machine learning techniques such as Ridge, and Lasso for optimization in Python.

jupyter-notebook linear-regression python random-forest scikit-learn streamlit

Last synced: 23 Feb 2026

https://github.com/jlee9503/telecommunication-churn

Analyze key factors influencing customer churn using Python data analytics technique. Explore key factors through data preprocessing, exploratory data analysis (EDA), and predictive modeling.

data-analysis data-visualization matplotlib pandas python scikit-learn

Last synced: 18 Jan 2026

https://github.com/himanshkr03/loan_default_prediction_using_machine_learning

This repository contains a Python-based project that uses machine learning to predict loan defaults. It explores data preprocessing, feature engineering, and model training techniques to build a predictive model for assessing loan risk.

data-science finance loan-default-prediction machine-learning pandas prediction-model python risk-assessment scikit-learn

Last synced: 14 Apr 2026

https://github.com/sohitbennett/roadsafe

A Deep learning computer vision system for real-time traffic safety monitoring.

computer-vision esrgan keras numpy pandas python scikit-learn tensorflow tesseract-ocr yolov5 yolov8

Last synced: 08 Apr 2026

https://github.com/allanreda/automated-k-means-clustering-engine

An interactive K-Means clustering tool built with Flask and Scikit-Learn, supporting Excel file uploads, cluster analysis, and data export, deployed on Google Cloud Run via Docker with CI/CD integration.

cicd css data-visualization deployment docker flask google-cloud-run html javascript k-means-clustering machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 19 Jan 2026

https://github.com/katjaweb/king-county-house-price-prediction

This project aims to predict house prices based on various features such as square footage, number of rooms or location.

machine-learning python regression scikit-learn

Last synced: 19 Jan 2026

https://github.com/lorenzorottigni/ml-breast-cancer

Machine Learning python bootcamp: Support Vector Machines using breast cancer dataset

ipynb machine-learning numpy pandas python scikit-learn seaborn support-vector-machines

Last synced: 14 Apr 2026

https://github.com/sudarshanc00/smishing

This project aims to classify text messages to detect potential smishing (SMS phishing) attacks. Using machine learning, the project provides a classifier that can differentiate between legitimate messages and smishing attempts, helping to prevent scams.

nltk numpy pandas python scikit-learn scipy

Last synced: 14 Apr 2026

https://github.com/ricardorobledo/paymentcardfrauddetection2025

Comparative analysis of probabilistic classification models for credit card fraud detection, focusing on model calibration and threshold optimization in highly imbalanced datasets.

imbalanced-learn matplotlib numpy pandas python3 scikit-learn search

Last synced: 14 Apr 2026

https://github.com/nikshithmenta/fake-news-detector

This repository contains a Streamlit web app designed for fake news detection. Users can input a news article, and the app will predict whether it's real or fake based on its content. It also allows users to choose between different vectorizers (TF-IDF or Bag of Words) and classifiers (Linear SVM or Naive Bayes) to customize the prediction model.

bag-of-words fake-news-detection linear-svc naive-bayes-classifier scikit-learn streamlit-application tf-idf

Last synced: 15 May 2026

https://github.com/taimoorkhan10/ai-fairness-explainability-toolkit

AI Fairness and Explainability Toolkit (AFET) is an open-source project aimed at providing tools and frameworks to assess, visualize, and mitigate bias in machine learning models. It supports multiple ML frameworks and offers a comprehensive suite of metrics and visualization components to enhance model transparency and fairness.

ai bias-detection data-science ethical-ai explainable-artificial-intelligence fairness machine-learning mlops model-interpretation open-source python responsible-ai scikit-learn

Last synced: 19 Jan 2026

https://github.com/jsimell/sleepanalysis

A Python data analysis project analyzing the sleep quality affecting factors and temporal patterns in the sleeping data of a single subject.

data-analysis matplotlib numpy pandas python scikit-learn seaborn

Last synced: 14 Apr 2026

https://github.com/wasifsohail5/amusic-ai_powered_musicrecommendationsystem

AMUSIC is an AI-driven music recommendation system that helps users discover personalized songs. Using Python, Streamlit, and Scikit-learn, it offers smart recommendations, advanced search, and interactive music insights. Users can save favorites, create playlists, and export data for a seamless music discovery experience.

joblib k-nearest-neighbours matplotlib minmaxscaler numpy pandas pickle plotly python scikit-learn seaborn streamlit

Last synced: 14 Oct 2025

https://github.com/yahiazakaria445/ensemble-learning-voting-classifier

Ensemble Learning Using KNN, Naive Bayes, Decision Tree on Biomechanical Data

matplotlib numpy pandas scikit-learn seaborn

Last synced: 30 Apr 2026

https://github.com/aasjunior/machinelearningapp

O Machine Learning App é um aplicativo desenvolvido com Kotlin, Android Studio e Jetpack Compose, para aplicação de algoritmos de aprendizado de máquina e exibição dos resultados. Realizado como tarefa da disciplina de Laboratório Mobile/Computação Natural no 5º Semestre de Desenvolvimento de Software Multiplataforma.

fastapi jetpack-compose kotlin-android machine-learning material-design scikit-learn

Last synced: 18 Apr 2026

https://github.com/moanassiddiqui/handsonml_ml

This is the complete part I of the Hands-On Machine Learning book which was about the classical machine learning models.

hands-on machine-learning scikit-learn

Last synced: 14 Mar 2026

https://github.com/mindlessmuse666/iris-ml-based-on-decision-trees

Проект демонстрирует применение моделей машинного обучения на основе деревьев решений и случайного леса для классификации набора данных Iris. Включает в себя загрузку данных, обучение моделей, оценку производительности и визуализацию результатов. Предназначен для изучения основ машинного обучения и анализа данных.

classification data-analysis data-visualization decision-trees iris-dataset machine-learning model-evaluation python random-forest scikit-learn

Last synced: 17 Oct 2025

https://github.com/hassan11196/churn-nn

A simple Churn Predictor using Scikit's Multi-Layer Perceptron Classifier

jupyter-notebook machine-learning ml neural-network python scikit-learn

Last synced: 14 Apr 2026

https://github.com/alexliap/sk_serve

Deployment of a Scikit-Learn model and it's column transformations made easy.

machine-learning mlops model-deployment scikit-learn

Last synced: 24 Oct 2025

https://github.com/santiagoenriquega/ez-animate

A Python package for creating Matplotlib animations with minimal code. Built to quickly visualize model behavior.

animation machine-learning matplotlib python scikit-learn

Last synced: 15 Mar 2026

https://github.com/smahala02/svm-machine-learning

This repository provides an in-depth tutorial and practical implementation of Support Vector Machines (SVM) for classification tasks, using Python and popular data science libraries.

classification data-science machine-learning python scikit-learn svm

Last synced: 30 Jan 2026

https://github.com/rahul-120/crop_recom

This project is a Machine Learning based Crop Recommendation System built using Flask. It helps farmers or users decide the most suitable crop to grow based on soil nutrients and environmental conditions.

crop-recommendation-system flask flask-application machine-learning python3 scikit-learn

Last synced: 02 May 2026

https://github.com/raulmaulidhino-dev/ml_modelling_regression

There are many factors that influence the grades/scores of students. One of the factors is study hours. In this mini analysis project, there are 3 models that will learn and predict the relation between study hours of students and their scores in an exam/test. This project will result the best ML model to solve the problem.

data data-analysis-python data-science eda machine-learning scikit-learn

Last synced: 28 Jan 2026

https://github.com/pyzit/recommandation-engine-in-drf-sk-learn

Full Stack Movie Recommendation System Project made in Django REST Framework and React JS

api django django-rest-framework movies reactjs recommender-system scikit-learn

Last synced: 28 Jan 2026

https://github.com/engineertolulope/us_states_living_ranking_analysis

Python script for analyzing and ranking U.S. states based on factors like cost of living, tax burden, diversity, crime rates, and climate. Uses weighted criteria to identify the best states to live in according to these metrics. Ideal for decision-making on relocation.

data-analysis data-science linear-regression machine-learning python scikit-learn

Last synced: 29 Jan 2026

https://github.com/samjoesilvano/airline_ticket_fare_prediction

Airline Fare Prediction using Machine Learning focuses on developing a Random Forest model to predict flight prices, achieving an R² score of 0.804. The project includes hyperparameter tuning using RandomizedSearchCV, alongside extensive data preprocessing and feature engineering to ensure robust model performance.

airline-fare-prediction data-preprocessing data-visualization feature-engineering feature-selection hyperparameter-tuning machine-learning pandas python random-forest randomizedsearchcv regression-analysis scikit-learn

Last synced: 15 Apr 2026

https://github.com/jaypanchal9/fraud-detection-case-study

A comprehensive case study applying machine learning techniques to detect fraudulent transactions effectively.

machine-learning matplotlib numpy pandas python3 scikit-learn seaborn xgboost

Last synced: 15 Apr 2026

https://github.com/itssahilwhat/ai-fundamentals

A curated collection of fundamental AI concepts, algorithms, and code implementations — including Machine Learning, Deep Learning, and Computer Vision — built from scratch and with practical examples.

computer-vision deep-learning machine-learning numpy pandas python pytorch scikit-learn

Last synced: 15 Apr 2026

https://github.com/gunjangyl/iris-detection

The Iris Detection Project classifies different species of Iris flowers using machine learning techniques. It analyzes four key features—sepal length, sepal width, petal length, and petal width—to predict one of three classes: Setosa, Versicolor, or Virginica. The project uses algorithms like KNN, Decision Trees, or SVM for classification. Model pe

knn-classification matplotlib python scikit-learn seaborn

Last synced: 15 Apr 2026

https://github.com/samiyaalizaidi/nn-ml-homeworks

Homework solutions for CPE-4903: Neural Networks & Machine Learning at Kennesaw State University.

machine-learning machine-learning-workflow neural-networks numpy scikit-learn

Last synced: 15 Apr 2026

https://github.com/as1467/canada-per-capita-income-prediction

This project is a simple machine learning exercise to predict Canada's per capita income based on historical data. The dataset used in this project was sourced from the CodeBasics GitHub repository and is used here to practice linear regression as part of my machine learning learning process.

machine-learning matplotlib-pyplot pandas python scikit-learn

Last synced: 15 Apr 2026

https://github.com/nikitalpopov/evotor_champ

solution for evotor data challenge

data-analysis data-science python scikit-learn

Last synced: 15 Apr 2026

https://github.com/sabin74/fake_news_detection

This project implements a Fake News Detection system using Python, Natural Language Processing (NLP), and machine learning. It classifies news articles as Real or Fake based on their textual content.

fake-news-detection kaggle-dataset multinomial-naive-bayes passive-aggressive-classifier python3 regex scikit-learn

Last synced: 16 Apr 2026

https://github.com/sanjiv856/machine_learning_scikit-learn

Repository for machine learning in Python using Scikit-learn.

pipelines python scikit-learn sklearn titanic-kaggle titanic-survival-prediction

Last synced: 27 Feb 2026

https://github.com/zsailer/skspline

A Scikit-learn interface on Scipy's spline.

scikit-learn scipy

Last synced: 16 Apr 2026

https://github.com/sergeimakarovv/energy-data-analytics-ml

Analyzing global data on sustainable energy, predicting CO2 emissions per capita

machine-learning pandas plotly python scikit-learn streamlit

Last synced: 12 Feb 2026

https://github.com/mgesteban/analyzing_car_prices

A comprehensive data science project analyzing factors that drive used car prices to provide actionable insights for used car dealerships.

crisp-dm data-science lasso-regression linear-regression machine-learning one-hot-encoding pandas ridge-regression scikit-learn

Last synced: 15 Feb 2026

https://github.com/paultheal1en/dsc-fact-checking

Fact-checking project classifying claims as SUPPORTED, REFUTED, or NEI. Uses ANN, DNN, RNN, CNN, Random Forest, PhoBERT, and Sentence Transformers.

deep-learning fact-checking keras machine-learning nlp phobert random-forest scikit-learn sentence-transformers tensorflow transformers

Last synced: 16 Apr 2026

https://github.com/hafidaso/predicting-industrial-machine-downtime-level-3

This project aims to develop a predictive model using machine learning techniques to forecast machine failures based on historical operational data.

imbalanced-learning numpy pandas python scikit-learn seaborn xgboost

Last synced: 16 Apr 2026

https://github.com/silky-x0/spam-detector

An machine learning algorithm to detect spam emails or such.

jupyter-notebook nltk-python pandas python3 scikit-learn

Last synced: 16 Apr 2026

https://github.com/khaymanii/calories-burnt-prediction-model

This model was built using Python and XGBoost Regression algorithm

matplotlib numpy pandas python scikit-learn

Last synced: 16 Apr 2026

https://github.com/meiyor/abatech_ai_test

This repository contains the files for deploying an Exploratory Data Analysis (EDA) for participant demographic and company-based data collected by the outsourcing service given by the company ABATech located in Colombia. This repository also includes the evaluation of three different classifiers to decode the level of satisfaction of the users.

keras python scikit-learn scikitlearn-machine-learning tensorflow

Last synced: 16 Apr 2026

https://github.com/archish27/pythontutorial

Python Programming Tutorial for new geeks who want to learn python from scratch to deal with various applications

matplotlib numpy pandas pygame python python-2 python-3 scikit-learn soup

Last synced: 01 Apr 2026

https://github.com/sahiltiwariiii/dssp

Predicting student math scores ! This project utilizes advanced machine learning techniques and MLOps tools like DVC and MLflow to predict a student's math score based on various factors such as gender, race/ethnicity, parental level of education, lunch type, test preparation course, writing etc

docker dotenv dvc flask machine-learning mlflow mlops mysql mysql-connector-python numpy pandas pymysql python python-dotenv scikit-learn seaborn sklearn-library statistics streamlit

Last synced: 27 Mar 2026

https://github.com/capsuleismail/income-census-prediction

Predict whether annual income of an individual exceeds $50K per annum based on census data. Also known as "Census Income" dataset.

datascience jupyter-notebook machinelearning-python scikit-learn

Last synced: 16 Apr 2026

https://github.com/ejw-data/proj-food-inspections

Analyzing Chicago Food Inspection data for interesting insights by combining multiple data resources and performing feature engineering.

decision-trees pandas preprocessing python scikit-learn

Last synced: 17 Apr 2026

https://github.com/zenklinov/regression_logistic_-_sentiment_analysis_movie_data

This repository contains code for performing sentiment analysis using scikit-learn and logistic regression

llm natural-language-processing nlp nltk scikit-learn sentiment-analysis

Last synced: 10 May 2026

https://github.com/iamwatchdogs/cardiovascular-risk-prediction

This mini-project uses machine learning algorithms to predict possible risks of heart disease by analyzing given data.

jupyter-notebook machine-learning-algorithms matplotlib numpy pandas python scikit-learn seaborn

Last synced: 02 Apr 2026

https://github.com/raphael-ufrj/analise_algodao

Análise histórica de plantio de algodão, analise do plantio com base no clima e nos dados históricos.

analysis data-science data-visualization dataset docker pandas provenance python python3 scikit-learn seaborn streamlit

Last synced: 02 Apr 2026

https://github.com/prashver/end-to-end-model-deployment-on-aws

Student Performance Analysis with Machine Learning analyzes factors impacting student outcomes using a robust machine learning pipeline. Achieving an impressive R2 score, it predicts student performance effectively. With extensive data preprocessing and deployment on AWS Elastic Beanstalk, it ensures scalability and high availability.

amazon-web-services aws-elastic-beanstalk end-to-end-deployment flask machine-learning-algorithms matplotlib numpy pandas scikit-learn seaborn

Last synced: 02 Apr 2026

https://github.com/nikhilgugwad/sentiment-analysis

Sentiment analysis for the Kannada language to classify Kannada sentences into different emotions.

numpy pandas scikit-learn

Last synced: 17 Apr 2026

https://github.com/bjpcjp/scikit-learn

Updates in progress. Jupyter workbooks will be added as time allows.

python python3 scikit-learn

Last synced: 18 Apr 2026

https://github.com/gregoritsch3/ml_clustering_eda_customersegmentation

An EDA and Machine Learning Clustering exercise on the Mall Customer Segmentation synthetic dataset demonstrating the use of KMeans Clustering and the Elbow Method. The clustering algorithm successfully segments the customer base into groups distinguishable by their annual income and spending score.

clustering kmeans-clustering machine-learning matplotlib numpy pandas scikit-learn scipy seaborn

Last synced: 04 Apr 2026

https://github.com/abdul-rafay19/california-housing-price-prediction

This project predicts California housing prices using machine learning regression models, including Random Forests and Decision Trees. It covers data preprocessing, exploratory analysis, model training, and hyperparameter tuning to optimize performance.

decision-trees gridsearchcv linear-regression matplotlib numpy pandas python random-forest randomsearch-cv scikit-learn scipy seaborn

Last synced: 04 Apr 2026

https://github.com/mnitin-reddy/a-b-testing-and-regression-analysis-for-ad-performance-optimization

Analyzed the performance of Facebook and AdWords ads using A/B testing and regression analysis to identify trends, correlations, and cost-effectiveness. Key insights included distribution of clicks and conversions, monthly trends, and cost-per-conversion analysis to optimize ROI.

abtesting data-science hypothesis-testing machine-learning matplotlib numpy pandas scikit-learn scipy seaborn statsmodels

Last synced: 04 Apr 2026

https://github.com/giacomolat/object-detection-sperimental-thesis-for-degree

In this repository is my experimental thesis work on the recognition of museum works through object detection techniques.

convolutional-neural-networks detectron2 jupyter-notebook machine-learning neural-networks object-detection python pytorch rcnn rcnn-model scikit-learn

Last synced: 18 Apr 2026

https://github.com/deliprofesor/game-search-volume-prediction-machine-learning-models-and-forecasting

This repository uses machine learning models like Random Forest, XGBoost, LightGBM, and time-series forecasting with Prophet to predict game search volumes. Additionally, Grid Search is applied for hyperparameter tuning of the LightGBM model.

data-cleaning data-science data-visualization feature-selection forecasting-models game-search grid-search hyperparameter-tuning lightgbm machine-learning pandas prophet python random-forest scikit-learn time-series-analysis time-series-forecasting xgboost

Last synced: 18 Apr 2026

https://github.com/manalisbhavsar/mall-customers-clustering

K-Means clustering to mall customer data, segmenting customers based on their annual income and spending score. To identify patterns and group customers for targeted marketing.

data-analysis data-visualization matplotlib numpy pandas python scikit-learn

Last synced: 18 Apr 2026

https://github.com/yashrajgithub/crop-recommendation

KrishiGyaan is a web app designed to help farmers make informed decisions on crop selection. By analyzing soil and environmental factors, the app provides personalized crop recommendations, enhancing agricultural productivity and promoting sustainable farming practices.

api artificial-intelligence crop-recommendation-system data-preprocessing data-visualization json machine-learning-algorithms pickle python random-forest-classifier scikit-learn streamlit supervised-learning train-test-split user-interface

Last synced: 05 Apr 2026

https://github.com/nowon1/insurance-claim-prediction_version

This project aims to predict the insurance claim amounts based on various customer attributes using machine learning techniques. The project involves data preprocessing, exploratory data analysis, feature engineering, and model training and evaluation.

data-preprocessing data-science data-visualization exploratory-data-analysis feature-engineering insurance jupyter-notebook machine-learning numpy pandas predictive-modeling python random-forest regression-analysis scikit-learn

Last synced: 05 Apr 2026

https://github.com/thekartikeyamishra/ai-customer-feedback-summarizer

The AI Customer Feedback Summarizer is a Python-based application that processes customer feedback, extracts insights, and summarizes reviews. This basic version uses extractive summarization techniques, and the advanced version integrates advanced sentiment analysis, visualization, and industry-specific fine-tuning.

ai chatbot gpt machine-learning matplotlib nltk pandas python scikit-learn streamlit

Last synced: 18 Apr 2026

https://github.com/vijaykumarr1452/black_friday_sales_analysis

Black Friday Sales Analysis python machine learning project using pandas and scikit-learn for data preprocessing, model training, and performance evaluation.

confusion-matrix jupyter-notebook machine-learning pandas python random-forest-classifier sales-analysis scikit-learn

Last synced: 19 Apr 2026

https://github.com/namratha2301/carprice_analysisandprediction

This project analyzes factors influencing vehicle prices using a dataset of various attributes, including Engine capacity, Power, Mileage, and Seating capacity.

data-analysis data-visualization exploratory-data-analysis machine-learning pandas predictive-modeling random-forest-classifier regression scikit-learn seaborn

Last synced: 20 Apr 2026

https://github.com/tryomar/data-miner

DataMiner is an interactive web application for data mining and machine learning. It helps users upload, clean, transform, and analyze datasets while building predictive models — all through a simple and powerful Streamlit interface.

data-cleaning data-mining data-preprocessing data-science data-visualization interactive-dashboards pandas python scikit-learn streamlit

Last synced: 20 Apr 2026

https://github.com/adityapradhan202/binge-trend

Media and entertainment recommendation website with AI powered recommendation system.

datascience-machinelearning natural-language-processing python scikit-learn spacy-nlp

Last synced: 21 Apr 2026

https://github.com/yogeshsinghkatoch9/advanced_nyc_housing_price_prediction

A robust ensemble learning framework for advanced NYC housing price prediction, leveraging global, clustered, and local ensembles with hyperparameter tuning.

data-science ensemble-learning housing-prices machine-learning new-york python scikit-learn

Last synced: 21 Apr 2026

https://github.com/waikato-datamining/spectral-data-converter-sklearn

Scikit-learn plugins for the spectral-data-converter library.

kasperl scikit-learn sdc seppl spectral-data

Last synced: 24 Apr 2026

https://github.com/leolion3/smartnanotubes-smellinspector-companion

Companion software for the SmellInspector Devices from SmartNanoTubes. Allows specifying substances, connecting multiple devices, collecting data and performing machine learning.

docker machine-learning python3 reactjs scikit-learn smartnanotubes smellinspector

Last synced: 27 Apr 2026

https://github.com/davidrpugh/kaust-dsa-201

Course materials for KAUST DSA 201

deep-learning machine-learning pytorch scikit-learn

Last synced: 27 Apr 2026