An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/pradipnp/decisiontree-iris

Machine learning project to classify iris flowers using a decision tree

classification decision-tree iris-dataset machine-learning python scikit-learn

Last synced: 18 May 2026

https://github.com/yugalsoni18/counterfeit_review_detection

Fake review detection using TF-IDF & SVM (AUC 0.98), plus Counterfeit Risk Score with clustering & anomaly detection.

business-analytics fraud-detection isolation-forest kmeans nlp python risk-scoring scikit-learn svm tfidf

Last synced: 18 May 2026

https://github.com/bhaveshbhakta/crop-yield-prediction

Indian Crop Yield Prediction Using Machine Learning

flask machine-learning python random-forest scikit-learn webdevelopment

Last synced: 20 Apr 2026

https://github.com/idaraabasiudoh/credit_card_fraud_detection

This repository contains a machine learning project focused on detecting credit card fraud using Decision Tree and Support Vector Machine (SVM) classifiers.

data-analysis jupyter-notebook machine-learning python3 scikit-learn snapml

Last synced: 19 Feb 2026

https://github.com/callesjuan/ninjalprm

Protótipo de ferramenta de agrupamento de dispositivos Android por geolocalização (Server)

python scikit-learn xmpp

Last synced: 20 Jan 2026

https://github.com/veranyagaka/credit-card-fraud-detection

Credit Card Fraud Detection using data preprocessing, analysis, visualization, and machine learning to accurately identify fraudulent transactions. -Final Project

ai anomaly-detection classification credit-card-fraud-detection machine-learning scikit-learn supervised-learning

Last synced: 18 May 2026

https://github.com/lc-rezende/eqx_boston_dataset

Exploratory data analysis, clustering, and forecasting on Boston crime data (2011-2015), revealing key crime trends, hotspots, and temporal patterns to support data-driven insights for urban safety and policing strategies.

data-analysis exploratory-data-analysis jupyter-notebook kmeans matplotlib numpy pandas prophet-facebook python scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/ashishsingh789/bcg_virtual_internship

This repository showcases my BCG X virtual internship project on customer churn analysis for PowerCo, covering business understanding, EDA, feature engineering, and modeling using Python and machine learning.

data-manipulation data-science dataanalysis datavisualization eda machine-learning matplotlib numpy pandas python random-forest scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/kasraskari/tumor-predict

Streamlit app for predicting tumor malignancy using logistic regression.

logistic-regression machine-learning numpy pandas python scikit-learn streamlit tumor-detection

Last synced: 09 Apr 2026

https://github.com/aadrianleo/fashion-style-classifier

A machine learning and deep learning pipeline for fashion image classification. Combines real-world data, manual annotation, and both KNN and EfficientNet-B0 CNN models to classify images into style categories. Includes data cleaning, augmentation, model training, evaluation, and reproducible notebooks.

classification-report cnn computer-vision confusion-matrix data-augmentation data-preprocessing deep-learning efficientnet exploratory-data-analysis fashion-classification image-classification knn label-studio machine-learning model-evaluation pytorch real-world-data reproducible-research scikit-learn transfer-learning

Last synced: 11 May 2026

https://github.com/mhkamel/ecommerce-targeting-system

A Flask-based E-Commerce Targeting System that provides customer segmentation and personalized product recommendations. Users can upload structured interaction data for analysis, receive AI-driven recommendations, and gain insights into user behavior. The application is built with Flask, Pandas, Scikit-Learn, and integrates an interactive web inter

ai bootstrap csv-processing customer-segmentation data-analysis data-science e-commerce flask machine-learning pandas python recommendation-system scikit-learn user-behavior web-application

Last synced: 09 Apr 2026

https://github.com/kuldeep-gif/interactive-gesture-speech-system

An interactive AI system that translates real-time hand gestures into audible speech and converts spoken words into visual gestures using OpenCV and MediaPipe.

computer-vision gesture-recognition hci machine-learning mediapipe opencv python scikit-learn speech-recognition

Last synced: 09 Apr 2026

https://github.com/jianninapinto/bandersnatch

This project implements a machine learning model using Random Forest, XGBoost, and Support Vector Machines algorithms with oversampling and undersampling techniques to handle imbalanced classes for classification tasks in the context of predicting the rarity of monsters.

altair imbalanced-classification imblearn machine-learning mongodb oversampling pycharm-ide pymongo python random-forest-classifier scikit-learn smote support-vector-machines undersampling xgboost

Last synced: 29 Sep 2025

https://github.com/macromrit/air-flick

Transfer files through the air with just a gesture. Push. Pull. Done.

css cv2 fastapi html js media-pipe peer2peer python random-forest-classifier restful-api scikit-learn websockets

Last synced: 09 Apr 2026

https://github.com/iamriteshkoushik/skrun

18hrs Scikit Learn Course Speedrun Repo

freecodecamp machine-learning scikit-learn

Last synced: 26 Apr 2026

https://github.com/anusha-me/customer_churn_analysis

Predict and analyze telecom customer churn using machine learning techniques and business dashboards. This end-to-end project includes data preprocessing, EDA, model evaluation (SVM, XGBoost), real-time Streamlit deployment, and Power BI dashboard reporting. Built for actionable insights and decision support.

churn-prediction classification-model customer-analytics dashboard data-science eda machine-learning powerbi predictive-analytics python scikit-learn streamlit svm telecom xgboost

Last synced: 29 Apr 2026

https://github.com/subhas-pramanik-09/mediscan-ai

A smart and scalable ML-powered health prediction system that can help detect the risk of three major diseases: Diabetes + Heart Disease + Parkinsons Disease

jupyter-notebook logistic-regression machine-learning numpy pandas scikit-learn streamlit svm-classifier

Last synced: 09 Apr 2026

https://github.com/smpotts/student-performance-predictions-ml

Creates machine learning models to predict student's learning outcomes.

jupyter-notebook machine-learning python regression-models scikit-learn

Last synced: 12 Sep 2025

https://github.com/mmerlyn/analysis-of-tomato-prices

Forecasting tomato prices in Karnataka using machine learning to help farmers make better crop planning and selling decisions.

css flask html matplotlib numpy pandas python scikit-learn seaborn

Last synced: 06 Jul 2025

https://github.com/omdoshi13/pricing-of-laptops-using-ml

Data Analysis, training Machine Learning models, and Model Evaluation and Refinement for Pricing of Laptops dataset.

data-analysis data-analysis-project datascience google-colab jupyter-notebook machine-learning matplotlib model-evaluation model-refinement numpy pandas python scikit-learn

Last synced: 09 Apr 2026

https://github.com/vhnegrisoli/machine-learning-linguagens-programacao

Projeto de Data Science e Machine Learning de análise de linguagens de programação de 2004 a 2021

data-science jupyter-notebook machine-learning matplotlib pandas python scikit-learn seaborn

Last synced: 07 Apr 2026

https://github.com/nurulashraf/linear-regression-insurance-premium

This analysis applies simple linear regression to explore the relationship between age and insurance premium. It includes model training, visualisation, and evaluation using MSE and RMSE to assess prediction accuracy.

beginner-project data-analysis insurance-data linear-regression machine-learning matplotlib predictive-modeling python regression-models scikit-learn

Last synced: 05 May 2026

https://github.com/pejpero/machine_learning

This repository contains two comprehensive machine learning projects using scikit-learn, demonstrating ensemble learning with a Voting Classifier and the comparison of linear and polynomial regression models on different datasets.

ensemble-learning linear-regression logistic-regression machine-learning polynomial-regression random-forest scikit-learn svm

Last synced: 09 Feb 2026

https://github.com/praditaw/patient-los-prediction

Predicting patient Length of Stay (LoS) using machine learning to provide insights for hospital operational efficiency.

exploratory-data-analysis feature-engine healthcare-analysis huggingface-spaces hyperparameter-tuning length-of-stay los-prediction machine-learning pandas scikit-learn streamlit

Last synced: 05 May 2026

https://github.com/gerardo1909/proyecto_nba_mvp

Trabajo práctico final de la materia "Introducción al Aprendizaje Automático" de la Licenciatura en Ciencia de Datos (UNSAM). 2C-2023

machine-learning nba notebooks-jupyter pandas python random-forest scikit-learn

Last synced: 03 Oct 2025

https://github.com/impesud/ai-finops-platform

AI FinOps is an AI-powered platform for cloud cost optimization and forecasting. Built with FastAPI, Python, and modern MLOps tools, it allows teams to track multi-cloud usage, detect anomalies, and predict future expenses using real-time data and machine learning.

aws docker fastapi jupyter mlflow python react scikit-learn statsmodels tailwindcss terraform xgboost

Last synced: 09 Apr 2026

https://github.com/parag000/content-based-movie-recommender

This project builds a content-based movie recommendation system using the TMDB dataset. By combining metadata features like cast, genres, and directors into a "metadata soup," it calculates movie similarity with vectorizers (Count) and cosine similarity. Ideal for learning content-based filtering and text vectorization techniques.

cosine-similarity countvectorizer recommendation-system scikit-learn tfidf-vectorizer vectorization

Last synced: 18 Apr 2026

https://github.com/swetshaw/machine-learning-a-z

It contains all tutorials based on Udemy course Machine Learning A-Z.

machine-learning python scikit-learn udemy-machine-learning

Last synced: 07 Apr 2026

https://github.com/svetlanam/pycon-workshop

Pycon CZ workshop: Better data analyses and product recommendations with Instagram data

data-analysis data-science martinus matplotlib pandas pycon2016 pyconcz python scikit-learn workshop

Last synced: 09 Apr 2026

https://github.com/ayan6943/employee-attrition-prediction-with-machine-learning

Employee Attrition Prediction with Machine Learning | Analyzing HR data to predict employee turnover using Random Forest. Includes EDA, feature engineering, model training, and evaluation. Achieved 90% accuracy.

attrition employee machine-learning matplotlib numpy pandas python randomforestclassifier scikit-learn seaborn smote

Last synced: 09 Apr 2026

https://github.com/al-shafi-github/deephatedetect-explainable-bengali-abusive-comments-classification-using-transformers-and-llm

This Project aims to train different models that can detect Bengali hate speech on different social media platforms and do a comparative analysis of the models

bangla-nlp nlp nlp-machine-learning python3 regex scikit-learn scikitlearn-machine-learning tabular-data

Last synced: 01 May 2026

https://github.com/jalijuhola/amazon-textual-reviews-recommender-

predicting score and recommending using amazon textual reviews

numpy pandas python scikit-learn typescript

Last synced: 09 Apr 2026

https://github.com/chengetanaim/customerpersonalityanalysis

Customer Personality Analysis involves a thorough examination of a company's optimal customer profiles. This analysis facilitates a deeper understanding of customers, enabling businesses to tailor products to meet the distinct needs, behaviors, and concerns of various customer types

kmeans-clustering pandas scikit-learn

Last synced: 21 Apr 2026

https://github.com/dragonscypher/feastfinderai

Discover the best dining spots with FeastFinderAI!

folium pandas python scikit-learn sql

Last synced: 09 Apr 2026

https://github.com/ifigeneiatsiflidou/applied-statistics-project

Project for an Applied Statistics course, involving exploratory data analysis and predictive modeling of movie revenue using engineered features and multiple linear regression.

correlation-analysis data-analysis linear-regression python scikit-learn visualization

Last synced: 29 Apr 2026

https://github.com/ravi0529/e-commerce-annual-spend-model

A basic Linear Regression model for predicting annual customer's spending

jupyter-notebook linear-regression matplotlib numpy pandas python scikit-learn scipy

Last synced: 09 Apr 2026

https://github.com/bkaracali/crime-data-analysis

Repository for Final Project

machine-learning python scikit-learn

Last synced: 21 Apr 2026

https://github.com/sk-g/mnist_beginners

Model search in traditional machine learning algorithms (non DL) and DL starter codes on MNIST dataset. This is a good starter code for beginners trying to learn about curse of dimensionality, overfitting and other concepts in general

keras machine-learning machine-learning-algorithms mnist mnist-beginners mnist-classification mnist-dataset numpy overfitting python pytorch pytorch-implmention resnet resnet-50 scikit-learn scikitlearn-machine-learning sklearn tensorflow

Last synced: 09 Apr 2026

https://github.com/nazmul-1117/100-days-of-machine-learning

I'm Nazmul so exited to start a new journey to learn 100 Days of Machine Learning. It's February 8, 2025. I'm so exited, let's see what happened insha'Allah

data-science machine-learning numpy pandas-dataframe python3 scikit-learn statistics

Last synced: 11 Aug 2025

https://github.com/hariprasath-v/hackerearth-amazon-business-research-analyst-hiring-challenge

Build a machine learning model that can calculate the time the delivery person takes to deliver the order.

exploratory-data-analysis hackerearth machine-learning pandas pycaret python scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/abdellatif-laghjaj/salary-scope-predictor

SalaryScope: Job Salary Predictor is a machine learning solution designed to estimate salaries from job listings. It employs a full ML pipeline from exploratory data analysis, data cleaning, and NLP on job descriptions to regression model training (Linear Regression, Random Forest, etc.) and hyperparameter tuning

data-science developer-survey feature-engineering machine-learning predictive-modeling regression salary-calculator salary-prediction scikit-learn streamlit

Last synced: 08 May 2026

https://github.com/hrolive/disaster-response-pipeline

A machine learning pipeline that categorizes disaster related messages so that they can be sent to the appropriate disaster relief agency

flask machine-learning natural-language-processing nltk pandas plotly python scikit-learn sql sqlalchemy

Last synced: 07 Apr 2026

https://github.com/bhuvan-s-prasad/-alzheimer-diagnosis

This project predicts Alzheimer’s disease using machine learning with basic MLOps integration for better organization and reproducibility. It includes data processing, model training, evaluation, and deployment, incorporating version control, automation, and experiment tracking as a first step into MLOps.

alzheimers-disease classification eda explainable-ai exploratory-data-analysis machine-learning mlops pandas python random-forest random-forest-classifier regression scikit-learn supervised-learning

Last synced: 09 Apr 2026

https://github.com/ezeparziale/tweet-clasification

:bird: Tweet sentiment analysis

bootstrap flask nltk python scikit-learn

Last synced: 09 Apr 2026

https://github.com/prakashjha1/customer-segmentation

This repository contains a customer segmentation project implemented in a Jupyter Notebook using Python. Customer segmentation is a crucial strategy for businesses aiming to understand their customer base better, enabling targeted marketing strategies and personalized customer experiences.

clustering-algorithm customer-segmentation kmeans-clustering matplotlib python scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/eusha425/housing-market-analysis

Implementation of supervised learning algorithms for real estate price prediction, featuring Ridge Regression optimization, IQR-based outlier detection, and extensive feature engineering. Includes detailed visualizations, statistical analysis, and model performance comparisons using various evaluation metrics.

data-preprocessing data-science exploratory-data-analysis house-price-prediction machine-learning python scikit-learn supervised-learning

Last synced: 09 Apr 2026

https://github.com/kejiahp/fastapi-ecom-recommendation-system

Advanced recommendation system for e-commerce applications.

docker fastapi jinja2 mongodb motor pydantic python scikit-learn scikit-surprise

Last synced: 07 Apr 2026

https://github.com/amandeep-gupta19/chatbot

Created a custom chatbot using Langchain. Here's a summary of what I did: Data Extraction: I gathered data about technical courses from the Brainlox website using Langchain’s URL loaders. Embedding Creation & Storage: I converted this data into embeddings and stored it in a vector store for efficient searching. API Development: I built a Flask

data-extraction faiss-vector-database flask-restful langchain numpy scikit-learn vector-database webbaseloader

Last synced: 09 Apr 2026

https://github.com/rohansoni45/movie-recommendation-system

This project is a Content-Based Recommender System that suggests movies to users based on their preferences and watched history. The system leverages cosine similarity to find and recommend movies similar to a selected title. It is built using Python and libraries like Pandas, NumPy, and Scikit-learn.

content-based-filtering cosine-similarity data-analysis data-science machine-learning numpy pandas python recommender-system render scikit-learn

Last synced: 17 Apr 2026

https://github.com/h00n24/ikr

Klasifikace a rozpoznávání - projekt

fit ikr scikit-learn vutbr

Last synced: 18 May 2026

https://github.com/konnik88/heart-disease-ml-practice

Practice notebook on heart-disease risk with a small/noisy dataset: EDA → preprocessing → classic ML baselines (scikit-learn). Not for clinical use

classification eda healthcare heart-disease imbalanced-data jupyter-notebook machine-learning model-evaluation optuna reproducibility scikit-learn

Last synced: 18 May 2026

https://github.com/moritzkoerber/text_analysis_app

A web app that classifies the content of messages that are usually sent during disasters such as earthquakes.

flask machine-learning nltk python scikit-learn

Last synced: 09 Apr 2026

https://github.com/tanaybhadula/ml-preprocessing-cli

A CLI tool with python to preprocess datasets for performing supervised learning to save time for users. Input data can be preprocessed using simple commands and preprocessed dataset can be downloaded later

cli data-cleaning data-preprocessing machine-learning pandas python scikit-learn

Last synced: 10 May 2026

https://github.com/altescy/xsklearn

Expanded scikit-learn for my research

python scikit-learn

Last synced: 21 Mar 2025

https://github.com/nicolascoiado/mulheres-ti

Este repositório contém um código em Python para analisar a evolução do número de mulheres na área de Tecnologia da Informação (TI) ao longo dos anos. Utilizando pandas para manipulação de dados e scikit-learn para criar um modelo de regressão linear, o objetivo é prever quantas mulheres estarão na TI em 2024 com base em dados históricos.

linear-regression matplotlib pandas python python3 scikit-learn

Last synced: 09 Apr 2026

https://github.com/alphacrypto246/old-car-price-prediction

The Old Car Price Prediction project predicts used car prices using features like age, mileage, and fuel type. It includes data preprocessing, model training, and visualization of trends, with easy customization for additional features or models.

machine-learning numpy pandas scikit-learn scikitlearn-machine-learning

Last synced: 09 Apr 2026

https://github.com/sneha1012/ml-dl

Implementing concepts and algorithms from scratch.

deep-learning machine-learning matplotlib numpy-tutorial scikit-learn

Last synced: 18 May 2026

https://github.com/tasninanika/australian-credit-approval-analysis-svm

This project uses a Support Vector Machine (SVM) Classifier to predict whether a credit application is approved (1) or denied (0) based on applicant features.

numpy pandas python3 scikit-learn svm-classifier

Last synced: 10 Apr 2026

https://github.com/lexxai/goit_python_ds_hw_03

Модуль 3. Класичне машинне навчання. Перенавчання. Лінійна регресія. LaTeX формули.

latex linear-regression matplotlib numpy pandas python scikit-learn

Last synced: 09 Apr 2026

https://github.com/abdullahashfaqvirk/SMS-Spam-Detection

A machine learning application designed to classify SMS messages as spam or non-spam, offering real-time analysis to identify potentially harmful content.

css3 docker flask html5 javascript matplotlib nltk numpy pandas python scikit-learn seaborn tailwindcss xgboost

Last synced: 16 Aug 2025

https://github.com/djdhairya/football-match-prediction

In this project, we'll predict the winner of football matches in the English Premier League (EPL).

jupyter-notebook machine-learning pandas python3 requests scikit-learn vscode

Last synced: 09 Apr 2026

https://github.com/abidhasanrafi/bioadaptive-eyeml-diagnosis

A state-of-the-art ocular diagnosis tool leveraging biomimetic machine learning to analyze eye movement patterns and predict ocular conditions with clinical-grade accuracy.

eye-tracking ocular-disease-recognition scikit-learn streamlit

Last synced: 16 Aug 2025

https://github.com/martingit2/aiportal-ml-service

ML-mikrotjeneste for Aracanix. En Python/Flask-app som trener og serverer XGBoost-modeller for prediktiv analyse. Se README for lenker til frontend og backend.

flask fullstack machine-learning microservices pandas python scikit-learn xgboost

Last synced: 09 Apr 2026

https://github.com/1587causalai/causal-sklearn

Scikit-learn Compatible Causal Machine Learning Library - Based on CausalEngine™

cauchy-distribution causal-inference causal-machine-learning machine-learning python pytorch scikit-learn

Last synced: 17 Aug 2025

https://github.com/bruno-moura24/hand-gesture-ai

Projeto em Python que utiliza OpenCV, MediaPipe e scikit-learn para detectar gestos de mão via webcam e classificá-los como números de 0 a 5 em tempo real.

computer-vision hand-gesture-recognition machine-learning mediapipe opencv python real-time-ai scikit-learn

Last synced: 28 Apr 2026

https://github.com/nickklos10/seriea_machine_learning_predictions_2025

This project involves scraping data, processing the data, and building machine learning models to predict the standings for the 2024-2025 Serie-A season.

beatifulsoup data-scraping keras matplotlib pandas scikit-learn shap tensorflow

Last synced: 13 Apr 2026

https://github.com/balajig-24/titanic_data_analysics-

Project Title: Titanic Survival Prediction Project Overview The Titanic Survival Prediction project is a classic machine learning problem that aims to predict whether a passenger survived the Titanic disaster based on various features such as age, gender, passenger class, and more. This project demonstrates my ability to clean, analyze, and model.

jupyter-notebook matplotlib numpy pandas python scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/vijay-saravanan/advanced-human-life-detection

Portable, real-time embedded system using mmWave radar, microphone, and accelerometer sensor fusion with advanced signal processing and machine learning to detect and locate humans trapped under debris. Features rapid alerts via LCD, LED, buzzer, and is designed for Raspberry Pi deployment in disaster scenarios.

disaster-recovery dwt fft landslide machine-learning random-forest-classifier scikit-learn sensor-fusion sensors-data-collection signal-processing vital-signs

Last synced: 14 May 2026

https://github.com/sabin74/loan_approval_prediction

This project predicts whether a loan application will be approved or not using machine learning classification models. The dataset used is from Kaggle’s Loan Prediction problem. The goal is to build a robust model to assist banks or financial institutions in making automated loan approval decisions.

classification-models kaggel-dataset loan-approval-prediction matplotlib-seaborn pandas python scikit-learn

Last synced: 30 Apr 2026