An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/hrolive/disaster-response-pipeline

A machine learning pipeline that categorizes disaster related messages so that they can be sent to the appropriate disaster relief agency

flask machine-learning natural-language-processing nltk pandas plotly python scikit-learn sql sqlalchemy

Last synced: 07 Apr 2026

https://github.com/vhnegrisoli/machine-learning-linguagens-programacao

Projeto de Data Science e Machine Learning de análise de linguagens de programação de 2004 a 2021

data-science jupyter-notebook machine-learning matplotlib pandas python scikit-learn seaborn

Last synced: 07 Apr 2026

https://github.com/smpotts/student-performance-predictions-ml

Creates machine learning models to predict student's learning outcomes.

jupyter-notebook machine-learning python regression-models scikit-learn

Last synced: 12 Sep 2025

https://github.com/aadrianleo/fashion-style-classifier

A machine learning and deep learning pipeline for fashion image classification. Combines real-world data, manual annotation, and both KNN and EfficientNet-B0 CNN models to classify images into style categories. Includes data cleaning, augmentation, model training, evaluation, and reproducible notebooks.

classification-report cnn computer-vision confusion-matrix data-augmentation data-preprocessing deep-learning efficientnet exploratory-data-analysis fashion-classification image-classification knn label-studio machine-learning model-evaluation pytorch real-world-data reproducible-research scikit-learn transfer-learning

Last synced: 11 May 2026

https://github.com/veranyagaka/credit-card-fraud-detection

Credit Card Fraud Detection using data preprocessing, analysis, visualization, and machine learning to accurately identify fraudulent transactions. -Final Project

ai anomaly-detection classification credit-card-fraud-detection machine-learning scikit-learn supervised-learning

Last synced: 18 May 2026

https://github.com/yugalsoni18/counterfeit_review_detection

Fake review detection using TF-IDF & SVM (AUC 0.98), plus Counterfeit Risk Score with clustering & anomaly detection.

business-analytics fraud-detection isolation-forest kmeans nlp python risk-scoring scikit-learn svm tfidf

Last synced: 18 May 2026

https://github.com/pradipnp/decisiontree-iris

Machine learning project to classify iris flowers using a decision tree

classification decision-tree iris-dataset machine-learning python scikit-learn

Last synced: 18 May 2026

https://github.com/sudarshanc00/brain-tumor-classification

This project uses a deep learning model in PyTorch to classify brain MRI images into four tumor types, aiding early diagnosis and treatment planning. Two ResNet-based models were developed and optimized, achieving high accuracy to support healthcare professionals in identifying tumor categories.

matplotlib numpy pytorch resnet scikit-learn streamlit

Last synced: 10 Apr 2026

https://github.com/martinkersner/kmeans-meetup

Presentation about k-Means for Seoul AI Meetup on July 22, 2017.

kmeans numpy python scikit-learn

Last synced: 03 May 2026

https://github.com/sanalislokuge/breast-cancer-ml-prediction

Machine Learning project using classification, regression, and ensemble techniques to predict breast cancer mortality status and survival months using clinical data. Built with scikit-learn, decision trees, logistic regression, and Naïve Bayes. Includes detailed model evaluation, data preprocessing, and interpretability.

classification data-science decision-tree ensemble-learning healthcare-analytics machine-learning ml models naive-bayes-classifier predictive-modeling regression scikit-learn

Last synced: 19 May 2026

https://github.com/anty-filidor/cyberbullying-detector

NLP bullying detector for tweets with ML model training pipeline deployed as web-app with CICD

deployment-system flask-api machine-learning nlp python scikit-learn

Last synced: 19 May 2026

https://github.com/arrhythmia-detection/authorfeatureextracteddecisiontreeesp32s3

Deploys a vanilla non-optimized Decision Tree for Arrhythmia classification using Chapman ECG dataset on ESP32-S3 dev kit

arrhythmia-classification decisiontreeclassifier eloquent esp32-arduino esp32-s3 scikit-learn

Last synced: 19 May 2026

https://github.com/huucanh0511/startup-profitability-prediction

This project predicts startup profitability using Logistic Regression and Random Forest, analysing financial (funding amount, funding rounds, revenue), market (market share), and operational (startup age, employee count) factors. It evaluates AUC, accuracy, precision, recall, and F1-score, addressing underfitting, overfitting, and feature selection

ai-for-finance data-science financial-modelling logistic-regression machine-learning predictive-analytics python random-forest scikit-learn startup-analysis

Last synced: 19 May 2026

https://github.com/subratamondal1/machine-learning

Machine Learning Notes with tools like Numpy, Pandas, Scikit-Learn.

machine-learning numpy pandas scikit-learn

Last synced: 10 Apr 2026

https://github.com/somjit101/ds-logistic-regression

A simple implementation of the Logistic Regression Classifier on the Breast Cancer Dataset with L1 regularization and GridSearch for hyperparameter tuning.

breast-cancer-prediction breast-cancer-wisconsin grid-search grid-search-cross-validation hyperparameter-tuning logistic-regression machine-learning-algorithms regularization scikit-learn

Last synced: 19 May 2026

https://github.com/freakwill/dred

🔴 dred = dimension reducing for machine learning (suit to sklearn)

dimension-reduction scikit-learn sklearn sklearn-estimator

Last synced: 19 May 2026

https://github.com/lopez86/datascienceexamples

Examples of various data science & data analysis topics using various sources of data.

data-analysis data-science pandas scikit-learn tutorial visualization

Last synced: 13 Apr 2026

https://github.com/shubhamgoyal575/credit-card-fraud-detection

📌 Credit Card Fraud Detection using Machine Learning This project focuses on detecting fraudulent credit card transactions using machine learning models like Random Forest, XGBoost, and Deep Learning. The dataset is preprocessed to handle class imbalance, and multiple models are evaluated based on ROC AUC Score and F1 Score.

adaboost-classifier artificial-neural-networks credit-card-fraud data-analysis data-cleaning data-preprocessing data-science data-visualization deep-learning exploratory-data-analysis lightgbm machine-learning machine-learning-algorithms random-forest-classifer scikit-learn tensorflow xgboost

Last synced: 08 Feb 2026

https://github.com/davidcgong/birddog.io

Real estate forecasting using Zillow Research data

forecasting pandas scikit-learn

Last synced: 19 May 2026

https://github.com/jazib-2004/face-mask-detection-using-cnns

Face mask detection can be very useful in environments like hospital emergency rooms or ICUs where wearing mask is mandatory. It can also help in pandemics like COVID where such models can detect if a person is wearing a mask or not. In this project, I used a Convolutional Nueral Network architecture to train a face mask detection algorithm based.

convolutional-neural-networks keras object-detection python scikit-learn tensorflow

Last synced: 08 Apr 2026

https://github.com/xprithvi/random-forest-regressor

This Jupyter notebook serves as a machine learning template to quickly make predictions and analyse feature importance in a dataset.

data-science feature-extraction machine-learning random-forest random-forest-regression scikit-learn

Last synced: 14 Mar 2025

https://github.com/lourdilene/guess-the-number

Number guessing game played between two players: human and computer. Basic Python project for studying object-oriented programming and machine learning with the scikit-learn library.

machine-learning oops-in-python scikit-learn

Last synced: 20 May 2026

https://github.com/hazz-i/codexia-chatbot

Deskriminatif Chatbot

chatbot nlp scikit-learn

Last synced: 19 May 2026

https://github.com/alphacrypto246/zoo-animal-classifier

A project that uses machine learning to classify animals into categories like Mammals, Birds, and Reptiles based on their characteristics.

machine-learning machine-learning-algorithms random-forest scikit-learn

Last synced: 20 May 2026

https://github.com/rohit-2301/hiresense

HireSense is an AI-powered resume classifier that uses NLP and Machine Learning to predict the best-fit job role from a PDF resume. Built with Streamlit, it features a clean UI for uploading resumes and instantly suggests roles like Data Scientist, Full Stack Developer, and DevOps Engineer.

joblib ml nlp pymupdf python scikit-learn streamlit tfidfvectorizer

Last synced: 22 Jul 2025

https://github.com/zahediparsa/ml_birkaracademy

Developed exercises and practical tasks to help students grasp key machine learning topics in a course hosted by Birkar Academy and ICDS.ai

decision-trees iris-dataset knn machine-learning mlp-classifier scikit-learn

Last synced: 02 Jan 2026

https://github.com/wuweiweiwu/zookeeper-bot

Bot for Facebook Messenger game Zookeeper using scikit-learn SVM :dromedary_camel:

facebook-messenger scikit-learn svm zookeeper

Last synced: 20 May 2026

https://github.com/barbaraeguche/pyrocast

🚒 a proactive wildfire prediction & analysis built with react & flask.

ai flask ml pandas react scikit-learn vite

Last synced: 08 Apr 2026

https://github.com/pramodyasahan/house-price-prediction

This repository contains the code for a machine learning model aimed at predicting housing prices. The model is based on the RandomForestRegressor algorithm from the scikit-learn library and utilizes feature selection, preprocessing, and pipeline techniques for improved performance.

machine-learning numpy pandas python scikit-learn

Last synced: 08 Apr 2026

https://github.com/bsamseth/triangular-regressor

A scikit-learn compatible implementation of a 2D triangular regressor.

scikit-learn triangulation

Last synced: 20 May 2026

https://github.com/thekartikeyamishra/aipoweredmarketingassistant

AI-Powered Marketing Assistant, an advanced tool designed to enhance your digital marketing campaigns using the power of machine learning (ML) and large language models (LLMs). This project empowers small businesses and MSMEs to create compelling content, analyze campaigns, and strategize effectively.

artificial-intelligence llm matplotlib numpy openai pandas python scikit-learn streamlit

Last synced: 08 Apr 2026

https://github.com/esha-sm/forecastx

This is an interactive web application for forecasting sales data using the ARIMA model. Users can upload their own CSV files or use a default dataset to generate forecasts and visualizations.

arima-model flask-api jupyter-notebook matplotlib pandas plotly python scikit-learn seaborn

Last synced: 27 Feb 2026

https://github.com/freakwill/nb-combination

ensemble classifier with naive bayes combination

bayes-classifier python scikit-learn

Last synced: 20 May 2026

https://github.com/mohit1106/Fraud-Detection-In-Financial-Transactions

an anomaly detection system on 284,807 transactions, achieving an AUC of ~0.972 with CNNs and Autoencoders.

autoencoders cnn-model isolation-forest keras python scikit-learn tensorflow

Last synced: 17 Oct 2025

https://github.com/jihoonerd/restricted-discriminant-analysis

RDA implementation compatible with Scikit-learn API

discriminant-analysis rda scikit-learn

Last synced: 22 Apr 2026

https://github.com/mk2345/fashionmnist-dl-ml

CNN and SVM image classifiers implemented in Keras and Scikit-Learn.

jupyter-notebook keras-tensorflow scikit-image scikit-learn

Last synced: 10 May 2026

https://github.com/kheriberto/logistic_regression_project

A project that analyses dummie data from an advertising company using logistic regression

data-analysis logistic-regression pandas python scikit-learn seaborn

Last synced: 08 Apr 2026

https://github.com/khaja-shaik-21/heart-disease-prediction-system

This form allows users to enter key health details like age, blood pressure, cholesterol levels, and exercise results to predict the likelihood of heart disease. The data is submitted to the backend for processing, where a machine learning model provides a prediction. The form is styled for a clean and responsive user experience.

css3 flask-application git html5 logestic-regression numpy pandas python3 scikit-learn

Last synced: 12 Apr 2026

https://github.com/amon20044/quantum-bayes-classifiers-and-their-application-in-image-classification

implementing Quantum Bayes Classifiers (QBCs) for image classification tasks using MNIST and Fashion-MNIST datasets, based on the research by Ming-Ming Wang and Xiao-Ying Zhang. The project includes Naïve QBC, SPODE-QBC, TAN-QBC, and Symmetric-QBC, simulated on MindQuantum.

bayesian bayesian-inference classification computing gaussian mindquantum mindspore naive-bayes-classifier qml quantum quantum-computing quantum-machine-learning research-reproduction scikit-learn spode tan

Last synced: 18 May 2026

https://github.com/abidhasanrafi/pharma-sales-analytics

A Streamlit-powered web application for analyzing pharmaceutical sales performance across teams, products, and territories.

matplotlib numpy pandas plotly sales-analysis scikit-learn seaborn streamlit

Last synced: 08 Apr 2026

https://github.com/jenil311/application-of-covid-19-spread-analysis

The objective of this project is to study the COVID-19 outbreak using basic statistical techniques and make short term predictions using ML regression methods.

covid19-tracker machine-learning regression-analysis regression-models ridge-regression scikit-learn

Last synced: 02 Jan 2026

https://github.com/mhmudfzli/loan-approval-prediction

This project demonstrates a comprehensive approach to solving a regression problem using various machine learning models. The notebook includes: Data Preprocessing, Exploratory Data Analysis (EDA), Model Training, Hyperparameter Tuning, Model Evaluation, Feature Importance

automl catboost numpy pandas python scikit-learn seaborn

Last synced: 08 Apr 2026

https://github.com/lren-chuv/sklearn_to_pfa

Convert Scikit Learn models to PFA

pfa-standard scikit-learn

Last synced: 21 May 2026

https://github.com/abhishekbagdiya01/movies-recommendation-system

This repository contains the code for a movie recommendation system built using Jupyter Notebook.

aiml jupyter-notebook numpy pandas python scikit-learn

Last synced: 08 Apr 2026

https://github.com/bjornmelin/ml-algorithm-playground

🧪 Core ML algorithm implementations with GPU acceleration. Featuring optimized implementations across various libraries with comprehensive analysis. 📈

algorithms cuda gpu-computing lightgbm machine-learning python scikit-learn xgboost

Last synced: 13 May 2026

https://github.com/sreekar0101/-movie-recommendation-system-using-python

The Movie Recommendation System is designed to suggest personalized movie recommendations by analyzing extensive datasets containing movie details and credits.ultilizes python libraries numpy pandas and scikit learn.The system achieved a 15% improvement in accuracy compared to the baseline model by identifying key factors that influence user choice

data-analysis data-visualization numpy-library pandas-dataframe scikit-learn seaborn-python

Last synced: 02 Jan 2026

https://github.com/douglaside/airlinedelay

[✍🏻Learn] Project aimed at analyzing flight delays, using Python algorithms and machine learning techniques to aid decision-making and identify patterns.

ai alura boxplot data-science graphics histogram machine-learning machine-learning-algorithms pandas python scikit-learn static

Last synced: 28 Jun 2025

https://github.com/achronus/data-exploration

A repository dedicated to interesting data exploration projects I've completed

data-analysis exploratory-data-analysis machine-learning matplotlib pandas python scikit-learn seaborn

Last synced: 02 Jan 2026

https://github.com/miguellopezvirues/azure_keyword_cpc

Development and deployment of simple regression model in Azure Machine Learning.

azureml deplyment machine-learning mlflow pandas scikit-learn

Last synced: 09 May 2026

https://github.com/hawkharsh1/house-price-pridiction-model-using-ann

A deep learning-based regression model built using Artificial Neural Networks (ANN) in PyTorch to predict house prices from structured data. This project demonstrates the application of machine learning and deep learning techniques for solving real-world problems in the housing domain.

artificial-neural-networks deep-neural-networks machine-learning numpy pandas python3 pytorch scikit-learn

Last synced: 08 Apr 2026

https://github.com/chawthinn/car-price-prediction-regression-ml

This project predicts used car prices using regression models including Linear, Ridge, Random Forest, XGBoost, and LightGBM. It covers preprocessing, EDA, model evaluation, hyperparameter tuning, and model persistence using Scikit-learn and related libraries.

car-price-prediction lgbmregressor linearregression numpy pandas python scikit-learn xgbregressor

Last synced: 08 Apr 2026

https://github.com/giatraskon/machine_learning_assignments

Machine learning assignments covering regression, classification, neural networks, adversarial examples, and real-time emotion detection using Python. Includes theoretical insights and practical implementations.

adversarial-examples bayesian-inference bias-variance-tradeoff cifar10 classification deep-learning emotion-recognition iris-dataset k-nearest-neighbours keras machine-learning mnist neural-networks opencv pima-indians-diabetes python regression ridge-regression scikit-learn tensorflow

Last synced: 08 Apr 2026

https://github.com/haydencordeiro/terafeed

Terafeed - Addressing Zero Hunger in Africa (Sustainability Goal SDG 2)

javscript numpy pandas powerbi python scikit-learn tableau vuejs

Last synced: 08 Apr 2026

https://github.com/thekartikeyamishra/predictive-sales-analytics

The Predictive Sales Analytics tool aims to help MSMEs forecast future sales using historical data. This advanced version leverages Machine Learning for accurate predictions and provides a dashboard to visualize sales trends, seasonality, and predictions.

joblib machine-learning matplotlib pandas python scikit-learn streamlit

Last synced: 08 Apr 2026

https://github.com/rakibhhridoy/visualmachinelearning-yellowbrick

Yellowbrick wraps the scikit-learn and matplotlib to create publication-ready figures and interactive data explorations. It is a diagnostic visualization platform for machine learning that allows us to steer the model selection process by helping to evaluate the performance, stability, and predictive value of our models and further assist in diagnosing the problems in our workflow.

classification hyperparameter-tuning machine-learning model-evaluation model-view-presenter model-visualization python random-forest random-forest-classifier scikit-learn visualization xgboost xgboost-algorithm yellowbrick

Last synced: 03 May 2026

https://github.com/leosolar8/mental-health-tech-ai-survey

Mental Health in Tech Survey Analysis — Applied K-means clustering, PCA, and Chi-square tests on tech industry survey data to uncover patterns between remote work practices and mental health consequences, with visualizations of key trends.

clustering data-science kmeans machine-learning mental-health pca python-project scikit-learn seaborn survey-analysis tech-industry visualization

Last synced: 08 Apr 2026

https://github.com/notshrirang/m2connex

M2ConneX is an all-encompassing platform specifically crafted for MMCOE alumni, enabling seamless communication, networking, and collaboration. It provides tailored recommendations for connections, posts, and job opportunities based on each user's unique skills and experience.

django django-rest-framework scikit-learn

Last synced: 28 Jun 2025

https://github.com/lorenzorottigni/ml-lending-club

Machine Learning python bootcamp: random forest classifier on LendingClub dataset

ipynb machine-learning numpy pandas python random-forest-classifier scikit-learn seaborn

Last synced: 08 Apr 2026

https://github.com/shahbazshaddy/explainable-multimodal-ai-for-breast-cancer-and-pneumonia-prediction

A deep learning-based framework integrating explainable multimodal AI for accurate prediction and transparent diagnosis of breast cancer and pneumonia.

deep-learning explainable-ai grad-cam groq-api llm machine-learning matplotlib multimodal numpy pandas python pytorch scikit-learn seaborn streamlit

Last synced: 08 Apr 2026

https://github.com/vladstudennikov/diabetes-prediction-app

ML-powered web app built with Laravel and Vue.js to predict diabetes risk based on users' daily habits and behavior

cypress data-analysis diabetes-prediction fastapi inertiajs laravel matplotlib medicine ml pandas php scikit-learn seaborn vuejs

Last synced: 08 Apr 2026

https://github.com/jhylin/ml1-1_small_mols_in_chembl

Polars dataframe library and logistic regression in scikit-learn (update)

logistic-regression machine-learning parquet-files polars-dataframe scikit-learn

Last synced: 03 Jan 2026

https://github.com/hariprasath-v/machinehack_analytics_olympiad_2023

Create a machine learning model to determine the likelihood of a customer defaulting on a loan based on credit history, payment behavior, and account details.

binaryclassification catboost exploratory-data-analysis machine-learning numpy pandas python scikit-learn shap

Last synced: 08 Apr 2026

https://github.com/priyanshulathi/cancer-diagnosis-prediction-model

A Machine Learning project to predict cancer malignancy using K-Nearest Neighbor, Support Vector Machine, and Decision Tree algorithms.

machine-learning numpy pandas python scikit-learn

Last synced: 03 Jan 2026

https://github.com/rinuya/ml-cancer-diagnosis

Binary classficiation using MLP & Random Forest

ml mlp random-forest scikit-learn

Last synced: 03 Jan 2026

https://github.com/ledsouza/deep-learning-noticias

Este projeto visa construir dois modelos de Machine Learning: um para classificar notícias em diferentes categorias e outro para realizar o autocomplete de texto, prevendo a próxima palavra em uma frase. O conjunto de dados fornecido consiste em notícias de um site de notícias, já pré-processadas e armazenadas em um arquivo CSV.

deep-learning keras machine-learning python scikit-learn tensorflow

Last synced: 08 Mar 2026

https://github.com/armahdavi/data_pipeline_analytics_statistics_ML_PM_PSD_residential_QFF

Sharing all the data pipelines and processing codes, statistical modellings, descriptive statistics, plot visualizations, and machine learning from Mahdavi & Siegel (2021) (Indoor Air) Project Miestone: 2017 - 2020 Full-length article: https://onlinelibrary.wiley.com/doi/abs/10.1111/ina.12782

data-science data-visualization dust hvac indoor-air-quality jupyter-notebook machine-learning matplotlib-pyplot numpy pandas python scikit-learn scipy-stats spyder spyder-python-ide statistics

Last synced: 17 Sep 2025

https://github.com/rohanbanerjee1234567-cell/prediction-of-expected-salary-using-machine-learning

Here is my first Project Repository where I have made a Machine Learning Project using Python. The Problem statement was to train a model based on the given Dataset and from there we need to Predict the Expected Salary of an Employee who will have similar profiles.

exploratory-data-analysis linearregression matplotlib-pyplot numpy pandas randomforest randomforestregressor scikit-learn scikitlearn-machine-learning searborn visualization

Last synced: 27 Apr 2026

https://github.com/rajan-bhateja/machine-learning-with-python

Machine learning algorithms implemented using Scikit-learn

classification clustering machine-learning regression scikit-learn sklearn

Last synced: 17 May 2026

https://github.com/sabbadini10/job4you

Job4You is an AI-powered job application assistant that streamlines the entire application process. Built on Angular and Firebase with GPT-4 integration.

angular api ats-optimization cover-letter email-automation firebase jobforall openai-api python resume-builder scikit-learn sheraz sherazhussain sherazhussain546

Last synced: 04 Mar 2026

https://github.com/jain1shh/solar-flare-prediction

This repository contains code and data for predicting solar flare energy ranges using machine learning, based on NASA's RHESSI mission data. It includes preprocessing of FITS files into a unified CSV dataset and implements models like Gradient Boosting, Random Forest, and Decision Tree classifiers, achieving accuracies up to 87%.

data-visualization machine-learning numpy pandas python scikit-learn solar-flare-prediction

Last synced: 09 Apr 2026

https://github.com/shafaq-aslam/predicting-heart-disease-risk-with-logistic-regression-techniques

Develop a predictive model using logistic regression techniques to assess heart disease risk based on patient health metrics and data analysis.

data-analysis heart-disease logistic-regression machine-learning machine-learning-models matplotlib numpy pandas python scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/shauryashaurya/marty_mcfly

Code, text and notebooks on a tutorial for Introduction to Machine Learning using open sources

anaconda jupyter-notebooks machine-learning machine-learning-tutorials notebook numpy python regression scikit-learn scipy tutorial

Last synced: 09 Apr 2026

https://github.com/rajan-bhateja/Machine-Learning-with-Python

ML/DL projects done using sklearn and TensorFlow

machine-learning scikit-learn sklearn

Last synced: 28 Jul 2025