An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/id-andyyy/alfahack

📈💰 Investment propensity prediction model

catboost hackathon-project jupyter lightgbm numpy optuna pandas python scikit-learn

Last synced: 12 Apr 2026

https://github.com/enricobolzonello/ml_homeworks

Homeworks for the Machine Learning Course 2022/23 @ Unipd

linear-regression machine-learning neural-network scikit-learn svm

Last synced: 11 Jun 2025

https://github.com/adarshpheonix2810/fake-job-post-detection

This project focuses on detecting fake job posts using machine learning. Fake job advertisements are often created to scam individuals by stealing personal information or money.

data-analysis deep-learning joblib machine-learning nlp-machine-learning numpy pandas python scikit-learn tkinter

Last synced: 12 Apr 2026

https://github.com/vishnu-vamshii/fraud-detection-using-machine-learning

Developed a machine learning pipeline to detect fraudulent credit card transactions, handling imbalanced data with SMOTE and scaling. Trained models like Logistic Regression and Random Forest. Conducted EDA to identify fraud patterns.

pandas python scikit-learn tensorflow

Last synced: 12 Apr 2026

https://github.com/mindlessmuse666/train-test-splitter

Анализ данных о пассажирах Титаника и разбиение на обучающую и тестовую выборки. Практическое задание по дисциплине "Основы применения методов искусственного интеллекта в программировании".

data-analysis data-preprocessing data-visualization machine-learning pandas python scikit-learn seaborn titanic train-test-split

Last synced: 12 Apr 2026

https://github.com/oneapi-src/predictive-asset-health-analytics

AI Starter Kit for Predictive Asset Maintenance using Intel® optimized version of XGBoost

machine-learning scikit-learn

Last synced: 04 Apr 2025

https://github.com/mark1708/neurointerfaces-of-information-systems

Laboratory work on the discipline "Neurointerfaces of information systems"

numpy pandas python scikit-learn

Last synced: 12 Apr 2026

https://github.com/nicolas-giacomelli/modelo-polinomial-api-streamlit

Codigo focado em treinamento de modelo de previsão de salario por tempo de empresa e nivel dentro da empresa disponibilizado por api e usando front-end streamlit para calcular basta inserir tempo de empresa e nivel e calcular o resultado de salario compativel

api fastapi joblib machine-learning matplotlib numpy pandas pingouin pydantic scikit-learn seaborn streamlit uvicorn

Last synced: 12 Apr 2026

https://github.com/yuweaec/wine_quality_prediction

The Wine Quality Prediction project aims to predict the quality of wine based on its chemical properties using machine learning algorithms.

flask jupyter-notebook machine-learning python scikit-learn

Last synced: 11 Apr 2025

https://github.com/striderzz/ml-heart-disease-classification

Machine Learning - Heart Disease Classification Project using Sci-Kit Learn

classification-machine-learning machine-learning machine-learning-projects scikit-learn

Last synced: 16 May 2026

https://github.com/jaspreetsingh-exe/sign-language-recognition-system

Sign Language Recognition System is an AI-powered application that enables real-time sign language recognition using MediaPipe and an MLP model. It captures hand gestures, extracts landmark features, and predicts sign language letters dynamically. The project also explores MobileNetV2 and aims to expand into Text-to-Sign Language generation.

deep-learning machine-learning mediapipe mobilenetv2 neural-networks scikit-learn sign signlanguage signlanguagedetection signlanguagerecognition tensorflow

Last synced: 01 May 2026

https://github.com/gfyoung/tree-decode

Package for removing the black-box around decision trees

blackbox decision-tree machine-learning python scikit-learn

Last synced: 20 Jan 2026

https://github.com/aishwaryagm1999/insurance-workflow-management

This project is an Insurance Workflow Management System designed to streamline policy management, claims processing, and fraud detection. It includes user account management, customer feedback analysis via NLP, alert notifications through SMS, and a fraud detection model, providing a secure, efficient solution for insurance operations.

css fraud-detection html json labelimg machine-learning natural-language-processing nlp opencv python qr-code-generator random-forest-classifier scikit-learn sms-notification tensorflow textblob twilio user-interface

Last synced: 26 Dec 2025

https://github.com/imehranasgari/mlflow_starter

This project is a hands-on guide to the complete end-to-end MLflow workflow, designed as an educational resource. It demonstrates how MLflow is used in practice for experiment tracking, model versioning, and ensuring a reproducible MLOps lifecycle, focusing on the methodology and best practices rather than high model accuracy.

data-science experiment-tracking mlflow mlops model-registry python scikit-learn

Last synced: 11 May 2026

https://github.com/lilivalgo/machine-learning-projects

This repository hosts the machine learning project developed during my learning journey. It showcases my progress and the skills acquired in the field of machine learning

lag-feature linear-regression ml-models scikit-learn scipy-stats seaborn-plots

Last synced: 28 Mar 2025

https://github.com/urvee1810/eda-time-series

A comprehensive time series analysis of French retail quarterly sales data from 2012 to 2017. The project focuses on analyzing sales patterns, seasonal decomposition, and trend analysis using various statistical techniques and visualizations.

arima-modeling data-visualization exploratory-data-analysis matplotlib numpy pandas pmdarima python scikit-learn seaborn statsmodels time-series-analysis trend-analysis

Last synced: 12 Apr 2026

https://github.com/alphacrypto246/grape-quality-prediction

The Grape Quality Prediction project uses machine learning to predict the quality of grapes based on chemical properties like acidity, sugar content, and alcohol levels. It applies regression models to forecast the quality score, helping in wine production and quality assessment.

machine-learning numpy pandas scikit-learn scikitlearn-machine-learning

Last synced: 19 Apr 2026

https://github.com/manojkp08/student-performance-analysis

The Student Performance Analyzer is your go-to solution for understanding and improving student performance. By blending the power of machine learning with interactive visualizations, this tool provides educators and learners with personalized insights into learning styles, performance gaps, and actionable improvements.

machine-learning numpy pandas python requests scikit-learn streamlit

Last synced: 12 Apr 2026

https://github.com/brianlesko/maze-runner

Developed a Python-based maze-crawling application using a PS5 controller interface. This project highlights skills in software-hardware integration and low-code UI design, demonstrating expertise ideal for advanced software engineering.

communication dualsense engineer engineering hacking hardware hardware-hacking interface low-code-ui mechanical-engineer mechanical-engineering protocol ps5 python robotics-engineer scikit-learn software sony streamlit ui

Last synced: 12 Apr 2026

https://github.com/crispengari/python-sklearn

💎 Introduction to machine learning with scikit-learn in python. A quick walk through the sklearn library for machine learning and understanding different machine learning algorithims.

ai artificial-intelligence classification clustering datascience jupyter-notebook machine-learning ml-python nlp python regression scikit-learn

Last synced: 13 May 2026

https://github.com/celineboutinon/faux-billets

OpenClassrooms Data Analyst 2022-2023 - Projet 10

machine-learning python scikit-learn statsmodels

Last synced: 16 May 2026

https://github.com/raduldev/ml-projects

End To End Machine Learning Project guided by Krish Naik from Ineuron.

catboost dill flask-application numpy pandas python scikit-learn xgboost

Last synced: 12 Apr 2026

https://github.com/dakohhh/politicians-face-recognition

A machine learning model where we classify famous Nigerian politicians. We restrict classification to only 4 people

gridsearchcv jupyter-notebook machine-learning opencv python pywavelets scikit-learn

Last synced: 16 Apr 2026

https://github.com/arrhythmia-detection/arrhythmiadetectionmodels

This repository contains ML codebase developed during CSE713 group project

arrhythmia-detection deep-neural-nets esp32-s3 scikit-learn tensorflow tensorflow-lite tinyml

Last synced: 12 Apr 2026

https://github.com/umasivakumar14/real_estate_ml_model

Predicts the price of a home in Bengaluru, Karnataka based on location, urbanization, total square feet, bedrooms, bathrooms, and balconies.

aws flask gridsearchcv http-requests machine-learning machine-learning-algorithms nginx pandas python scikit-learn

Last synced: 02 Feb 2026

https://github.com/usmana5809/quran-recitation-audio-classification

Quran Recitation Audio Classification project aims to classify different recitations of the Quran using machine learning techniques. It involves preprocessing audio data, extracting features, training models, and evaluating their performance

audio-classification classification-model islamic-studies librosa machine-learning python quran scikit-learn

Last synced: 20 Mar 2025

https://github.com/ranimeshehata/softmax-regression-on-mnist

A PyTorch-based project for classifying the MNIST dataset using Softmax Regression, including training, validation, results and visualization.

matplotlib mnist python3 pytorch scikit-learn softmax-regression torchvision

Last synced: 15 Apr 2026

https://github.com/shubhamsoni98/project_using_knn

This project applies the K-Nearest Neighbors (KNN) algorithm to predict iPhone purchases based on customer data. Using features like age, salary, and previous purchase behavior, the KNN model classifies customers into buyers and non-buyers.

anaconda analytics data data-science eda knn knn-classification machine-learning-algorithms predict project python scikit-learn tableau

Last synced: 03 Jan 2026

https://github.com/thaisgarcia/scikit-learn

Utilizei aprendizado supervisionado, mais especificamente regressão linear, para prever salários com base no tempo dedicado aos estudos mensais. O modelo treinado estabeleceu uma relação matemática entre salário e horas de estudo, ajustando parâmetros durante o treinamento.

pandas scikit-learn seaborn

Last synced: 08 May 2026

https://github.com/njorogepaul-moghul/house-price-predictions-kaggle-competition-

Built a predictive model for the Kaggle House Prices competition using feature engineering and LightGBM, achieving strong leaderboard performance."

data-science house-price-prediction-with-lightgbm kaggle-competition lightgbm machine-learning predicting-home-values-using-machine-learning random-forest scikit-learn

Last synced: 15 May 2026

https://github.com/sankoktas/bhi360-fall-detection

Fall detection system using Bosch BHI360 sensor data with time-series labeling, feature extraction, and machine learning (LOSO CV + Gradient Boosting).

accelerometer bhi360 bosch-sensors data-augmentation fall-detection feature-extraction gradient-boosting gyroscope human-activity-recognition label-studio loso-cross-validation machine-learning python scikit-learn sensor-data smote time-series

Last synced: 07 May 2026

https://github.com/sayan520/titanic-data-insights

Conducting data analysis on Kaggle's Titanic: Machine Learning from Disaster dataset using essential data wrangling, exploratory data analysis (EDA), and visualization techniques to uncover insights, identify patterns, and explore factors influencing passenger survival.

jupyter-notebook kaggle matplotlib numpy pandas python scikit-learn seaborn

Last synced: 12 Apr 2026

https://github.com/ytalk/deep-learning

Um repositório dedicado à minha jornada de aprendizado e experimentação em Deep Learning. Contém diversas pipelines e implementações em diferentes datasets, explorando modelos (MLPs, LSTMs, CNNs) e técnicas (Regressão, Classificação, etc.) com foco em TensorFlow e Keras.

data-science deep-learning keras machine-learning neural-networks pandas python scikit-learn tensorflow

Last synced: 30 Dec 2025

https://github.com/tnleite/credit-card-customer-clustering

Este repositório apresenta um projeto de segmentação e predição de clientes de cartões de crédito. Utilizando EDA, clusterização (K-Means) e machine learning, o objetivo é prever o grupo de novos clientes, apoiando estratégias de marketing personalizadas.

classification-algorithm clustering-algorithm clustering-analysis data-science exploratory-data-analysis kmeans-clustering logistic-regression machine-learning-algorithms machine-learning-models matplotlib numpy scikit-learn seaborn

Last synced: 07 May 2026

https://github.com/rahulsm20/insurance-data

A data analytics project dealing with risk assessment and it's effects in health insurance.

data-analysis data-analytics machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 12 Apr 2026

https://github.com/lucasfranklinsilva/rnn-lstm

Modelo de Prevenção de Falhas em Turbinas Simuladas Utilizando Redes Neurais Recorrentes

data-visualization deep-learning jupyter-notebook keras machine-learning neural-networks python scikit-learn

Last synced: 12 Apr 2026

https://github.com/senaayy/adhd-network-efficiency

🧠 End-to-end fMRI analysis pipeline comparing ADHD brain topology vs. Healthy Controls using Graph Theory (Global Efficiency & Clustering). Built with Nilearn, NetworkX, and Docker for reproducible neuroscience.

adhd bioinformatics brain-networks computational-neuroscience data-science docker fmri graph-theory network-analysis networkx neuroscience nilearn python scikit-learn

Last synced: 17 Jun 2026

https://github.com/chris-santiago/tsfeast

A collection of Scikit-Learn compatible time series transformers and tools.

data-science feature-engineering python scikit-learn time-series timeseries-features transformers

Last synced: 01 May 2026

https://github.com/theartificialdev/movie-recommendation-system

The primary goal of this project is to provide personalized movie recommendations to users based on their preferences and the characteristics of the movies. This is achieved through a multi-step process involving data preprocessing, text vectorization, and recommendation generation.

anaconda-environment data-science jupyter-notebook machine-learning movie-recommendation movies pandas python3 recommendation-system recommender-system scikit-learn scikitlearn-machine-learning

Last synced: 12 Apr 2026

https://github.com/sohang3112/stock-prediction-mlops

Stock Prediction MLOps group project for IIT Madras MTech (AI).

mlops python scikit-learn stock-price-prediction

Last synced: 20 Jun 2026

https://github.com/pders01/telarantula

📜 I made this for Uni. Was pretty fun. It scrapes telegram channels of known German tinfoil-hats and tries to detect the telegram channel based on the emojis that are used.

assignment python research scikit-learn scrapy

Last synced: 04 Aug 2025

https://github.com/thariniselvakumar/kidney-disease-prediction

This project is about the kidney disease prediction using machine learning algorithms

machine-learning matplotlib numpy pandas scikit-learn seaborn

Last synced: 12 Apr 2026

https://github.com/jbizzlefoshizzle/linear-and-ridge-regression

The purpose of this project was to analyze and predict housing prices using attributes or features such as square footage, number of bedrooms, number of floors, and so on.

linear-regression machine-learning machine-learning-algorithms regression-analysis regression-models ridge-regression scikit-learn scikitlearn-machine-learning train-test-split train-test-using-sklearn

Last synced: 16 May 2026

https://github.com/swat1563/recommendation-system

This repository features a recommendation system and analytics engine using datasets on users, organizations, contents, contacts, events, and recommendations. It includes data preprocessing, building a recommendation system, and creating visual reports with Power BI.

analytics data-analysis data-visualization engine kaggle numpy pandas powerbi powerbi-dashboards powerbi-desktop powerbi-reports python recommendation-engine recommendation-system recommender-systems scikit-learn scipy

Last synced: 07 Jan 2026

https://github.com/jpcano/boston_housing

Predicting Boston Housing Prices using supervised Machine Learning algorithms

cross-validation machine-learning numpy pandas python regression-models scikit-learn

Last synced: 12 Apr 2026

https://github.com/ccastleberry/sk-autobots

Custom data transformers using the scikit-learn API.

scikit-learn sklearn sklearn-api

Last synced: 08 Feb 2026

https://github.com/themihirmathur/soiligator

Soiligator is an advanced machine learning project designed to optimize irrigation management by predicting whether irrigation is necessary based on environmental and soil-related data.

auc-score logistic-regression machine-learning matplotlib numpy pandas python random-forest-classifier roc-curve scikit-learn seaborn standardscaler support-vector-machine

Last synced: 12 Apr 2026

https://github.com/daniel-furman/RecFeatureSelect

Feature selection functions (1) using the multi-collinearity matrix and recursively proceeding to a spearman threshold and (2) using Forward Stepwise Selection running on an ensemble sklearner (with options for HPO).

correlation-threshold machine-learning modeling multicollinearity recursion recursive-algorithm scikit-learn spearman-rho

Last synced: 09 Jul 2025

https://github.com/raghavendranhp/industrial_copper_modelling

Industrial Copper Modeling optimizes pricing decisions using advanced ML. Predict sales with accuracy, classify leads, and streamline decision-making.

classification-models copper decision-tree-classifier decision-tree-regression pickle-file predictive-modeling regression-models scikit-learn

Last synced: 16 May 2026

https://github.com/touhoue/oilpumpvibration

The project employs signal processing techniques like Hilbert transforms to extract amplitude envelopes and instantaneous frequencies, facilitating insights into the mechanical health and performance of the system.

python scikit-learn

Last synced: 07 May 2026

https://github.com/gititsid/visaverdict

A ML project to predict possibility of US Visa approval

classification python3 random-forest-classifier scikit-learn

Last synced: 03 Feb 2026

https://github.com/khanovico/energy-data-analysis

This is the cloud model analyzing real world dataset with BigQuery and other big-data analyzing tools. I implemented docker image for running this app on cross-platform environments.

big-data-processing bigquery docker google-app-engine jupyter-notebook mlflow python scikit-learn seaborn xgboost

Last synced: 17 Feb 2026

https://github.com/lorenzorottigni/ml-universities

Machine Learning python bootcamp: K mean clustering with public/private universities dataset

k-mean-clustering machine-learning numpy pandas python scikit-learn seaborn

Last synced: 05 Apr 2026

https://github.com/jprmaulion/bayesopt-gb-seismic-liquefaction-liq7

Bayesian-optimized gradient boosting for seismic liquefaction prediction with geographic stratified CV on the LIQ/7/2833 global database.

bayesian-optimization binary-classification gradient-boosting lightgbm liquefaction machine-learning python scikit-learn shap shear-wave-velocity soil-mechanics xgboost

Last synced: 29 May 2026

https://github.com/massimilianoviola/entity-matching-dblp-acm

Entity matching on the DBLP-ACM dataset

scikit-learn sentence-transformers

Last synced: 13 Jun 2026

https://github.com/mpoojithavigneswari/bangalore-house-price-prediction

This project involves creating a website that predicts Bangalore house prices with 94.65% accuracy using a machine learning algorithm.

data-analysis data-science flask-server machine-learning matplotlib numpy pandas python scikit-learn seaborn

Last synced: 12 Apr 2026

https://github.com/aleksandarbuk/machine-learning

The Machine Learning Library repository provides a collection of scripts and tools leveraging Scikit-Learn, Pandas, and NumPy for various machine learning tasks and data analysis.

matplotlib numpy python scikit-learn tensorflow

Last synced: 16 Apr 2026

https://github.com/rririanto/thesis-projects

The computer science thesis project that I worked on when I was a student and was looking for a part time job

bag machine-learning python2 python27 scikit-learn surf

Last synced: 02 Feb 2026

https://github.com/armahdavi/data_pipeline_analytics_statistics_ml_pm_psd_residential_qff

Sharing all the data pipelines and processing codes, statistical modellings, descriptive statistics, plot visualizations, and machine learning from Mahdavi & Siegel (2021) (Indoor Air) Project Miestone: 2017 - 2020 Full-length article: https://onlinelibrary.wiley.com/doi/abs/10.1111/ina.12782

data-science data-visualization dust hvac indoor-air-quality jupyter-notebook machine-learning matplotlib-pyplot numpy pandas python scikit-learn scipy-stats spyder spyder-python-ide statistics

Last synced: 11 Apr 2026

https://github.com/mohd-faizy/preprocess_ml

This repository hosts Python code that utilizes the Scikit-learn preprocessing API for data preprocessing. The code presents a comprehensive range of tools that handle missing data, scale data, encode categorical variables, and perform other functions.

data-science feature-engineering feature-engineering-algorithm feature-extraction feature-selection machine-learning outlier-detection preprocessing-data preprocessor scikit-learn

Last synced: 16 May 2026

https://github.com/namratha2301/bangalorehousepricepredictor

Predicting house price in Bangalore based on the key features of the house like number of rooms, size in square feet etc.

azure bashscript docker flake8 flask github-actions scikit-learn

Last synced: 12 Apr 2026

https://github.com/charlescro/reddit-classification-nlp

Analyzing subreddit language via Reddit API and NLP techniques.

data-analysis data-science data-visualization nlp-machine-learning reddit-api scikit-learn

Last synced: 03 Apr 2025

https://github.com/filsan95/project-iot_malware_identification

This repository contains the code and data for a project that detects malware from IoT devices using a publish-subscribe model with Confluent and Databricks. The project streams IoT device data to Kafka, analyzes it, and detects malware using machine learning models such as Random Forest and Gradient Boosted Trees.

apache-kafka classification confluent databricks machine-learning-algorithms scikit-learn sql

Last synced: 16 Mar 2025

https://github.com/vishal-038/healthcare

The AI Healthcare System is a web-based application that integrates machine learning with Django to assist users in disease prediction, booking appointments with doctors, purchasing medicines, and managing lab tests.

django django-rest-framework pandas scikit-learn sqllite

Last synced: 05 May 2026

https://github.com/karimosman89/health-risk-assessment

Predict health risks based on patient data.Create a machine learning model that predicts health risks (like diabetes or heart disease) based on patient data.Help healthcare providers identify at-risk patients for early intervention.

ehr-data pandas python scikit-learn

Last synced: 06 May 2026

https://github.com/gangula-karthik/bank-transaction-classification

Classifying bank transactions with precision—your first step towards smarter finance management 💳🤖📊

finance machine-learning nlp scikit-learn

Last synced: 09 Apr 2025

https://github.com/bilalm04/email-spam-classifier

A machine learning project that classifies emails as spam or not spam using Logistic Regression, with a deployable Flask API for real-time classification.

api flask jupyter-notebook machine-learning matplotlib nlp numpy pandas python scikit-learn

Last synced: 06 Mar 2026

https://github.com/abhay-rudatala/resume-analyzer

Intelligent Resume Analysis System using Machine Learning and NLP. Features TF-IDF + Naive Bayes/SVM classification (90-95% accuracy), SpaCy NER for information extraction, and interactive Streamlit web app with custom UI. Built with Python, Scikit-learn, and deployed on Streamlit Cloud.

classification machine-learning named-entity-recognition nlp portfolio-project python resume-analysis scikit-learn spacy streamlit

Last synced: 06 May 2026

https://github.com/otuemre/housepricingml

A machine learning project predicting house prices using regression models. Covers data preprocessing, feature engineering, and model comparison to achieve accurate results. Developed for a Kaggle competition, focusing on effective ML workflows and model interpretability.

eda encoding evaluation-metrics kaggle-competition lightgbm-regressor machine-learning matplotlib-pyplot neural-networks numpy pandas preprocessing python ridge-regression scikit-learn seaborn tensorflow xgboost-regression

Last synced: 13 Apr 2026

https://github.com/the-developer-306/fake-review-detector

This project is a machine learning-based review classification system that predicts whether a product review is GENUINE or FAKE. It preprocesses review text, analyzes sentiment, and uses numerical features like ratings and helpfulness to make predictions. The model is deployed via a Flask web application for user interaction.

classification flask logistic-regression machine-learning numpy pandas python renderdeploy scikit-learn sentiment-analysis

Last synced: 12 Apr 2026

https://github.com/henriquepmartins/ml-number-prediction

Number prediction using Logistic Regression

logistic-regression machine-learning scikit-learn

Last synced: 15 May 2026

https://github.com/boomerspine/selflearning_chatbot

Self learning chatbot using python

python scikit-learn

Last synced: 10 May 2026