An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/tnleite/real-estate-opportunities-analysis

Este repositório apresenta uma análise de oportunidades no mercado imobiliário, combinando séries temporais, clusterização e previsões para identificar estados com maior potencial de crescimento e orientar estratégias de expansão eficientes.

catboostregressor cluster-analysis data-science kmeans-clustering lightgbm-regressor machine-learning-algorithms numpy regression-models scikit-learn xgboost-regression

Last synced: 10 May 2026

https://github.com/ankur-krgarg/credit-risk

Predict credit risk using machine learning (LogReg, Random Forest). Built clean pipeline with EDA, modeling, and visualizations.

classification credit-risk-analysis data-science eda machine-learning portfolio python remote-read scikit-learn

Last synced: 10 Jun 2026

https://github.com/alphacrypto246/student-learning-style-prediction

An interactive web application built with Streamlit that predicts a student's preferred learning style (visual, auditory, or kinesthetic) using machine learning, aiding educators in personalizing teaching strategies.

machine-learning scikit-learn scikitlearn-machine-learning streamlit

Last synced: 11 May 2026

https://github.com/mpolinowski/tstochastic-neighbor-embedding

Improve Data Quality by discarding non-correlating, noisy Dimensions

matplotlib-pyplot python scikit-learn t-sne

Last synced: 11 May 2026

https://github.com/bheemisme/brain-tumor-classification

brain tumor classification using machin learning

deep-learning machine-learning pytorch scikit-learn xgboost

Last synced: 11 May 2026

https://github.com/matheusadc/valorizai

Projeto que tem como objetivo a previsão do preço de casas.

jupyter-notebook pandas scikit-learn

Last synced: 11 May 2026

https://github.com/johannesvc/data-science-portfolio

A curated portfolio of applied data science projects focused on machine learning, NLP, and social impact.

academic-portfolio data-science deep-learning keras machine-learning media-bias nlp pandas scikit-learn

Last synced: 11 May 2026

https://github.com/sharvesh1401/inverse-design-patch-antenna

A machine learning approach to the inverse design of microstrip patch antennas by predicting optimal physical dimensions from desired performance metrics.

antenna-design deep-learning engineering-project gradio jupyter-notebook machine-learning patch-antenna python regression-model scikit-learn

Last synced: 11 May 2026

https://github.com/g-eoj/kaggle-rotten-tomatoes

Movie review sentiment analysis with the Stanford parsed Rotten Tomatoes dataset.

cross-validation nlp nltk rotten-tomatoes scikit-learn

Last synced: 12 May 2026

https://github.com/msikorski93/heart-failure-prediction

The subject of this repository was to perform binary classification based on respondent's collected features (age, cholesterol level, fasting blood sugar, thallium stress test results, etc.).

classification knn-classifier logistic-regression random-forest-classifier roc-curves scikit-learn svm-classifier

Last synced: 13 May 2026

https://github.com/mateusoliveira30/house-prices

This project was developed for the Kaggle competition "House Prices - Advanced Regression Techniques." The goal is to predict house sale prices using advanced regression techniques, including feature engineering, Random Forests, and Gradient Boosting.

kaggle-competition machine-learning scikit-learn

Last synced: 13 May 2026

https://github.com/johanneswiesner/skplot

A python package for extracting, plotting and reporting information from one or multiple sklearn classification & prediction pipelines.

plotting python scikit-learn sklearn visualization

Last synced: 14 May 2026

https://github.com/breezy-codes/machine-learning-for-spam-sms

Real-time SMS spam detection using ML models in simulated cellular networks. Compares 4 algorithms with comprehensive performance analysis.

logistic-regression machine-learning naive-bayes network-simulation random-forest research scikit-learn spam-sms spam-sms-detection svm telecommunication

Last synced: 14 May 2026

https://github.com/fulviofavilla/cvd-prediction-ml

Comparative ML analysis for CVD prediction. Winner of the 2023 HPCC Systems Poster Competition.

data-science ecl healthcare hpcc-systems machine-learning pandas python scikit-learn

Last synced: 11 Jun 2026

https://github.com/muditnautiyal-21/mudra-ml

Glass-box autonomous data science in Python. Profiles data, builds leakage-safe pipelines, recommends and tunes models, and logs every decision behind the result.

automl classification clustering data-science explainable machine-learning pipeline python regression scikit-learn

Last synced: 12 Jun 2026

https://github.com/nayutalienx/osu-skill-predictor

ML-powered osu! pass probability & accuracy predictor with real-time overlay. Standalone Windows bundle available.

fastapi machine-learning osu overlay predictor scikit-learn

Last synced: 14 Jun 2026

https://github.com/tomdewildt/interactive-and-explainable-ai-design

Code for The Interactive And Explainable AI Design course of my master's degree

jupyter lime numpy pandas python scikit-learn shap

Last synced: 18 Jun 2026

https://github.com/jayemscript/lab-to-code

A complete Python learning roadmap for scientists and researchers — covering data science, biology, chemistry, physics, and mathematics with curated libraries, tools, and resources.

bioinformatics chemistry data-science jupyter-notebook machine-learning mathematics numpy pandas physics python research roadmap scientific-computing scikit-learn

Last synced: 19 Jun 2026

https://github.com/nafis2508/maternal-neonatal-outcome-prediction

Predicting Maternal and Neonatal Birth Outcomes using Machine Learning on 61,018 Healthcare Records from Kenya and Uganda

data-science decision-tree eda healthcare-ai healthcare-analytics machine-learning maternal-health predictive-modeling python random-forest scikit-learn

Last synced: 24 Jun 2026

https://github.com/imosudi/model_training

Breast Cancer Diagnosis: Logistic Regression, Random Forest, k-NN and Decision Tree classifiers models with feature importance analysis - Includes data exploration, train/test splitting, feature scaling, cross-validation, and model evaluation metrics with confusion matrices and decision boundary visualisation

classification data-science decision-tree educational feature-importance k-nearest-neighbors linear-regression machine-learning model-evaluation python3 random-forest scikit-learn

Last synced: 25 Jun 2026

https://github.com/aishwaryagm1999/insurance-workflow-management

This project is an Insurance Workflow Management System designed to streamline policy management, claims processing, and fraud detection. It includes user account management, customer feedback analysis via NLP, alert notifications through SMS, and a fraud detection model, providing a secure, efficient solution for insurance operations.

css fraud-detection html json labelimg machine-learning natural-language-processing nlp opencv python qr-code-generator random-forest-classifier scikit-learn sms-notification tensorflow textblob twilio user-interface

Last synced: 26 Dec 2025

https://github.com/manojkp08/student-performance-analysis

The Student Performance Analyzer is your go-to solution for understanding and improving student performance. By blending the power of machine learning with interactive visualizations, this tool provides educators and learners with personalized insights into learning styles, performance gaps, and actionable improvements.

machine-learning numpy pandas python requests scikit-learn streamlit

Last synced: 12 Apr 2026

https://github.com/brianlesko/maze-runner

Developed a Python-based maze-crawling application using a PS5 controller interface. This project highlights skills in software-hardware integration and low-code UI design, demonstrating expertise ideal for advanced software engineering.

communication dualsense engineer engineering hacking hardware hardware-hacking interface low-code-ui mechanical-engineer mechanical-engineering protocol ps5 python robotics-engineer scikit-learn software sony streamlit ui

Last synced: 12 Apr 2026

https://github.com/arrhythmia-detection/arrhythmiadetectionmodels

This repository contains ML codebase developed during CSE713 group project

arrhythmia-detection deep-neural-nets esp32-s3 scikit-learn tensorflow tensorflow-lite tinyml

Last synced: 12 Apr 2026

https://github.com/pders01/telarantula

📜 I made this for Uni. Was pretty fun. It scrapes telegram channels of known German tinfoil-hats and tries to detect the telegram channel based on the emojis that are used.

assignment python research scikit-learn scrapy

Last synced: 04 Aug 2025

https://github.com/jprmaulion/bayesopt-gb-seismic-liquefaction-liq7

Bayesian-optimized gradient boosting for seismic liquefaction prediction with geographic stratified CV on the LIQ/7/2833 global database.

bayesian-optimization binary-classification gradient-boosting lightgbm liquefaction machine-learning python scikit-learn shap shear-wave-velocity soil-mechanics xgboost

Last synced: 29 May 2026

https://github.com/massimilianoviola/entity-matching-dblp-acm

Entity matching on the DBLP-ACM dataset

scikit-learn sentence-transformers

Last synced: 13 Jun 2026

https://github.com/charlescro/reddit-classification-nlp

Analyzing subreddit language via Reddit API and NLP techniques.

data-analysis data-science data-visualization nlp-machine-learning reddit-api scikit-learn

Last synced: 03 Apr 2025

https://github.com/gangula-karthik/bank-transaction-classification

Classifying bank transactions with precision—your first step towards smarter finance management 💳🤖📊

finance machine-learning nlp scikit-learn

Last synced: 09 Apr 2025

https://github.com/otuemre/housepricingml

A machine learning project predicting house prices using regression models. Covers data preprocessing, feature engineering, and model comparison to achieve accurate results. Developed for a Kaggle competition, focusing on effective ML workflows and model interpretability.

eda encoding evaluation-metrics kaggle-competition lightgbm-regressor machine-learning matplotlib-pyplot neural-networks numpy pandas preprocessing python ridge-regression scikit-learn seaborn tensorflow xgboost-regression

Last synced: 13 Apr 2026

https://github.com/hvalfangst/azure-functions-pandas

Azure Functions for ETL operations using Pandas. Uploaded CSV files trigger data processing, calculating correlations and storing results in a JSON file. Automated deployment via GitHub Actions and Terraform.

az-204 azure azure-functions azure-functions-python pandas python scikit-learn terraform

Last synced: 12 Apr 2026

https://github.com/hrolive/recommendation-systems-ibm

Analyze the interactions that users have with articles on the IBM Watson Studio platform and make recommendations to them about new articles, using various recommendation engines.

machine-learning natural-language-processing pandas python recomendation-system scikit-learn

Last synced: 12 Apr 2026

https://github.com/adam-maz/virtual_screening

Within this repository I present scripts that can be helpful during virtual screening in drug design & development.

clusterization jupyter-notebook k-means-clustering maestro-schrodinger medicinal-chemistry molecular-fingerprints pandas python rdkit scikit-learn scoring-functions virtual-screening

Last synced: 04 May 2026

https://github.com/santiago-giordano/datascienceproject

Data Science Course Project: Causes of death around the world

apis jupyter-notebook matplotlib pandas python scikit-learn seaborn

Last synced: 12 Apr 2026

https://github.com/purcellcjp/credit-risk-classification

This project utilized Python and scikit-learn libraries to train and evalute a Machinge Learning model based on loan risk.

machine-learning numpy pandas-dataframe python scikit-learn

Last synced: 12 Apr 2026

https://github.com/sravyatogarla/movie-recommendation-system

A complete Movie Recommendation System project implementing Popularity-Based, Content-Based, and Collaborative Filtering models using the MovieLens dataset. Built with Python, Pandas, and Plotly, featuring interactive inputs and visualizations.

capstone-project collaborative-filtering content-based-filtering data-science data-visualization edureka jupyter-notebook machine-learning movie-recomendation-system movielens pandas popularity-based-filtering python recommender-system scikit-learn sql

Last synced: 13 Apr 2026

https://github.com/vikneshsrv24/customer-segmentation

Segregation of customers based on purchasing pattern for targeted marketing.

jupyter-notebook matplotlib pandas python scikit-learn

Last synced: 13 Apr 2026

https://github.com/smaddanki/data-science

Code blocks, algorithms, and research snippets in Data Science, Machine Learning, AI & Quant Finance.

deep-learning machine-learning pytorch scikit-learn spark

Last synced: 13 Apr 2026

https://github.com/nurulashraf/polynomial-regression-manufacturing

A Python project implementing polynomial regression to analyse and predict manufacturing-related data. Features include data preprocessing, model training, and visualisation of results. Ideal for exploring machine learning applications in manufacturing process optimisation.

data-analysis data-visualization machine-learning manufacturing polynomial-regression predictive-modeling process-optimization python regression-models scikit-learn

Last synced: 16 Apr 2026

https://github.com/snigdho8869/language-detection-ai

Detect 18+ languages instantly using machine learning (BERT, LSTM, SVM) and NLP. Includes a Flask web app for real-time predictions, trained models, and detailed notebooks.

artificial-intelligence cnn deep-learning flask gru keras language-detection lstm machine-learning naive-bayes-classifier natural-language-processing nlp nltk python scikit-learn svm tensorflow text-classification web-app web-development

Last synced: 13 Apr 2026

https://github.com/divakarkumarp/pneumonia-detection

Deep learning (DL) model is a pneumonia fighter! Trained on chest X-ray images, it analyzes patterns to detect the lung infection. Imagine a digital doctor scrutinizing the X-ray, pinpointing areas that might be pneumonia. The model outputs a probability score, helping doctors confirm or rule out the illness.

cnn deep-learning docker keras python scikit-learn streamlit tensorflow

Last synced: 13 Apr 2026

https://github.com/laoluadewoye/skloverlay

This repository is the official location of the SKLOverlay Project. Here, it will hold everything used for the package on Py Pi, including source files.

classification classification-algorithm data-science data-wrangling evaluation-metrics excel graphics graphs machine-learning machine-learning-algorithms matplotlib modeling pandas preprocessing scikit-learn

Last synced: 22 Feb 2026

https://github.com/andrewobwocha/titanicsurvival

🚢 End-to-end Python pipeline for Titanic survival classification. Demonstrates EDA, preprocessing, feature engineering, and Logistic Regression evaluation using Scikit-learn.

classification data-preprocessing data-visualization exploratory-data-analysis feature-engineering machine-learning pandas python scikit-learn titanic

Last synced: 13 Jun 2025

https://github.com/armahdavi/data_analytics_statistics_plotting_pm_airborne_sampling

All codes for the data pipelines processing, statistical modellings, descriptive statistics and plot visualizations from airborne phase of Mahdavi et al. (2021) (Environmental Pollution) Project Miestone: 2018 - 2021

data-science data-visualization machine-learning matplotlib-pyplot numpy pandas python scikit-learn scipy-stats statistics

Last synced: 13 Apr 2026

https://github.com/pratyush905/farecast-nyc-taxifare-predictor

Machine learning models to predict nyc taxi fare based on given dataset

jupiter-notebook kaggle machine-learning matplotlib numpy python regression-models scikit-learn

Last synced: 13 Apr 2026

https://github.com/mahsayedsalem/models_utils

Writing machine learning reusable and clean codes to make my life easier.

deep-learning keras keras-tensorflow machine-learning python3 scikit-learn tensorflow

Last synced: 13 Apr 2026

https://github.com/phonhay103/diabetes_svm

Diabetes Classification with SVM

classification scikit-learn streamlit svm

Last synced: 08 May 2026

https://github.com/zvdy/movie_recommendation

Movie Recommendation Search Engine using Jupyter Notebooks, Pandas, Nnmpy, SciKit Learn, IPyWidgets

data-science jupyter-notebook machine-learning numpy pandas python scikit-learn

Last synced: 13 Apr 2026

https://github.com/johnnixon6972/cirrhosis-outcomes-prediction

This leverages advanced machine learning techniques to predict patient outcomes for those suffering from cirrhosis. Utilizing a comprehensive dataset from a Mayo Clinic study, this project explores various data imputation methods and class balancing techniques to enhance prediction accuracy.

ai algorithms analytics artificial-intelligence machine-learning ml pandas python3 scikit-learn

Last synced: 13 Apr 2026

https://github.com/khushi130404/placemetrix

A machine learning-based placement prediction app using IQ and CQPA as inputs. Built with Python in Jupyter Notebook, leveraging scikit-learn, pandas, and matplotlib.

jupyter-notebook machine-learning matplotlib python scikit-learn

Last synced: 13 Apr 2026

https://github.com/joewlos/fantasy_football_monte_carlo_draft_simulator

Monte Carlo Fantasy Football Draft Simulator Featuring FastAPI, NextUI, and ODMantic

fantasy-football monte-carlo nextjs nextui odmantic pydantic python scikit-learn

Last synced: 13 Apr 2026

https://github.com/no-country-simulation/s16-21-n-data-bi

Analisis del COVID-19 - insights sobre la evolución de la pandemia - impacto en 5 paises sudamericanos.

eda etl machine-learning matplotlib pandas powerbi python scikit-learn seabron streamlit

Last synced: 28 Apr 2025

https://github.com/santoshn86/dlp-ev-system-for-pa-optimization

This system is a game-changer, enabling smarter energy management through predictive insights and personalized optimization strategies.

aiml django flask keras pytorch scikit-learn tensorflow typescript

Last synced: 13 Apr 2026

https://github.com/1401dev/customer-lifetime-value-prediction

A data science project leveraging Python and Scikit-Learn to build predictive models that estimate customer lifetime value (CLV). Includes data cleaning, feature engineering, and model selection to identify key drivers of CLV, supporting strategic decision-making in customer retention and marketing.

clv clv-analysis customer-retention data-analysis dataprocessing feature-engineering machine-learning marketing-analytics predictive-modeling python regression-analysis scikit-learn

Last synced: 06 May 2026

https://github.com/pramodyasahan/spaceship-titanic

This repository features a machine learning model designed to predict whether passengers of a space travel company are likely to be transported. The model employs CatBoostClassifier, a machine learning algorithm known for handling categorical data effectively.

machine-learning numpy pandas python scikit-learn

Last synced: 13 Apr 2026

https://github.com/muscaanmnmnm/breast-cancer-detector

A predictive model for breast cancer detection using K-Nearest Neighbors, demonstrating the impact of feature scaling on model performance and recall.

breast-cancer-wisconsin data-science feature-scaling jupyter-notebook knn-classification machine-learning pandas-dataframe python-3 scikit-learn

Last synced: 06 Sep 2025

https://github.com/grandechowhiskey/fcc-data_analysis-projects

A collection of projects completed as part of the FreeCodeCamp "Data Analysis with Python" certification. These projects cover statistical calculations, data visualization, and trend analysis using real-world datasets.

data-analysis data-visualization matplotlib pandas python3 scikit-learn seaborn

Last synced: 01 May 2026

https://github.com/thinker84/real-time-stock-price-prediction-and-market-analysis-using-machine-learning

Real-time stock price prediction app using LSTM, Streamlit, and historical data (2010–2023). Forecasts next 10 days & visualizes trends.

data-science django lstm machine-learning numpy pandas pandas-datareader scikit-learn stock-market stock-price-prediction stooq streamlit yahoo-finance yahoo-finance-api

Last synced: 13 Jul 2025

https://github.com/pksvv/machinelearning_svm

Various implementations of Support Vector Machine Algo

machine-learning python scikit-learn support-vector-machine

Last synced: 04 May 2026

https://github.com/dharma-acha/imageclassification

This project is an interactive Streamlit web application using the VGG-13 model to classify images from the CIFAR-10 dataset. Users can upload images to receive real-time predictions and visual explanations of the model's decisions. The goal is to accurately classify images into one of the ten CIFAR-10 classes: airplanes, automobiles, birds, cats,

colab-notebook matplotlib numpy pandas python3 pytorch scikit-learn seaborn streamlit

Last synced: 13 Apr 2026

https://github.com/mmerlyn/analysis-of-tomato-prices

Forecasting tomato prices in Karnataka using machine learning to help farmers make better crop planning and selling decisions.

css flask html matplotlib numpy pandas python scikit-learn seaborn

Last synced: 06 Jul 2025

https://github.com/lukacerr/lovelytics

Lovelytics technical task for AI engineer position

ai-agents deepagents langchain ml python scikit-learn

Last synced: 31 May 2026

https://github.com/imehranasgari/mlflow_starter

This project is a hands-on guide to the complete end-to-end MLflow workflow, designed as an educational resource. It demonstrates how MLflow is used in practice for experiment tracking, model versioning, and ensuring a reproducible MLOps lifecycle, focusing on the methodology and best practices rather than high model accuracy.

data-science experiment-tracking mlflow mlops model-registry python scikit-learn

Last synced: 11 May 2026

https://github.com/njorogepaul-moghul/house-price-predictions-kaggle-competition-

Built a predictive model for the Kaggle House Prices competition using feature engineering and LightGBM, achieving strong leaderboard performance."

data-science house-price-prediction-with-lightgbm kaggle-competition lightgbm machine-learning predicting-home-values-using-machine-learning random-forest scikit-learn

Last synced: 15 May 2026

https://github.com/blue-catblues/tieba-integratedanalysis

Python期末大作业—对百度贴吧进行爬虫采集(scrapy)、统计分析(pandas)、可视化展示(matplotlib),与机器学习分类(scikitLearn)的综合性数据分析

matplotlib nlp-machine-learning pandas python scikit-learn scrapy seaborn

Last synced: 05 Oct 2025

https://github.com/inesruizblach/data-science-project

A data science project exploring Portuguese "Vinho Verde" wine quality prediction. Features EDA, feature engineering, ML models, and evaluation using Python, pandas, scikit-learn, and visualization tools.

binary-classification classification data-science exploratory-data-analysis feature-engineering imbalanced-learn jupyter-notebook machine-learning model-evaluation pandas regression scikit-learn seaborn uci-dataset wine-quality

Last synced: 09 May 2026

https://github.com/josepablodmg/python--linear-regression---housing-exercise

A predictive analysis exploring the relationship between household characteristics and median income in California. Using linear regression, the project investigates whether blocks with fewer households correspond to higher median incomes.

california data-analysis data-science exploratory-data-analysis housing-data linear-regression machine-learning python regression scikit-learn statistics visualization

Last synced: 05 Oct 2025

https://github.com/dearabhin/girlfriend-predictor

Using machine learning to solve the ultimate college classification problem. A fun project applying Python and Logistic Regression to predict relationship outcomes based on a (hilariously) synthetic dataset. 📊❤️

classification data-science fun-project google-colab jyputer-notebook jypyternotebook logistic-regression machine-learning pandas python scikit-learn

Last synced: 06 Oct 2025

https://github.com/muellerconstantin/house-prices

Data analysis about house prices in Ames (Iowa) with advanced regression techniques.

dvc jupyter-notebook python python3 scikit-learn

Last synced: 14 Apr 2026

https://github.com/jyablonski/nba_elt_mlflow

ML Pipeline for NBA ELT Project

python scikit-learn

Last synced: 17 Jan 2026

https://github.com/r-gg/ml-37

Amazon Reviews ~ Sentiment analysis evaluation: fine-tuned BERT vs LSTM. (+ Extensive Data Mining & Visualization)

bert deep-learning ipynb-jupyter-notebook lstm machine-learning python scikit-learn uni-project

Last synced: 05 Feb 2026

https://github.com/workwithchaimaa/codealpha_diseaseprediction

Complete ML pipeline for binary classification to predict heart disease. Includes data preprocessing, model comparison (Logistic Regression, RF), hyperparameter tuning, and feature importance analysis.

classification heart-disease machine-learning python random-forest scikit-learn

Last synced: 08 Oct 2025

https://github.com/manjotkaurgill/agritech

Enter details of your soil and weather, and find best suitable crop for farming. With our advanced AI system, you can make informed decisions and optimize your agricultural practices.

flask generative-ai insight-generation machine-learning matplotlib mongodb nextjs numpy pandas python scikit-learn seaborn

Last synced: 12 Apr 2026

https://github.com/himanshkr03/loan_default_prediction_using_machine_learning

This repository contains a Python-based project that uses machine learning to predict loan defaults. It explores data preprocessing, feature engineering, and model training techniques to build a predictive model for assessing loan risk.

data-science finance loan-default-prediction machine-learning pandas prediction-model python risk-assessment scikit-learn

Last synced: 14 Apr 2026

https://github.com/sharvesh1401/battsense

BattSense is a machine learning project focused on predicting the State of Health (SOH) of lithium-ion batteries using operational parameters such as voltage, current, temperature, and capacity. The model enables accurate, data-driven diagnostics for battery performance monitoring in electric vehicles and portable devices.

battery-diagnostics battery-health battery-health-prediction battery-soh data-analysis electric-vehicles energy-storage machine-learning predictive-maintenance python regression scikit-learn

Last synced: 07 May 2026