An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/senaayy/adhd-network-efficiency

🧠 End-to-end fMRI analysis pipeline comparing ADHD brain topology vs. Healthy Controls using Graph Theory (Global Efficiency & Clustering). Built with Nilearn, NetworkX, and Docker for reproducible neuroscience.

adhd bioinformatics brain-networks computational-neuroscience data-science docker fmri graph-theory network-analysis networkx neuroscience nilearn python scikit-learn

Last synced: 17 Jun 2026

https://github.com/chris-santiago/tsfeast

A collection of Scikit-Learn compatible time series transformers and tools.

data-science feature-engineering python scikit-learn time-series timeseries-features transformers

Last synced: 01 May 2026

https://github.com/diiblo/la-poste-predictive-flux

PrĂ©diction journaliĂšre du flux de colis dans les centres de tri de La Poste. Pipeline complet : gĂ©nĂ©ration de donnĂ©es, modĂ©lisation LightGBM, orchestration via Airflow (Docker), stockage PostgreSQL et dashboard interactif Streamlit. Projet rĂ©alisĂ© en MastĂšre 2 Data Engineering Ă  l’ECE Paris.

airflow docker postgresql scikit-learn streamlit

Last synced: 31 Jan 2026

https://github.com/emv271828/diabetes_cdc_uci_machine_learning

Segunda avaliação para a disciplina de InteligĂȘncia Artificial da Universidade Federal Fluminense.

jupyter-notebook machine-learning pandas python scikit-learn

Last synced: 15 Apr 2026

https://github.com/jbizzlefoshizzle/linear-and-ridge-regression

The purpose of this project was to analyze and predict housing prices using attributes or features such as square footage, number of bedrooms, number of floors, and so on.

linear-regression machine-learning machine-learning-algorithms regression-analysis regression-models ridge-regression scikit-learn scikitlearn-machine-learning train-test-split train-test-using-sklearn

Last synced: 16 May 2026

https://github.com/jofaval/titanic-disaster

Data Analysis of the famous Titanic Disaster in 1912 with Machine Learning

classification data-analysis data-science data-visualization google-colab kaggle machine-learning python scikit-learn

Last synced: 15 Apr 2026

https://github.com/eljandoubi/genre_classification

Create an ML pipeline for Genre Classification using MLflow.

hydra machine-learning mlflow numpy pandas pandas-profiling pytest scikit-learn scipy wandb

Last synced: 11 Apr 2026

https://github.com/talapanenivarshithchowdary/asteroid-detection-ml

This project uses Machine Learning to detect and classify asteroids based on trajectory and size, aiding in Near-Earth Object detection and planetary defense.

classification data-science decision-trees jupyter-notebook knn logistic-regression machine-lea matplotlib numpy pandas pillow prediction python3 random-forest scikit-learn

Last synced: 11 Apr 2026

https://github.com/audy21/datacamp

Learning portfolio documenting my progress, while taking Data Analyst & Data Science certifications from DataCamp.

data-analysis data-science machine-learning matplotlib numpy pandas python scikit-learn seaborn

Last synced: 11 Apr 2026

https://github.com/swarnabhaghosh/house-price-prediction-model

Built an end-to-end regression pipeline to predict house prices using Linear Regression with automated preprocessing (PowerTransform, StandardScaling) via Scikit-learn's Pipeline and ColumnTransformer.

column-transformer linear-regression matplotlib-pyplot numpy pandas pipeline python scikit-learn seaborn

Last synced: 11 Apr 2026

https://github.com/oadultradeepfield/galaxy10-anomaly-detection

A public API and experimental PyTorch pipeline for anomaly detection in the Galaxy10 DECals dataset using ResNet50, autoencoders, and clustering techniques

flask google-cloud-run kaggle pytorch scikit-learn

Last synced: 05 Apr 2026

https://github.com/aksoni07/movie-recommendation

A hybrid movie recommendation system designed to deliver personalized and accurate suggestions by combining user preferences, item attributes, and collaborative patterns, ensuring a seamless and engaging experience.

clustering content-based-filtering data-analysis embeddings jupyter-notebook numpy ollaborative-filtering pandas personalization python recommendation-systems scikit-learn user-item-interactions

Last synced: 11 Apr 2026

https://github.com/perpendicooler/elementary-research-for-steamboat-willie-s-store-in-poland

An elementary research for a company to opening store in a city using gurobi and pulp optimization.

christofides-algorithm gurobipy numpy pandas pulp python3 scikit-learn travelling-salesman-problem

Last synced: 05 Apr 2026

https://github.com/swat1563/recommendation-system

This repository features a recommendation system and analytics engine using datasets on users, organizations, contents, contacts, events, and recommendations. It includes data preprocessing, building a recommendation system, and creating visual reports with Power BI.

analytics data-analysis data-visualization engine kaggle numpy pandas powerbi powerbi-dashboards powerbi-desktop powerbi-reports python recommendation-engine recommendation-system recommender-systems scikit-learn scipy

Last synced: 07 Jan 2026

https://github.com/billy0402/python-machine-learning

A learning project from NTUB machine learning course.

ai course jupyter-notebook python scikit-learn tensorflow

Last synced: 05 Apr 2026

https://github.com/lorenzorottigni/ml-movies

Machine Learning python bootcamp: Recommender Systems on movies dataset

ipynb machine-learning numpy pandas python recommender-system scikit-learn seaborn

Last synced: 05 Apr 2026

https://github.com/allanreda/telco-customer-churn-predictor-app

A web-based machine learning application that predicts customer churn using a logistic regression model. Built with Scikit-Learn for model training, Gradio for the user interface, and deployed on Google Cloud App Engine. The app allows users to input customer data and receive predictions on churn risk to support business decision-making.

app-engine data-visualization deployment google-cloud gradio hyperparameter-tuning logistic-regression machine-learning numpy pandas scikit-learn

Last synced: 16 Apr 2026

https://github.com/dastogirrudro/machine-learning-and-deep-learning

This is my thesis project which i have done in varsity.Here i used machine learning and deep learning i used LSTM as deep learning.This can identify aggresive spam message. Here i used pandas scikit-learn and many more framework i used python as a programming language.I used many algorithm for highering the accuracy of my project.

deep-learning lstm machine-learning numpy pandas python scikit-learn

Last synced: 11 Apr 2026

https://github.com/pramodyasahan/titanic-survival

This repository contains a machine learning project focused on predicting the survival of passengers on the Titanic. The project uses a Support Vector Regression (SVR) model from the sklearn library and involves data preprocessing and prediction.

data-preprocessing matplotlib numpy pandas python scikit-learn support-vector-regression

Last synced: 08 Apr 2026

https://github.com/andrewjmack/credit-risk-classification

Supervised learning model trained and evaluated on loan risk for potential use in the prediction of the creditworthiness of an applicant

banking loan-prediction-analysis machine-learning pandas python scikit-learn supervised-learning

Last synced: 11 Apr 2026

https://github.com/trimoyee-g/flipkart-reviews-sentiment-analysis

A RandomForestClassifier-based sentiment analysis model for efficient binary categorization of Flipkart reviews.

machine-learning matplotlib python random-forest-classifier scikit-learn seaborn

Last synced: 11 Apr 2026

https://github.com/daniel-furman/RecFeatureSelect

Feature selection functions (1) using the multi-collinearity matrix and recursively proceeding to a spearman threshold and (2) using Forward Stepwise Selection running on an ensemble sklearner (with options for HPO).

correlation-threshold machine-learning modeling multicollinearity recursion recursive-algorithm scikit-learn spearman-rho

Last synced: 09 Jul 2025

https://github.com/vasu7052/spam-classifier

This is a Machine Learning Project to detect whether a given sentence maybe a spam or not using Python and Keras.

keras keras-neural-networks python3 scikit-learn spam-classification tensorflow

Last synced: 11 Apr 2026

https://github.com/alexsolov28/ml_course

Курс "ĐąĐ”Ń…ĐœĐŸĐ»ĐŸĐłĐžŃ ĐŒĐ°ŃˆĐžĐœĐœĐŸĐłĐŸ ĐŸĐ±ŃƒŃ‡Đ”ĐœĐžŃ"

colab-notebooks jupyter-notebook matplotlib numpy pandas python scikit-learn seaborn

Last synced: 05 Apr 2025

https://github.com/jihoonerd/restricted-discriminant-analysis

RDA implementation compatible with Scikit-learn API

discriminant-analysis rda scikit-learn

Last synced: 22 Apr 2026

https://github.com/pranavgautam29/flight-price-prediction

The Flight Price Prediction project uses machine learning to forecast flight ticket prices based on historical data. Hosted on Streamlit Community Cloud and deployed via Streamlit, this application allows users to input flight details such as departure and arrival airports, travel dates, and class to receive accurate price predictions.

machine-learning prediction-model regression scikit-learn statistical-machine-learning streamlit

Last synced: 21 Feb 2026

https://github.com/moustafamohamed01/breast-cancer-prediction

A machine learning model built with PyTorch to predict if a tumor is malignant or benign using the Breast Cancer Dataset. The model uses a neural network to classify the data and shows how to train, evaluate, and visualize results.

ai data-science deep-learning machine-learning neural-network python pytorch scikit-learn

Last synced: 15 Apr 2026

https://github.com/mohit1106/Fraud-Detection-In-Financial-Transactions

an anomaly detection system on 284,807 transactions, achieving an AUC of ~0.972 with CNNs and Autoencoders.

autoencoders cnn-model isolation-forest keras python scikit-learn tensorflow

Last synced: 17 Oct 2025

https://github.com/scikit-learn/pairwise-distances-reductions-asv-suite

A dedicated asv suite for scikit-learn private PairwiseDistancesReductions

asv benchmarks cython scikit-learn

Last synced: 18 Jan 2026

https://github.com/raghavendranhp/industrial_copper_modelling

Industrial Copper Modeling optimizes pricing decisions using advanced ML. Predict sales with accuracy, classify leads, and streamline decision-making.

classification-models copper decision-tree-classifier decision-tree-regression pickle-file predictive-modeling regression-models scikit-learn

Last synced: 16 May 2026

https://github.com/karimosman89/resume-screening

Screen resumes to identify the best candidates.Build a machine learning model that screens resumes and ranks candidates based on job descriptions.Streamline the hiring process for HR departments by automating candidate screening.

machine-learning-algorithms nlp-machine-learning nltk-python python scikit-learn spacy text-processing

Last synced: 29 Apr 2026

https://github.com/nikitalpopov/evotor_champ

solution for evotor data challenge

data-analysis data-science python scikit-learn

Last synced: 15 Apr 2026

https://github.com/ebadshabbir/decision_tree_algorithm

Decision Tree Classifier for Social Network Ads A Python implementation of a Decision Tree Classifier to predict user purchasing behavior based on age and estimated salary. Includes feature scaling, model evaluation (confusion matrix and accuracy), and visualizations of decision boundaries for both training and test sets.

decision-tree-classifier jupyter-notebook machine-learning matplotlib-pyplot numpy pandas python scikit-learn

Last synced: 11 Apr 2026

https://github.com/idaraabasiudoh/telco-churn-logistic-regression

A predictive model using logistic regression to identify customers likely to churn from a telecommunications company.

logistic-regression machine-learning python3 scikit-learn

Last synced: 01 Feb 2026

https://github.com/khanovico/energy-data-analysis

This is the cloud model analyzing real world dataset with BigQuery and other big-data analyzing tools. I implemented docker image for running this app on cross-platform environments.

big-data-processing bigquery docker google-app-engine jupyter-notebook mlflow python scikit-learn seaborn xgboost

Last synced: 17 Feb 2026

https://github.com/parbhat-cpp/suicidal-ml

A machine learning/NLP-based system to identify signs of suicidal ideation from user text inputs.

bash cicd classification docker fastapi githubactions jinja2 jupyter-notebook machine-learning natural-language-processing nlp numpy pandas python scikit-learn

Last synced: 11 Apr 2026

https://github.com/pramodyasahan/model-selection

This repository explores and compares different regression models for predicting continuous outcomes. This repository includes implementations and evaluations of five key regression models. The primary goal is to demonstrate how each model works, evaluate their performance using R-squared values, and guide users in selecting the best model.

machine-learning modelselection numpy pandas python regression scikit-learn

Last synced: 08 Mar 2025

https://github.com/coder5omkar/logistic-regression-customer-churn-prediction

This project uses Logistic Regression to predict customer churn in the telecom industry. To run, clone the repository, install dependencies, and run the Jupyter notebook for full analysis and predictions.

logistic-regression ml pandas scikit-learn seaborn statistics

Last synced: 20 Apr 2026

https://github.com/vhnegrisoli/machine-learning-linguagens-programacao

Projeto de Data Science e Machine Learning de anålise de linguagens de programação de 2004 a 2021

data-science jupyter-notebook machine-learning matplotlib pandas python scikit-learn seaborn

Last synced: 07 Apr 2026

https://github.com/abdiasarsene/customer_segmentation_for_a_marketing_campaign

Use unsupervised learning techniques to segment a company’s customers into distinct groups in order to personalize marketing campaigns. To ultimately propose specific marketing strategies for each customer segment based on the insights obtained.

acp kmeans-clustering matplotlib pandas plotly python scikit-learn seaborn

Last synced: 08 Mar 2025

https://github.com/paulinhok14/property-insight-sample

Property Insight is an app that helps you identify amazing real estate opportunities, leveraging AI models to estimate a property Fair Value and compare to current prices.

ai docker fastapi python pytorch real-estate scikit-learn streamlit

Last synced: 11 Apr 2026

https://github.com/jo-minseok/global-warming-100year

đŸŒĄïž 2100년êčŒì§€ì˜ ì§€ê”Ź Ʞ옚, 핎수멎, 북ê·č ëč™í•˜, 탄소 ì˜ˆìžĄ ML [ì™„ëŁŒ]

arima-model global-warming machine-learning matplotlib numpy pandas scikit-learn seaborn

Last synced: 11 Apr 2026

https://github.com/anibalalpizar/python-machine-learning-example

This code reads and preprocesses a dataset for classification using pandas, numpy, matplotlib and scikit-learn. The dataset is split into three parts for training, validation and testing. The data is then scaled and optionally oversampled for balanced classes.

machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 11 Apr 2026

https://github.com/nfordumass/hot-seat

Machine Learning Dashboard and Engine for Predicting NFL Coach Firings

astro machine-learning react scikit-learn supabase typescript

Last synced: 09 Mar 2025

https://github.com/sarowarahmed/advertising-sales-app

📈 Advertising Sales Predictor: A web app powered by a Machine Learning model, built with Numpy, Pandas, Scikit-learn, and Streamlit, to forecast sales based on TV, Newspaper, and Online Advertising. Deployed on Streamlit Cloud for real-time, easy-to-use predictions.

advertising app machine-learning multiple-linear-regression numpy pandas sales scikit-learn streamlit

Last synced: 07 Feb 2026

https://github.com/arnoldchrisoduor1/machines

Testing the limits of machines

pytorch scikit-learn tensorflow

Last synced: 11 Apr 2026

https://github.com/capsuleismail/drybeanuci

Data Science Project with Model comparison.

datascience jupyter-notebook machinelearning-python scikit-learn

Last synced: 18 May 2026

https://github.com/ayushtiwari134/machine_learning_models

A repo where i upload all the models which i train during my journey of learning Machine Learning from scratch

linear-regression logistic-regression machinelearning matplotlib numpy pandas python random-forest scikit-learn

Last synced: 11 Apr 2026

https://github.com/broodhoney/titanic-ml-from-disaster

This repository contains my analysis and solutions for the Titanic: Machine Learning from Disaster competition on Kaggle. The notebook explores the dataset, performs extensive Exploratory Data Analysis (EDA), applies feature engineering techniques, and builds predictive models to determine survival outcomes based on passenger data

machine-learning numpy pandas python scikit-learn scikitlearn-machine-learning

Last synced: 11 Apr 2026

https://github.com/shru924/ecommerce_customer_behavior_analysis

A machine learning project that analyzes and segments e-commerce customers based on behavior patterns using Python, Random Forest, and data visualization.

customer-segmentation data-analysis jupyter-notebook machine-learning matplotlib pandas python scikit-learn

Last synced: 11 Apr 2026

https://github.com/ansh-info/industrial-scale-penicillin-simulation

Optimizing industrial-scale penicillin production using machine learning and data analysis.

jupyter-notebook machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 11 Apr 2026

https://github.com/aerojam95/math70076-data-science-cw2

This repository presents the second coursework for the MATH70076 Data Science module at Imperial College London, where the project showcases different machine and deep learning models for image classification

data-science deep-learning machine-learning python3 pytorch scikit-learn

Last synced: 15 Apr 2026

https://github.com/swetshaw/machine-learning-a-z

It contains all tutorials based on Udemy course Machine Learning A-Z.

machine-learning python scikit-learn udemy-machine-learning

Last synced: 07 Apr 2026

https://github.com/freakwill/nb-combination

ensemble classifier with naive bayes combination

bayes-classifier python scikit-learn

Last synced: 20 May 2026

https://github.com/dllllb/ds-pipeline

Data Science model pipeline based on SciKit-Learn Estimator API

data-science machine-learning python scikit-learn

Last synced: 16 Apr 2026

https://github.com/amiriiw/text_classification

Welcome to the Text Classification Project! This project is designed to train a model for classifying texts based on their emotional content and then using it to categorize new texts into corresponding emotional categories.

keras numpy pandas pickle scikit-learn tensorflow text-classification

Last synced: 20 Jan 2026

https://github.com/dinuka-rp/python-machine-learning

This repository contains the projects that I followed to learn Machine Learning with Python

machine-learning python scikit-learn

Last synced: 11 Apr 2026

https://github.com/rririanto/thesis-projects

The computer science thesis project that I worked on when I was a student and was looking for a part time job

bag machine-learning python2 python27 scikit-learn surf

Last synced: 02 Feb 2026

https://github.com/vyjayanthipolapragada/fraud_detection_creditcard

Detecting the fraudulent credit card transactions by training Decision Tree model using Scikit-learn and SnapML

classification-model data-preprocessing decision-tree-classifier kaggle-dataset machine-learning numpy pandas python scikit-learn snapml time tree-model

Last synced: 11 Apr 2026

https://github.com/infinitode/scikit-learn-decisiontreeclassifier-updater

An open-source tool to convert older Scikit-learn DecisionTreeClassifier models to the newer version.

ai classifier cli converter decisiontree python scikit-learn sklearn tools

Last synced: 31 Mar 2025

https://github.com/dharma-acha/resnet18_imageclassification_cnn

In this part of the project, we implement ResNet-18 from scratch using PyTorch and train it on an image dataset to achieve over 75% accuracy. We apply techniques to prevent overfitting and optimize performance, aiming for an accuracy of 80% or higher.

matplotlib numpy python3 pytorch scikit-learn seaborn

Last synced: 11 Apr 2026

https://github.com/max00358/sign_language_detection

A sign language detector that recognizes ASL(American Sign Language) alphabet

mediapipe opencv scikit-learn

Last synced: 09 Feb 2026

https://github.com/mohd-faizy/preprocess_ml

This repository hosts Python code that utilizes the Scikit-learn preprocessing API for data preprocessing. The code presents a comprehensive range of tools that handle missing data, scale data, encode categorical variables, and perform other functions.

data-science feature-engineering feature-engineering-algorithm feature-extraction feature-selection machine-learning outlier-detection preprocessing-data preprocessor scikit-learn

Last synced: 16 May 2026

https://github.com/sarowarahmed/predicting-kolkata-house-price

🏠 Predicting Kolkata House Price: A web app powered by a Machine Learning model, built with Numpy, Pandas, Scikit-learn, and Streamlit, to predict house prices in Kolkata. Deployed on Streamlit Cloud for easy access and real-time predictions.

app kolkata linear-regression machine-learning numpy pandas scikit-learn streamlit

Last synced: 26 Feb 2026

https://github.com/sbera01/credit-card-approval-predictor

End-to-end Machine Learning project to predict credit card approval decisions using real-world financial features. Includes EDA, model training, and deployment-ready architecture

credit-card-approval-prediction data-analysis machine-learning python scikit-learn streamlit

Last synced: 24 Dec 2025

https://github.com/mborrillo/ranking-ciudades-espana

Sistema end-to-end de anĂĄlisis multicriterio que evalĂșa 50 ciudades españolas en calidad de vida mediante datos oficiales

business-intelligence data-analysis multi-criteria-decision-analysis pandas python3 quality-of-life ranking-system scikit-learn scoring-models

Last synced: 13 Jan 2026