An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/antonio-f/housing-simplemlexample

Basic example with California Housing Prices dataset from the StatLib repository using scikit-learn

housing-simplemlexample machine-learning scikit-learn simple

Last synced: 01 May 2026

https://github.com/luthfiwulandari/machine-learning-breast-cancer

This project is a simple application that uses logistic regression to detect breast cancer. It classifies tumors as either malignant or benign based on the dataset provided by Scikit-learn.

datascience jupyter logistic-regression machine-learning python scikit-learn

Last synced: 01 May 2026

https://github.com/jlee9503/medical-readmission

Conduct an analysis of medical readmission status using hospital patient data and the Social Determinants of Health dataset. Identify key factors influencing readmission rates to provide insights for improving healthcare outcomes.

python random-forest-regression scikit-learn tableau

Last synced: 01 May 2026

https://github.com/dhruvv1402/spam-detection-python-

This project is a Spam Detection System built using Python. It classifies SMS messages as spam or ham (not spam) using machine learning techniques.

countvectorizer kaggle-dataset nlp-machine-learning nltk numpy pandas python scikit-learn supervised-machine-learning tf-idf

Last synced: 01 May 2026

https://github.com/danishzulfiqar/language-detection-nlp-model

This machine learning model is designed to accurately detect and classify text in 18 languages using NLP

fastapi jupyter-notebook machine-learning natural-language-processing scikit-learn

Last synced: 01 May 2026

https://github.com/anastasiaschmidt1/sqli-detection-ml

UNI-PROJEKT: Erkennung von SQL-Injection-Angriffen durch maschinelles Lernen (SVM-Modell)

bht-berlin machine-learning scikit-learn sqli svm

Last synced: 02 May 2026

https://github.com/maxwelllzh/linearizer

Linearizing parameters for linear regression

data-analysis machine-learning scikit-learn

Last synced: 02 May 2026

https://github.com/luizassimoes/sklearn-kaggle-titanic

This repository was created to store all the code for tackling the Titanic challenge on Kaggle.

kaggle machine-learning scikit-learn

Last synced: 02 May 2026

https://github.com/dmschauer/aws-sagemaker-deployment-test

I did a simple test to see how deploying a machine learning model on AWS Sagemaker and thus turning it into an API works. Since scikit-learn models require less dependencies than e.g. TensorFlow models I went with them for this test. To do so I used a tutorial.

aws boto3 python sagemaker scikit-learn

Last synced: 02 May 2026

https://github.com/sundanc/btcprediction

Predict Bitcoin prices based on historical data using machine learning techniques

bitcoin-prediction keras machine-learning pandas python python3 scikit-learn scikitlearn-machine-learning

Last synced: 02 May 2026

https://github.com/bishopce16/cryptocurrencies

An analysis on cryptocurrencies dataset using unsupervised machine learning, PCA algorithm, and K-means clustering.

hvplot jupyter-notebook pandas plotly python scikit-learn unsupervised-machine-learning visual-studio-code

Last synced: 02 May 2026

https://github.com/pierrekieffer/datapreprocessing

Custom data preprocessing library made for machine learning

data-preparation data-preprocessing machine-learning preprocessing scikit-learn

Last synced: 02 May 2026

https://github.com/moritzkoerber/data_science_posts

This repository hosts the code for my data science related blog posts.

hyperparameter-tuning machine-learning pipeline python scikit-learn

Last synced: 03 May 2026

https://github.com/insane-group/scikit-learn-template

Generic template to bootstrap your scikit-learn project

hydra scikit-learn template

Last synced: 03 May 2026

https://github.com/viniciusds2020/ml_pycaret_classificacao

Sistema de preprocessamento e treinamento de modelos de machine learning utilizando PyCaret. Uma metodologia low-code para processos de MLops

machine-learning mlops preprocessing pycaret python scikit-learn

Last synced: 03 May 2026

https://github.com/rohitinu6/tesla-price-prediction

A machine learning project that predicts future stock price movements using Logistic Regression, SVC, and XGBoost with engineered financial features.

data-analysis data-visualization feature-engineering financial-analysis logistic-regression machine-learning matplotlib python scikit-learn seaborn stock-market stock-price-prediction support-vector-machine time-series xgboost

Last synced: 03 May 2026

https://github.com/fandredev/ml-my-guide

my own annotations about ML/DS using pandas, matplotlib, numpy, scikit learn

anaconda matplotlib numpy pandas plotly scikit-learn seaborn

Last synced: 03 May 2026

https://github.com/alessandromonolo/fraud-detection-binary-classification-model

This project builds a machine learning model to classify fraudulent clients using a banking dataset. Data preprocessing, statistical analysis, and feature selection were performed before training KNN and Random Forest Classifier. Model performance was evaluated using accuracy, precision, recall, and F1-score.

classification-model fraud-detection knn-classification machine-learning pandas python random-forest scikit-learn statistical-analysis

Last synced: 03 May 2026

https://github.com/arrhythmia-detection/authorprovidedfeaturescombineddt

Deploys a vanilla Decision Tree for Arrhythmia classification using Chapman ECG dataset on Arduino UNO board

arduino-uno arrhythmia-classification atmega328p chapman-ecg decision-tree-classifier eloquent scikit-learn

Last synced: 09 Jun 2026

https://github.com/albertodiazdurana/traveline-ds-project-skeleton

Minimal Python DS project skeleton (rebooking prediction): src-layout, sklearn, MLflow, FastAPI, Docker, GitHub Actions CI. Includes an intentional data-leakage bug for code-review demos.

data-science-skeleton docker fastapi github-actions machine-learning mlflow pydantic pytest python scikit-learn

Last synced: 09 Jun 2026

https://github.com/zhenglinlei/zdmp

Industry 4.0 Optimization with Machine Learning AI

industry-4 knn-classification machine-learning pandas python scikit-learn

Last synced: 03 May 2026

https://github.com/srisaihariharan/mic_sentiment_analysis_v

Sentiment analysis of IMDb movie reviews using Python, Scikit-learn, and TF-IDF.

machine-learning natural-language-processing nlp python scikit-learn sentiment-analysis sentiment-classification

Last synced: 03 May 2026

https://github.com/apfirebolt/movie_recommendation_using_scikitlearn_and_pyqt5

A movie recommendation system built using KNN model from scikit-learn library. GUI components are powered by pyQt5, a library to create GUI applications in Python

cosine-similarity jupyter-notebook knn-algorithm movie-recommedation pandas python scikit-learn

Last synced: 03 May 2026

https://github.com/abdiasarsene/predictive-churn-management-data-driven-customer

Use unsupervised learning techniques to segment a company’s customers into distinct groups in order to personalize marketing campaigns. To ultimately propose specific marketing strategies for each customer segment based on the insights obtained.

acp kmeans-clustering matplotlib pandas plotly python scikit-learn seaborn

Last synced: 03 May 2026

https://github.com/furk4nbulut/uygulamalarla-makine-ogrenmesi-ve-derin-ogrenme-atolyesi

Bu repository, Manisa'da gerçekleştirilen BTK Akademi Uygulamalı Makine Öğrenmesi ve Derin Öğrenme Atölyesi'ne ait eğitim sürecini kapsamaktadır. Atölyede katılımcılar, ileri düzey makine öğrenmesi ve derin öğrenme teknikleriyle ilgili teorik ve pratik bilgiler edinmektedir.

matplotlib numpy pandas scikit-learn seaborn

Last synced: 03 May 2026

https://github.com/kaustavmodak/business-aided-customer-feedback-assessment-system

A Streamlit-based sentiment analysis app that classifies customer reviews into Positive, Neutral, or Negative using a pre-trained ML mode

framework machine-learning matplotlib nlp nltk numpy pandas pickle regex scikit-learn seaborn sentiment-analysis streamlt tfidf-vectorizer

Last synced: 03 May 2026

https://github.com/pramodyasahan/binary-classifier

This repository houses the code for a machine learning model designed to predict customer churn. The model is built using Support Vector Machine (SVM) from the scikit-learn library and incorporates preprocessing, pipeline, and grid search techniques for optimal performance.

numpy pandas scikit-learn

Last synced: 03 May 2026

https://github.com/jonad/finding_donors

Predicting income with UCI Census Income Dataset using supervised machine learning algorithms

numpy pandas scikit-learn scikitlearn-machine-learning

Last synced: 03 May 2026

https://github.com/alestankiewicz/credit-card-fraud-detection

Credit Card Fraud Detection Excercise In Python

pandas plotly python3 scikit-learn xgboost

Last synced: 03 May 2026

https://github.com/arnavk-09/phishing-detection

🎣 Detect Phishing URLs with Data Pre-fitted... API & Web UI

csv data fastapi flask python scikit-learn

Last synced: 03 May 2026

https://github.com/thmslmr/scikitlearn-examples

💡Scikit Learn examples - Python

python scikit-learn tutorials

Last synced: 03 May 2026

https://github.com/srilaasya/breast-cancer-classifier

Used several Python libraries to make a K-Nearest Neighbor classifier that is trained to predict whether a patient has breast cancer

knearest-neighbor-classifier python scikit-learn

Last synced: 03 May 2026

https://github.com/atchayaah/home-value-insights-kc

Data-driven project predicting King County housing prices using EDA, regression models, and ML techniques, developed as part of IBM’s Data Analysis with Python course on Coursera.

joblib matplotlib numpy pandas pickle python scikit-learn seaborn

Last synced: 03 May 2026

https://github.com/darenr/gradientboostingmachines

Notebooks exploring strengths and weaknesses of GBM based classifiers

jupyter-notebook lightgbm pandas scikit-learn xgboost

Last synced: 03 May 2026

https://github.com/lucs1590/commom_segmentations

The purpose of this repository is to document and expose code samples using common threading techniques.

computational-vision machine-learning open-source opencv python scikit-image scikit-learn segmentation sklearn

Last synced: 03 May 2026

https://github.com/samarth4023/shell-internship-2

🤖 AICTE Shell Internship - NLP Chatbot This repository contains the implementation of a Chatbot using NLP, developed as part of the AICTE Shell Internship. The chatbot is designed to understand and respond to user queries using Natural Language Processing (NLP) techniques.

ai artificial-intelligence chatbot natural-language-processing nlp nltk python scikit-learn streamlit

Last synced: 04 May 2026

https://github.com/ceodaniyal/telecom_customer_churn_prediction

A machine learning project that predicts whether a telecom customer will churn (leave the service) using customer demographics, account information, and service usage. The repository includes data preprocessing, model training (with logistic regression), feature scaling, and example predictions.

classification customer-churn-prediction data-science logistic-regression machine-learning ml-project pandas prediction python scikit-learn streamlit telecom

Last synced: 04 May 2026

https://github.com/abdullahalzubaer/feature-selection-ranking

In-depth analysis regarding feature selection and ranking.

feature-ranking feature-selection random scikit-learn

Last synced: 04 May 2026

https://github.com/codejsha/machine-learning-examples

Examples of machine learning using scikit-learn

machine-learning scikit-learn

Last synced: 04 May 2026

https://github.com/vyclarks/gestational-diabetes-prediction-ml

Predicting gestational diabetes from the Pima dataset — Python (scikit-learn); reproducible notebook, metrics, and report.

healthcare-analysis machine-learning python scikit-learn

Last synced: 04 May 2026

https://github.com/baponkar/scikit-logisticregression-application

A simple and detail application analysis of sci kit learn LogisticRegression model .

classification-algorithm logistic-regression machine-learning python3 scikit-learn

Last synced: 04 May 2026

https://github.com/danielwohlr/delivery_time_series

Time series forecasting of food delivery service data

forecasting-time-series python scikit-learn

Last synced: 04 May 2026

https://github.com/abhivur/graduate-income-forecaster

Contributors: Abdussalam Raheem, Chiara Su, and Joseph Botros

matplotlib numpy pandas python scikit-learn seaborn

Last synced: 04 May 2026

https://github.com/homebackend/pdf-title-page-splitter

Splits a pdf based on identified title pages using ML trained model

machine-learning opencv pdf-splitter pdf2image pypdf2 scikit-learn tensorflow

Last synced: 04 May 2026

https://github.com/satvikpraveen/sklearn-mastery

Enterprise-grade ML framework showcasing advanced Scikit-Learn implementations with production-ready pipelines, algorithm-optimized synthetic data generation, comprehensive evaluation suite with statistical testing, custom transformers, ensemble methods, and real-world industry applications across healthcare, finance, and manufacturing domains.

artificial-intelligence ci-cd classification custom-transformers data-science docker ensemble-methods feature-engineering fintech fraud-detection healthcare-ai hyperparameter-tuning jupyter-notebooks machine-learning mlops model-evaluation pipeline-architecture predictive-maintenance python scikit-learn

Last synced: 04 May 2026

https://github.com/joel-beck/airbnb-oslo

Price Prediction Models for Airbnb Apartments in Oslo | Winter Term 2021/22

prediction python pytorch scikit-learn

Last synced: 04 May 2026

https://github.com/bhawnamehbubani/airline-passenger-referral-program-development-with-classification-techniques

Prediction of airline passenger referrals using Logistic Regression, GridSearchCV, and TF-IDF vectorization with Python, Pandas, Scikit-learn, and Excel.

excel gridsearchcv logistic-regression pandas python3 scikit-learn tf-idf-vectorization

Last synced: 04 May 2026

https://github.com/keven-rdr/rio-airbnb-predictor

Estudo de IA, utilizando modelos de previsão como o regressor para determinar valor de imóvel

airbnb ia kaggle php price regression-models scikit-learn

Last synced: 04 May 2026

https://github.com/suguru-n/temp_easyai

学部生向け機械学習体験プログラム

google-colab jupyter-notebook linearregression python scikit-learn

Last synced: 04 May 2026

https://github.com/dakii24/credit-card-fraud-detection

This repository contains a machine learning project focused on detecting fraudulent credit card transactions. The project includes data preprocessing, model training, and evaluation to identify and prevent fraudulent activities.

capstone-project class-imbalance classification-algorithm credit-card credit-card-fraud data-science decision-trees fraud machine-learning open-data python scikit-learn svm svm-classifier

Last synced: 04 May 2026

https://github.com/madhu26sree/diabetes-prediction

This project leverages the Support Vector Machine (SVM) algorithm to predict whether a person is likely to have diabetes or not, using the Diabetes dataset. It covers data preprocessing, model building, evaluation using Python.

machine-learning python scikit-learn

Last synced: 04 May 2026

https://github.com/drod75/nyc-arrests-analysis

This is a simple Data Science Project made to analyze and display data and trends found within the NYC Arrests Year to Date Dataset.

data-analysis data-visualization folium jupyter-notebook matplotlib-pyplot nyc-opendata nypd python scikit-learn seaborn

Last synced: 04 May 2026

https://github.com/msikorski93/protein-tertiary-structure

Performing a regression task for estimating residue size based on given physicochemical properties of protein tertiary structures (CASP 5-9).

bioinformatics gradient-boosting multilayer-perceptron-network protein-structure-prediction regression-algorithms scikit-learn tensorflow

Last synced: 04 May 2026

https://github.com/chathumiamarasinghe/nn-training-model

A comprehensive project for training neural networks to solve real-world problems. This repository includes customizable code for building, training, and evaluating neural network architectures using popular deep learning frameworks.

jupyter-notebook matplotlib numpy phyton scikit-learn

Last synced: 04 May 2026

https://github.com/aqueeqazam/machine-learning-using-scikit

This repository contains all of the algorithms used to train the machine learning models using the Scikit library.

numpy scikit-learn

Last synced: 04 May 2026

https://github.com/siddhantborse/atmosviz

Atmos Viz is a Python-based project designed to analyze, visualize, and predict global temperature trends across various cities and countries using time-series analysis and advanced data science techniques. Leveraging historical climate data, this project integrates machine learning models, geospatial mapping, and interactive visualizations to unco

geopandas geospatial-analysis gis matplotlib numpy pandas plotly python scikit-learn seaborn shapefiles time timeseries-analysis timeseries-data

Last synced: 05 May 2026

https://github.com/sxv357/xtern-artificial-intelligence-work-based-assessment

This application takes in data regarding undergraduate college students in the state of Indiana such as their year, what major they're pursuing, which university they attend, and makes a prediction about their food order.

jupyter-notebook matplotlib pandas pickle scikit-learn seaborn

Last synced: 05 May 2026

https://github.com/pierrealexandre78/deathpredict

Predict Hospital mortality rate using Machine Learning for patients admitted in ICU (Intensive Care Unit)

healthcare hospital machine-learning predictions python random-forest-classifier scikit-learn xgboost-classifier

Last synced: 05 May 2026

https://github.com/thekartikeyamishra/resumeevaluatorapp

The Automated Resume Evaluator is a Python-based application that helps evaluate resumes against job descriptions. It calculates an Applicant Tracking System (ATS) score, which is the percentage of keywords from the job description found in the resume.

flask machine-learning matplotlib nlp nltk pypdf python scikit-learn spacy textblob

Last synced: 05 May 2026

https://github.com/himanshkr03/comparative_performance_on_fashionmnist

This repository explores various machine learning and deep learning models for classifying images from the Fashion MNIST dataset. It includes data exploration, model training, evaluation, and visualization techniques to gain insights into the classification task.

deep-learning fashion-mnist fine hybrid-model image-classification keras machine-learning scikit-learn tensorflow xgboost-algorithm

Last synced: 05 May 2026

https://github.com/simpl1fy/spam-classifier-project

A web application to classify spam texts or emails.

multinomial-naive-bayes nltk python render scikit-learn text-classification

Last synced: 05 May 2026

https://github.com/s-matke/eco-forecast

Machine learning model used for predicting European country with most green surplus energy generated

data-science green-energy machine-learning scikit-learn supervised-learning

Last synced: 05 May 2026

https://github.com/hallowshaw/text-emotion-classification-using-lstm-and-tokenization

This repository provides a machine learning and deep learning pipeline for text emotion detection. It includes a pretrained LSTM model, tokenizer, and preprocessing steps to classify emotions such as joy, sadness, and anger from text input. Easily deployable with provided resources and scripts.

emotion-classification emotion-detection feature-engineering lstm nltk nltk-python scikit-learn scikitlearn-machine-learning sentiment-analysis sequential-models text-classification text-classification-multi-label tokenization tokenizer

Last synced: 05 May 2026

https://github.com/marconicivitavecchia/stazione-monitoraggio-ambientale

Codice in MicroPython per ESP32 per il corso tenuto dalla nostra scuola rivolto ai docenti sulla creazione di una stazione di monitoraggio ambientale che copre gli argomenti di Python, IoT ed Intelligenza Artificiale.

ai esp32 micropython micropython-esp32 python school-project scikit-learn

Last synced: 05 May 2026

https://github.com/zafir100100/cancer-stage-prediction

This code predicts cancer data using various regression models, calculates their average R-squared scores, and prints the best model.

cross-validation data-analysis data-preprocessing decision-trees gradient-boosting linear-regression machine-learning-algorithms numpy pandas random-forest regression scikit-learn

Last synced: 05 May 2026

https://github.com/hitthecodelabs/petalanalyticsstreamlit

Web application developed with Streamlit that predicts the Iris flower type based on its physical features

matplotlib model numpy pickle python scikit-learn sklearn streamlit

Last synced: 05 May 2026

https://github.com/monish-nallagondalla/universal-bank

Credit Card Ownership Prediction A machine learning project that predicts credit card ownership using features like age and income, balancing class distributions for improved accuracy.

classification-models credit-card-prediction data-analysis data-classification decision-tree-classifier imbalanced-datasets machine-learning model-evaluation python scikit-learn

Last synced: 05 May 2026

https://github.com/smaddanki/pattern-pursuit-challenge

A personal challenge to build a production-ready trading signal system for S&P 500 stocks using deep learning. This project progresses from basic ML models to a complete trading infrastructure, focusing on 5-day forward return prediction and signal generation.

deep-learning machine-learning pytorch quantative-trading quantitative-finance quantitative-research scikit-learn

Last synced: 05 May 2026

https://github.com/markdouthwaite/lingo-demo

A demo project showing how to effectively deploy Scikit-Learn Linear Models in Go into Google Cloud Run.

go golang google-cloud-platform python scikit-learn

Last synced: 05 May 2026

https://github.com/akash-47-tank/personalized-e-commerce-review-summarizer

Personalized E-commerce Product Review Summarizer: A Streamlit app that summarizes product reviews (e.g., from a CSV) using T5-small and tailors summaries to user preferences (price, durability, etc.) with NLP and lightweight ML.

data-analysis e-commerce machine-learning nlp personalization portfolio python scikit-learn sentiment-analysis streamlit t5 transformers web-app

Last synced: 05 May 2026

https://github.com/teja-1403/coursera-machine-learning-with-python-honors

This project involves building a classifier to predict rainfall for the next day based on weather data from the Australian Government's Bureau of Meteorology. Various machine learning techniques such as Linear Regression, KNN, Decision Trees, Logistic Regression, and SVM were implemented and evaluated.

classification hierarchical-clustering machine-learning regression scikit-learn scipy

Last synced: 05 May 2026

https://github.com/zuhairzia/customer-segmentation

📖 About Customer Segmentation using KMeans clustering to analyze demographics, income, and spending. Helps businesses with targeted marketing and customer insights.

joblib matplotlib numpy pandas scikit-learn seaborn

Last synced: 05 May 2026

https://github.com/rohra-mehak/sciencesync

System for Personalized Google Scholar Alerts Processing and Data Management, and provision of ML based clustering analysis

agglomerative-clustering clustering crossref-api customtkinter google-api google-scholar graph-api machine-learning numpy pandas python3 scientific-article-analysis scikit-learn sqlite3

Last synced: 05 May 2026