An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/disney35/stock-prices-dashboard

A dashboard to analyze, predict, and visualize stock prices using Python & LSTM

ema jupyter-notebook keras macd matplotlib-pyplot mfi numpy pandas python rsi scikit-learn sma streamlit tenserflow yfinance

Last synced: 12 Apr 2026

https://github.com/khaifara/klafisikasi_jeruk_faiz_kece

Step by step machine learning classification dengan StandardScaler, OneHotEncoder, OrdinalEncoder, ColumnTransformer, Pipeline, Classification Report, Confusion Matrix dan deployment menggunakan Streamlit

machine-learning scikit-learn streamlit

Last synced: 05 Oct 2025

https://github.com/veerchaudhary0708/credit-fraud-detection

An end-to-end machine learning project to detect credit fraud using XGBoost.

datascience fintech fraud-detection machinelearning scikit-learn xgboost

Last synced: 18 May 2026

https://github.com/inesruizblach/data-science-project

A data science project exploring Portuguese "Vinho Verde" wine quality prediction. Features EDA, feature engineering, ML models, and evaluation using Python, pandas, scikit-learn, and visualization tools.

binary-classification classification data-science exploratory-data-analysis feature-engineering imbalanced-learn jupyter-notebook machine-learning model-evaluation pandas regression scikit-learn seaborn uci-dataset wine-quality

Last synced: 09 May 2026

https://github.com/vedanty3/bulldozer-price-prediction

A machine learning project aiming to build a machine learning model which could predict the sales price of bulldozer.

andrew-ng-machine-learning ensemble-machine-learning gridsearchcv jupyter-notebook machine-learning matplotlib numpy pandas python randomforestregressor randomizedsearchcv scikit-learn ztm

Last synced: 05 Apr 2026

https://github.com/smakde/learning-resource-recommender

A lightweight recommender that helps you discover your next learning resource. It blends patterns from similar users with content keywords, and explains each suggestion in the UI.

als content-based-filtering evaluation-metrics explainable-ai hybrid-recommender implicit-feedback implicit-lib lightfm logistic-matrix-factorization mapk matrix-factorization ndcg pandas precision-at-k python recommender-system scikit-learn streamlit tf-idf top-n-recommendations

Last synced: 30 Apr 2026

https://github.com/moritzkoerber/tune_preprocessing_algos

Files for this blogpost https://moritzkoerber.github.io/python/tutorial/2019/11/18/blogpost/

cross-validation hyperparameter-tuning machine-learning python scikit-learn

Last synced: 30 Apr 2026

https://github.com/tinaland101/credit-risk-classification

The purpose of this project is to build a credit risk classification model using machine learning techniques. This model helps identify the creditworthiness of borrowers based on historical lending data. Specifically, it uses a logistic regression model to predict whether a loan is healthy (0) or high-risk (1).

numpy pandas pathlib scikit-learn

Last synced: 30 Apr 2026

https://github.com/samuelpillai/machine-learning-classification-regression-nlp

A curated collection of machine learning mini-projects covering classification, regression, and natural language processing (NLP). This project demonstrates model training, evaluation, feature engineering, and pipeline integration using real-world datasets and Python tools like Scikit-learn, pandas, and NLTK.

classification data-analysis data-science data-visualization feature-engineering jupyter-notebook machine-learning ml-pipeline model-evaluation nlp python regression-models scikit-learn supervised-learning text-mining

Last synced: 30 Apr 2026

https://github.com/abhivur/connections-ai

Contributors: Meet Gamdha, Gaurav Nimmagadda

bert python scikit-learn word2vec

Last synced: 30 Apr 2026

https://github.com/kumailn/machinelearning

Machine learning with Python

machine-learning python scikit-learn tensorflow

Last synced: 30 Apr 2026

https://github.com/dharma-acha/explanability_in_deepneuralnetworks

Our project aims to enhance the transparency and trustworthiness of the VGG model in critical fields like healthcare imaging and self-driving cars. By integrating explainability methods into the VGG model for image classification, we will clarify its decision-making process.

colab-notebook matplotlib numpy pandas scikit-learn seaborn

Last synced: 30 Apr 2026

https://github.com/boladjivinny/fire-prediction

Notebook for the Fire fighting using data on Zindi. Ranked number 5 on the public leaderboard and 8 on the private leaderboard. https://zindi.africa/hackathons/cmu-africa-fighting-fire-with-data

feature-engineering hackhathon machine-learning regression scikit-learn stacking

Last synced: 30 Apr 2026

https://github.com/fadlani-aditya/iris-plant-classification

This project focuses on classifying different species of Iris flowers using the Random Forest algorithm. The dataset, sourced from Scikit-learn, contains four key features: sepal length, sepal width, petal length, and petal width, which are used to predict the flower species (Setosa, Versicolor, and Virginica).

agriculture data-science iris-dataset machine-learning python scikit-learn supervised-learning

Last synced: 01 May 2026

https://github.com/arturovaine/n8n-nodes-sklearn

Custom n8n nodes for integrating scikit-learn machine learning algorithms into your n8n workflows.

machine-learning n8n n8n-nodes scikit-learn sklearn

Last synced: 08 Jun 2026

https://github.com/kristishqau/sentimentanalysis_nlp

A project for sentiment analysis of tweets using various NLP techniques and machine learning models.

datascience jupyter-notebook machine-learning nlp nltk python scikit-learn sentiment-analysis xgboost

Last synced: 01 May 2026

https://github.com/barbarahayd/com410-ml

atividades aula machine learning

decision-tree scikit-learn

Last synced: 01 May 2026

https://github.com/antonio-f/housing-simplemlexample

Basic example with California Housing Prices dataset from the StatLib repository using scikit-learn

housing-simplemlexample machine-learning scikit-learn simple

Last synced: 01 May 2026

https://github.com/luthfiwulandari/machine-learning-breast-cancer

This project is a simple application that uses logistic regression to detect breast cancer. It classifies tumors as either malignant or benign based on the dataset provided by Scikit-learn.

datascience jupyter logistic-regression machine-learning python scikit-learn

Last synced: 01 May 2026

https://github.com/dhruvv1402/spam-detection-python-

This project is a Spam Detection System built using Python. It classifies SMS messages as spam or ham (not spam) using machine learning techniques.

countvectorizer kaggle-dataset nlp-machine-learning nltk numpy pandas python scikit-learn supervised-machine-learning tf-idf

Last synced: 01 May 2026

https://github.com/danishzulfiqar/language-detection-nlp-model

This machine learning model is designed to accurately detect and classify text in 18 languages using NLP

fastapi jupyter-notebook machine-learning natural-language-processing scikit-learn

Last synced: 01 May 2026

https://github.com/anastasiaschmidt1/sqli-detection-ml

UNI-PROJEKT: Erkennung von SQL-Injection-Angriffen durch maschinelles Lernen (SVM-Modell)

bht-berlin machine-learning scikit-learn sqli svm

Last synced: 02 May 2026

https://github.com/luizassimoes/sklearn-kaggle-titanic

This repository was created to store all the code for tackling the Titanic challenge on Kaggle.

kaggle machine-learning scikit-learn

Last synced: 02 May 2026

https://github.com/viniciusds2020/ml_pycaret_classificacao

Sistema de preprocessamento e treinamento de modelos de machine learning utilizando PyCaret. Uma metodologia low-code para processos de MLops

machine-learning mlops preprocessing pycaret python scikit-learn

Last synced: 03 May 2026

https://github.com/fandredev/ml-my-guide

my own annotations about ML/DS using pandas, matplotlib, numpy, scikit learn

anaconda matplotlib numpy pandas plotly scikit-learn seaborn

Last synced: 03 May 2026

https://github.com/alessandromonolo/fraud-detection-binary-classification-model

This project builds a machine learning model to classify fraudulent clients using a banking dataset. Data preprocessing, statistical analysis, and feature selection were performed before training KNN and Random Forest Classifier. Model performance was evaluated using accuracy, precision, recall, and F1-score.

classification-model fraud-detection knn-classification machine-learning pandas python random-forest scikit-learn statistical-analysis

Last synced: 03 May 2026

https://github.com/albertodiazdurana/traveline-ds-project-skeleton

Minimal Python DS project skeleton (rebooking prediction): src-layout, sklearn, MLflow, FastAPI, Docker, GitHub Actions CI. Includes an intentional data-leakage bug for code-review demos.

data-science-skeleton docker fastapi github-actions machine-learning mlflow pydantic pytest python scikit-learn

Last synced: 09 Jun 2026

https://github.com/apfirebolt/movie_recommendation_using_scikitlearn_and_pyqt5

A movie recommendation system built using KNN model from scikit-learn library. GUI components are powered by pyQt5, a library to create GUI applications in Python

cosine-similarity jupyter-notebook knn-algorithm movie-recommedation pandas python scikit-learn

Last synced: 03 May 2026

https://github.com/kaustavmodak/business-aided-customer-feedback-assessment-system

A Streamlit-based sentiment analysis app that classifies customer reviews into Positive, Neutral, or Negative using a pre-trained ML mode

framework machine-learning matplotlib nlp nltk numpy pandas pickle regex scikit-learn seaborn sentiment-analysis streamlt tfidf-vectorizer

Last synced: 03 May 2026

https://github.com/jonad/finding_donors

Predicting income with UCI Census Income Dataset using supervised machine learning algorithms

numpy pandas scikit-learn scikitlearn-machine-learning

Last synced: 03 May 2026

https://github.com/arnavk-09/phishing-detection

🎣 Detect Phishing URLs with Data Pre-fitted... API & Web UI

csv data fastapi flask python scikit-learn

Last synced: 03 May 2026

https://github.com/srilaasya/breast-cancer-classifier

Used several Python libraries to make a K-Nearest Neighbor classifier that is trained to predict whether a patient has breast cancer

knearest-neighbor-classifier python scikit-learn

Last synced: 03 May 2026

https://github.com/samjoesilvano/airline_ticket_fare_prediction

Airline Fare Prediction using Machine Learning focuses on developing a Random Forest model to predict flight prices, achieving an R² score of 0.804. The project includes hyperparameter tuning using RandomizedSearchCV, alongside extensive data preprocessing and feature engineering to ensure robust model performance.

airline-fare-prediction data-preprocessing data-visualization feature-engineering feature-selection hyperparameter-tuning machine-learning pandas python random-forest randomizedsearchcv regression-analysis scikit-learn

Last synced: 15 Apr 2026

https://github.com/darenr/gradientboostingmachines

Notebooks exploring strengths and weaknesses of GBM based classifiers

jupyter-notebook lightgbm pandas scikit-learn xgboost

Last synced: 03 May 2026

https://github.com/lucs1590/commom_segmentations

The purpose of this repository is to document and expose code samples using common threading techniques.

computational-vision machine-learning open-source opencv python scikit-image scikit-learn segmentation sklearn

Last synced: 03 May 2026

https://github.com/samarth4023/shell-internship-2

🤖 AICTE Shell Internship - NLP Chatbot This repository contains the implementation of a Chatbot using NLP, developed as part of the AICTE Shell Internship. The chatbot is designed to understand and respond to user queries using Natural Language Processing (NLP) techniques.

ai artificial-intelligence chatbot natural-language-processing nlp nltk python scikit-learn streamlit

Last synced: 04 May 2026

https://github.com/abdullahalzubaer/feature-selection-ranking

In-depth analysis regarding feature selection and ranking.

feature-ranking feature-selection random scikit-learn

Last synced: 04 May 2026

https://github.com/vyclarks/gestational-diabetes-prediction-ml

Predicting gestational diabetes from the Pima dataset — Python (scikit-learn); reproducible notebook, metrics, and report.

healthcare-analysis machine-learning python scikit-learn

Last synced: 04 May 2026

https://github.com/danielwohlr/delivery_time_series

Time series forecasting of food delivery service data

forecasting-time-series python scikit-learn

Last synced: 04 May 2026

https://github.com/homebackend/pdf-title-page-splitter

Splits a pdf based on identified title pages using ML trained model

machine-learning opencv pdf-splitter pdf2image pypdf2 scikit-learn tensorflow

Last synced: 04 May 2026

https://github.com/satvikpraveen/sklearn-mastery

Enterprise-grade ML framework showcasing advanced Scikit-Learn implementations with production-ready pipelines, algorithm-optimized synthetic data generation, comprehensive evaluation suite with statistical testing, custom transformers, ensemble methods, and real-world industry applications across healthcare, finance, and manufacturing domains.

artificial-intelligence ci-cd classification custom-transformers data-science docker ensemble-methods feature-engineering fintech fraud-detection healthcare-ai hyperparameter-tuning jupyter-notebooks machine-learning mlops model-evaluation pipeline-architecture predictive-maintenance python scikit-learn

Last synced: 04 May 2026

https://github.com/bhawnamehbubani/airline-passenger-referral-program-development-with-classification-techniques

Prediction of airline passenger referrals using Logistic Regression, GridSearchCV, and TF-IDF vectorization with Python, Pandas, Scikit-learn, and Excel.

excel gridsearchcv logistic-regression pandas python3 scikit-learn tf-idf-vectorization

Last synced: 04 May 2026

https://github.com/keven-rdr/rio-airbnb-predictor

Estudo de IA, utilizando modelos de previsão como o regressor para determinar valor de imóvel

airbnb ia kaggle php price regression-models scikit-learn

Last synced: 04 May 2026

https://github.com/dakii24/credit-card-fraud-detection

This repository contains a machine learning project focused on detecting fraudulent credit card transactions. The project includes data preprocessing, model training, and evaluation to identify and prevent fraudulent activities.

capstone-project class-imbalance classification-algorithm credit-card credit-card-fraud data-science decision-trees fraud machine-learning open-data python scikit-learn svm svm-classifier

Last synced: 04 May 2026

https://github.com/madhu26sree/diabetes-prediction

This project leverages the Support Vector Machine (SVM) algorithm to predict whether a person is likely to have diabetes or not, using the Diabetes dataset. It covers data preprocessing, model building, evaluation using Python.

machine-learning python scikit-learn

Last synced: 04 May 2026

https://github.com/drod75/nyc-arrests-analysis

This is a simple Data Science Project made to analyze and display data and trends found within the NYC Arrests Year to Date Dataset.

data-analysis data-visualization folium jupyter-notebook matplotlib-pyplot nyc-opendata nypd python scikit-learn seaborn

Last synced: 04 May 2026

https://github.com/aqueeqazam/machine-learning-using-scikit

This repository contains all of the algorithms used to train the machine learning models using the Scikit library.

numpy scikit-learn

Last synced: 04 May 2026

https://github.com/sxv357/xtern-artificial-intelligence-work-based-assessment

This application takes in data regarding undergraduate college students in the state of Indiana such as their year, what major they're pursuing, which university they attend, and makes a prediction about their food order.

jupyter-notebook matplotlib pandas pickle scikit-learn seaborn

Last synced: 05 May 2026

https://github.com/thekartikeyamishra/resumeevaluatorapp

The Automated Resume Evaluator is a Python-based application that helps evaluate resumes against job descriptions. It calculates an Applicant Tracking System (ATS) score, which is the percentage of keywords from the job description found in the resume.

flask machine-learning matplotlib nlp nltk pypdf python scikit-learn spacy textblob

Last synced: 05 May 2026

https://github.com/himanshkr03/comparative_performance_on_fashionmnist

This repository explores various machine learning and deep learning models for classifying images from the Fashion MNIST dataset. It includes data exploration, model training, evaluation, and visualization techniques to gain insights into the classification task.

deep-learning fashion-mnist fine hybrid-model image-classification keras machine-learning scikit-learn tensorflow xgboost-algorithm

Last synced: 05 May 2026

https://github.com/s-matke/eco-forecast

Machine learning model used for predicting European country with most green surplus energy generated

data-science green-energy machine-learning scikit-learn supervised-learning

Last synced: 05 May 2026

https://github.com/marconicivitavecchia/stazione-monitoraggio-ambientale

Codice in MicroPython per ESP32 per il corso tenuto dalla nostra scuola rivolto ai docenti sulla creazione di una stazione di monitoraggio ambientale che copre gli argomenti di Python, IoT ed Intelligenza Artificiale.

ai esp32 micropython micropython-esp32 python school-project scikit-learn

Last synced: 05 May 2026

https://github.com/zafir100100/cancer-stage-prediction

This code predicts cancer data using various regression models, calculates their average R-squared scores, and prints the best model.

cross-validation data-analysis data-preprocessing decision-trees gradient-boosting linear-regression machine-learning-algorithms numpy pandas random-forest regression scikit-learn

Last synced: 05 May 2026

https://github.com/monish-nallagondalla/universal-bank

Credit Card Ownership Prediction A machine learning project that predicts credit card ownership using features like age and income, balancing class distributions for improved accuracy.

classification-models credit-card-prediction data-analysis data-classification decision-tree-classifier imbalanced-datasets machine-learning model-evaluation python scikit-learn

Last synced: 05 May 2026

https://github.com/smaddanki/pattern-pursuit-challenge

A personal challenge to build a production-ready trading signal system for S&P 500 stocks using deep learning. This project progresses from basic ML models to a complete trading infrastructure, focusing on 5-day forward return prediction and signal generation.

deep-learning machine-learning pytorch quantative-trading quantitative-finance quantitative-research scikit-learn

Last synced: 05 May 2026

https://github.com/akash-47-tank/personalized-e-commerce-review-summarizer

Personalized E-commerce Product Review Summarizer: A Streamlit app that summarizes product reviews (e.g., from a CSV) using T5-small and tailors summaries to user preferences (price, durability, etc.) with NLP and lightweight ML.

data-analysis e-commerce machine-learning nlp personalization portfolio python scikit-learn sentiment-analysis streamlit t5 transformers web-app

Last synced: 05 May 2026

https://github.com/nandinimarepalli/ai_ml_internship_projects

Projects completed during my AI/ML and Data Expert internship, including EDA, machine learning models, and dashboard development using Python, pandas, scikit-learn, and visualization libraries.

matplotlib numpy pandas python scikit-learn seaborn

Last synced: 05 May 2026

https://github.com/rohansardar/iris_flower

A basic ML project on the iris flower classification

data-science iris-classification iris-dataset ml python scikit-learn

Last synced: 05 May 2026

https://github.com/aysenurcftc/breast_cancer_streamlit

Breast Cancer Wisconsin Dataset Classifier with Scikit-learn and Streamlit

breast-cancer classification gridsearch scikit-learn streamlit

Last synced: 05 May 2026

https://github.com/supernovasatsangi23/modifying-biomarker-gene-identification-for-effective-cancer-categorization

A project that focuses on implementing a hybrid approach that modifies the identification of biomarker genes for better categorization of cancer. The methodology is a fusion of MRMR filter method for feature selection, steady state genetic algorithm and a MLP classifier.

dataset deep-learning deep-neural-networks feature-selection genetic-algorithm machine-learning machine-learning-algorithms mlp-classifier mrmr neural-network numpy pandas-dataframe python python3 scikit-learn scikit-learn-python tkinter-gui tkinter-python

Last synced: 05 May 2026

https://github.com/vanilladucky/housing-prediction

This is a data analytics and machine learning project that I undertook using a housing dataset on Kaggle in order to put my machine learning knowledge to practice and some practical application.

data-science machine-learning python scikit-learn

Last synced: 05 May 2026

https://github.com/zenitsu272/fault-detection-ml

Machine Learning based Fault Detection in machines using sensor data

artificial-intelligence decsion-tree machine-learning pandas pandas-dataframe pandas-python scikit-learn

Last synced: 05 May 2026

https://github.com/pjj11005/ml_with_pytorch_study

[머신 러닝 교과서: 파이토치 편] -> 학습한 코드 저장소

deep-learning graph-neural-networks machine-learning neural-networks pytorch scikit-learn transformer

Last synced: 06 May 2026

https://github.com/grandechowhiskey/fcc-machine_learning-boilerplates

A collection of projects completed as part of the FreeCodeCamp "Machine Learning with Python" certification. These projects focus on implementing machine learning models, data preprocessing, and predictive analysis using libraries like scikit-learn and TensorFlow.

ai ml python3 scikit-learn tensorflow

Last synced: 06 May 2026

https://github.com/fahrettinsolak/ai-based-salary-scale-calculation-project

This project demonstrates a Polynomial Regression model using a dataset related to experience and salary. The model is built using Python with the pandas, matplotlib, and sklearn libraries. The dataset includes information on years of experience and corresponding salary.

artificial-intelligence deep-learning jupyter-notebook machine-learning matplotlib pandas pyhton scikit-learn

Last synced: 05 May 2026

https://github.com/sadmansakib93/mental-resilience-analysis-using-machine-learning

Utilized supervised and unsupervised ML techniques to analyze mental health and resilience levels of medical students [Project completed on December, 2019]

artificial-intelligence classification clustering correlation linear-regression machine-learning machine-learning-algorithms mental-health python regression resilience scikit-learn statistical-analysis

Last synced: 06 May 2026

https://github.com/rishisolanke/twitter-sentiment-analysis-using-machine-learning-

A research project that classifies tweets as positive, negative, or neutral using ML algorithms (Logistic Regression, Naïve Bayes, SVM) with NLP preprocessing.

data-science data-visualization logistic-regression machine-learning ml-models naive-bayes natural-language-processing nlp scikit-learn sentiment-analysis svm text-classification twitter-data

Last synced: 06 May 2026

https://github.com/keneandita/iris-intel

Iris Flower Classifier is a simple web app built with Streamlit that predicts the species of an Iris flower based on user-input flower features. It uses pre-trained machine learning models including Logistic Regression, K-Nearest Neighbors, SVM, and Decision Tree to make real-time predictions.

iris-classification jupyter-notebook machine-learning python scikit-learn streamlit

Last synced: 06 May 2026

https://github.com/eshansugeesh/fico-score-loan-default-modeling-project

Credit risk assessment using FICO score segmentation, loan default modeling, discretization techniques, and log-likelihood evaluation for predictive analytics in financial services.

bucketing classification credit-risk customer-segmentation data-science discretization fico-score financial-analytics loan-analysis loan-default log-likelihood machine-learning numpy pandas predictive-modeling risk-modeling scikit-learn segmentation statistical-modelling

Last synced: 06 May 2026

https://github.com/nicolas-giacomelli/modelo_regressao_linear_vendas

Modelo de regressão linear para previsão de vendas Desafio do curso de IA da RocketSeat

matplotlib pandas python3 scikit-learn

Last synced: 06 May 2026

https://github.com/billgewrgoulas/recommendation-systems

Algorithms for joke rating prediction using the joke data-set from Kaggle.

algorithm clustering collaborative-filtering machine-learning numpy pandas recommender-system scikit-learn scypi

Last synced: 06 May 2026

https://github.com/lazarust/jupyternotebooks

Storage spot for all my Jupyter Notebooks. Check some of them out!!

jupyter-notebook jupyter-notebooks keras scikit-learn sklearn

Last synced: 06 May 2026

https://github.com/erick957/saleprice-prediction-dataset-analysis-and-cleaning-advance-regression

🏠 Predict house prices using advanced regression techniques with this comprehensive analysis and cleaning project, from data loading to model deployment.

data-analysis data-science eda google-colab machine-learning numpy pandas python scikit-learn scikit-learn-python

Last synced: 06 May 2026

https://github.com/sabin74/boston_house_prediction

This project aims to predict the median value of owner-occupied homes in Boston suburbs using various machine learning regression models. Multiple regression techniques were applied, including Linear Regression, Decision Tree, Random Forest, Gradient Boosting and dimensionality reduction with PCA. Hyperparameter tuning was performed.

boston-housing-price-prediction hyperparameter-tuning kaggle-dataset pca-analysis python3 regression-models scikit-learn

Last synced: 06 May 2026

https://github.com/dwade-eng/customer-lead-conversion-analysis

This project explores a real-world lead conversion dataset, using a structured machine learning pipeline to classify leads into likely or unlikely converters. It includes complete steps from data wrangling and visualization to feature engineering and model evaluation.

html matplotlib pandas python3 scikit-learn seaborn

Last synced: 06 May 2026

https://github.com/ejw-data/ml-playground

Testing the limitations, inabilities, and strengths of models with synthetic data

machine-learning python scikit-learn

Last synced: 06 May 2026

https://github.com/douglas-data-analyst/predictive-analysis

Modelo preditivo para previsão de vendas usando scikit-learn e machine learning

data-science machine-learning predictive-analytics python sales-forecasting scikit-learn time-series

Last synced: 06 May 2026