An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/chaitanya1436/student_performance_analysis

A project focused on analyzing college student performance using data on department, assessment scores, and performance labels. Implemented in Google Colab, the analysis includes data preprocessing, feature scaling, and exploratory data analysis to uncover insights and prepare the data for further analysis or modeling.

ata-preprocessing data-preparation exploratory-data-analysis feature-scaling google-colab numpy pandas scikit-learn

Last synced: 07 Feb 2026

https://github.com/rayyan9477/machine-learning-driven-backorder-prediction-system

Experience a state-of-the-art Django web application designed to predict product backorders with exceptional accuracy. This platform leverages advanced machine learning techniques, incorporating pre-trained Random Forest Classifier, Decision Tree, and LGBM models.

matplotlib notebook numpy pandas python scikit-learn

Last synced: 12 Apr 2026

https://github.com/kr1shnasomani/sentimentscope

Sentiment analysis on movie review using TensorFlow and GloVe embeddings

deep-learning keras matplotlib natural-language-processing neural-networks numpy pandas scikit-learn tensorflow

Last synced: 12 Apr 2026

https://github.com/alexsomai/machine-learning-getting-started

Dummy examples and experiments to get started with Machine Learning

artificial-intelligence deep-learning machine-learning python scikit-learn

Last synced: 13 Apr 2026

https://github.com/anudeepjonnada/phishshield-ai

🛡️ PhishShield AI – An intelligent phishing email detector that uses BERT and Machine Learning to identify phishing attempts in real time. Integrated with the Gmail API, powered by Flask, React, and MongoDB for secure full-stack email analysis and threat detection.

bert flask gmail-api mongodb oauth2 python react scikit-learn

Last synced: 13 Apr 2026

https://github.com/tasninanika/will-you-survive-frontend

A full-stack machine learning app to predict Titanic passenger survival with a modern, interactive UI. Powered by FastAPI, scikit-learn, and a React frontend.

fastapi framer-motion python3 react react-router scikit-learn

Last synced: 12 Apr 2026

https://github.com/zen204/airbnb-availability

A machine learning model that predicts Airbnb listing availability, utilizing feature engineering and supervised learning techniques to improve guest experience and optimize host management.

binary-classification data-analysis data-preprocessing data-visualization feature-engineering machine-learning matplotlib model-evaluation nlp pandas predictive-modeling python scikit-learn seaborn supervised-learning

Last synced: 21 Jan 2026

https://github.com/mgobeaalcoba/linear_algebra_for_machine_learning

Explore fundamental linear algebra concepts essential for machine learning in this repository, with code examples and explanations. Get a solid foundation for ML!

machine-learning matplotlib numpy pandas python3 scikit-learn scipy seaborn

Last synced: 12 Apr 2026

https://github.com/mgobeaalcoba/survival_predictor_on_the_titanic_scikit_learn

Titanic Survival Predictor using Scikit-Learn: Machine learning model and analysis to predict passenger survival on the Titanic based on historical data.

matplotlib numpy pandas python3 scikit-learn seaborn titanic-dataset titanic-kaggle titanic-survival-prediction

Last synced: 10 Apr 2026

https://github.com/andrewquijano/operating_systems_ii

Creating an Intrusion Detection System

ids kdd99 nsl-kdd-dataset scikit-learn

Last synced: 17 Jan 2026

https://github.com/hayatoy/gcpml-notebook

Dockerfile with Jupyter Machine Learning environment plus Google Cloud SDK

dockerfile google-cloud-platform jupyter scikit-learn tensorflow

Last synced: 12 Apr 2026

https://github.com/felipeclarindo/energy-predict-api

Api para realizar previsões sobre energia.

api api-development api-rest flask pandas pickle python scikit-learn

Last synced: 13 Apr 2026

https://github.com/jersongb22/datascience_mlops_movierecommendations_project

Simulating a Data Scientist's role in a startup aggregating streaming platforms. Building movie queries and ML-based recommendation system with MLOps focus. ML model web app deployed with Render.

data-science fastapi machine-learning matplotlib pandas python render scikit-learn stopwords

Last synced: 10 Apr 2026

https://github.com/mehmoodulhaq570/machine-learning-models

A repository consisting of machine learning models for predicting the future instance. More specifically this repository is a Machine Learning course for those who are interested in learning the basics of machine learning algorithms.

decision-trees gradient-descent gradient-descent-algorithm knn-algorithm linear-regression linear-regression-models logistic-regression-algorithm machine-learning-algorithms machine-learning-models ml naive-bayes-algorithm one-hot-encoding pca python random-forest-classifier scikit-learn svm-model

Last synced: 08 Apr 2025

https://github.com/shreeyas-48/creditcardfrauddetection

Project for detecting credit card frauds using neural networks and logistic regression

autoencoder keras logistic-regression matplotlib neural-networks numpy pandas python scikit-learn

Last synced: 12 Apr 2026

https://github.com/hvignolo87/marketing-campaign-classification

Real case of classification with machine learning. Analysis of real data from telemarketing campaigns of a Portuguese bank.

binary-classification data-science pandas python scikit-learn xgbclassifier xgboost

Last synced: 12 Apr 2026

https://github.com/ayberkyavuz/ml_model_server_docker_deployment

This repository is for containing source codes of machine learning model server deployment.

deployment docker flask machine-learning model python random-forest scikit-learn

Last synced: 08 Apr 2026

https://github.com/medyessinkhlif/medclaimml

An AI-powered machine learning application designed to process healthcare reimbursement claims. It analyzes medical documents, client information, insurance policies, and legal regulations to predict accurate reimbursement amounts, ensuring efficiency, compliance, and fraud detection.

healthcare jest-tests mern-stack mongodb nodejs nosql numpy pytorch react scikit-learn tailwindcss

Last synced: 13 May 2025

https://github.com/jesly-joji/house-price-prediction

House Price Prediction using Linear Regression with Scikit-learn and Flask

flask regression scikit-learn

Last synced: 03 Jan 2026

https://github.com/flysirin/adstextclassification

Classification of advertisements by topic

docker excel flask pandas python pytorch scikit-learn

Last synced: 02 Jan 2026

https://github.com/its-maneeshk/fake-product-detection-system

The Fake Product Review Detection System is a machine learning-powered web application designed to analyze and detect fake reviews on eCommerce platforms. It helps users identify whether a product has genuine or manipulated reviews by leveraging Natural Language Processing (NLP) and supervised learning models.

api beautifulsoup4 fetch-api flask html-css-javascript joblib nlp-machine-learning numpy pandas python reactjs requests scikit-learn

Last synced: 05 Mar 2025

https://github.com/ksasi/boston_housing

Predicting Boston Housing Prices - Udacity

machine-learning numpy pandas python scikit-learn

Last synced: 08 Apr 2026

https://github.com/salmandeveloperz/ml_house_prediction

project for house price prediction using Classification & Regression models. Includes Docker setup for easy deployment.

classification-model clustering deep-learning machine-learning matplotlib numpy pandas python3 regression-models scikit-learn

Last synced: 10 Apr 2026

https://github.com/grachale/predict_titanik

Predicting the survival of Titanic passengers (binary classification) with usage of decision tree and KNN from scikit-learn.

classification decision-tree-classifier knn-classifier matplotlib pandas python scikit-learn titanic-survival-prediction

Last synced: 12 Apr 2026

https://github.com/alessandrosocc/machine-learning-project-2022

Final project for the Machine Learning course at the University of Cagliari in 2022. Analysis of a dataset, use of Machine Learning techniques with Oversampling and Undersampling techniques. Final report with the results obtained.

imblearn machine-learning matplotlib-pyplot oversampling pandas scikit-learn spambase-dataset undersampling

Last synced: 18 Jan 2026

https://github.com/alisonmitchell/boston-housing

Investigation of the Boston housing dataset to evaluate, train and test a regression model to predict house prices.

data-science machine-learning matplotlib numpy pandas python scikit-learn scipy seaborn

Last synced: 10 Apr 2026

https://github.com/mirgis/plucky-playground

A modest collection of machine learning and deep learning algorithms, along with examples implemented in diverse toolkits.

bayes bayesian deep-learning examples ipynb keras machine-learning neural-network pandas playground python3 pytorch scikit-learn sklearn statistics tensorflow

Last synced: 13 Apr 2026

https://github.com/aahnik/gdsc-ml-ds-bootcamp-2023

This repo contains files given by my seniors as well as assignments and final project done by me during the bootcamp.

data-science machine-learning ml numpy pandas python3 scikit-learn

Last synced: 28 Oct 2025

https://github.com/yancotta/anti-aging-epigenetics-ml-app

A thesis MVP for a personalized anti-aging system that analyzes genetic SNPs and lifestyle habits using ML models (Random Forest and Neural Networks) to provide risk assessments and actionable recommendations. Built with FastAPI, React, PostgreSQL, and containerized via Docker for scalability and explainability.

anti-aging bioinformatics docker explainable-ai fastapi genetics healthtech machine-learning mlops personalized-medicine pytorch reactjs scikit-learn synthetic-data thesis-project

Last synced: 16 Sep 2025

https://github.com/srikarveluvali/heart-disease-prediction-ml

This machine learning project aims to predict the presence or absence of heart disease in individuals based on a set of health-related features. By utilizing a dataset containing information about patients, we employ various machine learning techniques and data analysis to build a predictive model.

exploratory-data-analysis machine-learning python scikit-learn

Last synced: 04 May 2026

https://github.com/leabrodyheine/water-pump-status-prediction

This project implements machine learning models to predict the status of water pumps in Tanzania using data from DrivenData's competition. The project includes preprocessing steps, model evaluation using cross-validation, and hyperparameter optimization with Optuna.

argparse cross-validation gradient-boosting-classifier logistic-regression machine-learning multilayer-perceptron numpy optuna pandas random-forest-classifier scikit-learn

Last synced: 11 Apr 2026

https://github.com/devspidr/ml-programs

A collection of foundational machine learning programs covering supervised and unsupervised algorithms, implemented using Python and libraries like scikit-learn, pandas, and matplotlib. Ideal for beginners and students learning core ML concepts through practical coding.

classification machine-learning-algorithms regression scikit-learn supervised-learning unsupervised-learning

Last synced: 30 May 2026

https://github.com/lucasfrag/dengue-prediction-knc

Projeto desenvolvido para realizar previsão de casos de dengue usando o algoritmo de classificação KNeighborsClassifier.

data-science knearest-neighbor-classifier machine-learning pandas python scikit-learn

Last synced: 11 Mar 2025

https://github.com/lasithaamarasinghe/movie-recommender-system

This ML model recommends movies that may align with the user's preferences based on TF-IDF matrix.

jupyter-notebook machine-learning movie-recommendation movielens-dataset numpy pandas python regex scikit-learn tf-idf-vectorizer

Last synced: 12 Apr 2026

https://github.com/francescopaolol/titaniccompetition

It's my first kaggle competition about predict survival on the Titanic and get familiar with ML basics

jupyter-notebook kaggle-competition machine-learning ml pandas scikit-learn

Last synced: 17 Apr 2026

https://github.com/karimosman89/legal-document-nlp

Create a tool that uses NLP to extract key information from legal documents, contracts, or agreements.Use NLP techniques for named entity recognition and text classification.Streamline the review process for legal teams by automating information extraction.

nltk python scikit-learn spacy

Last synced: 11 Apr 2026

https://github.com/soumyagautam/sign-sense

Deep Learning and Neural Network based Sign Sense or 'Sign Language' to Speech converter is an desktop app which can detect hand signs in a frame and can convert them to Speech, according to their respective meaning. Opposite to this, it can also recognise your voice and can convert it to sign language.

ai cv2 dataprocessing deep-learning keras machine-learning mediapipe moviepy-library neural-network openai-whisper scikit-learn tensorflow tkinter-python

Last synced: 10 Apr 2026

https://github.com/suundumused/weather-forecast-ai-example

The project scope is a weather forecasting model based on behavioral analysis of the last 33 hours (hour-by-hour forecast) with Random Forest Classifier. The program automatically saves and loads the last trained model for prediction.

ai artificial-intelligence artificial-intelligence-algorithms artificial-intelligence-projects artificialintelligence scikit scikit-learn scikit-learn-python scikitlearn scikitlearn-machine-learning weather weather-conditions weather-forecast weather-information

Last synced: 20 May 2026

https://github.com/aarryasutar/credit_eda

This project focuses on cleaning and analyzing a loan application dataset to gain insights into the factors influencing loan defaults. Through systematic data cleaning, visualization, and merging with previous application data, it provides a robust foundation for further predictive modeling.

binning boxplot correlation-matrix data-cleaning data-splitting dataframe feature-engineering heatmap jupyter-notebook matplotlib numpy pandas python scikit-learn seaborn

Last synced: 13 Apr 2026

https://github.com/uea-geral/rna-perceptron-exercise

🤖Disciplina de RNA: treinamento de um neurônio Perceptron.

jupyter-notebook neural-network numpy perceptron python scikit-learn

Last synced: 13 Apr 2026

https://github.com/camilajaviera91/prediction-of-housing-prices-using-linear-regression

This project provides tools to search for datasets on Kaggle, download and preprocess them, and perform predictions using a Linear Regression model. It includes interactive text-based user interfaces built with `curses`.

curses kaggle linear-regression matplotlib-pyplot mean-absolute-error mean-square-error numpy pandas pathlib python scikit-learn train-test-split

Last synced: 10 Apr 2026

https://github.com/samarpan-rai/serveitlearn

It creates an extremely thin layer around FastAPI library which allows you to create an end point super fast.

fastapi inference ml pypi scikit-learn

Last synced: 12 Apr 2026

https://github.com/yuvraj0412s/proactive-fraud-detection-using-machine-learning

An end-to-end machine learning project for detecting financial fraud using LightGBM, featuring in-depth EDA, advanced feature engineering, and a focus on actionable business insights.

class-imbalance classification-model data-analysis data-science data-visualization exploratory-data-analysis feature-engineering fintech fraud-detection jupyter-notebook lightgbm machine-learning pandas python scikit-learn smote

Last synced: 02 May 2026

https://gitlab.com/hylkedonker/statkit

Statistics for sci-kit learn.

machine learning scikit-learn statistics

Last synced: 01 Nov 2025

https://github.com/hokagem/damagedlogginganalyzer

A project about an analyzation of a statistic of damaged logging (wood) in Germany using Python.

analysis csv csv-parser k-fold-cross-validation numpy pandas pandas-dataframe pandas-python polynomial-regression scikit-learn statistics wood

Last synced: 03 May 2026

https://github.com/sanjeetbth7/krishi-nexus

Krishi Nexus revolutionizes agriculture by delivering data-driven crop recommendations via advanced machine learning, maximizing yields and ensuring sustainable practices. This platform empowers farmers with actionable insights, optimizing investments and promoting informed decision-making for a prosperous and eco-conscious future.

api classification expressjs reactjs scikit-learn supervised-learning tail

Last synced: 18 Feb 2026

https://github.com/colinwu0403/heartbpmusic

Music discovery platform that recommends you a song based on your heart's BPM and your mood using Machine Learning.

django neurokit2 scikit-learn spotify-web-api vuejs

Last synced: 05 May 2026

https://github.com/singhkunwardeep/twitter_sentiment_analysis

A machine learning project to classify Twitter sentiment into positive, negative, categories using Logistic Regression and TF-IDF Vectorization. This project involves data preprocessing, feature extraction, model training, and evaluation of the sentiment of tweets. Built with Python, NLTK, and Scikit-learn.

logistic-regression nltk-python pandas-dataframe python3 scikit-learn tfidf-vectorizer

Last synced: 05 May 2026

https://github.com/rahimizadeh/prediction-api-with-flask-and-mlflow

An end-to-end machine learning project demonstrating model lifecycle management with MLflow and production deployment using Flask.

flask machine-learning mlflow mlops-workflow python random-forest-regression rest-api scikit-learn

Last synced: 13 Apr 2026

https://github.com/alessandromonolo/descriptive-texts-classification-by-usage-purposes-of-estate-properties

The project aims to identify the best model for the classification of texts derived from descriptions of assets subject to Italian judicial auctions. The employed models include both conventional models, such as Logistic Regression, Naive Bayes, SVM, and XGBoost, and neural network models, such as Fasttext and XLM-Roberta.

fasttext logistic-regression naive-bayes nlp python pytorch scikit-learn seaborn spacy svm text-classification tfidf tokenizer xgboost xlm-roberta

Last synced: 08 Apr 2026

https://github.com/evanmarshall-dev/evanmarshall-tech

Professional IT services platform featuring serverless AWS infrastructure, ML-powered service recommendations, and automated CI/CD deployment. Built to showcase full-stack development, cloud architecture, and machine learning engineering skills.

api-gateway aws ci-cd cloud-computing cloudfront devops full-stack github-actions infrastructure-as-code lambda machine-learning mlops nextjs portfolio python react s3 scikit-learn serverless terraform

Last synced: 13 Apr 2026

https://github.com/satvikpraveen/fashionmnist-analysis

A comprehensive analysis of the Fashion MNIST dataset using PyTorch. Covers data preparation, EDA, baseline modeling, and fine-tuning CNNs like ResNet. Includes modular folders for data, notebooks, and results. Features CSV exports, visualizations, metrics comparison, and a requirements.txt for easy setup. Ideal for ML workflow exploration.

computer-vision confusion-matrix convolutional-neural-networks deep-learning-algorithms exploratory-data-analysis fashion-mnist-dataset fine-tuning hyperparameter-tuning image-classification jupyter-notebook machine-learning-algorithms matplotlib-pyplot model-evaluation numpy pandas pytorch resnet-18 scikit-learn seaborn vgg

Last synced: 22 Apr 2025

https://github.com/gokulgowthams/smart-premium

An Interactive Premium Amount Detection for user which accurately predicts the required premium amount for a default loan by using series of questions that satisfies the criteria in Streamlit Application

data-preprocessing feature-engineering git github mlflow model-deployment numpy pandas python scikit-learn streamlit xgboost

Last synced: 11 Apr 2026

https://github.com/rohitpawar001/bone_marrow_surival_prediction

Bone marrow transplants can be life-saving, but predicting patient survival is complex. In this project, I used machine learning to analyze key medical factors and improve survival predictions. I also implemented CI/CD pipelines, used MLflow for model tracking, and deployed the model on an AWS EC2 instance.

aws docker ec2-instance flask machine-learning mlflow python scikit-learn

Last synced: 08 Apr 2026

https://github.com/f-aguzzi/chemfusekit

Chemometrics library for data fusion, model training and prediction of data from multiple sensor sources.

chemometrics datafusion knn lda pca plsda scikit-learn svm

Last synced: 20 Jan 2026

https://github.com/srilaasya/handwriting-recognition-using-k-means

Used K-means clustering and scikit-learn to cluster images of handwritten digits.

handwriting-recognition k-means python scikit-learn

Last synced: 13 Apr 2026

https://github.com/sivatsk26/university-admit-eligibility-predictor

This project is created using Machine Learning and Regression methods- a statistical technique to predict the outcome of event which is to verify the users’ admission eligibility level, considering the universities they have chosen. This is achieved based on the algorithms implemented, when is user feed the application with the required information

html-css-javascript ibm-cloud ibm-watson linear-regression machine-learning matplotlib numpy pandas python python-flask random-forest scikit-learn

Last synced: 13 Apr 2026

https://github.com/serhatderya/house-prices---advanced-regression-techniques

This machine learning model was developed for "House Prices - Advanced Regression Techniques" competition in Kaggle by using several machine learning models such as Random Forest, XGBoost and LightGBM.

ai artificial-intelligence data-science ju jupyter-notebook lightgbm lightgbm-regressor machine-learning machinelearning prediction python random-forest random-forest-regression regression scikit-learn xgboost xgboost-regression

Last synced: 28 Apr 2026

https://github.com/mnitin-reddy/reducing-review-overhead-with-ml-based-application-screening

A machine learning classification project to filter out low-probability visa applications using historical data. It features an end-to-end implementation with CI/CD on AWS, achieving 93% accuracy with a KNN model optimized through Optuna, alongside integration of MLOps tools like Evidently and MLflow.

aws docker githubactions hypothesistesting machinelearning matplotlib mlflow mlops mongodb numpy optuna pandas python scikit-learn seaborn

Last synced: 10 Apr 2026

https://github.com/samarthmule/chatbot

This project implements a generic chatbot using Natural Language Processing (NLP) and Machine Learning techniques. The chatbot is designed to classify user input into predefined intents and provide context-aware responses. The solution is scalable, interactive, and suitable for various domains.

chatbot internship machine-learning machine-learning-algorithms nlp nltk project-repository python python3 scikit-learn streamlit

Last synced: 13 Apr 2026

https://github.com/arnabsaha7/piezoelectric-roads-implementation

This repository implements a piezoelectric road system in Python, leveraging Pandas, NumPy, scikit-learn, Matplotlib, and Seaborn. The requirements.txt file ensures version consistency for reproducibility.

pandas-python piezoelectric roads scikit-learn

Last synced: 06 Jan 2026

https://github.com/aryank1511/wattwise

WattWise is an innovative energy-saving app that uses an Arduino-powered device to monitor and predict household electricity usage and bills in real-time.

arduino docker flask machine-learning mqtt nextjs scikit-learn

Last synced: 04 Feb 2026

https://github.com/supriya811106/healthcare-recommedation-system

A Flask-based web app that predicts diseases based on symptoms and recommends specialized doctors. It uses machine learning for accurate health predictions and location-based doctor searches.

css flask-application healthcare-application html javascript machine-learning numpy pandas recommendation-system scikit-learn

Last synced: 04 Mar 2026

https://github.com/imswappy/brain-tumor-detection

🧠 Deep learning project for brain tumor classification using MRI images. Built with transfer learning (VGG16 + fine-tuning), TensorFlow/Keras, and deployed via Streamlit. Dataset & model loaded dynamically from KaggleHub. Includes training notebook, evaluation, and interactive web app.

kagglehub keras numpy pandas scikit-learn streamlit tensorflow vgg16-model

Last synced: 13 Apr 2026

https://github.com/strcoder4007/machine-learning-deep-learning-practice

Implementation of Linear/Logistic Reg, K-NN, SVM, Clustering, K-Means, ConvNet, ResNet, MobileNet, RNN, LSTM etc. using Pandas, SciKitLearn, NumPy & TensorFlow 2

convolutional-neural-networks matplotlib scikit-learn tensorflow2

Last synced: 15 May 2026

https://github.com/jibbs1703/classic-ml-models

This repository contains scripts for developing, training and evaluating machine learning models using several python frameworks.

aws data-preprocessing data-science deep-learning feature-engineering machine-learning multiclass-classification neural-networks predictive-modeling pyspark-mllib pytest scikit-learn xgboost-classifier

Last synced: 10 Apr 2026

https://github.com/metriccoders/metriccoders_datasets

This is the Metric Coders repository containing all the datasets for machine learning.

data datasets machine-learning natural-language-processing scikit-learn

Last synced: 08 Apr 2025

https://github.com/nafisalawalidris/logistic-regression-model-for-breast-cancer-recurrence-prediction

Predicting Breast Cancer Recurrence - A logistic regression model using patient attributes to classify recurrence risk. Dataset analysis and model evaluation. Contributions welcome.

breast-cancer classification-model data-analysis data-science healthcare logistic-regression machine-learning python recurrence-prediction scikit-learn

Last synced: 17 May 2026

https://github.com/takkii/pylean

Data analysis ( 🐍 💎 📈 )

analayze matplotlib numpy pandas python scikit-learn

Last synced: 09 Sep 2025

https://github.com/aryansk/mnist-deep-learning-exploration

This repository contains implementation of various deep learning approaches for the MNIST handwritten digit classification task, using both scikit-learn and Keras frameworks.

keras machine-learning machine-learning-algorithms mnist-classification numpy python scikit-learn tensorflow

Last synced: 08 Apr 2026

https://github.com/ayushshahh/fespn

A neural network made to predict final exam scores of students

mlp mlp-regressor multilayer-perceptron neural-network prediction-model scikit-learn

Last synced: 02 May 2026

https://github.com/edisedis777/pyspark-ml-features

A PySpark implementation of 6 lesser-known Scikit-Learn features optimized for Azure Databricks. This project translates powerful machine learning techniques from Scikit-Learn into PySpark's distributed computing framework.

azure databricks databricks-notebooks large-scale machine-learning pyspark python scikit-learn scikitlearn-machine-learning

Last synced: 13 Apr 2026