An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/sabbadini10/job4you

Job4You is an AI-powered job application assistant that streamlines the entire application process. Built on Angular and Firebase with GPT-4 integration.

angular api ats-optimization cover-letter email-automation firebase jobforall openai-api python resume-builder scikit-learn sheraz sherazhussain sherazhussain546

Last synced: 04 Mar 2026

https://github.com/jofaval/ionosphere

Binary Classification of Ionosphere signals at Goose Bay, Labrador in 1988

data-analysis data-science data-visualization deep-learning google-colab keras machine-learning python scikit-learn tensorflow uci xgboost

Last synced: 09 Apr 2026

https://github.com/jain1shh/solar-flare-prediction

This repository contains code and data for predicting solar flare energy ranges using machine learning, based on NASA's RHESSI mission data. It includes preprocessing of FITS files into a unified CSV dataset and implements models like Gradient Boosting, Random Forest, and Decision Tree classifiers, achieving accuracies up to 87%.

data-visualization machine-learning numpy pandas python scikit-learn solar-flare-prediction

Last synced: 09 Apr 2026

https://github.com/dhanraj-parigi/diabetes_prediction_app

🩺 A simple and interactive web app that predicts diabetes using 🧠 machine learning. 🚀 Built with Python, Streamlit, and the 🧮 Pima Indians Diabetes dataset.

ai-in-healthcare classification data-science diabetes-prediction health-check healthcare-ai jupyter-notebook machine-learning ml-project pandas python random-forest scikit-learn streamlit

Last synced: 09 Apr 2026

https://github.com/mirzaazwad/tymbert

TYMBert is our submission for NCIM 2025, a spam classifier that makes use of knowledge distillation to compress the model while preserving accuracy

bert huggingface-transformers knowledge-distillation machine-learning matplotlib numpy pandas python3 scikit-learn tiny-bert torch

Last synced: 09 Apr 2026

https://github.com/shafaq-aslam/predicting-heart-disease-risk-with-logistic-regression-techniques

Develop a predictive model using logistic regression techniques to assess heart disease risk based on patient health metrics and data analysis.

data-analysis heart-disease logistic-regression machine-learning machine-learning-models matplotlib numpy pandas python scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/stefagnone/text_adventure_game

A text-based adventure game project using Python fundamentals

matplotlib numpy pandas python r scikit-learn seaborn sql

Last synced: 09 Apr 2026

https://github.com/shauryashaurya/marty_mcfly

Code, text and notebooks on a tutorial for Introduction to Machine Learning using open sources

anaconda jupyter-notebooks machine-learning machine-learning-tutorials notebook numpy python regression scikit-learn scipy tutorial

Last synced: 09 Apr 2026

https://github.com/rajan-bhateja/Machine-Learning-with-Python

ML/DL projects done using sklearn and TensorFlow

machine-learning scikit-learn sklearn

Last synced: 28 Jul 2025

https://github.com/ajxxxs/spotify-music-analysis

spotify Music (web scraped playlists ) analysis (over 3 states) , trends, features and a music recommendation system.

matplotlib numpy panda scikit-learn seaborn

Last synced: 28 Jul 2025

https://github.com/anuranjanjain/cardioguide

This is the project that I created for DSN 2 at VIT , As its name suggests it will help you to check for any abnormalities with your heart by giving the "Heart Risk Assessment"

chartjs chatbot flask-application mlmodel pandas pickle python rest-api scikit-learn

Last synced: 20 Jan 2026

https://github.com/rafay-imraan/email-spam-filtering

Machine learning models that filter spam emails from a dataset downloaded from kaggle.com.

machine-learning ml pandas python scikit-learn xgboost

Last synced: 20 Jan 2026

https://github.com/iamjuniorb/d499-supervised-learning

This class for machine Learning presents the end-to-end process of investigating data through a machine learning lens.

machine-learning project python python3 scikit-learn scikit-learn-python scikitlearn-machine-learning supervised-learning supervised-machine-learning

Last synced: 16 May 2026

https://github.com/kavyachouhan/fake-news-detection-dravidian-language

This repository contains the code and resources for a machine learning project focused on detecting fake news in the Malayalam language, developed as part of the IITM-PAN BS AI-ML Challenge.

jupyter-notebook machine-learning numy pandas python scikit-learn

Last synced: 08 Feb 2026

https://github.com/antrita/stroke_prediction_model

A model that combines Kaggle's Stroke Prediction Dataset with live weather/air quality data to implement FDA-compliant MLOps pipeline and shows expertise in healthcare regulations and real-time inference.

ai data-analysis deep-learning kaggle-dataset machine-learning prediction-model random-forest real-time scikit-learn streamlit weather-api xgboost

Last synced: 07 May 2026

https://github.com/nathadriele/transaction_fraud_prevention_pipeline

Uma solução de detecção e prevenção de fraudes em transações financeiras, combinando Machine Learning, regras de negócio e análises estatísticas avançadas. O sistema oferece um dashboard interativo para monitoramento em tempo real, análise de dados e gestão de alertas de fraude.

data-analysis data-visualization docker fraud-prevention machine-learning matplotlib numpy pandas pipeline pytest python scikit-learn scipy seaborn streamlit tensorflow transaction xgboost

Last synced: 10 Apr 2026

https://github.com/080bct12alex/nepalestate

A real estate price prediction web app using machine learning, Next.js and Flask

flask-api mlp-regresor nextjs scikit-learn

Last synced: 31 Jul 2025

https://github.com/lanhhoang/toronto-bicycle-thefts-classifier

A predictive service using Toronto Police Open Data to provide a classification of either the bike is likely to be returned or not

clustering decision-trees flask logistic-regression machine-learning python scikit-learn streamlit

Last synced: 04 May 2026

https://github.com/bgmp/svm

Support Vector Machine implementation written in Python

iris-dataset scikit-learn svm

Last synced: 31 Jul 2025

https://github.com/vigneshvaranasi/breast_cancer_detection

This project employs machine learning, focusing on Logistic Regression, to detect breast cancer using tumor-related features. The dataset is preprocessed, and the model achieves 100% accuracy on the test set. The goal is to gain insights into breast cancer factors and provide an effective detection solution.

jupyter machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 09 Apr 2026

https://github.com/sridharyadav07/machine-learning-project-bankruptcy-prevention-

The project explores multiple machine learning algorithms and evaluates their performance using various metrics, such as accuracy and confusion matrices. The models tested include Logistic Regression, K-Nearest Neighbors (KNN), Naive Bayes, and Support Vector Machine (SVM). In addition, regularization techniques (L1, L2) are used to avoid overfit.

data-preprocessing evaluation machine-learning-models matplotlib-pyplot modelbuilding modeldeployment numpy pandas python scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/apal21/tensorflow-pima-indians-dataset-classification

Pima Indians Dataset classification using Tensorflow Linear Classifier and DNN Classifier.

classification deep-neural-networks kaggle linear-classifier pandas pima-indians-dataset scikit-learn tensorflow

Last synced: 09 Apr 2026

https://github.com/presizhai/iris-predictor-fastapi

A web application for predicting the species of Iris flowers using a machine learning model trained with the Iris dataset, with FastAPI, a modern web framework for building APIs.

essemblelearning fastapi python random-forest-classifier scikit-learn uvicorn

Last synced: 25 Dec 2025

https://github.com/oroszgy/cookiecutter-ml-flask

Cookiecutter template for training and serving machine learning models with scikit-learn, spacy, Flask and Docker

docker flask flask-application machine-learning nlp rest-api scikit-learn spacy

Last synced: 09 Apr 2026

https://github.com/kiapanahi/handson-machine-learning-book-playground

Sample codes and practices around the book "Hands-On Machine Learning with Scikit-Learn and TensorFlow"

machine-learning python scikit-learn tensorflow

Last synced: 09 Apr 2026

https://github.com/mtlh/fyp_prempredict

In PremPredict, players will predict all Premier League games. Compete against the algorithm and other users across a full season. Scoring points for every correct result/prediction.

django prediction premierleague python scikit-learn tailwindcss

Last synced: 09 Apr 2026

https://github.com/0xunkn0wn4m1r/data_engineering_banking_project

🏦 Build a complete data engineering workflow for a banking system, showcasing ETL processes, data transformations, and an interactive financial dashboard.

automation data-analysis data-cleaning data-science feature-engineering fintech-bank flask-api loan-default-prediction machine-learning mlops model-explainability numpy postgresql scikit-learn segmentation shap sql unsupervised-learning

Last synced: 09 Apr 2026

https://github.com/alexgoodison/boxbox

F1 Race Visualiser & Overtake Prediction Model 🏎️

fastapi keras nextjs scikit-learn

Last synced: 09 Apr 2026

https://github.com/ranimeshehata/feed-forward-neural-network-on-mnist

A PyTorch-based project for classifying the MNIST dataset using Feed Forward Neural Networks, including training, validation, results and visualization.

feedforward-neural-network matplotlib mnist python3 pytorch scikit-learn torchvision

Last synced: 11 Apr 2026

https://github.com/mrmalik2512/catsvsdog.github.io

A CNN model integrated with flask backend the project is trained on image data of dogs and cats and integrated with a website predicts the given image is dog or a cat

deep-learning numpy python scikit-learn tensorflow

Last synced: 09 Apr 2026

https://github.com/lefteris-souflas/modern-slavery-analysis

Jupyter notebook using machine learning techniques to explore the complex drivers of modern slavery. Models from a research paper are replicated and evaluated . Actions also include filling missing data, training regression models, and analyzing feature importance.

decision-tree feature-importance grid-search-cv imputation jupyter-notebook lasso-regression linear-regression matplotlib mean-absolute-error numpy pandas preprocessing principal-component-analysis python3 random-forest ridge-regression scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/manishkumarpatel07/heartattack_risk_prediction

"Heart Attack Risk Prediction" uses machine learning to estimate the likelihood of a heart attack based on user-provided data like physical attributes, symptoms, and medical history. This system enables remote screening, identifying high-risk individuals, and easing medical system burdens by providing early, data-driven health risk assessments.

boruta knn-algorithm matplotlib numpy pandas python scikit-learn

Last synced: 09 Apr 2026

https://github.com/idaraabasiudoh/credit_card_fraud_detection

This repository contains a machine learning project focused on detecting credit card fraud using Decision Tree and Support Vector Machine (SVM) classifiers.

data-analysis jupyter-notebook machine-learning python3 scikit-learn snapml

Last synced: 19 Feb 2026

https://github.com/lc-rezende/eqx_boston_dataset

Exploratory data analysis, clustering, and forecasting on Boston crime data (2011-2015), revealing key crime trends, hotspots, and temporal patterns to support data-driven insights for urban safety and policing strategies.

data-analysis exploratory-data-analysis jupyter-notebook kmeans matplotlib numpy pandas prophet-facebook python scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/ashishsingh789/bcg_virtual_internship

This repository showcases my BCG X virtual internship project on customer churn analysis for PowerCo, covering business understanding, EDA, feature engineering, and modeling using Python and machine learning.

data-manipulation data-science dataanalysis datavisualization eda machine-learning matplotlib numpy pandas python random-forest scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/kasraskari/tumor-predict

Streamlit app for predicting tumor malignancy using logistic regression.

logistic-regression machine-learning numpy pandas python scikit-learn streamlit tumor-detection

Last synced: 09 Apr 2026

https://github.com/mhkamel/ecommerce-targeting-system

A Flask-based E-Commerce Targeting System that provides customer segmentation and personalized product recommendations. Users can upload structured interaction data for analysis, receive AI-driven recommendations, and gain insights into user behavior. The application is built with Flask, Pandas, Scikit-Learn, and integrates an interactive web inter

ai bootstrap csv-processing customer-segmentation data-analysis data-science e-commerce flask machine-learning pandas python recommendation-system scikit-learn user-behavior web-application

Last synced: 09 Apr 2026

https://github.com/kuldeep-gif/interactive-gesture-speech-system

An interactive AI system that translates real-time hand gestures into audible speech and converts spoken words into visual gestures using OpenCV and MediaPipe.

computer-vision gesture-recognition hci machine-learning mediapipe opencv python scikit-learn speech-recognition

Last synced: 09 Apr 2026

https://github.com/jianninapinto/bandersnatch

This project implements a machine learning model using Random Forest, XGBoost, and Support Vector Machines algorithms with oversampling and undersampling techniques to handle imbalanced classes for classification tasks in the context of predicting the rarity of monsters.

altair imbalanced-classification imblearn machine-learning mongodb oversampling pycharm-ide pymongo python random-forest-classifier scikit-learn smote support-vector-machines undersampling xgboost

Last synced: 29 Sep 2025

https://github.com/macromrit/air-flick

Transfer files through the air with just a gesture. Push. Pull. Done.

css cv2 fastapi html js media-pipe peer2peer python random-forest-classifier restful-api scikit-learn websockets

Last synced: 09 Apr 2026

https://github.com/anusha-me/customer_churn_analysis

Predict and analyze telecom customer churn using machine learning techniques and business dashboards. This end-to-end project includes data preprocessing, EDA, model evaluation (SVM, XGBoost), real-time Streamlit deployment, and Power BI dashboard reporting. Built for actionable insights and decision support.

churn-prediction classification-model customer-analytics dashboard data-science eda machine-learning powerbi predictive-analytics python scikit-learn streamlit svm telecom xgboost

Last synced: 29 Apr 2026

https://github.com/subhas-pramanik-09/mediscan-ai

A smart and scalable ML-powered health prediction system that can help detect the risk of three major diseases: Diabetes + Heart Disease + Parkinsons Disease

jupyter-notebook logistic-regression machine-learning numpy pandas scikit-learn streamlit svm-classifier

Last synced: 09 Apr 2026

https://github.com/omdoshi13/pricing-of-laptops-using-ml

Data Analysis, training Machine Learning models, and Model Evaluation and Refinement for Pricing of Laptops dataset.

data-analysis data-analysis-project datascience google-colab jupyter-notebook machine-learning matplotlib model-evaluation model-refinement numpy pandas python scikit-learn

Last synced: 09 Apr 2026

https://github.com/nurulashraf/linear-regression-insurance-premium

This analysis applies simple linear regression to explore the relationship between age and insurance premium. It includes model training, visualisation, and evaluation using MSE and RMSE to assess prediction accuracy.

beginner-project data-analysis insurance-data linear-regression machine-learning matplotlib predictive-modeling python regression-models scikit-learn

Last synced: 05 May 2026

https://github.com/pejpero/machine_learning

This repository contains two comprehensive machine learning projects using scikit-learn, demonstrating ensemble learning with a Voting Classifier and the comparison of linear and polynomial regression models on different datasets.

ensemble-learning linear-regression logistic-regression machine-learning polynomial-regression random-forest scikit-learn svm

Last synced: 09 Feb 2026

https://github.com/praditaw/patient-los-prediction

Predicting patient Length of Stay (LoS) using machine learning to provide insights for hospital operational efficiency.

exploratory-data-analysis feature-engine healthcare-analysis huggingface-spaces hyperparameter-tuning length-of-stay los-prediction machine-learning pandas scikit-learn streamlit

Last synced: 05 May 2026

https://github.com/gerardo1909/proyecto_nba_mvp

Trabajo práctico final de la materia "Introducción al Aprendizaje Automático" de la Licenciatura en Ciencia de Datos (UNSAM). 2C-2023

machine-learning nba notebooks-jupyter pandas python random-forest scikit-learn

Last synced: 03 Oct 2025

https://github.com/impesud/ai-finops-platform

AI FinOps is an AI-powered platform for cloud cost optimization and forecasting. Built with FastAPI, Python, and modern MLOps tools, it allows teams to track multi-cloud usage, detect anomalies, and predict future expenses using real-time data and machine learning.

aws docker fastapi jupyter mlflow python react scikit-learn statsmodels tailwindcss terraform xgboost

Last synced: 09 Apr 2026

https://github.com/parag000/content-based-movie-recommender

This project builds a content-based movie recommendation system using the TMDB dataset. By combining metadata features like cast, genres, and directors into a "metadata soup," it calculates movie similarity with vectorizers (Count) and cosine similarity. Ideal for learning content-based filtering and text vectorization techniques.

cosine-similarity countvectorizer recommendation-system scikit-learn tfidf-vectorizer vectorization

Last synced: 18 Apr 2026

https://github.com/svetlanam/pycon-workshop

Pycon CZ workshop: Better data analyses and product recommendations with Instagram data

data-analysis data-science martinus matplotlib pandas pycon2016 pyconcz python scikit-learn workshop

Last synced: 09 Apr 2026

https://github.com/ayan6943/employee-attrition-prediction-with-machine-learning

Employee Attrition Prediction with Machine Learning | Analyzing HR data to predict employee turnover using Random Forest. Includes EDA, feature engineering, model training, and evaluation. Achieved 90% accuracy.

attrition employee machine-learning matplotlib numpy pandas python randomforestclassifier scikit-learn seaborn smote

Last synced: 09 Apr 2026

https://github.com/al-shafi-github/deephatedetect-explainable-bengali-abusive-comments-classification-using-transformers-and-llm

This Project aims to train different models that can detect Bengali hate speech on different social media platforms and do a comparative analysis of the models

bangla-nlp nlp nlp-machine-learning python3 regex scikit-learn scikitlearn-machine-learning tabular-data

Last synced: 01 May 2026

https://github.com/jalijuhola/amazon-textual-reviews-recommender-

predicting score and recommending using amazon textual reviews

numpy pandas python scikit-learn typescript

Last synced: 09 Apr 2026

https://github.com/chengetanaim/customerpersonalityanalysis

Customer Personality Analysis involves a thorough examination of a company's optimal customer profiles. This analysis facilitates a deeper understanding of customers, enabling businesses to tailor products to meet the distinct needs, behaviors, and concerns of various customer types

kmeans-clustering pandas scikit-learn

Last synced: 21 Apr 2026

https://github.com/dragonscypher/feastfinderai

Discover the best dining spots with FeastFinderAI!

folium pandas python scikit-learn sql

Last synced: 09 Apr 2026

https://github.com/ifigeneiatsiflidou/applied-statistics-project

Project for an Applied Statistics course, involving exploratory data analysis and predictive modeling of movie revenue using engineered features and multiple linear regression.

correlation-analysis data-analysis linear-regression python scikit-learn visualization

Last synced: 29 Apr 2026

https://github.com/ravi0529/e-commerce-annual-spend-model

A basic Linear Regression model for predicting annual customer's spending

jupyter-notebook linear-regression matplotlib numpy pandas python scikit-learn scipy

Last synced: 09 Apr 2026

https://github.com/bkaracali/crime-data-analysis

Repository for Final Project

machine-learning python scikit-learn

Last synced: 21 Apr 2026

https://github.com/sk-g/mnist_beginners

Model search in traditional machine learning algorithms (non DL) and DL starter codes on MNIST dataset. This is a good starter code for beginners trying to learn about curse of dimensionality, overfitting and other concepts in general

keras machine-learning machine-learning-algorithms mnist mnist-beginners mnist-classification mnist-dataset numpy overfitting python pytorch pytorch-implmention resnet resnet-50 scikit-learn scikitlearn-machine-learning sklearn tensorflow

Last synced: 09 Apr 2026

https://github.com/nazmul-1117/100-days-of-machine-learning

I'm Nazmul so exited to start a new journey to learn 100 Days of Machine Learning. It's February 8, 2025. I'm so exited, let's see what happened insha'Allah

data-science machine-learning numpy pandas-dataframe python3 scikit-learn statistics

Last synced: 11 Aug 2025

https://github.com/hariprasath-v/hackerearth-amazon-business-research-analyst-hiring-challenge

Build a machine learning model that can calculate the time the delivery person takes to deliver the order.

exploratory-data-analysis hackerearth machine-learning pandas pycaret python scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/abdellatif-laghjaj/salary-scope-predictor

SalaryScope: Job Salary Predictor is a machine learning solution designed to estimate salaries from job listings. It employs a full ML pipeline from exploratory data analysis, data cleaning, and NLP on job descriptions to regression model training (Linear Regression, Random Forest, etc.) and hyperparameter tuning

data-science developer-survey feature-engineering machine-learning predictive-modeling regression salary-calculator salary-prediction scikit-learn streamlit

Last synced: 08 May 2026

https://github.com/bhuvan-s-prasad/-alzheimer-diagnosis

This project predicts Alzheimer’s disease using machine learning with basic MLOps integration for better organization and reproducibility. It includes data processing, model training, evaluation, and deployment, incorporating version control, automation, and experiment tracking as a first step into MLOps.

alzheimers-disease classification eda explainable-ai exploratory-data-analysis machine-learning mlops pandas python random-forest random-forest-classifier regression scikit-learn supervised-learning

Last synced: 09 Apr 2026

https://github.com/ezeparziale/tweet-clasification

:bird: Tweet sentiment analysis

bootstrap flask nltk python scikit-learn

Last synced: 09 Apr 2026

https://github.com/prakashjha1/customer-segmentation

This repository contains a customer segmentation project implemented in a Jupyter Notebook using Python. Customer segmentation is a crucial strategy for businesses aiming to understand their customer base better, enabling targeted marketing strategies and personalized customer experiences.

clustering-algorithm customer-segmentation kmeans-clustering matplotlib python scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/eusha425/housing-market-analysis

Implementation of supervised learning algorithms for real estate price prediction, featuring Ridge Regression optimization, IQR-based outlier detection, and extensive feature engineering. Includes detailed visualizations, statistical analysis, and model performance comparisons using various evaluation metrics.

data-preprocessing data-science exploratory-data-analysis house-price-prediction machine-learning python scikit-learn supervised-learning

Last synced: 09 Apr 2026

https://github.com/amandeep-gupta19/chatbot

Created a custom chatbot using Langchain. Here's a summary of what I did: Data Extraction: I gathered data about technical courses from the Brainlox website using Langchain’s URL loaders. Embedding Creation & Storage: I converted this data into embeddings and stored it in a vector store for efficient searching. API Development: I built a Flask

data-extraction faiss-vector-database flask-restful langchain numpy scikit-learn vector-database webbaseloader

Last synced: 09 Apr 2026

https://github.com/moritzkoerber/text_analysis_app

A web app that classifies the content of messages that are usually sent during disasters such as earthquakes.

flask machine-learning nltk python scikit-learn

Last synced: 09 Apr 2026

https://github.com/nicolascoiado/mulheres-ti

Este repositório contém um código em Python para analisar a evolução do número de mulheres na área de Tecnologia da Informação (TI) ao longo dos anos. Utilizando pandas para manipulação de dados e scikit-learn para criar um modelo de regressão linear, o objetivo é prever quantas mulheres estarão na TI em 2024 com base em dados históricos.

linear-regression matplotlib pandas python python3 scikit-learn

Last synced: 09 Apr 2026

https://github.com/alphacrypto246/old-car-price-prediction

The Old Car Price Prediction project predicts used car prices using features like age, mileage, and fuel type. It includes data preprocessing, model training, and visualization of trends, with easy customization for additional features or models.

machine-learning numpy pandas scikit-learn scikitlearn-machine-learning

Last synced: 09 Apr 2026