An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/jingjing-jin/purchase-behavior-analysis

Purchase Behavior Analysis for Targeted Customer Segmentation

clustering-algorithm data-mining machine-learning python scikit-learn

Last synced: 20 Jan 2026

https://github.com/alpha597/music_classification_ml

A project which compares different machine learning algorithms' accuracy in music genre classification of a large dataset.

machine-learning pandas python scikit-learn tensorflow

Last synced: 11 Apr 2026

https://github.com/simranjeet97/spam-classification

Spam Classification Using Natural Language Processing (NLP), Scikit-Learn Library, and Bayesian Method.

data-science emails kaggle kaggle-dataset naive-bayes-classifier nlp-machine-learning nltk-python python scikit-learn spam-classification

Last synced: 11 Apr 2026

https://github.com/vasu7052/spam-classifier

This is a Machine Learning Project to detect whether a given sentence maybe a spam or not using Python and Keras.

keras keras-neural-networks python3 scikit-learn spam-classification tensorflow

Last synced: 11 Apr 2026

https://github.com/karimosman89/resume-screening

Screen resumes to identify the best candidates.Build a machine learning model that screens resumes and ranks candidates based on job descriptions.Streamline the hiring process for HR departments by automating candidate screening.

machine-learning-algorithms nlp-machine-learning nltk-python python scikit-learn spacy text-processing

Last synced: 29 Apr 2026

https://github.com/zohaib-cheema/defacto

DeFacto is a machine learning-based tool that classifies fake news articles using a hybrid model built with Scikit-learn, TensorFlow, and Keras. The system analyzes social and political content to detect deception in news stories and social media posts, providing a reliable solution to address the growing issue of misinformation.

flask git keras numpy pandas r scikit-learn tensorflow

Last synced: 07 Apr 2026

https://github.com/ankitjha2202/sentiment_analysis

A simple web application that performs sentiment analysis using logistic regression to predict whether a given text has a positive, negative or neutral sentiment.

classification logistic-regression nlp scikit-learn sentiment

Last synced: 28 Mar 2025

https://github.com/hmasdev/ssbgm

Score Based Generative Model with scikit-learn

generative-model scikit-learn

Last synced: 17 May 2026

https://github.com/anuragkush2527/vibesync-3.0

Sentiment analysis in social media involves using natural language processing (NLP) and machine learning to analyze users' opinions, emotions, and attitudes expressed in posts, comments, and reviews. It helps in understanding public sentiment, monitoring trends, and making data-driven decisions.

expressjs fastapi mongodb nltk nodejs numpy pandas python reactjs scikit-learn sentiment-analysis tensorflow

Last synced: 16 Oct 2025

https://github.com/ledsouza/nlp-article-classification

This project aims to develop a machine learning model capable of classifying news articles into different categories based on their titles. Two different word embedding models (CBOW and Skip-gram) are trained and used to vectorize the article titles. These vectorized representations are then used to train a Logistic Regression classifier.

gensim-word2vec natural-language-processing nlp nlp-machine-learning pandas python scikit-learn spacy spacy-nlp

Last synced: 11 Apr 2026

https://github.com/nk-works/creditflow-ai

CreditFlow AI predicts loan defaulters using Artificial Neural Networks (ANNs). This model uses historical loan data to predict the likelihood of default for new loan applications.

ai artificial-neural-networks deep-learning jupyter-notebook machine-learning matplotlib numpy pandas python scikit-learn seaborn tensorflow

Last synced: 24 Jun 2025

https://github.com/mathealgou/ml-jobs

This project is a machine learning exercise, the application receives a set of skills from the user and returns a job title that matches the skills entered. It uses the Random Forest algorithm to make the prediction base on a jobs dataset.

machine-learning python random-forest-classifier scikit-learn

Last synced: 11 Sep 2025

https://github.com/ahmed-maher77/signlink___graduation-project

𝐀𝐈-𝐏𝐨𝐰𝐞𝐫𝐞𝐝 𝐒𝐢𝐠𝐧 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐓𝐫𝐚𝐧𝐬𝐥𝐚𝐭𝐨𝐫 | A web and mobile app that bridges communication gaps for the deaf and hard-of-hearing community by translating English and Arabic sign language into real-time text and speech, and converting spoken words into text during video calls.

csharp fastapi firebase-realtime-database flutter framer-motion javascript microsoft-dot-net-technologies numpy opencv python pytorch reactjs scikit-learn scss-framework sign-language-recognizer sign-language-translation sql-server tailwindcss webrtc websockets

Last synced: 07 Apr 2026

https://github.com/marktheo/bike-sharing-demand

Jupyter Notebook - Predicting bike rental numbers based on climate and temporal data

decision-tree-classifier decision-tree-regression jupyter-notebook machine-learning scikit-learn

Last synced: 18 May 2026

https://github.com/bhazel/dockerfiles

Some Dockerfiles for working with specific technologies or learning resources.

docker dockerfile ocaml python rails ruby scikit-learn

Last synced: 10 Apr 2026

https://github.com/prarthana-singh/heart-attack-prediction-model

A Machine Learning model that predicts the risk of a heart attack based on health parameters like cholesterol levels, blood pressure, BMI, smoking habits, and age. Built using Classification models, Scikit-Learn, Pandas, and Python.

classification data-analysis data-science heart-attack-prediction logistic-regression machine-learning numpy pandas python scikit-learn

Last synced: 25 Jun 2025

https://github.com/aryanpillai2007/credit-card-fraud-detection

The primary goal of this project is to develop a comprehensive fraud detection system that enhances the security and trustworthiness of financial transactions.

anomaly-detection classification credit-card-fraud data-preprocessing data-science data-visualization fraud-detection imbalanced-data logistic-regression machine-learning outlier-detection pca pca-analysis python roc-curve scikit-learn

Last synced: 18 May 2026

https://github.com/akshaypatra/cardiovascular_disease_detection

AI-driven ECG classification model that detects cardiovascular abnormalities such as arrhythmia and atrial fibrillation using a hybrid CNN-LSTM deep learning approach.

keras matplotlib numpy pandas python3 scikit-learn seaborn tensorflow wfdb

Last synced: 14 Apr 2026

https://github.com/kanika300393/loan_prediction

This project implements a Loan Prediction system using Support Vector Machine (SVM). It includes data preprocessing, visualization of features like income and education, and model evaluation. The goal is to predict loan approval based on the dataset. Clone the repo to explore the code and improve the model.

data-science machine-learning numpy pandas python scikit-learn svm-classifier

Last synced: 09 Apr 2026

https://github.com/simon2k/stock-price-prediction-evaluation

This project is indented to present a small evaluation of different types of regression models for predicting stock prices for AAPL.

evaluation machine-learning numpy pandas predicting-stock-prices scikit-learn

Last synced: 07 Apr 2026

https://github.com/enayar478/nomad_machine_learning_dash_app

An interactive Machine Learning app built with Dash and Plotly, developed as part of the Data Analytics Bootcamp at Le Wagon Bordeaux. It allows users to visualize data, make real-time predictions, and explore various model insights.

analytics cachetools dash dashboard-application data-analysis data-science deployment gunicorn interactive-visualization machine-learning pandas plotly plotly-dash prediction-model python python3 render scikit-learn web-application

Last synced: 02 Jan 2026

https://github.com/satheesh-meadi/real_time_financial_risk_dashboard

Financial Risk Analysis Dashboard 🚀. An interactive Streamlit dashboard designed for analyzing and visualizing portfolio performance. Features include CAPM analysis, portfolio optimization, efficient frontier visualization, and real-time stock data to help optimize investments.

numpy pandas plotly plotly-express python3 scikit-learn streamlit yfinance

Last synced: 05 Apr 2026

https://github.com/capsuleismail/drybeanuci

Data Science Project with Model comparison.

datascience jupyter-notebook machinelearning-python scikit-learn

Last synced: 18 May 2026

https://github.com/sbera01/credit-card-approval-predictor

End-to-end Machine Learning project to predict credit card approval decisions using real-world financial features. Includes EDA, model training, and deployment-ready architecture

credit-card-approval-prediction data-analysis machine-learning python scikit-learn streamlit

Last synced: 24 Dec 2025

https://github.com/adhishnanda/motion-based-german-learning-app

AI-powered language learning app with gesture recognition (MediaPipe + ML/DL models), real-time interaction, spaced repetition, and full React/TypeScript UI. Demonstrates ML engineering, computer vision, and frontend expertise.

capstone-project computer-vision data-science deep-learning gesture-recognition interactive interactive-learning machine-learning mediapipe portfolio-project pose-estimation react scikit-learn tensorflow typescript

Last synced: 07 Apr 2026

https://github.com/ramyacp14/sentimentanalysis

Implements a sentiment analysis model to determine the emotional tone behind text, helping understand attitudes, opinions, and emotions in online mentions.

machine-learning natural-language-processing nltk numpy pandas python scikit-learn

Last synced: 07 Apr 2026

https://github.com/jerinpious/house-price-prediction

This project is a machine learning-based application to predict house prices. A frontend interface has been developed using Streamlit to make the prediction process user-friendly for regular customers. The project is structured

data-analysis data-engineering data-science eda machine-learning pandas python random-forest scikit-learn streamlit

Last synced: 05 Apr 2026

https://github.com/dadvaiahpavan/ats-system

This AI-driven Applicant Tracking System (ATS) is a cutting-edge solution designed to revolutionize the recruitment process by providing intelligent resume analysis and matching capabilities.

google-generativeai nltk pandas plotly python-docx scikit-learn spacy streamlit

Last synced: 05 Apr 2026

https://github.com/jainish-prajapati/solar-flare-prediction

This repository contains code and data for predicting solar flare energy ranges using machine learning, based on NASA's RHESSI mission data. It includes preprocessing of FITS files into a unified CSV dataset and implements models like Gradient Boosting, Random Forest, and Decision Tree classifiers, achieving accuracies up to 87%.

data-visualization machine-learning numpy pandas python scikit-learn solar-flare-prediction

Last synced: 30 Dec 2025

https://github.com/myahninsi/credit_card_fraud_detection

This repository is for the Neural Networks and Deep Learning Course - Assignment 1, focusing on credit card fraud detection. The project utilizes a machine learning model to predict whether a transaction is fraudulent using a synthetic credit card dataset.

matplotlib numpy pandas pickle python scikit-learn seaborn streamlit

Last synced: 09 Apr 2026

https://github.com/jupitvq/simple-uib-assistant

Chatbot sederhana berbasis machine learning untuk membantu mahasiswa memberikan informasi seputar akademik & administrasi UIB.

chatbot machine-learning scikit-learn virtual-assistant

Last synced: 05 Apr 2025

https://github.com/abdulshaikh55/ml-involuntary-denied-boarding

A machine learning model that predicts whether you will be denied onboarding your plane.

first-timers ipynb machine-learning scikit-learn

Last synced: 29 Apr 2026

https://github.com/karthikarajagopal44/data-analysis-using-python-libraries-

The COVID-19 pandemic has significantly impacted India, necessitating a detailed analysis of the virus’s spread within the country. In this project, we explore an India-specific COVID-19 dataset, leveraging Python libraries such as Pandas, NumPy, Matplotlib, and Seaborn.

data-cleaning data-visualization matplotlib numpy pandas python python3 scikit-learn seaborn

Last synced: 07 Apr 2026

https://github.com/yuji1702/ai--powered-triage-system

This project implements a machine learning-based triage system for emergency rooms, which classifies patients based on their symptoms and vitals using a Random Forest Classifier. The system features real-time patient data integration, a user-friendly GUI built with Tkinter, and secure patient data encryption using Fernet from the cryptography lib

cryptography data-imputation data-preprocessing data-security encryption gui healthcare machine-learning matplotlib medical-data python random-forest realt-time scikit-learn seaborn tkinter triage-system

Last synced: 05 Apr 2025

https://github.com/akansharajput280799/data-driven-insights-into-job-satisfaction-and-compensation-trends

This project analyzes 2020 employee data to identify factors influencing job satisfaction, performance, and salary differences, offering insights for improving engagement and workplace strategies.

cluster-analysis colab-notebook data-cleaning descriptive-statistics factor-analysis hypothesis-testing jupyter-notebook matplotlib python scikit-learn seaborn t-test visualization

Last synced: 18 Apr 2026

https://github.com/fahrettinsolak/ai-map-based-geographic-clustering-project

This project focuses on clustering crime incidents in San Francisco using the K-Means algorithm. The dataset is obtained from Kaggle and contains information about crime types, geographical coordinates, and other relevant features. The goal is to identify crime hotspots through geographic clustering and visualize the clusters on an interactive map.

artificial-intelligence deep-learning elbow-method jupyter-notebook machine-learning numpy openstreetmap pandas phyton plotly scikit-learn standardscaler

Last synced: 05 Apr 2026

https://github.com/venky-1710/superhero-recruitment

Superhero Recruitment System predicts hero selection using machine learning. Users input hero attributes through a web interface. A Random Forest model analyzes abilities, strengths, weaknesses, success rates, and missions completed. The Flask app displays results, showing if a hero is selected.

css flask html numpy pandas python scikit-learn

Last synced: 07 Apr 2026

https://github.com/josugoar/digit-recognizer

Digit recognizer full stack web app and classifier

flask jquery opencv scikit-learn

Last synced: 12 Sep 2025

https://github.com/tapas-gope/telecommunication-customer-churn

This project involves predicting customer churn in a telecommunications company using machine learning techniques, exploring various features' impact, optimizing models, and identifying key factors influencing churn.

feature-engineering matplotlib-pyplot model-evaluation-and-validation numpy pandas python scikit-learn

Last synced: 12 Sep 2025

https://github.com/narendhiran-dev/predictive-analytics-for-repayment-predictions

A machine learning API built with Python, FastAPI, and Scikit-learn to predict borrower repayment risk based on historical payment data. A FinTech risk assessment system that uses a Random Forest model to predict a borrower's future repayment behavior and serves the prediction via a REST API.

data-science fastapi fintech loan-prediction loan-prediction-analysis machine-learning machine-learning-algorithms predictive-modeling python random-forest random-forest-classifier random-forest-regression rest-api risk-assessment scikit-learn scikit-learn-api scikit-learn-python

Last synced: 13 Apr 2026

https://github.com/sabin74/loan_approval_prediction

This project predicts whether a loan application will be approved or not using machine learning classification models. The dataset used is from Kaggle’s Loan Prediction problem. The goal is to build a robust model to assist banks or financial institutions in making automated loan approval decisions.

classification-models kaggel-dataset loan-approval-prediction matplotlib-seaborn pandas python scikit-learn

Last synced: 30 Apr 2026

https://github.com/nickklos10/seriea_machine_learning_predictions_2025

This project involves scraping data, processing the data, and building machine learning models to predict the standings for the 2024-2025 Serie-A season.

beatifulsoup data-scraping keras matplotlib pandas scikit-learn shap tensorflow

Last synced: 13 Apr 2026

https://github.com/sneha1012/ml-dl

Implementing concepts and algorithms from scratch.

deep-learning machine-learning matplotlib numpy-tutorial scikit-learn

Last synced: 18 May 2026

https://github.com/altescy/xsklearn

Expanded scikit-learn for my research

python scikit-learn

Last synced: 21 Mar 2025

https://github.com/tanaybhadula/ml-preprocessing-cli

A CLI tool with python to preprocess datasets for performing supervised learning to save time for users. Input data can be preprocessed using simple commands and preprocessed dataset can be downloaded later

cli data-cleaning data-preprocessing machine-learning pandas python scikit-learn

Last synced: 10 May 2026

https://github.com/konnik88/heart-disease-ml-practice

Practice notebook on heart-disease risk with a small/noisy dataset: EDA → preprocessing → classic ML baselines (scikit-learn). Not for clinical use

classification eda healthcare heart-disease imbalanced-data jupyter-notebook machine-learning model-evaluation optuna reproducibility scikit-learn

Last synced: 18 May 2026

https://github.com/h00n24/ikr

Klasifikace a rozpoznávání - projekt

fit ikr scikit-learn vutbr

Last synced: 18 May 2026

https://github.com/rohansoni45/movie-recommendation-system

This project is a Content-Based Recommender System that suggests movies to users based on their preferences and watched history. The system leverages cosine similarity to find and recommend movies similar to a selected title. It is built using Python and libraries like Pandas, NumPy, and Scikit-learn.

content-based-filtering cosine-similarity data-analysis data-science machine-learning numpy pandas python recommender-system render scikit-learn

Last synced: 17 Apr 2026

https://github.com/kejiahp/fastapi-ecom-recommendation-system

Advanced recommendation system for e-commerce applications.

docker fastapi jinja2 mongodb motor pydantic python scikit-learn scikit-surprise

Last synced: 07 Apr 2026

https://github.com/hrolive/disaster-response-pipeline

A machine learning pipeline that categorizes disaster related messages so that they can be sent to the appropriate disaster relief agency

flask machine-learning natural-language-processing nltk pandas plotly python scikit-learn sql sqlalchemy

Last synced: 07 Apr 2026

https://github.com/swetshaw/machine-learning-a-z

It contains all tutorials based on Udemy course Machine Learning A-Z.

machine-learning python scikit-learn udemy-machine-learning

Last synced: 07 Apr 2026

https://github.com/vhnegrisoli/machine-learning-linguagens-programacao

Projeto de Data Science e Machine Learning de análise de linguagens de programação de 2004 a 2021

data-science jupyter-notebook machine-learning matplotlib pandas python scikit-learn seaborn

Last synced: 07 Apr 2026

https://github.com/smpotts/student-performance-predictions-ml

Creates machine learning models to predict student's learning outcomes.

jupyter-notebook machine-learning python regression-models scikit-learn

Last synced: 12 Sep 2025

https://github.com/iamriteshkoushik/skrun

18hrs Scikit Learn Course Speedrun Repo

freecodecamp machine-learning scikit-learn

Last synced: 26 Apr 2026

https://github.com/aadrianleo/fashion-style-classifier

A machine learning and deep learning pipeline for fashion image classification. Combines real-world data, manual annotation, and both KNN and EfficientNet-B0 CNN models to classify images into style categories. Includes data cleaning, augmentation, model training, evaluation, and reproducible notebooks.

classification-report cnn computer-vision confusion-matrix data-augmentation data-preprocessing deep-learning efficientnet exploratory-data-analysis fashion-classification image-classification knn label-studio machine-learning model-evaluation pytorch real-world-data reproducible-research scikit-learn transfer-learning

Last synced: 11 May 2026

https://github.com/veranyagaka/credit-card-fraud-detection

Credit Card Fraud Detection using data preprocessing, analysis, visualization, and machine learning to accurately identify fraudulent transactions. -Final Project

ai anomaly-detection classification credit-card-fraud-detection machine-learning scikit-learn supervised-learning

Last synced: 18 May 2026

https://github.com/callesjuan/ninjalprm

Protótipo de ferramenta de agrupamento de dispositivos Android por geolocalização (Server)

python scikit-learn xmpp

Last synced: 20 Jan 2026

https://github.com/bhaveshbhakta/crop-yield-prediction

Indian Crop Yield Prediction Using Machine Learning

flask machine-learning python random-forest scikit-learn webdevelopment

Last synced: 20 Apr 2026

https://github.com/yugalsoni18/counterfeit_review_detection

Fake review detection using TF-IDF & SVM (AUC 0.98), plus Counterfeit Risk Score with clustering & anomaly detection.

business-analytics fraud-detection isolation-forest kmeans nlp python risk-scoring scikit-learn svm tfidf

Last synced: 18 May 2026

https://github.com/pradipnp/decisiontree-iris

Machine learning project to classify iris flowers using a decision tree

classification decision-tree iris-dataset machine-learning python scikit-learn

Last synced: 18 May 2026

https://github.com/sudarshanc00/brain-tumor-classification

This project uses a deep learning model in PyTorch to classify brain MRI images into four tumor types, aiding early diagnosis and treatment planning. Two ResNet-based models were developed and optimized, achieving high accuracy to support healthcare professionals in identifying tumor categories.

matplotlib numpy pytorch resnet scikit-learn streamlit

Last synced: 10 Apr 2026

https://github.com/saniyaacharya04/resume-scanner-using-nlp

A live resume scanning and ranking tool built with Python, Streamlit, and NLP. Upload resumes, match them to job descriptions, and generate analytics dashboards and PDF reports.

dashboard job-matching nlp pdf-parser resume-scanner scikit-learn spacy streamlit transformers

Last synced: 03 May 2026

https://github.com/martinkersner/kmeans-meetup

Presentation about k-Means for Seoul AI Meetup on July 22, 2017.

kmeans numpy python scikit-learn

Last synced: 03 May 2026

https://github.com/sanalislokuge/breast-cancer-ml-prediction

Machine Learning project using classification, regression, and ensemble techniques to predict breast cancer mortality status and survival months using clinical data. Built with scikit-learn, decision trees, logistic regression, and Naïve Bayes. Includes detailed model evaluation, data preprocessing, and interpretability.

classification data-science decision-tree ensemble-learning healthcare-analytics machine-learning ml models naive-bayes-classifier predictive-modeling regression scikit-learn

Last synced: 19 May 2026

https://github.com/anty-filidor/cyberbullying-detector

NLP bullying detector for tweets with ML model training pipeline deployed as web-app with CICD

deployment-system flask-api machine-learning nlp python scikit-learn

Last synced: 19 May 2026

https://github.com/idaraabasiudoh/drug_prescribtion_decision_tree_model

This repository contains a machine learning project focused on classifying drugs based on patient characteristics using a Decision Tree classifier. The project uses Python and popular data science libraries such as scikit-learn, pandas, and matplotlib.

data-analysis jupyter-notebook machine-learning python3 scikit-learn

Last synced: 10 Apr 2026

https://github.com/arrhythmia-detection/authorfeatureextracteddecisiontreeesp32s3

Deploys a vanilla non-optimized Decision Tree for Arrhythmia classification using Chapman ECG dataset on ESP32-S3 dev kit

arrhythmia-classification decisiontreeclassifier eloquent esp32-arduino esp32-s3 scikit-learn

Last synced: 19 May 2026

https://github.com/huucanh0511/startup-profitability-prediction

This project predicts startup profitability using Logistic Regression and Random Forest, analysing financial (funding amount, funding rounds, revenue), market (market share), and operational (startup age, employee count) factors. It evaluates AUC, accuracy, precision, recall, and F1-score, addressing underfitting, overfitting, and feature selection

ai-for-finance data-science financial-modelling logistic-regression machine-learning predictive-analytics python random-forest scikit-learn startup-analysis

Last synced: 19 May 2026

https://github.com/medicharlakarthik/credit-card-fraud-detection

Credit Card Fraud Detection using machine learning to distinguish fraudulent transactions from legitimate ones. This project includes data analysis, model training, and evaluation to achieve high accuracy and recall, minimizing false negatives for better fraud detection

machine-learning python random-forest-classifier scikit-learn

Last synced: 12 Apr 2026

https://github.com/subratamondal1/machine-learning

Machine Learning Notes with tools like Numpy, Pandas, Scikit-Learn.

machine-learning numpy pandas scikit-learn

Last synced: 10 Apr 2026