An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/antim21/spamsense-ai

Classifying emails into Spam or Not Spam categories using Machine Learning techniques

machine-learning nlp python scikit-learn

Last synced: 04 May 2026

https://github.com/analitico-771/creditworthiness_classification_model

This is an Application that trains a model using supervised learning and imbalanced-learn library in order to classify and identify the creditworthiness of borrowers

artificial-intelligence credit-risk fintech imbalanced-learning machine-learning python quantitative-finance scikit-learn supervised-machine-learning

Last synced: 04 May 2026

https://github.com/kohlerhector/trex-tree-reward-exploration

Using Tree estimators of the MDP models to then count leaves grouping similar transitions and do count-based exploration.

decision-trees drl exploration rl scikit-learn stable-baselines3

Last synced: 04 May 2026

https://github.com/franpog859/titanic-competition

❄️🚢 Machine Learning project workflow reference. Model predicts if given people survive the Titanic disaster basing on among others their age, sex and names

classification data-science kaggle machine-learning scikit-learn titanic workflow

Last synced: 05 May 2026

https://github.com/anupam0202/contextual-rag-chatbot

Contextual RAG Chatbot that processes PDF documents using the Google Gemini API

google-generativeai numpy pypdf2 scikit-learn streamlit

Last synced: 05 May 2026

https://github.com/shuddha2021/stellar-candidate-selector

A sophisticated candidate selection algorithm leveraging multi-criteria analysis and machine learning to identify top software engineering candidates. This tool features flexible filtering, score adjustment, and detailed visualizations to streamline the recruitment process.

candidate-selection data-analysis data-visualization machine-learning pandas plotting-in-python python python-data-analysis recruitment scikit-learn

Last synced: 05 May 2026

https://github.com/sigilbyte/choquet-classifier

Implementation of the Choquet classifier using the scikit-learn API design.

machine-learning regression regression-models scikit-learn scikitlearn-machine-learning

Last synced: 05 May 2026

https://github.com/codenexa/nairobi

Quantifying Integrity in the Digital Age Misinformation spreads rapidly, accountability often falters, and the lines between transparency and manipulation blur

csv ipynb-jupyter-notebook matpotlib pkl-model python scikit-learn

Last synced: 05 May 2026

https://github.com/rakshit-vasava/predictive-analytics-for-insurance-purchase

Predicting customer insurance purchases using stacking models and SMOTE for the Homesite Quote Conversion Problem on Kaggle.

k-nearest-neighbours kaggle-competition multilayer-perceptron python random-forest scikit-learn smote support-vector-machines

Last synced: 05 May 2026

https://github.com/tromesh/sinhala-parser

Sinhala parser project is based on Natural Language Processing (NLP)

flux-architecture natural-language-processing nlp python react scikit-learn sinhala

Last synced: 05 May 2026

https://github.com/pngo1997/yelp-business-recommender-system

Building an item-based collaborative recommendation system using embeddings for establishments from the Yelp dataset.

content-based-recommendation embeddings geo-mapping geospatial information-retrieval python recommender-system scikit-learn spacy

Last synced: 05 May 2026

https://github.com/myounus-codes/saleprice-prediction-dataset-analysis-and-cleaning-advance-regression

In this project I have cleaned the data for the model. Project Google Colab Link: https://colab.research.google.com/drive/1vQY-XEFJSdEkW2PQOSf1j13Yk8L-XXNw?usp=sharing

algorithms data-analysis data-science eda google-colab machine-learning numpy pandas python scikit-learn scikit-learn-python

Last synced: 05 May 2026

https://github.com/grachale/predict_life_expect

Predicting life expectancy (regression) with usage of custom random forest, linear regression and decision tree regressor from scikit-learn.

decision-tree-regression jupyter-notebook linear-regression pandas python random-forest regression scikit-learn

Last synced: 05 May 2026

https://github.com/intscription/python-programs

Python basics-advance

numpy pandas scikit-learn

Last synced: 05 May 2026

https://github.com/sarthak-1408/rain-fall-prediction

This repository represents the End to End Machine Learning Project (Rain Fall Prediction in Australia).

heroku heroku-deployment machine-learning numpy pandas rain-fall rain-fall-prediction scikit-learn xgboost-algorithm

Last synced: 05 May 2026

https://github.com/jordandeklerk/pygridge

A scikit-learn compatible Python package for data-driven group regularized ridge regression

python regression regularized-regression scikit-learn

Last synced: 05 May 2026

https://github.com/joaoassalim/class-by-description-classifier-with-nlp

Enhancing Item Classification through Natural Language Processing: Leveraging Text Descriptions for Precise Categorization

bert fine-tuning nlp nlp-machine-learning scikit-learn sklearn tensorflow

Last synced: 06 May 2026

https://github.com/drcbeatz/machine-learning-tool

Machine Learning Tool - Train and test supervised ML algorithms (incl. binary classification and regression) on custom data sets and visualize your results without knowing how to code.

data-science data-visualization django machine-learning python scikit-learn

Last synced: 06 May 2026

https://github.com/himendersharma0712/life_expectancy_pred

This repository is for a hackathon project.

jupyter-notebook machine-learning python scikit-learn

Last synced: 06 May 2026

https://github.com/shubhranpara/heart-disease-predictor

I have created this project as my Python term assignment. In this project I have trained a ML model to predict the heart disease using Scikit-learn library in python.

google-colab jupyter-notebook machine-learning medical prediction-model python scikit-learn

Last synced: 06 May 2026

https://github.com/sandeepbalachandran/predictor

A collection of prediction algorithms for different purposes

collection jupyter-notebook machine-learning notebook predictor regression-models scikit-learn

Last synced: 06 May 2026

https://github.com/varun-khorgade/cvinsight-ai-resume-analyzer

AI tool that analyzes resumes, extracts keywords, and matches them with job descriptions.

css django html5 nlp python scikit-learn textparse

Last synced: 06 May 2026

https://github.com/nurulashraf/ann-cancer-prediction

An Artificial Neural Network built with TensorFlow and Keras to predict breast cancer based on the Wisconsin Breast Cancer dataset.

artificial-neural-network breast-cancer-prediction deep-learning keras machine-learning python scikit-learn tensorflow

Last synced: 06 May 2026

https://github.com/khaymanii/house-price-prediction-model

This model was built using Python and XGBoost Regression algorithm

matplotlib numpy pandas python scikit-learn

Last synced: 06 May 2026

https://github.com/elcorto/gp_playground

Explore selected topics related to Gaussian processes

gaussian-processes gpy gpytorch kernel-ridge-regression machine-learning scikit-learn tinygp

Last synced: 06 May 2026

https://github.com/mohammadvhossein/ml-gym

The ML-GYM repository showcases machine learning projects using **scikit-learn**, covering classification, regression, and clustering. It offers educational resources for beginners and practical examples for experienced users, complete with detailed instructions.

classification-algorithms clustering-methods cross-validation data-preprocessing data-science decision-trees feature-engineering machine-learning model-evaluation neural-networks python-programming random-forests regression-techniques scikit-learn supervised-learning unsupervised-learning

Last synced: 06 May 2026

https://github.com/omanshu209/ml-basics-2022

Machine Learnings(AI) models developed using the scikit-learn library in Python.

jupyter-notebook machine-learning python python3 scikit-learn

Last synced: 06 May 2026

https://github.com/glencrawford/matchmaker

A k-nearest neighbors machine learning project to perform similarity matching using a dataset of OkCupid dating profiles.

django machine-learning python scikit-learn scipy

Last synced: 06 May 2026

https://github.com/mpolinowski/isometric-mapping

Non-linear dimensionality reduction through Isometric Mapping

isomap matplotlib-pyplot python scikit-learn

Last synced: 06 May 2026

https://github.com/flexycode/ccmaclrl

🤖 This repository is intended for our Machine Learning CCMACLRL COM231ML by Professor Elizer Ponio Jr

artificial-intelligence linnear-regression machine-learning machine-learning-algorithms python random-forest scikit-learn supervised-learning tensorflow

Last synced: 07 May 2026

https://github.com/marksikaundi/handson-machinelearning

Complete Collection about Machine Learning

matplotlib pandas-python scikit-learn tensorflow

Last synced: 07 May 2026

https://github.com/cbjuan/paper-ijimai-ml-employability

Jupyter notebook developed to support the research presented in the paper "Proposing a machine learning approach to analyze and predict employment and its factors"

jupyter-notebook python research scikit-learn

Last synced: 07 May 2026

https://github.com/asut00/machine-learning-program_42ai

Comprehensive Machine Learning path by 42AI: hands-on modules on regression, gradient descent, and real-world ML applications.

linear-regression machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 07 May 2026

https://github.com/aymanmansur/insider-threat-detection-using-cert-dataset-logon-

Detecting anomalies in user logon behavior using the CERT Insider Threat Detection Dataset. This project extracts key features like session duration and logon frequency during non-working hours and applies Isolation Forest to identify suspicious activity.

matplotlib pandas python scikit-learn

Last synced: 07 May 2026

https://github.com/idaraabasiudoh/knn-customer-classification

Labels telecommunication customer base to respective groups to determine service type required for each customer.

data-analysis jupyter-notebook machine-learning pyhton3 scikit-learn

Last synced: 07 May 2026

https://github.com/joseprsm/nectarine

🍑 Neural Enhanced Collaborative Tool for Automated Recommendation and INtelligent Exploration

argo-workflows recommender-systems scikit-learn tensorflow tensorflow-recommenders

Last synced: 07 May 2026

https://github.com/bala-1409/foreign-exchange-rate-time-series-data-science-project

This project will use time series analysis to forecast the exchange rate between the euro and the US dollar. The project will use a variety of statistical techniques, such as ARIMA to model the data and forecast the exchange rate.

data-analysis data-science data-visualization datapreprocessing eda exploratory-data-analysis forecasting machine-learning-algorithms model modelfitting predictive-modeling python3 scikit-learn statsmodels time-series time-series-analysis

Last synced: 07 May 2026

https://github.com/mehuaniket/blog-classifier

blog classifier with scikit random forest.

bag-of-words blog-classifier python scikit-learn

Last synced: 07 May 2026

https://github.com/otuemre/realtimenids

Real-time network intrusion detection system using Zeek flow logs and machine learning (IsolationForest). Detects threats with both signature-based and anomaly-based techniques trained on the CSE-CIC-IDS2018 dataset.

anomaly-detection cybersecurity flow-analysis isolation-forest machine-learning network-intrusion-detection nids scapy scikit-learn zeek

Last synced: 07 May 2026

https://github.com/tddschn/hack-ncsu-2024

ML and doc part of our Hack_NCState project builtin in less than 1 day | Racial Bias in Criminal Justice Visualized: Code Black

bias machine-learning scikit-learn

Last synced: 08 May 2026

https://github.com/canayter/unsupervised-machine-learning

Utilizing Python and unsupervised learning to predict if cryptocurrencies are affected by 24-hour or 7-day price changes.

k-means-clustering python scikit-learn unsupervised-machine-learning

Last synced: 08 May 2026

https://github.com/loong64/onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

ai-framework deep-learning hardware-acceleration loong64 loongarch64 machine-learning neural-networks onnx pytorch scikit-learn tensorflow

Last synced: 09 May 2026

https://github.com/lakshitalearning/churninsight

Customer Churn prediction means knowing which customers are likely to leave or unsubscribe from your service.

churn-prediction data-science flask google-colab machine-learning predictive-analytics python scikit-learn user-retention web-development

Last synced: 09 May 2026

https://github.com/davidcamilo0710/hate_speech_analysis

Hate speech detection using NLP for linguistic analysis and machine learning (XGBoost) for classification with Python and SpaCy.

hate-speech-detection linguistic-analysis nlp scikit-learn spacy xgboost

Last synced: 09 May 2026

https://github.com/siam29/ensemble-majority-voting-hard

In this project, we implemented an ensemble learning approach using majority voting (hard voting) with five machine learning classifiers: DT, RF, XGBC, ANN, and KNN. The ensemble model achieved an impressive accuracy score of 99.95% and an F1 score of 85.51%.

credit-card-fraud ensemble-learning machine-learning matplotlib pandas scikit-learn

Last synced: 09 May 2026

https://github.com/t-abishek/embedded-intent-classifier

A production-grade FastAPI application that uses sentence embeddings to classify user prompts into 4 categories: Built using Python, BGE SentenceTransformer, Scikit-learn, and FastAPI.

classifier embedded huggingface pandas scikit-learn transformer

Last synced: 10 May 2026

https://github.com/mijisu0103/data-driven-decision-making-risk-analysis

This repository contains my coursework project for ECS7005P - Risk and Decision-Making for Data Science and AI. It applies probabilistic models, Bayesian networks, and decision analysis using Python and PyAgrum to evaluate risk and optimise decision-making under uncertainty.

machine-learning pandas probability-and-statistics pyagrum python quantitative-decision-making risk-assessment scikit-learn

Last synced: 10 May 2026

https://github.com/neelanjan-chakraborty/custoclarity

CUSTO CLARITY is a customer segmentation model built in Python. Using clustering on real retail datasets, it identifies 5 customer segments that unlocked strategic retail partnerships. Powered by scikit-learn, pandas, seaborn, and Matplotlib.

clustering-algorithm clustering-algorithms customer-analytics customer-segmentation data-visualization kmeans kmeans-clustering pandas python scikit-learn

Last synced: 11 May 2026

https://github.com/pngo1997/astrophysical-objects-classification

Project applies machine learning techniques to classify astrophysical objects using observational data from the Large Synoptic Survey Telescope (LSST).

adaptive-boosting-algorithm classification down-sampling gradient-boosting keras machine-learning neural-network python random-forest scikit-learn supervised-learning tensorflow time-series

Last synced: 10 May 2026

https://github.com/vaibhavs10/learn-ml

Modified notebooks (single) from kaggle.com/learn with added nuances

decision-trees machine-learning pandas random-forest scikit-learn

Last synced: 11 May 2026

https://github.com/rvats20/income-classification-using-ml

Model Training, Implementing various machine learning algorithms such as Logistic Regression, Decision Trees, Random Forests, and Gradient Boosting. Model Evaluation: Assessing model performance using metrics like accuracy, precision, recall, and F1-score. Hyperparameter Tuning

classification machine-learning machine-learning-algorithms ml pandas-dataframe python scikit-learn

Last synced: 11 May 2026

https://github.com/hasanulmukit/spam-email-classifier

This is a Spam Email Classifier built using Python and Streamlit. It uses a pre-trained model to predict whether an email is Spam or Not Spam. The app also provides the probability scores for both categories, enhancing transparency and reliability of the prediction.

email-classifier machine-learning nlp python scikit-learn spam-detection streamlit text-classification

Last synced: 11 May 2026

https://github.com/aravindnathan02/credit-card-fraud-detection

This repository contains a Machine Learning project aimed at detecting fraudulent credit card transactions. The goal is to build a reliable and efficient model that minimizes false positives and false negatives, ensuring financial safety and improving fraud detection capabilities.

classification-model fraud-detection logistic-regression machine-learning python random-forest scikit-learn

Last synced: 11 May 2026

https://github.com/elifftosunn/bert-bank-model

It is a Turkish BERT-based model that will analyze people's bank complaints and classify them according to one of eight categories.

countvectorizer doc2vec f1-score huggingface huggingface-transformer huggingface-transformers nlp nltk python3 scikit-learn stopwords tagged tfidf-transformer train-test-split word-tokenizer wordnetlemmatizer

Last synced: 12 May 2026

https://github.com/thevarunsharma/extracting-dominant-colors

A web application that extracts the dominant colors from an image using K-means clustering.

flask-application k-means-clustering machine-learning python scikit-learn unsupervised-learning

Last synced: 12 May 2026

https://github.com/alessiochen/setiment-analysis-ai-project

Application of Sentimental Analysis for Artificial Intelligence class at UNIFI

ai andrew dataset movie-reviews scikit-learn sentiment-analysis

Last synced: 12 May 2026

https://github.com/aliy98/navigation-sensor-data-classification

Classification of a Navigation Robot Sensor Dataset Using SVM, Random Forest and Neural Network

artificial-neural-networks keras multiclass-classification random-forest scikit-learn scitos-g5 support-vector-machines

Last synced: 13 May 2026

https://github.com/ultrasage-danz/scikit-learn-ml

Machine Learning with scikit-learn by Data School

ai data data-school machine-learning macos ml scikit-learn ultrasage-dan

Last synced: 13 May 2026

https://github.com/alam025/customer-churn-prediction

🎯 Predict customer churn with 96%+ accuracy using Random Forest ML. Beautiful visualizations, production-ready code, and real business impact. Save revenue before customers leave! 🚀

churn-prediction classification customer-analytics customer-churn customer-retention data-science machine-learning pandas predictive-analytics python random-forest scikit-learn

Last synced: 11 Jun 2026

https://github.com/dhavaltaunk08/gender-classification

I did this project during my internship at IIT Guwahati. It aimed to perform gender classification in video streaming.

deep-learning librosa opencv-python python scikit-learn

Last synced: 14 May 2026

https://github.com/antoniskl/amsterdam-metro-crowdedness-prediction

The aim of this full-stack project is to predict with RandomForest and visualize crowdedness for metro stations of Amsterdam by using external factors.

amsterdam covid-19 crowded-areas dash full-stack metro prediction-model python random-forest regression scikit-learn ticketmaster-api

Last synced: 14 May 2026

https://github.com/anishshinde01/machine-learning-exercises

Python implementations of machine learning, statistics, and mathematical foundations.

linear-algebra machine-learning machine-learning-algorithms matplotlib numerical-analysis numpy python scikit-learn scipy statistics

Last synced: 11 Jun 2026

https://github.com/sivatsk26/university-admit-eligibility-predictor

This project is created using Machine Learning and Regression methods- a statistical technique to predict the outcome of event which is to verify the users’ admission eligibility level, considering the universities they have chosen. This is achieved based on the algorithms implemented, when is user feed the application with the required information

html-css-javascript ibm-cloud ibm-watson linear-regression machine-learning matplotlib numpy pandas python python-flask random-forest scikit-learn

Last synced: 13 Apr 2026

https://github.com/rohitpawar001/bone_marrow_surival_prediction

Bone marrow transplants can be life-saving, but predicting patient survival is complex. In this project, I used machine learning to analyze key medical factors and improve survival predictions. I also implemented CI/CD pipelines, used MLflow for model tracking, and deployed the model on an AWS EC2 instance.

aws docker ec2-instance flask machine-learning mlflow python scikit-learn

Last synced: 08 Apr 2026

https://github.com/benman1/python-time-series

Time-Series analysis, statistical and machine learning models for forecasting, regression, and classification

darts deep-learning forecasting mlforecast nixtla scikit-learn statsforecast time-series time-series-analysis

Last synced: 22 Feb 2026

https://github.com/strcoder4007/machine-learning-deep-learning-practice

Implementation of Linear/Logistic Reg, K-NN, SVM, Clustering, K-Means, ConvNet, ResNet, MobileNet, RNN, LSTM etc. using Pandas, SciKitLearn, NumPy & TensorFlow 2

convolutional-neural-networks matplotlib scikit-learn tensorflow2

Last synced: 15 May 2026

https://github.com/samarthmule/chatbot

This project implements a generic chatbot using Natural Language Processing (NLP) and Machine Learning techniques. The chatbot is designed to classify user input into predefined intents and provide context-aware responses. The solution is scalable, interactive, and suitable for various domains.

chatbot internship machine-learning machine-learning-algorithms nlp nltk project-repository python python3 scikit-learn streamlit

Last synced: 13 Apr 2026

https://github.com/abz4375/recommendersystem

A sophisticated recommender system that leverages web mining techniques to help users find hotels that match their preferences.

cosine-similarity css html javascript pandas python scikit-learn selenium selenium-webdriver

Last synced: 13 Apr 2026

https://github.com/filiplangiewicz/automltunability

📈 Analyzing the impact of hyperparameter optimization

automl machine-learning scikit-learn

Last synced: 18 Feb 2026

https://github.com/somenath203/titanic-survival-project-backend

Click the link below to check the swagger documentation of the website live

fastapi pandas python render scikit-learn seaborn titanic-survival-predictor

Last synced: 05 Apr 2026

https://github.com/jersongb22/datascience_ibm_stockpredictionlstm_project

In the IBM Advanced Data Science specialization, an interactive real-time web application was developed using LSTM networks in TensorFlow to predict stock market trends for global companies.

apache-spark data-science deep-learning lstm-neural-networks machine machine-learning plotly python scikit-learn streamlit tensorflow

Last synced: 13 Apr 2026

https://github.com/imswappy/brain-tumor-detection

🧠 Deep learning project for brain tumor classification using MRI images. Built with transfer learning (VGG16 + fine-tuning), TensorFlow/Keras, and deployed via Streamlit. Dataset & model loaded dynamically from KaggleHub. Includes training notebook, evaluation, and interactive web app.

kagglehub keras numpy pandas scikit-learn streamlit tensorflow vgg16-model

Last synced: 13 Apr 2026

https://github.com/supriya811106/healthcare-recommedation-system

A Flask-based web app that predicts diseases based on symptoms and recommends specialized doctors. It uses machine learning for accurate health predictions and location-based doctor searches.

css flask-application healthcare-application html javascript machine-learning numpy pandas recommendation-system scikit-learn

Last synced: 04 Mar 2026

https://github.com/virajbhutada/article-recommendation-system

This project aims to redefine content discovery by delivering personalized article recommendations tailored to individual user preferences. We use advanced machine learning techniques like PCA and K-means clustering to analyze user behavior and article characteristics to provide highly accurate recommendations.

anaconda article-recommendation clustering-algorithm data-analysis data-science keras-tensorflow machine-learning machine-learning-algorithms ml-models numpy pandas plotly python scikit-learn scipy

Last synced: 06 Jan 2026

https://github.com/asosnovsky/analyzing-blood-vessel-aneurysm

A few simple scripts to identify aneurysm in a blood-vessel (research projects)

machine-learning meanshift medical-image-processing scikit-learn

Last synced: 20 May 2026

https://github.com/khaymanii/parkinsons-disease-detection-model

This model was built with Python and Support Vector Machine Algorithm

matplotlib numpy pandas python scikit-learn

Last synced: 19 Apr 2026

https://github.com/bahar15984/obesity-classification

Machine Learning Pipeline for Obesity Classification using Azure ML & Python

azure azure-ml classification data-science healthcare machine-learning mlops obesity pandas pipeline python scikit-learn

Last synced: 03 Nov 2025

https://github.com/takkii/pylean

Data analysis ( 🐍 💎 📈 )

analayze matplotlib numpy pandas python scikit-learn

Last synced: 09 Sep 2025

https://github.com/squadron-leader/ecopredict-ai

EcoPredict AI is a powerful, AI-driven solution for predicting Greenhouse Gas (GHG) emissions based on user-input industry data. Designed for environmental sustainability initiatives, EcoPredict AI utilizes machine learning models to deliver accurate carbon emission predictions and is deployed via Streamlit for real-time access.

epa-data linear-regression python regression-model scikit-learn streamlit

Last synced: 12 Apr 2026

https://github.com/davidyen1124/cowculator

COWCULATOR: AI-driven catering cost forecasting in Python. Trains order-level and daily time series models, exports an edge-ready JSON bundle, and includes a demo web UI.

cli data-science edge-ai forecasting github-actions machine-learning mypy pandas python ruff scikit-learn time-series uv

Last synced: 05 May 2026

https://github.com/paulj1989/bulgarian-constitutional-court-decisions

Developing NLP models for text and sentence classification using legal texts from the Bulgarian constitutional court.

keras neural-network nlp scikit-learn tensorflow tesseract

Last synced: 04 May 2026