An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/gilevatanya/yandex-practicum-projects

Кейсы решенные на курсах Яндекс Практикума.

bert bootstrap catboost keras lightgbm matplotlib nltk numpy pandas postgresql python pytorch scikit-learn scipy seaborn sql

Last synced: 06 Jan 2026

https://github.com/namratha2301/bangalorehousepricepredictor

Predicting house price in Bangalore based on the key features of the house like number of rooms, size in square feet etc.

azure bashscript docker flake8 flask github-actions scikit-learn

Last synced: 12 Apr 2026

https://github.com/charlescro/reddit-classification-nlp

Analyzing subreddit language via Reddit API and NLP techniques.

data-analysis data-science data-visualization nlp-machine-learning reddit-api scikit-learn

Last synced: 03 Apr 2025

https://github.com/filsan95/project-iot_malware_identification

This repository contains the code and data for a project that detects malware from IoT devices using a publish-subscribe model with Confluent and Databricks. The project streams IoT device data to Kafka, analyzes it, and detects malware using machine learning models such as Random Forest and Gradient Boosted Trees.

apache-kafka classification confluent databricks machine-learning-algorithms scikit-learn sql

Last synced: 16 Mar 2025

https://github.com/medicharlakarthik/credit-card-fraud-detection

Credit Card Fraud Detection using machine learning to distinguish fraudulent transactions from legitimate ones. This project includes data analysis, model training, and evaluation to achieve high accuracy and recall, minimizing false negatives for better fraud detection

machine-learning python random-forest-classifier scikit-learn

Last synced: 12 Apr 2026

https://github.com/karimosman89/health-risk-assessment

Predict health risks based on patient data.Create a machine learning model that predicts health risks (like diabetes or heart disease) based on patient data.Help healthcare providers identify at-risk patients for early intervention.

ehr-data pandas python scikit-learn

Last synced: 06 May 2026

https://github.com/gangula-karthik/bank-transaction-classification

Classifying bank transactions with precision—your first step towards smarter finance management 💳🤖📊

finance machine-learning nlp scikit-learn

Last synced: 09 Apr 2025

https://github.com/huucanh0511/startup-profitability-prediction

This project predicts startup profitability using Logistic Regression and Random Forest, analysing financial (funding amount, funding rounds, revenue), market (market share), and operational (startup age, employee count) factors. It evaluates AUC, accuracy, precision, recall, and F1-score, addressing underfitting, overfitting, and feature selection

ai-for-finance data-science financial-modelling logistic-regression machine-learning predictive-analytics python random-forest scikit-learn startup-analysis

Last synced: 19 May 2026

https://github.com/arrhythmia-detection/authorfeatureextracteddecisiontreeesp32s3

Deploys a vanilla non-optimized Decision Tree for Arrhythmia classification using Chapman ECG dataset on ESP32-S3 dev kit

arrhythmia-classification decisiontreeclassifier eloquent esp32-arduino esp32-s3 scikit-learn

Last synced: 19 May 2026

https://github.com/idaraabasiudoh/drug_prescribtion_decision_tree_model

This repository contains a machine learning project focused on classifying drugs based on patient characteristics using a Decision Tree classifier. The project uses Python and popular data science libraries such as scikit-learn, pandas, and matplotlib.

data-analysis jupyter-notebook machine-learning python3 scikit-learn

Last synced: 10 Apr 2026

https://github.com/prajakta1321/authencheck

Amdocs Gen AI Graduate Hackathon 2024-25- A comprehensive fact-checking and misinformation detection system that leverages cutting-edge AI models and multiple news sources to verify information circulating on social media

api bert-fine-tuning flask-application matplotlib ngrok-server nlp nlp-machine-learning numpy pandas python3 scikit-learn seaborn wandb

Last synced: 05 Apr 2026

https://github.com/the-developer-306/fake-review-detector

This project is a machine learning-based review classification system that predicts whether a product review is GENUINE or FAKE. It preprocesses review text, analyzes sentiment, and uses numerical features like ratings and helpfulness to make predictions. The model is deployed via a Flask web application for user interaction.

classification flask logistic-regression machine-learning numpy pandas python renderdeploy scikit-learn sentiment-analysis

Last synced: 12 Apr 2026

https://github.com/koradapavani/customer-churn-ml-project

Machine learning project to predict customer churn in telecom

churn-prediction machine-learning python scikit-learn telecom

Last synced: 04 May 2026

https://github.com/anty-filidor/cyberbullying-detector

NLP bullying detector for tweets with ML model training pipeline deployed as web-app with CICD

deployment-system flask-api machine-learning nlp python scikit-learn

Last synced: 19 May 2026

https://github.com/avtorgenii/cocktails-analysis

Preprocession, augmentation, EDA and clustering of cocktails dataset from TheCocktailDB. Recruiting task for SolVro science club.

matplotlib pandas scikit-learn

Last synced: 11 May 2026

https://github.com/m-rishab/job-recruitment-prediction-and-hr-dashboard-using-plotly

This project features make it ideal for dynamic HR dashboards, offering insights into candidate profiles and recruitment processes.

correlation-analysis flask kmeans-clustering numpy pandas plotly python scikit-learn seaborn standardscaler

Last synced: 12 Apr 2026

https://github.com/sanalislokuge/breast-cancer-ml-prediction

Machine Learning project using classification, regression, and ensemble techniques to predict breast cancer mortality status and survival months using clinical data. Built with scikit-learn, decision trees, logistic regression, and Naïve Bayes. Includes detailed model evaluation, data preprocessing, and interpretability.

classification data-science decision-tree ensemble-learning healthcare-analytics machine-learning ml models naive-bayes-classifier predictive-modeling regression scikit-learn

Last synced: 19 May 2026

https://github.com/tamk-kol/project_orbital_data_analysis

The goal of this project is to develop an automatic method to detect orbital maneuvers using machine learning.

matplotlib numpy pandas scikit-learn

Last synced: 30 Jan 2026

https://github.com/hvalfangst/azure-functions-pandas

Azure Functions for ETL operations using Pandas. Uploaded CSV files trigger data processing, calculating correlations and storing results in a JSON file. Automated deployment via GitHub Actions and Terraform.

az-204 azure azure-functions azure-functions-python pandas python scikit-learn terraform

Last synced: 12 Apr 2026

https://github.com/rakibhhridoy/appliedmachinelearninghousing-regression

Let's take the Housing dataset which contains information about different houses in Boston. This data was originally a part of UCI Machine Learning Repository and has been removed now. We can also access this data from the scikit-learn library. The objective is to predict the value of prices of the house using the given features.

deep-learning housing-market housing-prices machine-learning numpy pandas python real-estate regression scikit-learn

Last synced: 05 Apr 2026

https://github.com/martinkersner/kmeans-meetup

Presentation about k-Means for Seoul AI Meetup on July 22, 2017.

kmeans numpy python scikit-learn

Last synced: 03 May 2026

https://github.com/simranjeet97/spam-classification

Spam Classification Using Natural Language Processing (NLP), Scikit-Learn Library, and Bayesian Method.

data-science emails kaggle kaggle-dataset naive-bayes-classifier nlp-machine-learning nltk-python python scikit-learn spam-classification

Last synced: 11 Apr 2026

https://github.com/saniyaacharya04/resume-scanner-using-nlp

A live resume scanning and ranking tool built with Python, Streamlit, and NLP. Upload resumes, match them to job descriptions, and generate analytics dashboards and PDF reports.

dashboard job-matching nlp pdf-parser resume-scanner scikit-learn spacy streamlit transformers

Last synced: 03 May 2026

https://github.com/arijit-7612/sms-spam-detection

A deep learning–based SMS Spam Detector built with BiLSTM and Keras TextVectorization. The model classifies messages as Spam or Ham with high accuracy and is deployed on Streamlit for real-time text classification with a clean and interactive user interface.

pandas python scikit-learn seaborn streamlit tensorflow

Last synced: 12 Apr 2026

https://github.com/alpha597/music_classification_ml

A project which compares different machine learning algorithms' accuracy in music genre classification of a large dataset.

machine-learning pandas python scikit-learn tensorflow

Last synced: 11 Apr 2026

https://github.com/diiblo/la-poste-predictive-flux

Prédiction journalière du flux de colis dans les centres de tri de La Poste. Pipeline complet : génération de données, modélisation LightGBM, orchestration via Airflow (Docker), stockage PostgreSQL et dashboard interactif Streamlit. Projet réalisé en Mastère 2 Data Engineering à l’ECE Paris.

airflow docker postgresql scikit-learn streamlit

Last synced: 31 Jan 2026

https://github.com/gwerbin/sklearn-gensim

Scikit-learn-compatible adapters for Gensim

machine-learning natural-language-processing python python3 scikit-learn

Last synced: 29 Apr 2026

https://github.com/hrolive/recommendation-systems-ibm

Analyze the interactions that users have with articles on the IBM Watson Studio platform and make recommendations to them about new articles, using various recommendation engines.

machine-learning natural-language-processing pandas python recomendation-system scikit-learn

Last synced: 12 Apr 2026

https://github.com/rubada/machine-learning-with-ruba-dabbas

Advance your skills and start your career here, by taking the online courses on Intuidemy.

course learning machine machine-learning matplotlib matplotlib-pyplot models numpy pandas python scikit-learn

Last synced: 12 Apr 2026

https://github.com/abdelrahman-amen/housing-price

Predicting housing prices with machine learning regression models. This project implements Linear Regression, Random Forest, and Decision Tree models for accurate predictions.

decision-tree housing-price-prediction linear-regression machine-learning python random-forest regression-analysis scikit-learn

Last synced: 07 May 2026

https://github.com/gunjangyl/iris-detection

The Iris Detection Project classifies different species of Iris flowers using machine learning techniques. It analyzes four key features—sepal length, sepal width, petal length, and petal width—to predict one of three classes: Setosa, Versicolor, or Virginica. The project uses algorithms like KNN, Decision Trees, or SVM for classification. Model pe

knn-classification matplotlib python scikit-learn seaborn

Last synced: 15 Apr 2026

https://github.com/sabin74/spam_mail_detection

A machine learning project to classify SMS messages as Spam or Ham (Not Spam) using Natural Language Processing (NLP) techniques and Scikit-learn. This binary classification task uses the UCI SMS Spam Collection Dataset and implements various models including Naive Bayes, SVM, and Logistic Regression with performance tuning.

gridsearchcv nltk python scikit-learn smote sms-spam-detection uci-machine-learning

Last synced: 04 May 2026

https://github.com/murugavl/crop-prediction

This Crop Prediction System utilizes machine learning to recommend suitable crops based on environmental data. It helps farmers make informed decisions by analyzing factors like soil type and climate. The system aims to enhance agricultural efficiency and productivity.

flask machine-learning python scikit-learn

Last synced: 12 Jun 2025

https://github.com/satyavardhan2k4/medical-insurance-predictor

A linear regression model that predicts medical insurance cost based on the features like age, sex, BMI etc. the dataset values is based in US

machine-learning pandas python scikit-learn

Last synced: 12 Apr 2026

https://github.com/santiago-giordano/datascienceproject

Data Science Course Project: Causes of death around the world

apis jupyter-notebook matplotlib pandas python scikit-learn seaborn

Last synced: 12 Apr 2026

https://github.com/mianmharoon/sentimentanalysis_coreml_emotionclassifier

Emotion classification iOS app using CoreML and SwiftUI – demo for sentiment and emotion analysis, with the model converted from Scikit-learn using coremltools.

ai coreml coreml-models emotionclassification ios machinelearning nlp python3 scikit-learn sentimentanalysis swift swiftui

Last synced: 12 Apr 2026

https://github.com/sudarshanc00/brain-tumor-classification

This project uses a deep learning model in PyTorch to classify brain MRI images into four tumor types, aiding early diagnosis and treatment planning. Two ResNet-based models were developed and optimized, achieving high accuracy to support healthcare professionals in identifying tumor categories.

matplotlib numpy pytorch resnet scikit-learn streamlit

Last synced: 10 Apr 2026

https://github.com/manu-karenite/medical-insurance-cost-predictor

Medical Insurance Cost Generator is a Linear Regression based Predictor which is used to estimate and predict the Cost a person has to pay while Buying a Medical Insurance.

kaggle-dataset linear-regression machine-learning matplotlib numpy pandas python3 reactjs scikit-learn

Last synced: 15 Apr 2026

https://github.com/bagusperdanay7/absa-with-bilstm-undergraduate-thesis

My undergraduate thesis program, Aspect-Based Sentiment Analysis Towards Matket Place Application Review Using Bidirectional Long Short-Term Memory used Python, Keras and Tensorflow

ai aspect-based-sentiment-analysis bilstm deep-learning gensim imbalanced-learning ipython-notebook keras machine-learning matplotlib natural-language-processing nltk numpy pandas python scikit-learn seaborn tensorflow

Last synced: 11 Apr 2026

https://github.com/purcellcjp/credit-risk-classification

This project utilized Python and scikit-learn libraries to train and evalute a Machinge Learning model based on loan risk.

machine-learning numpy pandas-dataframe python scikit-learn

Last synced: 12 Apr 2026

https://github.com/samkazan/structural_discovery_of_macromolecules_data_analysis

This research project uses machine learning techniques and neural network to uncover key factors that contribute to successful protein structure discovery using Python and R

classification clustering ipython-notebook jupyter-notebook keras-neural-networks keras-tensorflow machine-learning neural-network numpy python r rmarkdown scikit-learn scipy tensorflow

Last synced: 02 Feb 2026

https://github.com/apfirebolt/titanic_survival_prediction

Titanic survival prediction GUI application using scikit-learn and PyQT5

jupyter-notebook pandas prediction pyqt5 python scikit-learn titanic-kaggle

Last synced: 06 Apr 2026

https://github.com/mpolinowski/scikit-wine-quality

Predicting Wine Quality with Several Classification Techniques using SciKit Learn.

feature-classifiers python scikit-learn

Last synced: 18 May 2026

https://github.com/kishankrishna1/spam-classifier

Developed a Machine Learning-based Spam Classifier using Multinomial Naive Bayes to identify and filter spam messages with high precision

matplotlib numpy pandas python scikit-learn seaborn

Last synced: 02 Apr 2026

https://github.com/bhaveshbhakta/diabetes-prediction

Note* The hosted website link might take some time to load. Please be patient while the application initializes.

diabetes-prediction flask machine-learning python scikit-learn svm web-development

Last synced: 12 Apr 2026

https://github.com/jofaval/titanic-disaster

Data Analysis of the famous Titanic Disaster in 1912 with Machine Learning

classification data-analysis data-science data-visualization google-colab kaggle machine-learning python scikit-learn

Last synced: 15 Apr 2026

https://github.com/mpoojithavigneswari/sentiment-analysis

The primary goal of this project is to build a sentiment analysis model that can predict the sentiment of a given review (positive or negative).

deep-learning keras machine-learning nlp python rnn-lstm scikit-learn tensorflow

Last synced: 04 Feb 2026

https://github.com/manshreet27/mrs

This Movie Recommendation System is a web-based application built using Python and Streamlit, designed to provide movie recommendations based on user preferences. It utilizes TMDb API for fetching real-time movie details and Kaggle's TMDB 5000 Movies dataset for content-based filtering.

numpy pandas python scikit-learn streamlit tmdb-5000-movies-dataset-from-kaggle tmdb-api-for-fetching-real-time-movie-data

Last synced: 07 Apr 2026

https://github.com/pradipnp/decisiontree-iris

Machine learning project to classify iris flowers using a decision tree

classification decision-tree iris-dataset machine-learning python scikit-learn

Last synced: 18 May 2026

https://github.com/lefteris-souflas/the-algorithmic-approach-to-winning-guess-who

This repository provides a systematic approach to winning the "Guess Who?" game through advanced machine learning techniques. It offers a comprehensive methodology for enhancing gameplay strategy and optimizing decision-making processes with meticulous attention to detail.

decision-tree drawio gradient-boosting graphviz-dot lightgbm machine-learning matplotlib numpy pandas python random-forest scikit-learn

Last synced: 09 Apr 2026

https://github.com/yugalsoni18/counterfeit_review_detection

Fake review detection using TF-IDF & SVM (AUC 0.98), plus Counterfeit Risk Score with clustering & anomaly detection.

business-analytics fraud-detection isolation-forest kmeans nlp python risk-scoring scikit-learn svm tfidf

Last synced: 18 May 2026

https://github.com/idaraabasiudoh/svm_cell_classification

This repository contains code for classifying cell samples using Support Vector Machine (SVM) with Scikit-learn.

machine-learning python3 scikit-learn svm-classifier

Last synced: 19 Jan 2026

https://github.com/muhdhammad/machine-learning

Crafted for hands-on learning and implementation of ML with scikit-learn

data-science jupyter-notebook machine-learning matplotlib numpy pandas python scikit-learn seaborn

Last synced: 07 Apr 2026

https://github.com/bhaveshbhakta/crop-yield-prediction

Indian Crop Yield Prediction Using Machine Learning

flask machine-learning python random-forest scikit-learn webdevelopment

Last synced: 20 Apr 2026

https://github.com/markdouthwaite/py-lingo

Utilities for helping you deploy Scikit-Learn models in Go (with lingo!)

hdf5 linear-models scikit-learn

Last synced: 25 Feb 2026

https://github.com/callesjuan/ninjalprm

Protótipo de ferramenta de agrupamento de dispositivos Android por geolocalização (Server)

python scikit-learn xmpp

Last synced: 20 Jan 2026

https://github.com/murugavl/loan_approval_prediction

This project is a Loan Approval Prediction System that uses Machine Learning to determine whether a loan application should be approved or rejected based on various factors. It is deployed using Streamlit for an interactive user experience.

jupyter-notebook machine-learning numpy pandas python random-forest-classifier scikit-learn stremlit

Last synced: 13 Apr 2026

https://github.com/sarmad426/ai

AI basic to advanced featuring Machine Learning, Deep Learning and Data Science.

ai data-science deep-learning hugging-face machine-learning numpy pandas python scikit-learn

Last synced: 15 Apr 2026

https://github.com/renatomaynard/supervised-machine-learning-models-pytorch-sklearn

This repository provides a comprehensive implementation of supervised machine learning models using PyTorch and Scikit-learn. It includes end-to-end workflows for both classification and regression tasks, covering data preprocessing, model training, evaluation, and comparison between traditional ML models

applied-machine-learning classification data-preprocessing deep-learning feature-engineering machine-learning ml-models ml-pipeline model-comparison model-evaluation python pytorch regression scikit-learn supervised-learning-algorithms

Last synced: 10 May 2026

https://github.com/alphacrypto246/stock-price-movement-prediction

A project leveraging Polynomial Regression to predict stock price movements based on historical data. Includes data preprocessing, feature engineering, visualization, and model evaluation to provide insights for informed trading decisions.

machine-learning numpy pandas polynomial-regression scikit-learn yfinance

Last synced: 13 Apr 2026

https://github.com/sebastianquintanam/arbol_decision

Árbol de decisión en Python para recomendar la mejor cuenta de ahorro según el perfil del usuario. Incluye dataset sintético, entrenamiento con scikit-learn y visualización del modelo.

data-science decision-tree finance machine-learning python scikit-learn

Last synced: 15 May 2026

https://github.com/vikneshsrv24/customer-segmentation

Segregation of customers based on purchasing pattern for targeted marketing.

jupyter-notebook matplotlib pandas python scikit-learn

Last synced: 13 Apr 2026

https://github.com/mathewvieira/sistemas-de-apoio-a-decisao-av1

Cadeira de Sistemas de Apoio à Decisão - VP1 - UNI7

knn-algorithm pandas python scikit-learn streamlit

Last synced: 13 Apr 2026

https://github.com/smaddanki/data-science

Code blocks, algorithms, and research snippets in Data Science, Machine Learning, AI & Quant Finance.

deep-learning machine-learning pytorch scikit-learn spark

Last synced: 13 Apr 2026

https://github.com/afifahhadie/wine-clasificassion

This project focuses on classifying different types of wine using machine learning techniques. The dataset contains various chemical properties of wines, which are used as features to predict the wine class.

classification data-science data-visualization jupyter-notebook machine-learning machine-learning-algorithms pandas scikit-learn wine-dataset

Last synced: 13 Apr 2026

https://github.com/ironlegion88/media_bias

An end-to-end NLP pipeline to analyze ideological bias in online news media during elections. Uses sentiment analysis, topic modeling (LDA/NMF), and NER to quantify media framing.

data-analysis machine-learning media-bias nlp nltk political-science python scikit-learn sentiment-analysis spacy topic-modeling

Last synced: 13 Apr 2026

https://github.com/veranyagaka/credit-card-fraud-detection

Credit Card Fraud Detection using data preprocessing, analysis, visualization, and machine learning to accurately identify fraudulent transactions. -Final Project

ai anomaly-detection classification credit-card-fraud-detection machine-learning scikit-learn supervised-learning

Last synced: 18 May 2026

https://github.com/karimosman89/credit-scoring

Evaluate the creditworthiness of individuals.Develop a credit scoring model that evaluates the creditworthiness of individuals based on historical data.Help financial institutions assess risk more accurately.

decision-trees ensemble-methods logistic-regression pandas python scikit-learn

Last synced: 13 Apr 2026

https://github.com/aadrianleo/fashion-style-classifier

A machine learning and deep learning pipeline for fashion image classification. Combines real-world data, manual annotation, and both KNN and EfficientNet-B0 CNN models to classify images into style categories. Includes data cleaning, augmentation, model training, evaluation, and reproducible notebooks.

classification-report cnn computer-vision confusion-matrix data-augmentation data-preprocessing deep-learning efficientnet exploratory-data-analysis fashion-classification image-classification knn label-studio machine-learning model-evaluation pytorch real-world-data reproducible-research scikit-learn transfer-learning

Last synced: 11 May 2026

https://github.com/iamriteshkoushik/skrun

18hrs Scikit Learn Course Speedrun Repo

freecodecamp machine-learning scikit-learn

Last synced: 26 Apr 2026

https://github.com/ejw-data/ml-clustering-crypto

Compares several machine learning clustering models to determine whether the currencies can be logically classified based on the given data

clustering python scikit-learn

Last synced: 13 Apr 2026

https://github.com/adirbella37/safety-analytics-project

Final project in Safety Management: analytics and predictive modeling for occupational incidents. Includes EDA, logistic regression, Poisson/Negative Binomial with overdispersion checks, ROC/AUC, and prediction exercises.

classification data-visualization drunk-and-drive eda logistic-regression matplotlib negative-binomial numpy occupational-safety overdispersion pandas poisson-regression python road-safety roc-auc scikit-learn seaborn statmodels

Last synced: 09 Apr 2026

https://github.com/jingjing-jin/Purchase-Behavior-Analysis

Purchase Behavior Analysis for Targeted Customer Segmentation

clustering-algorithm data-mining machine-learning python scikit-learn

Last synced: 02 Apr 2025

https://github.com/mahdimotamedi/ai-examples

AI Examples Repository showcasing machine learning and deep learning examples using Scikit-Learn and TensorFlow.

ai deep-learning machine-learning nerual-networks scikit-learn tensorflow

Last synced: 13 Apr 2026

https://github.com/vasu7052/recognizing-handwritten-digits

This is a machine learning project created in Python using Neural Networks and Supervised Learning Algorithms.

machine-learning machine-learning-algorithms numpy python scikit-learn

Last synced: 13 Apr 2026

https://github.com/samiyaalizaidi/nn-ml-homeworks

Homework solutions for CPE-4903: Neural Networks & Machine Learning at Kennesaw State University.

machine-learning machine-learning-workflow neural-networks numpy scikit-learn

Last synced: 15 Apr 2026

https://github.com/mahdibehoftadeh/polynomial-regression-co2-emissions

A simple machine learning polynomial regression using a large dataset to learn and predict CO2 emission of a car by its built features like engine size and cylinders

machine-learning matplotlib numpy nural-network pandas polynomial-regression python scikit-learn

Last synced: 22 Feb 2026