An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/angelarreola/ai_notes

Notas de la materia "Inteligencia Artificial" para su posterior extraccion mediante algun modelo de lenguaje que nos permita dar respuestas personalizadas con base a la informacion presente en este repositorio.

ai matplotlib numpy pandas phaserjs python scikit-learn

Last synced: 21 Jan 2026

https://github.com/medicharlakarthik/credit-card-fraud-detection

Credit Card Fraud Detection using machine learning to distinguish fraudulent transactions from legitimate ones. This project includes data analysis, model training, and evaluation to achieve high accuracy and recall, minimizing false negatives for better fraud detection

machine-learning python random-forest-classifier scikit-learn

Last synced: 12 Apr 2026

https://github.com/taimoorkhan10/ai-fairness-explainability-toolkit

AI Fairness and Explainability Toolkit (AFET) is an open-source project aimed at providing tools and frameworks to assess, visualize, and mitigate bias in machine learning models. It supports multiple ML frameworks and offers a comprehensive suite of metrics and visualization components to enhance model transparency and fairness.

ai bias-detection data-science ethical-ai explainable-artificial-intelligence fairness machine-learning mlops model-interpretation open-source python responsible-ai scikit-learn

Last synced: 19 Jan 2026

https://github.com/waikato-datamining/shallowflow-sklearn

scikit-learn support for the shallowflow Python workflow system.

python3 scikit-learn sklearn workflow-engine

Last synced: 14 Apr 2026

https://github.com/anuragkush2527/vibesync-3.0

Sentiment analysis in social media involves using natural language processing (NLP) and machine learning to analyze users' opinions, emotions, and attitudes expressed in posts, comments, and reviews. It helps in understanding public sentiment, monitoring trends, and making data-driven decisions.

expressjs fastapi mongodb nltk nodejs numpy pandas python reactjs scikit-learn sentiment-analysis tensorflow

Last synced: 16 Oct 2025

https://github.com/nisch-mhrzn/house_prediction

This project predicts house prices using data exploration, feature engineering, and machine learning models like Linear Regression and Random Forest. It demonstrates how to optimize models and evaluate their performance to accurately forecast house prices.

matplotlib numpy pandas python scikit-learn seaborn

Last synced: 08 Apr 2026

https://github.com/jsimell/sleepanalysis

A Python data analysis project analyzing the sleep quality affecting factors and temporal patterns in the sleeping data of a single subject.

data-analysis matplotlib numpy pandas python scikit-learn seaborn

Last synced: 14 Apr 2026

https://github.com/angelalim88/jakarta-air-quality-index-classification

This project classifies Jakarta's Air Quality Index (AQI) from 2010 to 2023 using machine learning models (Random Forest, MLP, SVM) based on pollutant concentrations.

data-analysis data-visua machine-learning scikit-learn tensorflow

Last synced: 13 Oct 2025

https://github.com/szymon-budziak/real_estate_house_prices_prediction

Predicting real estate house prices using various machine learning algorithms, including data exploration, preprocessing, model training, and evaluation.

data-analysis data-preprocessing data-science eda jupyter-notebook machine-learning matplotlib numpy optuna pandas predictive-modeling price-prediction python random-forest regression scikit-learn seaborn

Last synced: 21 Jan 2026

https://github.com/ledsouza/nlp-article-classification

This project aims to develop a machine learning model capable of classifying news articles into different categories based on their titles. Two different word embedding models (CBOW and Skip-gram) are trained and used to vectorize the article titles. These vectorized representations are then used to train a Logistic Regression classifier.

gensim-word2vec natural-language-processing nlp nlp-machine-learning pandas python scikit-learn spacy spacy-nlp

Last synced: 11 Apr 2026

https://github.com/huucanh0511/startup-profitability-prediction

This project predicts startup profitability using Logistic Regression and Random Forest, analysing financial (funding amount, funding rounds, revenue), market (market share), and operational (startup age, employee count) factors. It evaluates AUC, accuracy, precision, recall, and F1-score, addressing underfitting, overfitting, and feature selection

ai-for-finance data-science financial-modelling logistic-regression machine-learning predictive-analytics python random-forest scikit-learn startup-analysis

Last synced: 19 May 2026

https://github.com/dadvaiahpavan/ai-data-scientist-

AI-powered tool for dataset analysis, featuring data preprocessing, classification, regression, anomaly detection, and text analysis. Built with scikit-learn, pandas, and Plotly for visualization. Includes an interactive Streamlit web interface for real-time data analysis.

ai anomaly-detection classification data-analysis data-science machine-learning panda plotu regression scikit-learn sentiment-analysis streamlit

Last synced: 03 May 2026

https://github.com/arnab-0053/song-identifier

It identifies songs and artists from lyric snippets using two distinct methods - simple NLP based approach and BM25(Best Match 25) approach.

bm25 nlp nltk python rank-bm25 scikit-learn song-lyrics spotify-dataset text-preprocessing

Last synced: 28 Apr 2026

https://github.com/arrhythmia-detection/authorfeatureextracteddecisiontreeesp32s3

Deploys a vanilla non-optimized Decision Tree for Arrhythmia classification using Chapman ECG dataset on ESP32-S3 dev kit

arrhythmia-classification decisiontreeclassifier eloquent esp32-arduino esp32-s3 scikit-learn

Last synced: 19 May 2026

https://github.com/yahiazakaria445/ensemble-learning-voting-classifier

Ensemble Learning Using KNN, Naive Bayes, Decision Tree on Biomechanical Data

matplotlib numpy pandas scikit-learn seaborn

Last synced: 30 Apr 2026

https://github.com/cbjuan/paper-ieeeaccess-2017

Jupyter notebooks developed to support the research presented in the paper "Enabling adaptivity in web forms based on user characteristics detection through A/B testing and Machine Learning"

jupyter jupyter-notebook machine-learning pandas paper research-paper scikit-learn

Last synced: 10 May 2026

https://github.com/aasjunior/machinelearningapp

O Machine Learning App é um aplicativo desenvolvido com Kotlin, Android Studio e Jetpack Compose, para aplicação de algoritmos de aprendizado de máquina e exibição dos resultados. Realizado como tarefa da disciplina de Laboratório Mobile/Computação Natural no 5º Semestre de Desenvolvimento de Software Multiplataforma.

fastapi jetpack-compose kotlin-android machine-learning material-design scikit-learn

Last synced: 18 Apr 2026

https://github.com/avik-pal/kaggle-titanic

Predicting whether a given set of people survive on the Titanic

machine-learning numpy pandas scikit-learn scikitlearn-machine-learning

Last synced: 14 Apr 2026

https://github.com/thchilly/mlds102_py_exercises

Complete exercise sets from MLDS Practical Data Science and Applications course

data-science matplotlib numpy pandas python scikit-learn scipy tensorflow

Last synced: 06 Apr 2026

https://github.com/harris-giki/cancerdetectionmodel_ml

Simple Logistic Regression and Neural Network powered Machine Learning models that predicts whether a breast tumor is malignant or benign based on input features extracted from a breast cancer dataset.

cancer-detection development keras keras-tensorflow logistic-regression machine-learning neural-network scikit-learn streamlit tensorflow

Last synced: 13 Apr 2026

https://github.com/josancamon19/boston_housing

Predicting Boston Housing Prices for Udacity Machine Learning Nanodegree

boston-housing-price-prediction machine-learning machine-learning-nanodegree scikit-learn udacity

Last synced: 21 Apr 2026

https://github.com/gregoritsch3/ml_eda_classification_diabetes

An EDA and Machine Learning Classification exercise on the Diabetes dataset demonstrating the use of SQLAlchemy data import from an SQL database (PostgreSQL), Pre-processing Pipelines, ANOVA, 9 ScikitLearn ML models, Hyperparamter Tuning for the best performing one, and feature importance.

anova machine-learning matplotlib numpy pandas pipelines scikit-learn seaborn sql sqlalchemy statistics

Last synced: 14 Apr 2026

https://github.com/aasmirnov-webdev/data_science_projects

Сборник всех выполненных учебных проектов курса Яндекс.Практикум "Специалист по Data Science".

bert catboost data-science database lgbm mashine-learning matplotlib numpy pandas python pytorch scikit-learn scipy seaborn sql xgboost

Last synced: 06 Apr 2026

https://github.com/idaraabasiudoh/drug_prescribtion_decision_tree_model

This repository contains a machine learning project focused on classifying drugs based on patient characteristics using a Decision Tree classifier. The project uses Python and popular data science libraries such as scikit-learn, pandas, and matplotlib.

data-analysis jupyter-notebook machine-learning python3 scikit-learn

Last synced: 10 Apr 2026

https://github.com/abhi227070/car-price-prediction

This project implements a machine learning model to predict the price of cars based on various features such as mileage, manufacturing date, fuel type, and more. Users can input car information, and the model will estimate the price of the car based on the provided data. This tool can be useful for both car buyers and sellers to estimate car price.

data-analysis machine-learning machine-learning-algorithms machinelearning python3 regression regression-models scikit-learn scikitlearn-machine-learning

Last synced: 28 Apr 2026

https://github.com/omerdduran/riskfactor-heart

This ML project predicts heart disease using logistic regression on the Cleveland Heart Disease UCI dataset, featuring advanced preprocessing and medical feature engineering, achieving 82.1% accuracy with strong cross-validation.

cardiovascular-health data-science data-visualization heart-disease-prediction logistic-regression machine-learning medical-ai scikit-learn

Last synced: 14 May 2026

https://github.com/ghufranbarcha/linear-regression-training-app

This project is a Streamlit application that allows users to upload a CSV file, select variables, and train a linear regression model. The app provides an easy-to-use interface for selecting dependent and independent variables, scaling data, applying polynomial regression, and evaluating model performance.

data-science machine-learning python scikit-learn streamlit

Last synced: 20 Apr 2026

https://github.com/enayar478/nomad_machine_learning_dash_app

An interactive Machine Learning app built with Dash and Plotly, developed as part of the Data Analytics Bootcamp at Le Wagon Bordeaux. It allows users to visualize data, make real-time predictions, and explore various model insights.

analytics cachetools dash dashboard-application data-analysis data-science deployment gunicorn interactive-visualization machine-learning pandas plotly plotly-dash prediction-model python python3 render scikit-learn web-application

Last synced: 02 Jan 2026

https://github.com/anty-filidor/cyberbullying-detector

NLP bullying detector for tweets with ML model training pipeline deployed as web-app with CICD

deployment-system flask-api machine-learning nlp python scikit-learn

Last synced: 19 May 2026

https://github.com/sanalislokuge/breast-cancer-ml-prediction

Machine Learning project using classification, regression, and ensemble techniques to predict breast cancer mortality status and survival months using clinical data. Built with scikit-learn, decision trees, logistic regression, and Naïve Bayes. Includes detailed model evaluation, data preprocessing, and interpretability.

classification data-science decision-tree ensemble-learning healthcare-analytics machine-learning ml models naive-bayes-classifier predictive-modeling regression scikit-learn

Last synced: 19 May 2026

https://github.com/adityapradhan202/binge-trend

Media and entertainment recommendation website with AI powered recommendation system.

datascience-machinelearning natural-language-processing python scikit-learn spacy-nlp

Last synced: 21 Apr 2026

https://github.com/sunilvarma-l/liverdiseaseprediction

"Streamlit app to predict liver disease risk using a machine learning model based on patient input data."

machine-learning matplotlib numpy pandas pickle python scikit-learn seaborn streamlit

Last synced: 13 Apr 2026

https://github.com/martinkersner/kmeans-meetup

Presentation about k-Means for Seoul AI Meetup on July 22, 2017.

kmeans numpy python scikit-learn

Last synced: 03 May 2026

https://github.com/farhad-here/predict_student_performance

Predict Student Performance, is a data analysis and machine learning project aimed at predicting students' final performance (g3) based on demographic, family, and academic features. The project supports both Regression (predicting exact grades) and classification (Pass/Fail categories).

classification data-analysis data-visualization linear-regression machine-learning numpy pandas postgresql powerbi scikit-learn streamlit

Last synced: 14 Apr 2026

https://github.com/mecha-aima/fake-bills-detection

This Python project implements a simple classification model comparison using scikit-learn to classify banknotes as either "Authentic" or "Counterfeit" based on four features

classification-model machine-learning model-selection scikit-learn

Last synced: 27 Jan 2026

https://github.com/mahdi-meyghani/movie-recommendation-system

A Python-based movie recommendation system utilizing popularity-based, content-based, and collaborative filtering models with data science and machine learning techniques.

data-analysis data-science machine-learning recommendation-system scikit-learn scikitlearn-machine-learning

Last synced: 23 Jan 2026

https://github.com/saniyaacharya04/resume-scanner-using-nlp

A live resume scanning and ranking tool built with Python, Streamlit, and NLP. Upload resumes, match them to job descriptions, and generate analytics dashboards and PDF reports.

dashboard job-matching nlp pdf-parser resume-scanner scikit-learn spacy streamlit transformers

Last synced: 03 May 2026

https://github.com/haseeeb21/machine-learning-models

Machine Learning Models trained on Scikit-learn datasets. This repository contains the code files and saved models trained on Toy datasets (Classification & Regression), and Real World dataset.

anaconda classification classification-models jupyter-notebook knn knn-classification machine-learning machine-learning-algorithms python3 regression regression-models scikit-learn scikit-learn-python scikitlearn-machine-learning svm svm-classifier vscode

Last synced: 07 May 2026

https://github.com/tapas-gope/telecommunication-customer-churn

This project involves predicting customer churn in a telecommunications company using machine learning techniques, exploring various features' impact, optimizing models, and identifying key factors influencing churn.

feature-engineering matplotlib-pyplot model-evaluation-and-validation numpy pandas python scikit-learn

Last synced: 12 Sep 2025

https://github.com/razalkr70/customer-segmentation-using-dataset

A data science project that segments mall customers using K-Means clustering. Based on age, income, and spending score, it identifies customer groups and visualizes them with 2D and 3D plots for targeted marketing insights.

clustering customer-segmentation data-science data-visualization kmeans machine-learning pca python scikit-learn

Last synced: 28 Apr 2026

https://github.com/messierandromeda/sentiment-analysis

Sentiment analysis with the IMDB movie review dataset.

imdb-dataset python scikit-learn sentiment-analysis

Last synced: 28 Jan 2026

https://github.com/vladstudennikov/diabetes-prediction-app

ML-powered web app built with Laravel and Vue.js to predict diabetes risk based on users' daily habits and behavior

cypress data-analysis diabetes-prediction fastapi inertiajs laravel matplotlib medicine ml pandas php scikit-learn seaborn vuejs

Last synced: 08 Apr 2026

https://github.com/nk-works/creditflow-ai

CreditFlow AI predicts loan defaulters using Artificial Neural Networks (ANNs). This model uses historical loan data to predict the likelihood of default for new loan applications.

ai artificial-neural-networks deep-learning jupyter-notebook machine-learning matplotlib numpy pandas python scikit-learn seaborn tensorflow

Last synced: 24 Jun 2025

https://github.com/ashrw/handwritten_digit_recognizer

A handwritten digit recognition system using Python and Scikit-learn to preprocess images and classify digits with a trained SVM model.

ml python scikit-learn

Last synced: 03 Jan 2026

https://github.com/jhylin/ml1-1_small_mols_in_chembl

Polars dataframe library and logistic regression in scikit-learn (update)

logistic-regression machine-learning parquet-files polars-dataframe scikit-learn

Last synced: 03 Jan 2026

https://github.com/andrewsy1004/mask-detection

Mask detection system capable of identifying individuals with or without masks

kaggle keras python scikit-learn tensorflow

Last synced: 08 Apr 2026

https://github.com/hariprasath-v/machinehack_analytics_olympiad_2023

Create a machine learning model to determine the likelihood of a customer defaulting on a loan based on credit history, payment behavior, and account details.

binaryclassification catboost exploratory-data-analysis machine-learning numpy pandas python scikit-learn shap

Last synced: 08 Apr 2026

https://github.com/sundanc/movierecommendation

Movie recommendation system based on user input. Built with Streamlit

movie-recommendation-app python scikit-learn scikitlearn-machine-learning streamlib

Last synced: 27 Apr 2026

https://github.com/shakeel-data/amazon-sales-forecasting-python-bigquery-ml

An end-to-end analytics project using Python, SQL, & ML to forecast Amazon sales and segment customers. We build predictive models (LightGBM, Prophet) and clustering (KMeans) to deliver actionable insights for revenue growth and targeted marketing.

bigquery kmeans-clustring lightgbm linear-regression prophet-facebook scikit-learn

Last synced: 09 May 2026

https://github.com/anthippi/naive-bayes-imdb-classification

A custom Naive Bayes classifier for sentiment analysis of movie reviews from the IMDb dataset, utilizing feature selection based on Information Gain and comparing its performance with scikit-learn's BernoulliNB.

classification imdb matplotlib naive-bayes-classifier numpy pandas scikit-learn sklearn

Last synced: 09 Apr 2026

https://github.com/priyanshulathi/cancer-diagnosis-prediction-model

A Machine Learning project to predict cancer malignancy using K-Nearest Neighbor, Support Vector Machine, and Decision Tree algorithms.

machine-learning numpy pandas python scikit-learn

Last synced: 03 Jan 2026

https://github.com/rinuya/ml-cancer-diagnosis

Binary classficiation using MLP & Random Forest

ml mlp random-forest scikit-learn

Last synced: 03 Jan 2026

https://github.com/barraharrison/seoul-bike-sharing

Performing EDA on a kaggle dataset to look at the distribution of Seoul's bike-sharing system

jupyterlab matplotlib numpy pandas python scikit-learn seaborn

Last synced: 23 Jul 2025

https://github.com/shahbazshaddy/explainable-multimodal-ai-for-breast-cancer-and-pneumonia-prediction

A deep learning-based framework integrating explainable multimodal AI for accurate prediction and transparent diagnosis of breast cancer and pneumonia.

deep-learning explainable-ai grad-cam groq-api llm machine-learning matplotlib multimodal numpy pandas python pytorch scikit-learn seaborn streamlit

Last synced: 08 Apr 2026

https://github.com/ledsouza/deep-learning-noticias

Este projeto visa construir dois modelos de Machine Learning: um para classificar notícias em diferentes categorias e outro para realizar o autocomplete de texto, prevendo a próxima palavra em uma frase. O conjunto de dados fornecido consiste em notícias de um site de notícias, já pré-processadas e armazenadas em um arquivo CSV.

deep-learning keras machine-learning python scikit-learn tensorflow

Last synced: 08 Mar 2026

https://github.com/alphacrypto246/insurance-charges-prediction

The Predicting Insurance Charges project uses Decision Tree Regression to predict insurance charges based on features like age, sex, BMI, and smoking habits. It involves data preprocessing, feature scaling, and model evaluation with metrics like MAE and R².

machine-learning numpy pandas scikit-learn scikitlearn-machine-learning

Last synced: 03 May 2026

https://github.com/praatibhsurana/breast-cancer-prediction-svm

A SVM classifier coded in Python using Scikit-Learn to classify whether a patient's tumor is malignant or benign.

kaggle-dataset linear-classifier machine-learning-algorithms python scikit-learn svm-classifier

Last synced: 16 May 2026

https://github.com/bilgenurbekar/turkishcyberbullying

Contains fine-tuned BERT models and results in the text classification category using Turkish social media data

bert-fine-tuning huggingface-transformers matplotlib numpy pandas python pytorch scikit-learn transformers

Last synced: 07 Mar 2026

https://github.com/radoslawregula/geo-music-classification

Jupyter notebook implementing a classification solution to the geographical origins of music problem.

classification jupyter-notebook machine-learning pandas python random-forest-classifier scikit-learn

Last synced: 17 Apr 2026

https://github.com/sayan-mondal2022/mlops-assignment

A project for validating the Machine learning models

machine-learning scikit-learn streamlit

Last synced: 22 Apr 2026

https://github.com/pramodyasahan/learn-ml

This repository serves as both a personal learning diary and a resource for others interested in understanding and applying machine learning concepts. The projects are categorized based on the type of ML model and are implemented in Python using libraries like scikit-learn, pandas, and numpy.

classification clustering machine-learning matplotlib numpy pandas regression scikit-learn supervised-learning unsupervised-learning

Last synced: 07 Apr 2026

https://github.com/aneeshmurali-n/ann-diabetes-prediction

Predicting diabetes progression using an Artificial Neural Network (ANN). This project leverages the scikit-learn diabetes dataset for training and evaluation. Includes data preprocessing, model building, and performance visualization.

ann data-preprocessing data-visualization deep-learning diabetes-prediction exploratory-data-analysis keras machine-learning matplotlib neural-network numpy pandas regression scikit-learn seaborn tensorflow visualization

Last synced: 07 Apr 2026

https://github.com/vijay-saravanan/advanced-human-life-detection

Portable, real-time embedded system using mmWave radar, microphone, and accelerometer sensor fusion with advanced signal processing and machine learning to detect and locate humans trapped under debris. Features rapid alerts via LCD, LED, buzzer, and is designed for Raspberry Pi deployment in disaster scenarios.

disaster-recovery dwt fft landslide machine-learning random-forest-classifier scikit-learn sensor-fusion sensors-data-collection signal-processing vital-signs

Last synced: 14 May 2026

https://github.com/lorenzorottigni/ml-lending-club

Machine Learning python bootcamp: random forest classifier on LendingClub dataset

ipynb machine-learning numpy pandas python random-forest-classifier scikit-learn seaborn

Last synced: 08 Apr 2026

https://github.com/abdiasarsene/developpement_tableau_de_bord_de_la_chaine_approvisionnement_power_bi

Développer une solution complète pour visualiser, analyser et prédire des données de la chaîne d'approvisionnement.

ci-cd docker fastapi github-actions mysql-database randomizedsearchcv scikit-learn seaborn-plots

Last synced: 23 Jun 2025

https://github.com/rihua-tech/iris-ml-end-to-end

End-to-end Iris classification in Python: EDA → stratified CV → model comparison → SVM grid search → hold-out test → model persistence.

classification eda machine-learning python scikit-learn svm

Last synced: 14 May 2026

https://github.com/eljandoubi/disasterresponsepipeline

Project aim is to build a Natural Language Processing (NLP) model to categorize messages on a real time basis.

flask nltk numpy pandas plotly scikit-learn scipy sqlalchemy

Last synced: 09 Apr 2026

https://github.com/armahdavi/data_pipeline_analytics_statistics_ML_PM_PSD_residential_QFF

Sharing all the data pipelines and processing codes, statistical modellings, descriptive statistics, plot visualizations, and machine learning from Mahdavi & Siegel (2021) (Indoor Air) Project Miestone: 2017 - 2020 Full-length article: https://onlinelibrary.wiley.com/doi/abs/10.1111/ina.12782

data-science data-visualization dust hvac indoor-air-quality jupyter-notebook machine-learning matplotlib-pyplot numpy pandas python scikit-learn scipy-stats spyder spyder-python-ide statistics

Last synced: 17 Sep 2025

https://github.com/h-sarhan/hate-speech-classifier

Automatic Detection of Hate Speech and Offensive Content

nlp python scikit-learn

Last synced: 22 Apr 2026

https://github.com/riyajain255/customer-segmentation-for-e-commerce

This project analyzes online retail data to segment customers using K-Means clustering and build classification models to predict those segments based on purchasing behavior.

customer-segmentation data-analysis kmeans-clustering logistic-regression machine-learning matplotlib numpy pandas python random-forest scikit-learn seaborn-plots

Last synced: 02 Apr 2026

https://github.com/rohanbanerjee1234567-cell/prediction-of-expected-salary-using-machine-learning

Here is my first Project Repository where I have made a Machine Learning Project using Python. The Problem statement was to train a model based on the given Dataset and from there we need to Predict the Expected Salary of an Employee who will have similar profiles.

exploratory-data-analysis linearregression matplotlib-pyplot numpy pandas randomforest randomforestregressor scikit-learn scikitlearn-machine-learning searborn visualization

Last synced: 27 Apr 2026

https://github.com/rajan-bhateja/machine-learning-with-python

Machine learning algorithms implemented using Scikit-learn

classification clustering machine-learning regression scikit-learn sklearn

Last synced: 17 May 2026

https://github.com/beolawork-art/novabank-churn-analysis

NovaBank has noticed that customers are closing accounts or going inactive, and they want to understand why.

data-analysis data-science-projects data-visualization eda machine-learning numpy pandas python scikit-learn sql

Last synced: 08 Apr 2026

https://github.com/sabbadini10/job4you

Job4You is an AI-powered job application assistant that streamlines the entire application process. Built on Angular and Firebase with GPT-4 integration.

angular api ats-optimization cover-letter email-automation firebase jobforall openai-api python resume-builder scikit-learn sheraz sherazhussain sherazhussain546

Last synced: 04 Mar 2026