An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/maguids/supervised-learning---video-games

This project consists on exploratory data analysis and the application of supervised learning models for classification using a Video Games dataset. Second Semester of the First Year of the Bachelor's Degree in Artificial Intelligence and Data Science.

jupyter-notebook machine-learning matplotlib numpy pandas scikit-learn seaborn supervised-learning

Last synced: 30 Apr 2026

https://github.com/rokuu010/boxing-match-predictor

Machine learning project to predict the outcomes of pro boxing matches using Dataset/web-scraped data

boxing data-science machine-learning prediction-model python scikit-learn selenium sports-analytics

Last synced: 30 Apr 2026

https://github.com/sayed-ashfaq/delhivery-dataanalysis

In this project, I conducted basic analysis, feature engineering, normalization, and outlier handling, along with statistical and non-parametric testing to extract insights.

feature-engineering normalization outlier-detection pandas python scikit-learn statistcal-tests statistical-analysis

Last synced: 30 Apr 2026

https://github.com/fikri-rouzan/burnaway-capstone-data-science

Dashboard analitik interaktif untuk memetakan faktor fisik dan pola kerja pemicu burnout pada software developer.

jupyter-notebook matplotlib pandas pillow plotly python scikit-learn seaborn statsmodels streamlit

Last synced: 08 Jun 2026

https://github.com/abhivur/connections-ai

Contributors: Meet Gamdha, Gaurav Nimmagadda

bert python scikit-learn word2vec

Last synced: 30 Apr 2026

https://github.com/dharma-acha/explanability_in_deepneuralnetworks

Our project aims to enhance the transparency and trustworthiness of the VGG model in critical fields like healthcare imaging and self-driving cars. By integrating explainability methods into the VGG model for image classification, we will clarify its decision-making process.

colab-notebook matplotlib numpy pandas scikit-learn seaborn

Last synced: 30 Apr 2026

https://github.com/boladjivinny/fire-prediction

Notebook for the Fire fighting using data on Zindi. Ranked number 5 on the public leaderboard and 8 on the private leaderboard. https://zindi.africa/hackathons/cmu-africa-fighting-fire-with-data

feature-engineering hackhathon machine-learning regression scikit-learn stacking

Last synced: 30 Apr 2026

https://github.com/themihirmathur/mihir-clickpost-data-science-intern-round-1-assignment-submission

The objective of this project is to predict the predicted_exact_sla, which is the number of days between the shipment and delivery of an order, using historical shipment data.

data-science machine-learning pandas python random-forest-regression scikit-learn

Last synced: 30 Apr 2026

https://github.com/fadlani-aditya/iris-plant-classification

This project focuses on classifying different species of Iris flowers using the Random Forest algorithm. The dataset, sourced from Scikit-learn, contains four key features: sepal length, sepal width, petal length, and petal width, which are used to predict the flower species (Setosa, Versicolor, and Virginica).

agriculture data-science iris-dataset machine-learning python scikit-learn supervised-learning

Last synced: 01 May 2026

https://github.com/arturovaine/n8n-nodes-sklearn

Custom n8n nodes for integrating scikit-learn machine learning algorithms into your n8n workflows.

machine-learning n8n n8n-nodes scikit-learn sklearn

Last synced: 08 Jun 2026

https://github.com/deepthipathlawath20/emotion-recognition-bimodal

Bimodal emotion recognition (face + speech) with feature-level fusion and classic ML classifiers.

audio computer-vision emotion-recognition knn mfcc multimodal navie-bayes-algorithm python scikit-learn svm tensorflow

Last synced: 01 May 2026

https://github.com/kristishqau/sentimentanalysis_nlp

A project for sentiment analysis of tweets using various NLP techniques and machine learning models.

datascience jupyter-notebook machine-learning nlp nltk python scikit-learn sentiment-analysis xgboost

Last synced: 01 May 2026

https://github.com/antonio-f/housing-simplemlexample

Basic example with California Housing Prices dataset from the StatLib repository using scikit-learn

housing-simplemlexample machine-learning scikit-learn simple

Last synced: 01 May 2026

https://github.com/danishzulfiqar/language-detection-nlp-model

This machine learning model is designed to accurately detect and classify text in 18 languages using NLP

fastapi jupyter-notebook machine-learning natural-language-processing scikit-learn

Last synced: 01 May 2026

https://github.com/maxwelllzh/linearizer

Linearizing parameters for linear regression

data-analysis machine-learning scikit-learn

Last synced: 02 May 2026

https://github.com/dmschauer/aws-sagemaker-deployment-test

I did a simple test to see how deploying a machine learning model on AWS Sagemaker and thus turning it into an API works. Since scikit-learn models require less dependencies than e.g. TensorFlow models I went with them for this test. To do so I used a tutorial.

aws boto3 python sagemaker scikit-learn

Last synced: 02 May 2026

https://github.com/bishopce16/cryptocurrencies

An analysis on cryptocurrencies dataset using unsupervised machine learning, PCA algorithm, and K-means clustering.

hvplot jupyter-notebook pandas plotly python scikit-learn unsupervised-machine-learning visual-studio-code

Last synced: 02 May 2026

https://github.com/viniciusds2020/ml_pycaret_classificacao

Sistema de preprocessamento e treinamento de modelos de machine learning utilizando PyCaret. Uma metodologia low-code para processos de MLops

machine-learning mlops preprocessing pycaret python scikit-learn

Last synced: 03 May 2026

https://github.com/fandredev/ml-my-guide

my own annotations about ML/DS using pandas, matplotlib, numpy, scikit learn

anaconda matplotlib numpy pandas plotly scikit-learn seaborn

Last synced: 03 May 2026

https://github.com/arrhythmia-detection/authorprovidedfeaturescombineddt

Deploys a vanilla Decision Tree for Arrhythmia classification using Chapman ECG dataset on Arduino UNO board

arduino-uno arrhythmia-classification atmega328p chapman-ecg decision-tree-classifier eloquent scikit-learn

Last synced: 09 Jun 2026

https://github.com/zhenglinlei/zdmp

Industry 4.0 Optimization with Machine Learning AI

industry-4 knn-classification machine-learning pandas python scikit-learn

Last synced: 03 May 2026

https://github.com/srisaihariharan/mic_sentiment_analysis_v

Sentiment analysis of IMDb movie reviews using Python, Scikit-learn, and TF-IDF.

machine-learning natural-language-processing nlp python scikit-learn sentiment-analysis sentiment-classification

Last synced: 03 May 2026

https://github.com/abdiasarsene/predictive-churn-management-data-driven-customer

Use unsupervised learning techniques to segment a company’s customers into distinct groups in order to personalize marketing campaigns. To ultimately propose specific marketing strategies for each customer segment based on the insights obtained.

acp kmeans-clustering matplotlib pandas plotly python scikit-learn seaborn

Last synced: 03 May 2026

https://github.com/furk4nbulut/uygulamalarla-makine-ogrenmesi-ve-derin-ogrenme-atolyesi

Bu repository, Manisa'da gerçekleştirilen BTK Akademi Uygulamalı Makine Öğrenmesi ve Derin Öğrenme Atölyesi'ne ait eğitim sürecini kapsamaktadır. Atölyede katılımcılar, ileri düzey makine öğrenmesi ve derin öğrenme teknikleriyle ilgili teorik ve pratik bilgiler edinmektedir.

matplotlib numpy pandas scikit-learn seaborn

Last synced: 03 May 2026

https://github.com/pramodyasahan/binary-classifier

This repository houses the code for a machine learning model designed to predict customer churn. The model is built using Support Vector Machine (SVM) from the scikit-learn library and incorporates preprocessing, pipeline, and grid search techniques for optimal performance.

numpy pandas scikit-learn

Last synced: 03 May 2026

https://github.com/alestankiewicz/credit-card-fraud-detection

Credit Card Fraud Detection Excercise In Python

pandas plotly python3 scikit-learn xgboost

Last synced: 03 May 2026

https://github.com/arnavk-09/phishing-detection

🎣 Detect Phishing URLs with Data Pre-fitted... API & Web UI

csv data fastapi flask python scikit-learn

Last synced: 03 May 2026

https://github.com/srilaasya/breast-cancer-classifier

Used several Python libraries to make a K-Nearest Neighbor classifier that is trained to predict whether a patient has breast cancer

knearest-neighbor-classifier python scikit-learn

Last synced: 03 May 2026

https://github.com/atchayaah/home-value-insights-kc

Data-driven project predicting King County housing prices using EDA, regression models, and ML techniques, developed as part of IBM’s Data Analysis with Python course on Coursera.

joblib matplotlib numpy pandas pickle python scikit-learn seaborn

Last synced: 03 May 2026

https://github.com/lucs1590/commom_segmentations

The purpose of this repository is to document and expose code samples using common threading techniques.

computational-vision machine-learning open-source opencv python scikit-image scikit-learn segmentation sklearn

Last synced: 03 May 2026

https://github.com/samarth4023/shell-internship-2

🤖 AICTE Shell Internship - NLP Chatbot This repository contains the implementation of a Chatbot using NLP, developed as part of the AICTE Shell Internship. The chatbot is designed to understand and respond to user queries using Natural Language Processing (NLP) techniques.

ai artificial-intelligence chatbot natural-language-processing nlp nltk python scikit-learn streamlit

Last synced: 04 May 2026

https://github.com/abdullahalzubaer/feature-selection-ranking

In-depth analysis regarding feature selection and ranking.

feature-ranking feature-selection random scikit-learn

Last synced: 04 May 2026

https://github.com/vyclarks/gestational-diabetes-prediction-ml

Predicting gestational diabetes from the Pima dataset — Python (scikit-learn); reproducible notebook, metrics, and report.

healthcare-analysis machine-learning python scikit-learn

Last synced: 04 May 2026

https://github.com/baponkar/scikit-logisticregression-application

A simple and detail application analysis of sci kit learn LogisticRegression model .

classification-algorithm logistic-regression machine-learning python3 scikit-learn

Last synced: 04 May 2026

https://github.com/abhivur/graduate-income-forecaster

Contributors: Abdussalam Raheem, Chiara Su, and Joseph Botros

matplotlib numpy pandas python scikit-learn seaborn

Last synced: 04 May 2026

https://github.com/homebackend/pdf-title-page-splitter

Splits a pdf based on identified title pages using ML trained model

machine-learning opencv pdf-splitter pdf2image pypdf2 scikit-learn tensorflow

Last synced: 04 May 2026

https://github.com/satvikpraveen/sklearn-mastery

Enterprise-grade ML framework showcasing advanced Scikit-Learn implementations with production-ready pipelines, algorithm-optimized synthetic data generation, comprehensive evaluation suite with statistical testing, custom transformers, ensemble methods, and real-world industry applications across healthcare, finance, and manufacturing domains.

artificial-intelligence ci-cd classification custom-transformers data-science docker ensemble-methods feature-engineering fintech fraud-detection healthcare-ai hyperparameter-tuning jupyter-notebooks machine-learning mlops model-evaluation pipeline-architecture predictive-maintenance python scikit-learn

Last synced: 04 May 2026

https://github.com/joel-beck/airbnb-oslo

Price Prediction Models for Airbnb Apartments in Oslo | Winter Term 2021/22

prediction python pytorch scikit-learn

Last synced: 04 May 2026

https://github.com/bhawnamehbubani/airline-passenger-referral-program-development-with-classification-techniques

Prediction of airline passenger referrals using Logistic Regression, GridSearchCV, and TF-IDF vectorization with Python, Pandas, Scikit-learn, and Excel.

excel gridsearchcv logistic-regression pandas python3 scikit-learn tf-idf-vectorization

Last synced: 04 May 2026

https://github.com/suguru-n/temp_easyai

学部生向け機械学習体験プログラム

google-colab jupyter-notebook linearregression python scikit-learn

Last synced: 04 May 2026

https://github.com/drod75/nyc-arrests-analysis

This is a simple Data Science Project made to analyze and display data and trends found within the NYC Arrests Year to Date Dataset.

data-analysis data-visualization folium jupyter-notebook matplotlib-pyplot nyc-opendata nypd python scikit-learn seaborn

Last synced: 04 May 2026

https://github.com/msikorski93/protein-tertiary-structure

Performing a regression task for estimating residue size based on given physicochemical properties of protein tertiary structures (CASP 5-9).

bioinformatics gradient-boosting multilayer-perceptron-network protein-structure-prediction regression-algorithms scikit-learn tensorflow

Last synced: 04 May 2026

https://github.com/chathumiamarasinghe/nn-training-model

A comprehensive project for training neural networks to solve real-world problems. This repository includes customizable code for building, training, and evaluating neural network architectures using popular deep learning frameworks.

jupyter-notebook matplotlib numpy phyton scikit-learn

Last synced: 04 May 2026

https://github.com/aqueeqazam/machine-learning-using-scikit

This repository contains all of the algorithms used to train the machine learning models using the Scikit library.

numpy scikit-learn

Last synced: 04 May 2026

https://github.com/thekartikeyamishra/resumeevaluatorapp

The Automated Resume Evaluator is a Python-based application that helps evaluate resumes against job descriptions. It calculates an Applicant Tracking System (ATS) score, which is the percentage of keywords from the job description found in the resume.

flask machine-learning matplotlib nlp nltk pypdf python scikit-learn spacy textblob

Last synced: 05 May 2026

https://github.com/simpl1fy/spam-classifier-project

A web application to classify spam texts or emails.

multinomial-naive-bayes nltk python render scikit-learn text-classification

Last synced: 05 May 2026

https://github.com/hallowshaw/text-emotion-classification-using-lstm-and-tokenization

This repository provides a machine learning and deep learning pipeline for text emotion detection. It includes a pretrained LSTM model, tokenizer, and preprocessing steps to classify emotions such as joy, sadness, and anger from text input. Easily deployable with provided resources and scripts.

emotion-classification emotion-detection feature-engineering lstm nltk nltk-python scikit-learn scikitlearn-machine-learning sentiment-analysis sequential-models text-classification text-classification-multi-label tokenization tokenizer

Last synced: 05 May 2026

https://github.com/zafir100100/cancer-stage-prediction

This code predicts cancer data using various regression models, calculates their average R-squared scores, and prints the best model.

cross-validation data-analysis data-preprocessing decision-trees gradient-boosting linear-regression machine-learning-algorithms numpy pandas random-forest regression scikit-learn

Last synced: 05 May 2026

https://github.com/hitthecodelabs/petalanalyticsstreamlit

Web application developed with Streamlit that predicts the Iris flower type based on its physical features

matplotlib model numpy pickle python scikit-learn sklearn streamlit

Last synced: 05 May 2026

https://github.com/monish-nallagondalla/universal-bank

Credit Card Ownership Prediction A machine learning project that predicts credit card ownership using features like age and income, balancing class distributions for improved accuracy.

classification-models credit-card-prediction data-analysis data-classification decision-tree-classifier imbalanced-datasets machine-learning model-evaluation python scikit-learn

Last synced: 05 May 2026

https://github.com/smaddanki/pattern-pursuit-challenge

A personal challenge to build a production-ready trading signal system for S&P 500 stocks using deep learning. This project progresses from basic ML models to a complete trading infrastructure, focusing on 5-day forward return prediction and signal generation.

deep-learning machine-learning pytorch quantative-trading quantitative-finance quantitative-research scikit-learn

Last synced: 05 May 2026

https://github.com/markdouthwaite/lingo-demo

A demo project showing how to effectively deploy Scikit-Learn Linear Models in Go into Google Cloud Run.

go golang google-cloud-platform python scikit-learn

Last synced: 05 May 2026

https://github.com/teja-1403/coursera-machine-learning-with-python-honors

This project involves building a classifier to predict rainfall for the next day based on weather data from the Australian Government's Bureau of Meteorology. Various machine learning techniques such as Linear Regression, KNN, Decision Trees, Logistic Regression, and SVM were implemented and evaluated.

classification hierarchical-clustering machine-learning regression scikit-learn scipy

Last synced: 05 May 2026

https://github.com/zuhairzia/customer-segmentation

📖 About Customer Segmentation using KMeans clustering to analyze demographics, income, and spending. Helps businesses with targeted marketing and customer insights.

joblib matplotlib numpy pandas scikit-learn seaborn

Last synced: 05 May 2026

https://github.com/rohra-mehak/sciencesync

System for Personalized Google Scholar Alerts Processing and Data Management, and provision of ML based clustering analysis

agglomerative-clustering clustering crossref-api customtkinter google-api google-scholar graph-api machine-learning numpy pandas python3 scientific-article-analysis scikit-learn sqlite3

Last synced: 05 May 2026

https://github.com/nandinimarepalli/ai_ml_internship_projects

Projects completed during my AI/ML and Data Expert internship, including EDA, machine learning models, and dashboard development using Python, pandas, scikit-learn, and visualization libraries.

matplotlib numpy pandas python scikit-learn seaborn

Last synced: 05 May 2026

https://github.com/rohansardar/iris_flower

A basic ML project on the iris flower classification

data-science iris-classification iris-dataset ml python scikit-learn

Last synced: 05 May 2026

https://github.com/aysenurcftc/breast_cancer_streamlit

Breast Cancer Wisconsin Dataset Classifier with Scikit-learn and Streamlit

breast-cancer classification gridsearch scikit-learn streamlit

Last synced: 05 May 2026

https://github.com/supernovasatsangi23/modifying-biomarker-gene-identification-for-effective-cancer-categorization

A project that focuses on implementing a hybrid approach that modifies the identification of biomarker genes for better categorization of cancer. The methodology is a fusion of MRMR filter method for feature selection, steady state genetic algorithm and a MLP classifier.

dataset deep-learning deep-neural-networks feature-selection genetic-algorithm machine-learning machine-learning-algorithms mlp-classifier mrmr neural-network numpy pandas-dataframe python python3 scikit-learn scikit-learn-python tkinter-gui tkinter-python

Last synced: 05 May 2026

https://github.com/kefrankk/ml-fraud-detection

I built a predictive model to detect fraud in financial transactions.

pandas python scikit-learn

Last synced: 05 May 2026

https://github.com/vanilladucky/housing-prediction

This is a data analytics and machine learning project that I undertook using a housing dataset on Kaggle in order to put my machine learning knowledge to practice and some practical application.

data-science machine-learning python scikit-learn

Last synced: 05 May 2026

https://github.com/divinenaman/color-extraction-api

Extract colours from images using K-means, along with FastAPI pipeline.

fastapi k-means-clustering scikit-learn

Last synced: 05 May 2026

https://github.com/sevilaymuni/project-no.6-tree-based-models

Random Forest Assisted Suggestions for Salifort Motors Employee Retention: Plan, Analyze, Construct and Execute

data-science decision-trees evaluation-metrics gridsearchcv logistic-regression machine-learning matplotlib python random-forest-classifier scikit-learn seaborn-plots

Last synced: 05 May 2026

https://github.com/pjj11005/ml_with_pytorch_study

[머신 러닝 교과서: 파이토치 편] -> 학습한 코드 저장소

deep-learning graph-neural-networks machine-learning neural-networks pytorch scikit-learn transformer

Last synced: 06 May 2026

https://github.com/grandechowhiskey/fcc-machine_learning-boilerplates

A collection of projects completed as part of the FreeCodeCamp "Machine Learning with Python" certification. These projects focus on implementing machine learning models, data preprocessing, and predictive analysis using libraries like scikit-learn and TensorFlow.

ai ml python3 scikit-learn tensorflow

Last synced: 06 May 2026

https://github.com/rishisolanke/twitter-sentiment-analysis-using-machine-learning-

A research project that classifies tweets as positive, negative, or neutral using ML algorithms (Logistic Regression, Naïve Bayes, SVM) with NLP preprocessing.

data-science data-visualization logistic-regression machine-learning ml-models naive-bayes natural-language-processing nlp scikit-learn sentiment-analysis svm text-classification twitter-data

Last synced: 06 May 2026

https://github.com/eshansugeesh/fico-score-loan-default-modeling-project

Credit risk assessment using FICO score segmentation, loan default modeling, discretization techniques, and log-likelihood evaluation for predictive analytics in financial services.

bucketing classification credit-risk customer-segmentation data-science discretization fico-score financial-analytics loan-analysis loan-default log-likelihood machine-learning numpy pandas predictive-modeling risk-modeling scikit-learn segmentation statistical-modelling

Last synced: 06 May 2026

https://github.com/billgewrgoulas/recommendation-systems

Algorithms for joke rating prediction using the joke data-set from Kaggle.

algorithm clustering collaborative-filtering machine-learning numpy pandas recommender-system scikit-learn scypi

Last synced: 06 May 2026

https://github.com/kaoutarmi/predition_price-old-cars

Ce projet de prédiction du prix des voitures utilise l’apprentissage automatique pour estimer la valeur des véhicules en fonction de leurs caractéristiques.

car-price-prediction data-preprocessing data-science decision-tree feature-engineering machine-learning regression scikit-learn

Last synced: 06 May 2026

https://github.com/sabin74/boston_house_prediction

This project aims to predict the median value of owner-occupied homes in Boston suburbs using various machine learning regression models. Multiple regression techniques were applied, including Linear Regression, Decision Tree, Random Forest, Gradient Boosting and dimensionality reduction with PCA. Hyperparameter tuning was performed.

boston-housing-price-prediction hyperparameter-tuning kaggle-dataset pca-analysis python3 regression-models scikit-learn

Last synced: 06 May 2026

https://github.com/felipesbonatti/case-credit-risk-prediction

Projeto de classificação de risco de crédito construído com Python, Scikit-learn e Pandas. Demonstra um fluxo de trabalho de Machine Learning de ponta a ponta: pré-processamento de dados, feature engineering, treinamento de múltiplos algoritmos e avaliação de performance com métricas como AUC-ROC.

credit-risk machine-learning predictive-modeling python scikit-learn

Last synced: 06 May 2026

https://github.com/dwade-eng/customer-lead-conversion-analysis

This project explores a real-world lead conversion dataset, using a structured machine learning pipeline to classify leads into likely or unlikely converters. It includes complete steps from data wrangling and visualization to feature engineering and model evaluation.

html matplotlib pandas python3 scikit-learn seaborn

Last synced: 06 May 2026

https://github.com/ejw-data/ml-playground

Testing the limitations, inabilities, and strengths of models with synthetic data

machine-learning python scikit-learn

Last synced: 06 May 2026

https://github.com/douglas-data-analyst/predictive-analysis

Modelo preditivo para previsão de vendas usando scikit-learn e machine learning

data-science machine-learning predictive-analytics python sales-forecasting scikit-learn time-series

Last synced: 06 May 2026

https://github.com/lintangwisesa/pdb_mti_ui_lab1_k6

Tugas Lab 1 Pengelolaan Data Besar MTI UI 2023

machine-learning python3 scikit-learn

Last synced: 06 May 2026

https://github.com/barbarpotato/applied-data-science-with-python-specialization

This skills-based specialization is intended for learners who have a basic python or programming background, and want to apply statistical, machine learning, information visualization, text analysis, and social network.

data-science matplotlib pandas scikit-learn

Last synced: 06 May 2026

https://github.com/kartheekdama/salary-prediction

This salary prediction model leverages machine learning techniques, including Random Forest, Decision Tree, and Linear Regression, to estimate salaries based on individual attributes such as age, gender, education level, job title, and years of experience. The Random Forest model outperforms the others, achieving the highest R-squared score.

decision-tree exploratory-data-analysis feature-importance linear-regression machine-learning random-forest scikit-learn

Last synced: 06 May 2026

https://github.com/rafay-imraan/recommendation-system

A machine learning model that outputs personalized similar movie recommendations for people based on the ones they have rated positively.

machine-learning pandas python scikit-learn

Last synced: 06 May 2026