An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/idaraabasiudoh/knn-customer-classification

Labels telecommunication customer base to respective groups to determine service type required for each customer.

data-analysis jupyter-notebook machine-learning pyhton3 scikit-learn

Last synced: 07 May 2026

https://github.com/joseprsm/nectarine

🍑 Neural Enhanced Collaborative Tool for Automated Recommendation and INtelligent Exploration

argo-workflows recommender-systems scikit-learn tensorflow tensorflow-recommenders

Last synced: 07 May 2026

https://github.com/md-emon-hasan/6-classification-iris-ml-apps

A ML project on the classification of the Iris dataset, demonstrating data preprocessing, model training, and evaluation using Python and scikit-learn.

classification data-science iris-classification iris-dataset iris-flower-classification predictive-modeling scikit-learn

Last synced: 26 Apr 2026

https://github.com/nirmalyabag20/crop-yield-prediction-using-machine-learning

This project uses machine learning to predict crop yields based on factors like region, crop type, rainfall, temperature, and pesticide use. By analyzing a dataset of over 28,000 records, the models provide accurate yield forecasts, helping optimize farming decisions and resource management, ultimately contributing to sustainable agriculture.

jupyter-notebook matplotlib numpy pandas python scikit-learn seaborn

Last synced: 06 Feb 2026

https://github.com/singhrahuldps/myscikitlearn

My implementation of some Machine Learning Algorithms from scratch.

classifier-model decision-trees machine-learning scikit-learn

Last synced: 27 Apr 2026

https://github.com/chirindaopensource/measuring_corruption_from_text_data

End-to-End Python implementation of Muço’s (2025) corruption measurement framework. Combines NLP pipeline (regex extraction, Porter stemming, TF-IDF), PCA-based dimensionality reduction, and fixed-effects OLS to quantify institutional quality from Brazilian audit reports. Includes supervised learning robustness checks and LOO sensitivity analysis.

audit-analysis brazilian-data corruption-measurement dictionary-based-classification dimensionality-reduction econometrics fixed-effects government-transparency institutional-quality natural-language-processing nltk political-economy portuguese-nlp principal-component-analysis research-replication scikit-learn supervised-learning text-as-data text-classification text-mining

Last synced: 27 Apr 2026

https://github.com/bala-1409/foreign-exchange-rate-time-series-data-science-project

This project will use time series analysis to forecast the exchange rate between the euro and the US dollar. The project will use a variety of statistical techniques, such as ARIMA to model the data and forecast the exchange rate.

data-analysis data-science data-visualization datapreprocessing eda exploratory-data-analysis forecasting machine-learning-algorithms model modelfitting predictive-modeling python3 scikit-learn statsmodels time-series time-series-analysis

Last synced: 07 May 2026

https://github.com/mrapp-ke/examplewisef1maximizer

A scikit-learn meta-estimator for multi-label classification that aims to maximize the example-wise F1 measure

machine-learning multilabel-classification scikit-learn

Last synced: 27 Apr 2026

https://github.com/texnoforge/texnomagic

TexnoMagic library for digital Magic

gmm magic numpy python recognition scikit-learn scipy

Last synced: 03 Mar 2026

https://github.com/mehuaniket/blog-classifier

blog classifier with scikit random forest.

bag-of-words blog-classifier python scikit-learn

Last synced: 07 May 2026

https://github.com/otuemre/realtimenids

Real-time network intrusion detection system using Zeek flow logs and machine learning (IsolationForest). Detects threats with both signature-based and anomaly-based techniques trained on the CSE-CIC-IDS2018 dataset.

anomaly-detection cybersecurity flow-analysis isolation-forest machine-learning network-intrusion-detection nids scapy scikit-learn zeek

Last synced: 07 May 2026

https://github.com/rickiepark/ml-ko

머신러닝, 딥러닝 한글 번역 저장소

deep-learning keras machine-learning python scikit-learn tensorflow

Last synced: 17 Apr 2026

https://github.com/antonio-f/find-duplicate-questions

Find duplicate questions on StackOverflow by their embeddings. From the Natural Language Processing course - Coursera's Advanced Machine Learning specialization.

cosine-similarity discounted-cumulative-gain embeddings gensim natural-language-processing nlp nltk scikit-learn starspace text-similarity word2vec

Last synced: 27 Apr 2026

https://github.com/tddschn/hack-ncsu-2024

ML and doc part of our Hack_NCState project builtin in less than 1 day | Racial Bias in Criminal Justice Visualized: Code Black

bias machine-learning scikit-learn

Last synced: 08 May 2026

https://github.com/canayter/unsupervised-machine-learning

Utilizing Python and unsupervised learning to predict if cryptocurrencies are affected by 24-hour or 7-day price changes.

k-means-clustering python scikit-learn unsupervised-machine-learning

Last synced: 08 May 2026

https://github.com/cool-japan/sklears

A comprehensive machine learning library in Rust, inspired by scikit-learn's intuitive API and combining it with Rust's performance and safety guarantees.

ai artificial-intelligence machine-learning rust rust-lang scikit-learn scikitlearn-machine-learning

Last synced: 26 Apr 2026

https://github.com/chengetanaim/sentimentanalysisforfinancialnewsnotebook

Building the model of a financial news sentiment classifier. Financial news headlines will be classified as positive, negative or neutral (from an investor point of view)

logistic-regression machine-learning natural-language-processing scikit-learn tfidf-vectorizer

Last synced: 04 May 2026

https://github.com/anarya22/heart-disease-classification

Predicting heart disease using machine learning. This notebook looks into various python base ML and DS libraries in an attempt to build a machine learning model capable of predicting whether or not someone has heart disease based on their medical attributes.

data-cleaning data-visualization machine-learning matplotlib numpy pandas scikit-learn

Last synced: 01 May 2026

https://github.com/elifftosunn/bert-bank-model

It is a Turkish BERT-based model that will analyze people's bank complaints and classify them according to one of eight categories.

countvectorizer doc2vec f1-score huggingface huggingface-transformer huggingface-transformers nlp nltk python3 scikit-learn stopwords tagged tfidf-transformer train-test-split word-tokenizer wordnetlemmatizer

Last synced: 12 May 2026

https://github.com/iakshatgandhi/fake-news-classification-model-main

A machine learning-based project designed to classify news articles as real or fake. This system combines advanced natural language processing (NLP), robust machine learning models, and intuitive visualizations to deliver accurate and scalable predictions.

matplotlib nltk pickle python scikit-learn seaborn

Last synced: 09 Oct 2025

https://github.com/gigdevelopment10/neuralfunk

A Machine learning resource library for funky ML-Learners

algorithm keras machine-learning optimization-algorithms py-torch python scikit-learn tensorflow

Last synced: 29 Apr 2026

https://github.com/ahmetcansolak/decision-tree-classifier-scikit-learn

A simple decision tree classifier example using scikit-learn

decision-tree-classifier python scikit-learn

Last synced: 28 Apr 2026

https://github.com/thevarunsharma/extracting-dominant-colors

A web application that extracts the dominant colors from an image using K-means clustering.

flask-application k-means-clustering machine-learning python scikit-learn unsupervised-learning

Last synced: 12 May 2026

https://github.com/jesly-joji/spam-ham-classifier

Used Naive Bayes Algorithm, NLP Text Preprocessing Techniques

naive-bayes-classifier nlp scikit-learn streamlit text-preprocessing

Last synced: 03 May 2026

https://github.com/official-biswadeb941/clopimedi---your-healths-trusted-care

ClopiMedi is an AI-driven healthcare application that simplifies doctor appointment bookings, offering personalized recommendations based on medical conditions to enhance patient-provider connections.

adam ai flask flask-api flask-api-backend full-stack-web-development joblib machine-learning scikit-learn tensorflow

Last synced: 28 Apr 2026

https://github.com/charmee123/krishakvriddhi-final

I have also deployed this site on replit you can also check from that. https://replit.com/@charmee123/KrishakVriddhi?v=1

bootstrap css flask html javascript machine-learning python replit scikit-learn weather-api

Last synced: 14 Apr 2026

https://github.com/nirmalyabag20/breast-cancer-prediction-using-machine-learning

This project leverages machine learning to classify breast cancer as malignant or benign based on tumor characteristics. By applying and evaluating multiple algorithms, the model achieves high accuracy, demonstrating the practical application of data-driven solutions in medical diagnostics.

logistic-regression matplotlib numpy pandas python scikit-learn seaborn

Last synced: 12 Feb 2026

https://github.com/alessiochen/setiment-analysis-ai-project

Application of Sentimental Analysis for Artificial Intelligence class at UNIFI

ai andrew dataset movie-reviews scikit-learn sentiment-analysis

Last synced: 12 May 2026

https://github.com/nmsby/pca-machine-learning-lab

Principal Component Analysis (PCA) implementation and analysis lab for Machine Learning. Features manual PCA implementation, scikit-learn applications, data compression, and feature extraction with detailed visualizations.

data-analysis dimensionality-reduction jupyter-notebook machine-learning numpy pca python scikit-learn visualization

Last synced: 01 May 2026

https://github.com/byigitt/smartmove

fake data generation and analysis for ankara metro station

ankara cv2 metro numpy pandas scikit-learn

Last synced: 03 May 2026

https://github.com/kritimbist/365-days-of-github-challenge-ai-machine-learning

This repository is part of my 365 Days Challenge: AI × Machine learning, where I combine my passion for Machine Learning 🤖 to learn, build, and document projects every single day for one year.

data-science data-visualization deep-learning machine-learning matplotlib numpy python scikit-learn

Last synced: 28 Apr 2026

https://github.com/md-emon-hasan/ai-from-university

🎓 Collection of academic resources, projects, and exercises related to artificial intelligence concepts learned in university coursework.

ai artificial-intelligence linear-regression logestic-regression mahcine-learning ml scikit-learn

Last synced: 17 Apr 2026

https://github.com/francescopaolol/logisticregression

About predicting survival on the Titanic and get familiar with ML basics

jupyter-notebook kaggle logistic-regression machine-learning ml pandas scikit-learn

Last synced: 16 Apr 2026

https://github.com/aliy98/navigation-sensor-data-classification

Classification of a Navigation Robot Sensor Dataset Using SVM, Random Forest and Neural Network

artificial-neural-networks keras multiclass-classification random-forest scikit-learn scitos-g5 support-vector-machines

Last synced: 13 May 2026

https://github.com/aakanksha1406/fake-news-classifier

to identify when an article might be fake news

keras lstm lstm-neural-networks nltk python scikit-learn tensorflow

Last synced: 13 Feb 2026

https://github.com/adithaker/falafel

🤖 A from-scratch implementation of a small scaled federated learning application.

cli-app distributed-systems federated-learning logistic-regression python scikit-learn

Last synced: 28 Apr 2026

https://github.com/h-fuzzy-logic/python-finding-nsf-award-themes

Using NLP to find themes and concepts in NSF Awards

nltk pandas python scikit-learn

Last synced: 03 May 2026

https://github.com/baggiponte/ta-statistics-for-big-data-2022

🎓 Introduction to Python and Machine Learning [UniMi • AY 2021/2022]

clustering data-science data-visualization machine-learning python scikit-learn

Last synced: 03 May 2026

https://github.com/siam29/credit-card-fraud-detection-in-real-time

This project delivers a fast and efficient fraud detection methodology, providing predictions in under a second, emphasizing the importance of both high performance and quick response times.

ensemble-machine-learning feature-selection genetic-algorithm machine-learning matplotlib pandas pca scikit-learn

Last synced: 03 May 2026

https://github.com/carmoreno/analisisaccidentalidadbogota

Data Analysis about traffic accidents at Bogotá, Colombia.

data-analysis data-science jupyer-notebook matplotlib numpy pandas scikit-learn

Last synced: 17 Apr 2026

https://github.com/harshitwaldia/stock-price-prediction

An AI-driven stock market analysis dashboard that predicts next-day stock prices using a deep learning LSTM model. The project features: 🔮 AI Predictions for stock movements 🌍 Global market support (US, India, China, Japan, UK) 📊 Interactive React dashboard with charts & recent searches ⚡ Flask backend powered by Tensor/Keras & Yahoo Finance

dashboard flask flask-cors keras-tensorflow lstm-neural-networks machine-learning numpy react-typescript scikit-learn stock-price-prediction

Last synced: 03 May 2026

https://github.com/ivanyu/kaggle-digit-recognizer

Kaggle's "Digit Recognizer" competition

kaggle keras machine-learning scikit-learn

Last synced: 17 Apr 2026

https://github.com/loong64/onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

ai-framework deep-learning hardware-acceleration loong64 loongarch64 machine-learning neural-networks onnx pytorch scikit-learn tensorflow

Last synced: 09 May 2026

https://github.com/lakshitalearning/churninsight

Customer Churn prediction means knowing which customers are likely to leave or unsubscribe from your service.

churn-prediction data-science flask google-colab machine-learning predictive-analytics python scikit-learn user-retention web-development

Last synced: 09 May 2026

https://github.com/davidcamilo0710/hate_speech_analysis

Hate speech detection using NLP for linguistic analysis and machine learning (XGBoost) for classification with Python and SpaCy.

hate-speech-detection linguistic-analysis nlp scikit-learn spacy xgboost

Last synced: 09 May 2026

https://github.com/bhuvaneshwarguttula/student-performance-indicator

To understand and predict how the student's performance (test scores) is affected by the other variables (Gender, Ethnicity, Parental level of education, Lunch, Test preparation course).

exploratory-data-analysis machine-learning pandas python scikit-learn student-performance-analysis

Last synced: 07 Mar 2026

https://github.com/vishal-038/attendance_by_face_recogination

This project is a face recognition-based attendance system that uses Python, OpenCV, Scikit-learn, Streamlit, and various other libraries like Pandas, Numpy, Datetime, and OS for different functionalities. It enables adding faces to the database, taking attendance based on face recognition, and showing live attendance through a web interface built

opencv python scikit-learn

Last synced: 14 Feb 2026

https://github.com/ultrasage-danz/scikit-learn-ml

Machine Learning with scikit-learn by Data School

ai data data-school machine-learning macos ml scikit-learn ultrasage-dan

Last synced: 13 May 2026

https://github.com/hq969/customer-churn-prediction-with-hyperparameter-optimization-and-model-deployment

A complete end-to-end machine learning project that predicts customer churn using the Telco dataset. It includes data preprocessing, exploratory data analysis (EDA), model training with Random Forest, hyperparameter tuning, evaluation, and deployment via a Flask API.

flask numpy pandas python scikit-learn xgboost

Last synced: 02 Apr 2026

https://github.com/mg380/ibm-applied-data-science-capstone

This Capstone is the 10th (final) course in IBM Data Science Professional Certificate specialization, and it actually summarises in the form of project all materials that have been learned during this specialization

capstone data data-analysis data-science datascience ibm machine-learning plotly python scikit-learn sql

Last synced: 05 Mar 2026

https://github.com/the-developer-306/house-price-predictor

House Price Predictor: Harnessing machine learning algorithms to forecast housing prices in Boston, empowering buyers and sellers with accurate predictions based on key factors like location, crime rate, rooms, accessibility, and more.

csv ipynb-jupyter-notebook joblib matplotlib numpy pandas python scikit-learn

Last synced: 23 Feb 2026

https://github.com/rakibhhridoy/supportvectormachinein-medical

Support vector machine in medical disease detection. Both linear and non-linear data can be fitted in svm through its kernel specialization In medical we focus on precision or recall rather than accuracy.

diabetes-prediction machine-learning medical precision-medicine recall-precision scikit-learn support-vector-machines svm

Last synced: 29 Apr 2026

https://github.com/akhil888binoy/intelligent-supplychain-management-system

Blockchain-powered supply chain management system with ML-driven sales prediction. Streamlines supplier-employee transactions and inventory management. Built with MERN stack, Solidity, and Flask.

blockchain decentralized-payments ethereum express flask foundry hackathon-project inventory-management machine-learning mern-stack mongodb nodejs python react sales-prediction scikit-learn smart-contracts solidity supply-chain-management wagmi

Last synced: 09 Oct 2025

https://github.com/chitralputhran/drive-curve-machine-learning-app

:blue_car: Drive Curve is a web application made with the help of Flask, a microframework for Python based on Werkzeug, Jinja 2, and good intentions. On the backend, a Machine Learning model is used for predicting the price of the car. The machine learning model was trained on the Automobile Dataset from the UCI Machine Learning Repository.

flask machine-learning python scikit-learn webapp

Last synced: 03 May 2026

https://github.com/RickContreras/StudentPerformancePredictionSaberPro

Modelo de clasificación para predecir el desempeño de estudiantes en las Pruebas Saber Pro en Colombia. Incluye análisis exploratorio de datos, preprocesamiento y modelos de machine learning.

classification colombia data-analysis data-science education educational-assessment exploratory-data-analysis jupyter-notebook machine-learning python saber-pro scikit-learn student-performance

Last synced: 24 Oct 2025

https://github.com/andresmg07/real-time-sign-language-translator

AI-driven real-time American Sign Language translator. Implemented leveraging Support Vector Machines (SVM), OpenCV library and MediaPipe hands module.

ai computer-vision machine-learning mediapipe opencv pattern-recognition scikit-learn support-vector-machines

Last synced: 16 Apr 2026

https://github.com/jasper-koops/easy-gscv

This library allows you to quickly train machine learning classifiers by automatically splitting the data set and using both grid search and cross validation in the training process.

classification machine-learning python3 scikit-learn

Last synced: 14 Feb 2026

https://github.com/siam29/ensemble-majority-voting-hard

In this project, we implemented an ensemble learning approach using majority voting (hard voting) with five machine learning classifiers: DT, RF, XGBC, ANN, and KNN. The ensemble model achieved an impressive accuracy score of 99.95% and an F1 score of 85.51%.

credit-card-fraud ensemble-learning machine-learning matplotlib pandas scikit-learn

Last synced: 09 May 2026

https://github.com/garcane/Income-Prediction-ML

This is a machine learning project aimed at predicting whether an individual's annual income exceeds $50,000 based on their demographic and personal information.

data data-science machine-learning ml numpy pandas python random-forest scikit-learn

Last synced: 24 Oct 2025

https://github.com/zazi2002/machine-learning-project

Introduction to Machine Learning project with the goal of improving the classification performance on a dataset by optimizing the number of features and weak learners.

dimentionality-reduction ensemble-learning numpy pca random-forest scikit-learn

Last synced: 02 May 2026

https://github.com/prashver/titanic-survival-prediction

This project tackles the Titanic challenge on Kaggle, predicting passenger survival based on variables like age, sex, and passenger class. The Jupyter notebook covers essential steps of a data science pipeline, including exploratory data analysis, data cleaning, feature engineering, and modeling. The dataset used is the Titanic dataset.

classification-algorithm machine-learning-algorithms matplotlib numpy pandas scikit-learn seaborn

Last synced: 02 May 2026

https://github.com/pankajarm/tabular_ml_toolkit

A helper library to jumpstart your machine learning project based on tabular or structured data.

data-science feature-engineering hyperparameter-tuning machine-learning parallelism python scikit-learn structured-data tabular xgboost

Last synced: 19 Jan 2026

https://github.com/rakibhhridoy/machinelearning-featureselection

Before training a model or feed a model, first priority is on data,not in model. The more data is preprocessed and engineered the more model will learn. Feature selectio one of the methods processing data before feeding the model. Various feature selection techniques is shown here.

extratreesclassifier feature-selection gridsearchcv lasso-regression logistic-regression machine-learning numpy pandas pca rfe rfecv scikit-learn selectkbest

Last synced: 02 May 2026

https://github.com/khaymanii/titanic_survival_prediction_-model

This Model was built using Python and Logistic Regression algorithm

matplotlib numpy pandas python scikit-learn seaborn

Last synced: 02 May 2026

https://github.com/umar-saadat/car-price-prediction-ml

🚗 A Machine Learning project that predicts the price of used cars using Linear Regression. Built with Python, Scikit-learn, and Streamlit, this app takes inputs like car brand, year, mileage, engine size, and more to estimate the selling price in real-time

ai-project car-price-prediction data-science linear-regression machine-learning ml-project python scikit-learn streamlit

Last synced: 02 May 2026

https://github.com/gauravsingh9356/machine_learning

All my practical learning work involved in MACHINE LEARNING (Data Processing to Deep Learning)

deep-learning jupyter-notebook machine-learning machine-learning-algorithms nlp-machine-learning python scikit-learn

Last synced: 30 Apr 2026

https://github.com/alam025/customer-churn-prediction

🎯 Predict customer churn with 96%+ accuracy using Random Forest ML. Beautiful visualizations, production-ready code, and real business impact. Save revenue before customers leave! 🚀

churn-prediction classification customer-analytics customer-churn customer-retention data-science machine-learning pandas predictive-analytics python random-forest scikit-learn

Last synced: 11 Jun 2026

https://github.com/petrosdemetrakopoulos/flight-passengers-prediction

A supervised learning problem given as a project in the "Data Mining in Databases and World Wide Web" course in Computer Science Department of AUEB in Winter semester of 2019.

classification classifier data-science machine-learning python scikit-learn sklearn university-project

Last synced: 30 Apr 2026

https://github.com/t-abishek/embedded-intent-classifier

A production-grade FastAPI application that uses sentence embeddings to classify user prompts into 4 categories: Built using Python, BGE SentenceTransformer, Scikit-learn, and FastAPI.

classifier embedded huggingface pandas scikit-learn transformer

Last synced: 10 May 2026

https://github.com/bistcuite/plainml

Painless Machine Learning Library for python based on scikit-learn

machine-learning ml plainml python scikit-learn

Last synced: 02 May 2026

https://github.com/zachpinto/xc-rankings-predictions

Applied ML Project predicting cross-country team rankings based on individual-level performances

random-forest scikit-learn

Last synced: 29 Apr 2026

https://github.com/ayyucedemirbas/solar_power_elasticnet

ElasticNet Linear Regression on Solar Power Generation

elasticnet-regression scikit-learn skops tabular-regression

Last synced: 29 Apr 2026

https://github.com/dhavaltaunk08/gender-classification

I did this project during my internship at IIT Guwahati. It aimed to perform gender classification in video streaming.

deep-learning librosa opencv-python python scikit-learn

Last synced: 14 May 2026

https://github.com/sapsan14/water-quality-ee

Estonian water quality ML — binary classification of Terviseamet open data, Jupyter + scikit-learn.

classification estonia jupyter ml open-data scikit-learn

Last synced: 02 May 2026

https://github.com/aryansk/customer-segmentation-analysis

Advanced customer segmentation project using K-Means clustering to analyze customer behavior based on annual income, spending score, and age.

elbow-method exploratory-data-analysis machine-learning machine-learning-algorithms python scikit-learn sentiment-analysis sentiment-classification

Last synced: 29 Apr 2026

https://github.com/bestmahdi2/uni__dataminningstackoverflowproject

A university project related to data mining lesson on StackOverflow website data with Python language

cart csv data-mining logistic-regression matplotlib mlp naive-bayes nltk numpy pandas python scikit-learn scipy seaborn stackoverflow svc textblob tqdm xgboost

Last synced: 16 Feb 2026