scikit-learn
scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.
- GitHub: https://github.com/topics/scikit-learn
- Wikipedia: https://en.wikipedia.org/wiki/Scikit-learn
- Repo: https://github.com/scikit-learn/scikit-learn
- Created by: David Cournapeau
- Released: January 05, 2010
- Related Topics: scikit, python,
- Aliases: sklearn,
- Last updated: 2026-06-11 00:27:27 UTC
- JSON Representation
https://github.com/idaraabasiudoh/knn-customer-classification
Labels telecommunication customer base to respective groups to determine service type required for each customer.
data-analysis jupyter-notebook machine-learning pyhton3 scikit-learn
Last synced: 07 May 2026
https://github.com/gerdm/machine_learning
A repository with a bunch of machine learning
analyses data-science machine-learning machine-learning-algorithms scikit-learn
Last synced: 30 Apr 2026
https://github.com/joseprsm/nectarine
🍑 Neural Enhanced Collaborative Tool for Automated Recommendation and INtelligent Exploration
argo-workflows recommender-systems scikit-learn tensorflow tensorflow-recommenders
Last synced: 07 May 2026
https://github.com/3rd-son/breast-cancer-prediction-app
classification-algorithm machine-learning python scikit-learn
Last synced: 26 Apr 2026
https://github.com/md-emon-hasan/6-classification-iris-ml-apps
A ML project on the classification of the Iris dataset, demonstrating data preprocessing, model training, and evaluation using Python and scikit-learn.
classification data-science iris-classification iris-dataset iris-flower-classification predictive-modeling scikit-learn
Last synced: 26 Apr 2026
https://github.com/nirmalyabag20/crop-yield-prediction-using-machine-learning
This project uses machine learning to predict crop yields based on factors like region, crop type, rainfall, temperature, and pesticide use. By analyzing a dataset of over 28,000 records, the models provide accurate yield forecasts, helping optimize farming decisions and resource management, ultimately contributing to sustainable agriculture.
jupyter-notebook matplotlib numpy pandas python scikit-learn seaborn
Last synced: 06 Feb 2026
https://github.com/singhrahuldps/myscikitlearn
My implementation of some Machine Learning Algorithms from scratch.
classifier-model decision-trees machine-learning scikit-learn
Last synced: 27 Apr 2026
https://github.com/chirindaopensource/measuring_corruption_from_text_data
End-to-End Python implementation of Muço’s (2025) corruption measurement framework. Combines NLP pipeline (regex extraction, Porter stemming, TF-IDF), PCA-based dimensionality reduction, and fixed-effects OLS to quantify institutional quality from Brazilian audit reports. Includes supervised learning robustness checks and LOO sensitivity analysis.
audit-analysis brazilian-data corruption-measurement dictionary-based-classification dimensionality-reduction econometrics fixed-effects government-transparency institutional-quality natural-language-processing nltk political-economy portuguese-nlp principal-component-analysis research-replication scikit-learn supervised-learning text-as-data text-classification text-mining
Last synced: 27 Apr 2026
https://github.com/bala-1409/foreign-exchange-rate-time-series-data-science-project
This project will use time series analysis to forecast the exchange rate between the euro and the US dollar. The project will use a variety of statistical techniques, such as ARIMA to model the data and forecast the exchange rate.
data-analysis data-science data-visualization datapreprocessing eda exploratory-data-analysis forecasting machine-learning-algorithms model modelfitting predictive-modeling python3 scikit-learn statsmodels time-series time-series-analysis
Last synced: 07 May 2026
https://github.com/mrapp-ke/examplewisef1maximizer
A scikit-learn meta-estimator for multi-label classification that aims to maximize the example-wise F1 measure
machine-learning multilabel-classification scikit-learn
Last synced: 27 Apr 2026
https://github.com/texnoforge/texnomagic
TexnoMagic library for digital Magic
gmm magic numpy python recognition scikit-learn scipy
Last synced: 03 Mar 2026
https://github.com/mehuaniket/blog-classifier
blog classifier with scikit random forest.
bag-of-words blog-classifier python scikit-learn
Last synced: 07 May 2026
https://github.com/otuemre/realtimenids
Real-time network intrusion detection system using Zeek flow logs and machine learning (IsolationForest). Detects threats with both signature-based and anomaly-based techniques trained on the CSE-CIC-IDS2018 dataset.
anomaly-detection cybersecurity flow-analysis isolation-forest machine-learning network-intrusion-detection nids scapy scikit-learn zeek
Last synced: 07 May 2026
https://github.com/islam-hady9/smartai_customersupport
Smart Customer Support Assistant
customer-support gpt-2 natural-language-processing python pytorch scikit-learn transformers
Last synced: 17 Feb 2026
https://github.com/francescopaolol/linearregression
About predicting house sale prices for King County
jupyter-notebook kaggle linear-regression machine-learning ml pandas scikit-learn
Last synced: 18 Apr 2026
https://github.com/rickiepark/ml-ko
머신러닝, 딥러닝 한글 번역 저장소
deep-learning keras machine-learning python scikit-learn tensorflow
Last synced: 17 Apr 2026
https://github.com/antonio-f/find-duplicate-questions
Find duplicate questions on StackOverflow by their embeddings. From the Natural Language Processing course - Coursera's Advanced Machine Learning specialization.
cosine-similarity discounted-cumulative-gain embeddings gensim natural-language-processing nlp nltk scikit-learn starspace text-similarity word2vec
Last synced: 27 Apr 2026
https://github.com/tddschn/hack-ncsu-2024
ML and doc part of our Hack_NCState project builtin in less than 1 day | Racial Bias in Criminal Justice Visualized: Code Black
bias machine-learning scikit-learn
Last synced: 08 May 2026
https://github.com/canayter/unsupervised-machine-learning
Utilizing Python and unsupervised learning to predict if cryptocurrencies are affected by 24-hour or 7-day price changes.
k-means-clustering python scikit-learn unsupervised-machine-learning
Last synced: 08 May 2026
https://github.com/dolongbien/ml2018
Machine Learning Fall 2018
decision-tree-classifier dimensionality-reduction jupyter-notebook machine-learning-algorithms naive-bayes-classifier neural-networks python scikit-learn
Last synced: 01 May 2026
https://github.com/cool-japan/sklears
A comprehensive machine learning library in Rust, inspired by scikit-learn's intuitive API and combining it with Rust's performance and safety guarantees.
ai artificial-intelligence machine-learning rust rust-lang scikit-learn scikitlearn-machine-learning
Last synced: 26 Apr 2026
https://github.com/brenofariasdasilva/scientific-research
My Scientific Research Code Repository.
ck code-metrics commons-lang jabref matplotlib numpy pandas pydriller python scientific-research scikit-learn similarity-measures statistical-analysis wem word2vec worked-example worked-example-miner
Last synced: 16 Apr 2026
https://github.com/chengetanaim/sentimentanalysisforfinancialnewsnotebook
Building the model of a financial news sentiment classifier. Financial news headlines will be classified as positive, negative or neutral (from an investor point of view)
logistic-regression machine-learning natural-language-processing scikit-learn tfidf-vectorizer
Last synced: 04 May 2026
https://github.com/anarya22/heart-disease-classification
Predicting heart disease using machine learning. This notebook looks into various python base ML and DS libraries in an attempt to build a machine learning model capable of predicting whether or not someone has heart disease based on their medical attributes.
data-cleaning data-visualization machine-learning matplotlib numpy pandas scikit-learn
Last synced: 01 May 2026
https://github.com/elifftosunn/bert-bank-model
It is a Turkish BERT-based model that will analyze people's bank complaints and classify them according to one of eight categories.
countvectorizer doc2vec f1-score huggingface huggingface-transformer huggingface-transformers nlp nltk python3 scikit-learn stopwords tagged tfidf-transformer train-test-split word-tokenizer wordnetlemmatizer
Last synced: 12 May 2026
https://github.com/filipspl/optuml
Optuna-optimized ML methods, with scikit-learn like API
hyperparameter-optimization hyperparameter-tuning machine-learning optuna python python-module scikit-learn
Last synced: 04 Apr 2026
https://github.com/iakshatgandhi/fake-news-classification-model-main
A machine learning-based project designed to classify news articles as real or fake. This system combines advanced natural language processing (NLP), robust machine learning models, and intuitive visualizations to deliver accurate and scalable predictions.
matplotlib nltk pickle python scikit-learn seaborn
Last synced: 09 Oct 2025
https://github.com/gigdevelopment10/neuralfunk
A Machine learning resource library for funky ML-Learners
algorithm keras machine-learning optimization-algorithms py-torch python scikit-learn tensorflow
Last synced: 29 Apr 2026
https://github.com/ahmetcansolak/decision-tree-classifier-scikit-learn
A simple decision tree classifier example using scikit-learn
decision-tree-classifier python scikit-learn
Last synced: 28 Apr 2026
https://github.com/thevarunsharma/extracting-dominant-colors
A web application that extracts the dominant colors from an image using K-means clustering.
flask-application k-means-clustering machine-learning python scikit-learn unsupervised-learning
Last synced: 12 May 2026
https://github.com/jbris/python_data_profiler_comparison
Comparison between several Python data profile libraries.
automl deepchecks evidently evidentlyai reporting reporting-tool scikit-learn scklearn why ydata ydata-profiling
Last synced: 09 Jun 2026
https://github.com/jesly-joji/spam-ham-classifier
Used Naive Bayes Algorithm, NLP Text Preprocessing Techniques
naive-bayes-classifier nlp scikit-learn streamlit text-preprocessing
Last synced: 03 May 2026
https://github.com/official-biswadeb941/clopimedi---your-healths-trusted-care
ClopiMedi is an AI-driven healthcare application that simplifies doctor appointment bookings, offering personalized recommendations based on medical conditions to enhance patient-provider connections.
adam ai flask flask-api flask-api-backend full-stack-web-development joblib machine-learning scikit-learn tensorflow
Last synced: 28 Apr 2026
https://github.com/jofaval/pima-indian-diabetes
Data Analysis and Classification of Pima Indian Women's Diabetes in 1988
data-analysis data-science deep-learning google-colab kaggle logistic-regression machine-learning pima-diabetes-data python scikit-learn xgboost
Last synced: 16 Apr 2026
https://github.com/carpentries-incubator/python-classifying-power-consumption
Clustering and Classifying Time Series Data for Engineers
carpentries-incubator classification clustering engineering english lesson power-consumption pre-alpha python scikit-learn sklearn
Last synced: 12 Feb 2026
https://github.com/anmol-g-k/smart-grid-electricity-theft-detection
Smart Grid Electricity Theft Detection Powered by Machine Learning
colab-notebook jupyter-notebook machine-learning pandas python scikit-learn scikitlearn-machine-learning
Last synced: 04 Apr 2026
https://github.com/kelvinjuliusarmandoh/loan-approval-prediction
Implementing Machine Learning for predicting loan approval
classification loan-default-prediction machine-learning matplotlib-pyplot numpy pandas scikit-learn seaborn
Last synced: 01 May 2026
https://github.com/charmee123/krishakvriddhi-final
I have also deployed this site on replit you can also check from that. https://replit.com/@charmee123/KrishakVriddhi?v=1
bootstrap css flask html javascript machine-learning python replit scikit-learn weather-api
Last synced: 14 Apr 2026
https://github.com/glubbdubdrib/lazygrid
Automatic, efficient and flexible implementation of complex machine learning pipeline generation and cross-validation.
artificial-intelligence cross-validation database grid-search grid-search-hyperparameters keras machine-learning memoization model-comparison model-selection neural-networks openml optimization pipeline python scikit-learn tensorflow
Last synced: 29 Jan 2026
https://github.com/nirmalyabag20/breast-cancer-prediction-using-machine-learning
This project leverages machine learning to classify breast cancer as malignant or benign based on tumor characteristics. By applying and evaluating multiple algorithms, the model achieves high accuracy, demonstrating the practical application of data-driven solutions in medical diagnostics.
logistic-regression matplotlib numpy pandas python scikit-learn seaborn
Last synced: 12 Feb 2026
https://github.com/alessiochen/setiment-analysis-ai-project
Application of Sentimental Analysis for Artificial Intelligence class at UNIFI
ai andrew dataset movie-reviews scikit-learn sentiment-analysis
Last synced: 12 May 2026
https://github.com/nmsby/pca-machine-learning-lab
Principal Component Analysis (PCA) implementation and analysis lab for Machine Learning. Features manual PCA implementation, scikit-learn applications, data compression, and feature extraction with detailed visualizations.
data-analysis dimensionality-reduction jupyter-notebook machine-learning numpy pca python scikit-learn visualization
Last synced: 01 May 2026
https://github.com/byigitt/smartmove
fake data generation and analysis for ankara metro station
ankara cv2 metro numpy pandas scikit-learn
Last synced: 03 May 2026
https://github.com/sarmadahmad8/ml-and-deeplearning-projects-for-beginners
Beginner ML/DL projects spanning core libraries and problem sets.
beginner-friendly data-analysis data-science deep-learning fastai machine-learning opencv pytorch scikit-learn transformer
Last synced: 17 Apr 2026
https://github.com/kritimbist/365-days-of-github-challenge-ai-machine-learning
This repository is part of my 365 Days Challenge: AI × Machine learning, where I combine my passion for Machine Learning 🤖 to learn, build, and document projects every single day for one year.
data-science data-visualization deep-learning machine-learning matplotlib numpy python scikit-learn
Last synced: 28 Apr 2026
https://github.com/md-emon-hasan/ai-from-university
🎓 Collection of academic resources, projects, and exercises related to artificial intelligence concepts learned in university coursework.
ai artificial-intelligence linear-regression logestic-regression mahcine-learning ml scikit-learn
Last synced: 17 Apr 2026
https://github.com/francescopaolol/logisticregression
About predicting survival on the Titanic and get familiar with ML basics
jupyter-notebook kaggle logistic-regression machine-learning ml pandas scikit-learn
Last synced: 16 Apr 2026
https://github.com/hemanthkumarsunkari27/heart-disease-prediction-using-ml
Heart Disease Prediction System using Machine Learning (Flask + Bootstrap)
bootstrap flask healthcareprediction machine-learning numpy-library pandas-library python3 scikit-learn webpage
Last synced: 08 Oct 2025
https://github.com/aliy98/navigation-sensor-data-classification
Classification of a Navigation Robot Sensor Dataset Using SVM, Random Forest and Neural Network
artificial-neural-networks keras multiclass-classification random-forest scikit-learn scitos-g5 support-vector-machines
Last synced: 13 May 2026
https://github.com/aakanksha1406/fake-news-classifier
to identify when an article might be fake news
keras lstm lstm-neural-networks nltk python scikit-learn tensorflow
Last synced: 13 Feb 2026
https://github.com/adithaker/falafel
🤖 A from-scratch implementation of a small scaled federated learning application.
cli-app distributed-systems federated-learning logistic-regression python scikit-learn
Last synced: 28 Apr 2026
https://github.com/mdh266/kmeans
Creating A Scikit-Learn Compatable Clustering Algorithm
algorithms clustering data-science machine-learning machine-learning-algorithms scikit-learn unsupervised-learning
Last synced: 18 May 2026
https://github.com/h-fuzzy-logic/python-finding-nsf-award-themes
Using NLP to find themes and concepts in NSF Awards
nltk pandas python scikit-learn
Last synced: 03 May 2026
https://github.com/mpolinowski/manifold-learning-for-image-segmentation
Use Manifold Learning, Mapping and Discriminant Analysis to Visualize Image Datasets
fisher-discriminant-analysis image-segmentation isomap locally-linear-embedding principal-component-analysis python scikit-learn
Last synced: 03 May 2026
https://github.com/baggiponte/ta-statistics-for-big-data-2022
🎓 Introduction to Python and Machine Learning [UniMi • AY 2021/2022]
clustering data-science data-visualization machine-learning python scikit-learn
Last synced: 03 May 2026
https://github.com/siam29/credit-card-fraud-detection-in-real-time
This project delivers a fast and efficient fraud detection methodology, providing predictions in under a second, emphasizing the importance of both high performance and quick response times.
ensemble-machine-learning feature-selection genetic-algorithm machine-learning matplotlib pandas pca scikit-learn
Last synced: 03 May 2026
https://github.com/carmoreno/analisisaccidentalidadbogota
Data Analysis about traffic accidents at Bogotá, Colombia.
data-analysis data-science jupyer-notebook matplotlib numpy pandas scikit-learn
Last synced: 17 Apr 2026
https://github.com/harshitwaldia/stock-price-prediction
An AI-driven stock market analysis dashboard that predicts next-day stock prices using a deep learning LSTM model. The project features: 🔮 AI Predictions for stock movements 🌍 Global market support (US, India, China, Japan, UK) 📊 Interactive React dashboard with charts & recent searches ⚡ Flask backend powered by Tensor/Keras & Yahoo Finance
dashboard flask flask-cors keras-tensorflow lstm-neural-networks machine-learning numpy react-typescript scikit-learn stock-price-prediction
Last synced: 03 May 2026
https://github.com/ivanyu/kaggle-digit-recognizer
Kaggle's "Digit Recognizer" competition
kaggle keras machine-learning scikit-learn
Last synced: 17 Apr 2026
https://github.com/loong64/onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
ai-framework deep-learning hardware-acceleration loong64 loongarch64 machine-learning neural-networks onnx pytorch scikit-learn tensorflow
Last synced: 09 May 2026
https://github.com/lakshitalearning/churninsight
Customer Churn prediction means knowing which customers are likely to leave or unsubscribe from your service.
churn-prediction data-science flask google-colab machine-learning predictive-analytics python scikit-learn user-retention web-development
Last synced: 09 May 2026
https://github.com/davidcamilo0710/hate_speech_analysis
Hate speech detection using NLP for linguistic analysis and machine learning (XGBoost) for classification with Python and SpaCy.
hate-speech-detection linguistic-analysis nlp scikit-learn spacy xgboost
Last synced: 09 May 2026
https://github.com/bhuvaneshwarguttula/student-performance-indicator
To understand and predict how the student's performance (test scores) is affected by the other variables (Gender, Ethnicity, Parental level of education, Lunch, Test preparation course).
exploratory-data-analysis machine-learning pandas python scikit-learn student-performance-analysis
Last synced: 07 Mar 2026
https://github.com/vishal-038/attendance_by_face_recogination
This project is a face recognition-based attendance system that uses Python, OpenCV, Scikit-learn, Streamlit, and various other libraries like Pandas, Numpy, Datetime, and OS for different functionalities. It enables adding faces to the database, taking attendance based on face recognition, and showing live attendance through a web interface built
Last synced: 14 Feb 2026
https://github.com/ultrasage-danz/scikit-learn-ml
Machine Learning with scikit-learn by Data School
ai data data-school machine-learning macos ml scikit-learn ultrasage-dan
Last synced: 13 May 2026
https://github.com/hq969/customer-churn-prediction-with-hyperparameter-optimization-and-model-deployment
A complete end-to-end machine learning project that predicts customer churn using the Telco dataset. It includes data preprocessing, exploratory data analysis (EDA), model training with Random Forest, hyperparameter tuning, evaluation, and deployment via a Flask API.
flask numpy pandas python scikit-learn xgboost
Last synced: 02 Apr 2026
https://github.com/mg380/ibm-applied-data-science-capstone
This Capstone is the 10th (final) course in IBM Data Science Professional Certificate specialization, and it actually summarises in the form of project all materials that have been learned during this specialization
capstone data data-analysis data-science datascience ibm machine-learning plotly python scikit-learn sql
Last synced: 05 Mar 2026
https://github.com/the-developer-306/house-price-predictor
House Price Predictor: Harnessing machine learning algorithms to forecast housing prices in Boston, empowering buyers and sellers with accurate predictions based on key factors like location, crime rate, rooms, accessibility, and more.
csv ipynb-jupyter-notebook joblib matplotlib numpy pandas python scikit-learn
Last synced: 23 Feb 2026
https://github.com/rakibhhridoy/supportvectormachinein-medical
Support vector machine in medical disease detection. Both linear and non-linear data can be fitted in svm through its kernel specialization In medical we focus on precision or recall rather than accuracy.
diabetes-prediction machine-learning medical precision-medicine recall-precision scikit-learn support-vector-machines svm
Last synced: 29 Apr 2026
https://github.com/akhil888binoy/intelligent-supplychain-management-system
Blockchain-powered supply chain management system with ML-driven sales prediction. Streamlines supplier-employee transactions and inventory management. Built with MERN stack, Solidity, and Flask.
blockchain decentralized-payments ethereum express flask foundry hackathon-project inventory-management machine-learning mern-stack mongodb nodejs python react sales-prediction scikit-learn smart-contracts solidity supply-chain-management wagmi
Last synced: 09 Oct 2025
https://github.com/brenofariasdasilva/dagster-portuguese-grades
Dagster Portuguese Grades Model.
dagster matplotlib numpy pandas python scikit-learn seaborn
Last synced: 15 Apr 2026
https://github.com/chitralputhran/drive-curve-machine-learning-app
:blue_car: Drive Curve is a web application made with the help of Flask, a microframework for Python based on Werkzeug, Jinja 2, and good intentions. On the backend, a Machine Learning model is used for predicting the price of the car. The machine learning model was trained on the Automobile Dataset from the UCI Machine Learning Repository.
flask machine-learning python scikit-learn webapp
Last synced: 03 May 2026
https://github.com/ompreetham/dcn-network-traffic-anomaly-detection
Data Communication Networks - Network Traffic Anomaly Detection
anomaly anomaly-detection communication data dcn keras learning machine machine-learning network pandas presentation project python scikit-learn tensorflow traffic
Last synced: 08 Apr 2026
https://github.com/RickContreras/StudentPerformancePredictionSaberPro
Modelo de clasificación para predecir el desempeño de estudiantes en las Pruebas Saber Pro en Colombia. Incluye análisis exploratorio de datos, preprocesamiento y modelos de machine learning.
classification colombia data-analysis data-science education educational-assessment exploratory-data-analysis jupyter-notebook machine-learning python saber-pro scikit-learn student-performance
Last synced: 24 Oct 2025
https://github.com/andresmg07/real-time-sign-language-translator
AI-driven real-time American Sign Language translator. Implemented leveraging Support Vector Machines (SVM), OpenCV library and MediaPipe hands module.
ai computer-vision machine-learning mediapipe opencv pattern-recognition scikit-learn support-vector-machines
Last synced: 16 Apr 2026
https://github.com/realamirhe/leaf-node
A leaf node for your machine learning journey, from scratch to practical applications...
algorithm auto-encoder classification cybernetics feature-extraction feedback-mechanism lda learning machine-learning machine-learning-journey numpy pca practice regression scikit-learn sklearn smlfdl
Last synced: 09 May 2026
https://github.com/jasper-koops/easy-gscv
This library allows you to quickly train machine learning classifiers by automatically splitting the data set and using both grid search and cross validation in the training process.
classification machine-learning python3 scikit-learn
Last synced: 14 Feb 2026
https://github.com/siam29/ensemble-majority-voting-hard
In this project, we implemented an ensemble learning approach using majority voting (hard voting) with five machine learning classifiers: DT, RF, XGBC, ANN, and KNN. The ensemble model achieved an impressive accuracy score of 99.95% and an F1 score of 85.51%.
credit-card-fraud ensemble-learning machine-learning matplotlib pandas scikit-learn
Last synced: 09 May 2026
https://github.com/garcane/Income-Prediction-ML
This is a machine learning project aimed at predicting whether an individual's annual income exceeds $50,000 based on their demographic and personal information.
data data-science machine-learning ml numpy pandas python random-forest scikit-learn
Last synced: 24 Oct 2025
https://github.com/zazi2002/machine-learning-project
Introduction to Machine Learning project with the goal of improving the classification performance on a dataset by optimizing the number of features and weak learners.
dimentionality-reduction ensemble-learning numpy pca random-forest scikit-learn
Last synced: 02 May 2026
https://github.com/prashver/titanic-survival-prediction
This project tackles the Titanic challenge on Kaggle, predicting passenger survival based on variables like age, sex, and passenger class. The Jupyter notebook covers essential steps of a data science pipeline, including exploratory data analysis, data cleaning, feature engineering, and modeling. The dataset used is the Titanic dataset.
classification-algorithm machine-learning-algorithms matplotlib numpy pandas scikit-learn seaborn
Last synced: 02 May 2026
https://github.com/pankajarm/tabular_ml_toolkit
A helper library to jumpstart your machine learning project based on tabular or structured data.
data-science feature-engineering hyperparameter-tuning machine-learning parallelism python scikit-learn structured-data tabular xgboost
Last synced: 19 Jan 2026
https://github.com/hedriss10/knn-machine-learning
Machine learning
machine-learning python scikit-learn
Last synced: 09 May 2026
https://github.com/rakibhhridoy/machinelearning-featureselection
Before training a model or feed a model, first priority is on data,not in model. The more data is preprocessed and engineered the more model will learn. Feature selectio one of the methods processing data before feeding the model. Various feature selection techniques is shown here.
extratreesclassifier feature-selection gridsearchcv lasso-regression logistic-regression machine-learning numpy pandas pca rfe rfecv scikit-learn selectkbest
Last synced: 02 May 2026
https://github.com/kengz/feature_transform
Build Scikit ColumnTransformers by specifying configs.
auto-ml automated-feature-preprocessor columntransformer data-preprocessing feature-engineerig machine-learning scikit-learn
Last synced: 15 Feb 2026
https://github.com/hermann-web/search-engine-with-python-nlp
A python search engine build with NLP methods for a django project
cosine-similarity document-searching natural-language-processing nlp nltk pandas python scikit-learn search-engine semantic-similarity similarity-score similarity-search
Last synced: 02 May 2026
https://github.com/khaymanii/titanic_survival_prediction_-model
This Model was built using Python and Logistic Regression algorithm
matplotlib numpy pandas python scikit-learn seaborn
Last synced: 02 May 2026
https://github.com/umar-saadat/car-price-prediction-ml
🚗 A Machine Learning project that predicts the price of used cars using Linear Regression. Built with Python, Scikit-learn, and Streamlit, this app takes inputs like car brand, year, mileage, engine size, and more to estimate the selling price in real-time
ai-project car-price-prediction data-science linear-regression machine-learning ml-project python scikit-learn streamlit
Last synced: 02 May 2026
https://github.com/gauravsingh9356/machine_learning
All my practical learning work involved in MACHINE LEARNING (Data Processing to Deep Learning)
deep-learning jupyter-notebook machine-learning machine-learning-algorithms nlp-machine-learning python scikit-learn
Last synced: 30 Apr 2026
https://github.com/ibrahimsharaf/kaggle-competitions
gensim kaggle kaggle-popcorn machine-learning nltk scikit-learn
Last synced: 10 May 2026
https://github.com/alam025/customer-churn-prediction
🎯 Predict customer churn with 96%+ accuracy using Random Forest ML. Beautiful visualizations, production-ready code, and real business impact. Save revenue before customers leave! 🚀
churn-prediction classification customer-analytics customer-churn customer-retention data-science machine-learning pandas predictive-analytics python random-forest scikit-learn
Last synced: 11 Jun 2026
https://github.com/petrosdemetrakopoulos/flight-passengers-prediction
A supervised learning problem given as a project in the "Data Mining in Databases and World Wide Web" course in Computer Science Department of AUEB in Winter semester of 2019.
classification classifier data-science machine-learning python scikit-learn sklearn university-project
Last synced: 30 Apr 2026
https://github.com/t-abishek/embedded-intent-classifier
A production-grade FastAPI application that uses sentence embeddings to classify user prompts into 4 categories: Built using Python, BGE SentenceTransformer, Scikit-learn, and FastAPI.
classifier embedded huggingface pandas scikit-learn transformer
Last synced: 10 May 2026
https://github.com/bistcuite/plainml
Painless Machine Learning Library for python based on scikit-learn
machine-learning ml plainml python scikit-learn
Last synced: 02 May 2026
https://github.com/zachpinto/xc-rankings-predictions
Applied ML Project predicting cross-country team rankings based on individual-level performances
Last synced: 29 Apr 2026
https://github.com/ayyucedemirbas/solar_power_elasticnet
ElasticNet Linear Regression on Solar Power Generation
elasticnet-regression scikit-learn skops tabular-regression
Last synced: 29 Apr 2026
https://github.com/dhavaltaunk08/gender-classification
I did this project during my internship at IIT Guwahati. It aimed to perform gender classification in video streaming.
deep-learning librosa opencv-python python scikit-learn
Last synced: 14 May 2026
https://github.com/sapsan14/water-quality-ee
Estonian water quality ML — binary classification of Terviseamet open data, Jupyter + scikit-learn.
classification estonia jupyter ml open-data scikit-learn
Last synced: 02 May 2026
https://github.com/aryansk/customer-segmentation-analysis
Advanced customer segmentation project using K-Means clustering to analyze customer behavior based on annual income, spending score, and age.
elbow-method exploratory-data-analysis machine-learning machine-learning-algorithms python scikit-learn sentiment-analysis sentiment-classification
Last synced: 29 Apr 2026
https://github.com/bestmahdi2/uni__dataminningstackoverflowproject
A university project related to data mining lesson on StackOverflow website data with Python language
cart csv data-mining logistic-regression matplotlib mlp naive-bayes nltk numpy pandas python scikit-learn scipy seaborn stackoverflow svc textblob tqdm xgboost
Last synced: 16 Feb 2026