An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/ishanoshada/matplot3dex

A Matplotlib 3D Extension package for enhanced data visualization

data data-science matplotlib python-packages scikit-learn

Last synced: 05 Jan 2026

https://github.com/gaurangdave/house_price_predictions

Machine Learning Application to predict House Prices

hands-on learning-by-doing machine-learning numpy pandas python scikit-learn

Last synced: 11 Apr 2026

https://github.com/aymen016/film-recommendation-engine

A machine learning-powered movie recommender system designed to provide personalized recommendations based on user preferences and data analysis. This project includes a backend recommendation engine, a Streamlit-based interface, and a web-based frontend for an enhanced user experience.

flask numpy pandas pickle python scikit-learn streamlit

Last synced: 09 Apr 2026

https://github.com/victorkiosh/fake-news-detection

Detecting fake news using NLP and machine learning (Logistic Regression, Random Forest, XGBoost)

data-science fake-news-detection machine-learning nlp scikit-learn xgboost

Last synced: 18 May 2026

https://github.com/vedanty3/heart-disease-prediction

This project aims to build a machine learning model using K-Nearest Neighbor, LogisticRegression, RandomForestClassifier to classify whether or not a person has heart disease based upon his medical attributes. (accuracy achieved : 88.52%)

confusion-matrix correlation-matrices jupyter-notebook knn-classification logistic-regression machine-learning matplotlib numpy pandas python random-forest randomforestclassifier roccurve scikit-learn sklearn zerotomastery

Last synced: 09 Apr 2026

https://github.com/malleswarigelli/real_estate_house_price_prediction

Build end-to-end ML Regression pipeline for predicting housing price, deploy Flask app to cloud platform:Heroku with Docker, CI/CD tool: GitHub Actions

ci-cd-pipeline docker heroku-deployment machine-learning mlops mongodb python scikit-learn

Last synced: 09 Apr 2026

https://github.com/elazzouzihassan/si-fraud-detection-prototype

Système de Détection des Fraudes avec Python (Prototype).

googlecolab matplotlib numpy pandas python scikit-learn seaborn

Last synced: 11 Apr 2026

https://github.com/dustinmichels/bayesian-values-guesser

Uses some user input, data from the World Values Survey <www.worldvaluessurvey.org>, and Bayes Rule to guess a number of beliefs the user might have. STATUS: In progress.

bayes-rule bayesian-values-guesser naive-bayes-classifier pandas python scikit-learn values-survey

Last synced: 09 Apr 2026

https://github.com/gaurav9364/credit-card-fraud-detection

Credit Card Fraud Detection using Machine Learning – A classification project that detects fraudulent credit card transactions using supervised learning, with data preprocessing, handling class imbalance, and model evaluation (ROC-AUC, Precision, Recall, F1-score).

googlecolab imbalanced-learn matplotlib numpy pandas python scikit-learn seaborn xgboost

Last synced: 08 Apr 2026

https://github.com/PFS-AI/PFS

The AI-powered desktop tool for finding, classifying, and understanding your files. Search by keyword, ask questions, and get insights from your scattered files instantly.

ai cross-platform data-science document-classification fastapi file-management file-organizer file-search huggingface-transformers knowledge-management langchain machine-learning productivity-tools rag scikit-learn search-engine semantic-search vector-search

Last synced: 30 Dec 2025

https://github.com/kingabzpro/mlops-with-jenkins

From data ingestion to deploying the model using Jenkins.

classification fastapi jenkins mlops scikit-learn

Last synced: 13 Feb 2026

https://github.com/vidhi1290/text-classification-model-with-attention-mechanism-nlp

This Python project utilizes PyTorch to perform text classification with an attention mechanism. Pre-trained GloVe embeddings are processed for word representation, and a custom attention model is trained on consumer complaint data to categorize complaints into product categories.🎯

attention-mechanism deeplearning machine-learning nlp nltk numpy pandas python pytorch scikit-learn text-classification tqdm

Last synced: 06 Apr 2026

https://github.com/aaa1928/iris-ml-classifier

PyTorch model that classifies Iris species based on characteristics about the length and width of sepals and petals.

deep-learning iris-classification iris-dataset machine-learning neural-network numpy pandas python pytorch scikit-learn

Last synced: 05 Apr 2026

https://github.com/ishutak/disease_prediction

An AI-powered disease prediction system that uses machine learning to predict diseases based on symptoms. The system employs an ensemble of models including Random Forest and Neural Networks to provide accurate predictions with confidence levels.

css3 htlm5 javascript jquery numpy pandas pytorch scikit-learn select2

Last synced: 11 Apr 2026

https://github.com/aarryasutar/logistic_regression_on_age_prediction

This code evaluates the performance of a logistic regression model on age prediction using various features to predict a binary target variable, calculating metrics to determine the performance. It evaluates the comparison, identifies favorable features, and visualizes the ROC-AUC curve to determine the best model performance.

accuracy-score confusion-matrix f1-score feature-selection logistic-regression model-training numpy pandas precision recall rmse roc-auc-curve scikit-learn visualization

Last synced: 20 Jan 2026

https://github.com/amirjahantab/iris_classification

This project analyzes the famous Iris dataset using various machine learning techniques. The goal is to classify the iris flowers into three species: Setosa, Versicolor, and Virginica based on the features provided in the dataset.

classification data-science machine-learning scikit-learn

Last synced: 16 May 2026

https://github.com/pkini2002/hpe_cty

Repository to maintain the learnings of the technologies used for the CTY'23 Project Work provided by HPE

computer-networks docker docker-container linux python scikit-learn swarm-learning ubuntu

Last synced: 07 Apr 2026

https://github.com/mdalamin5/machine-learning-2.0

Machine-Learning-2.0: A comprehensive repository documenting my journey to master ML from scratch. It includes core algorithms, advanced techniques, data preprocessing, feature engineering, and real-world projects. Follow my structured approach, inspired by "100 Days of ML," featuring Python implementations, tools, and insightful resources.

data-fetching-from-api datapreprocessing end-to-end-project feature-engineering gradient-descent-optimizers machine-learning-algorithms scikit-learn webscraping-data

Last synced: 21 Apr 2026

https://github.com/uhstray-io/pystockbot

Platform & exchange agnostic Stock, Crypto, and Asset automated Machine Learning & AI Trading Bot

automation docker machine-learning python scikit-learn statistical-analysis trading-algorithms

Last synced: 13 Aug 2025

https://github.com/labrijisaad/chefclub-data-internship

Repository showcasing my Data Engineer / Scientist internship at Chefclub, contributing to data infrastructure enhancement and fostering data-driven insights.

airflow chefclub data-engineering data-science gcp scikit-learn

Last synced: 28 Apr 2025

https://github.com/offchan42/thai-thesis-classification

Classify each document inside the corpus using Python machine learning module: scikit-learn

nlp python python2 scikit-learn segment thai thai-language thai-thesis-classification

Last synced: 13 Aug 2025

https://github.com/raju-2003/indiaai-cyberguard-ai-hackathon

An NLP-powered system to simplify cybercrime reporting by analyzing descriptions, categorizing incidents, and providing actionable insights.

matplotlib nltk numpy pandas python random-forest-classifier re scikit-learn seaborn shap spacy wordcloud

Last synced: 11 Apr 2026

https://github.com/bestmahdi2/uni__pythonsupportvectormachinesbinaryclassification

A university project in which the binary classification of support vector machines is implemented with Python language

binary-classification classification matplotlib numpy python scikit-image scikit-learn seaborn support-vector-machine svm

Last synced: 07 Apr 2026

https://github.com/tasninanika/callifornia-housing-price-prediction-svr

Support Vector Regression (SVR) is a type of Support Vector Machine used for predicting continuous values.

matplotlib numpy pandas python3 scikit-learn seaborn svm-regression

Last synced: 11 Apr 2026

https://github.com/pockerman/tech3python

Collection of Python based algorithms on numerics, statistics, control etc

algorithms control estimation kalman-filter machine-learning numerical-methods particle-filter python3 scikit-learn statistics

Last synced: 18 May 2026

https://github.com/yvesemmanuel/machine_learning

Implements data problems solved with machine learning algorithms.

data-science keras keras-tensorflow linear-algebra machine-learning neural-network python scikit-learn

Last synced: 09 Apr 2026

https://github.com/selcia25/sleep-disorder-detection

💤This project aims to develop an automated method for detecting sleep disorders from heart rate signals.

cnn-classification kmeans-clustering machine-learning matplotlib scikit-learn scipy sleep-disorders tensorflow

Last synced: 05 Jan 2026

https://github.com/vicperal/ai-genai_projects

Python projects about LLM and ML use cases. I am using modules such as Pandas, Numpy, Plotly, scikit-learn, Transformers, Flask, JSON, etc. to analyze data, predict, generate insights and create text from models such as LLMs, linear regression, assembly methods, etc. Server- Front-End using Flask

assembly clinical-trials flask json linear-regression llm ml numpy pandas plotly price-prediction python rag random-forest scikit-learn sentimental-analysis sql text-summarization tokens-counter transformers

Last synced: 02 Apr 2026

https://github.com/rizz1406/spam-email-detector

Spam Email Classifier using Python and Streamlit A simple machine learning project that classifies emails as **spam** or **ham** using the **Naive Bayes algorithm** and **TF-IDF** for text feature extraction. The project includes a user-friendly web app built with Streamlit

nlp pandas pytho3 scikit-learn streamlit

Last synced: 09 Apr 2026

https://github.com/nekruzash/regression-correlation

This is from CS2023 - AI/DS/ML class, trained a model based on different categories of data and predicted using a linear regression for the best feature that has the greatest effect on the housing prices.

jupyter-notebook python scikit-learn

Last synced: 04 May 2026

https://github.com/mhmudfzli/exploring-mental-health-data

This project demonstrates a comprehensive approach to solving a regression problem using various machine learning models. The notebook includes: Data Preprocessing, Exploratory Data Analysis (EDA), Model Training, Hyperparameter Tuning, Model Evaluation, Feature Importance

catboost lightgbm matplotlib numpy pandas scikit-learn seaborn xgboost

Last synced: 09 Apr 2026

https://github.com/sizzlins/kalkulator-ai

A Simple Command Line Input Symbolic Regression Engine and Computer Algebra System (CAS) capable of discovering the laws of the universe, solving calculus, algebra, and trigonometrics.

calculator calculus cli computer-algebra-system curve-fitting machine-learning mathematics numpy physics python scientific-computing scikit-learn sparse-regression symbolic-regression sympy

Last synced: 13 Jan 2026

https://github.com/jswong65/machine_learning_nanodegree

Projects of Udacity Machine Learning nanodegree

machine-learning numpy pandas python scikit-learn scipy

Last synced: 09 Apr 2026

https://github.com/luceldasilva/covid_19_italia

Una entidad gubernamental responsable de la gestión de la salud en Italia enfrenta el desafío de comprender y analizar la propagación del COVID-19 para tomar decisiones informadas y eficaces en la gestión de la pandemia. Como científico de datos nuestra tarea es presentar insights que responden a las inquietudes de la entidad

covid-19 deepnote google-colab jupyterlab pearson-correlation python random-forest scikit-learn

Last synced: 31 Jan 2026

https://github.com/garcane/income-prediction-ml

This is a machine learning project aimed at predicting whether an individual's annual income exceeds $50,000 based on their demographic and personal information.

data data-science machine-learning ml numpy pandas python random-forest scikit-learn

Last synced: 08 Apr 2026

https://github.com/tasninanika/mammographic-masses-analysis-dt

This project uses a Decision Tree Classifier to predict whether a detected mammographic mass is benign (0) or malignant (1) based on input features.

decision-tree-classifier numpy pandas pyhton3 scikit-learn

Last synced: 11 Apr 2026

https://github.com/vatshayan/hospital-discharge-analysis

Analysis of Hospitalization Discharge Rates in Lake County, Illinois of various attributes like Anxiety, Alcohol, mood, Diabetes, Asthma, etc

data-analysis data-visualization jupyter-notebook machine machine-learning machine-learning-algorithms scikit-learn

Last synced: 04 Mar 2025

https://github.com/tasninanika/k-means-clustering

An interactive and insightful customer segmentation project using K-Means Clustering.

matplotlib numpy pandas plotly python3 scikit-learn seaborn

Last synced: 11 Apr 2026

https://github.com/grachale/predict_pass_exam

Creating AdaBoost classifier with decision trees for predicting whether a student will pass or fail an exam (classification) based on the number of study hours and their scores in the previous exam.

adaboost cross-validation decision-tree jupyter-notebook matplotlib python scikit-learn seaborn

Last synced: 06 May 2026

https://github.com/filiplangiewicz/automltunability

📈 Analyzing the impact of hyperparameter optimization

automl machine-learning scikit-learn

Last synced: 18 Feb 2026

https://github.com/lasithaamarasinghe/stock-market-price-prediction

This ML model predicts the price of the S&P500 Stock Market Index using RandomForestClassifier

jupyter-notebook machine-learning pandas python random-forest-classifier scikit-learn sp500 stock-market-price-prediction yfinance

Last synced: 10 Apr 2026

https://github.com/somenath203/titanic-survival-project-backend

Click the link below to check the swagger documentation of the website live

fastapi pandas python render scikit-learn seaborn titanic-survival-predictor

Last synced: 05 Apr 2026

https://github.com/jersongb22/datascience_ibm_stockpredictionlstm_project

In the IBM Advanced Data Science specialization, an interactive real-time web application was developed using LSTM networks in TensorFlow to predict stock market trends for global companies.

apache-spark data-science deep-learning lstm-neural-networks machine machine-learning plotly python scikit-learn streamlit tensorflow

Last synced: 13 Apr 2026

https://github.com/abz4375/recommendersystem

A sophisticated recommender system that leverages web mining techniques to help users find hotels that match their preferences.

cosine-similarity css html javascript pandas python scikit-learn selenium selenium-webdriver

Last synced: 13 Apr 2026

https://github.com/benman1/python-time-series

Time-Series analysis, statistical and machine learning models for forecasting, regression, and classification

darts deep-learning forecasting mlforecast nixtla scikit-learn statsforecast time-series time-series-analysis

Last synced: 22 Feb 2026

https://github.com/virajbhutada/article-recommendation-system

This project aims to redefine content discovery by delivering personalized article recommendations tailored to individual user preferences. We use advanced machine learning techniques like PCA and K-means clustering to analyze user behavior and article characteristics to provide highly accurate recommendations.

anaconda article-recommendation clustering-algorithm data-analysis data-science keras-tensorflow machine-learning machine-learning-algorithms ml-models numpy pandas plotly python scikit-learn scipy

Last synced: 06 Jan 2026

https://github.com/swimshahriar/heart-attack-prediction

Heart attack prediction from 13 features.

jupyter-notebook pandas python3 scikit-learn

Last synced: 18 Apr 2026

https://github.com/asosnovsky/analyzing-blood-vessel-aneurysm

A few simple scripts to identify aneurysm in a blood-vessel (research projects)

machine-learning meanshift medical-image-processing scikit-learn

Last synced: 20 May 2026

https://github.com/chitralputhran/tutorial-sklearn-columntransformer

ColumnTransformer was introduced in scikit-learn from version 0.20 onwards. The notebook file contains a quick and easy tutorial on ColumnTransformer to get you started.

scikit-learn

Last synced: 17 May 2026

https://github.com/williyam-m/movie-recommendation-system

Developed a web app with a cosine similarity machine learning model for personalized recommendations based on user history, likes, bookmarks, and activity. Implemented user auth and CRUD operations for movies.

django machine-learning numpy pandas prediction-model python scikit-learn

Last synced: 10 Apr 2026

https://github.com/mohit1106/fraud-detection-in-financial-transactions

an anomaly detection system on 284,807 transactions, achieving an AUC of ~0.972 with CNNs and Autoencoders.

autoencoders cnn-model isolation-forest keras python scikit-learn tensorflow

Last synced: 10 Apr 2026

https://github.com/khaymanii/parkinsons-disease-detection-model

This model was built with Python and Support Vector Machine Algorithm

matplotlib numpy pandas python scikit-learn

Last synced: 19 Apr 2026

https://github.com/bahar15984/obesity-classification

Machine Learning Pipeline for Obesity Classification using Azure ML & Python

azure azure-ml classification data-science healthcare machine-learning mlops obesity pandas pipeline python scikit-learn

Last synced: 03 Nov 2025

https://github.com/bhimrazy/iris-species-prediction-using-decision-tree-algorithm-grip

Iris Species Intelligence: Classifying Iris Species with Confidence using Decision Trees | The Sparks Foundation: GRIP

decision-tree-classifier fastapi gripjan23 machine-learning python scikit-learn sparkfoundation

Last synced: 10 Apr 2026

https://github.com/theanujsinha01/rainfall-prediction-using-machine-learning

This project predicts whether it will rain or not based on weather features like pressure, humidity, dew point, cloud cover, sunshine, wind direction, and wind speed. We use a Random Forest Classifier, a popular ML algorithm, trained on historical weather data. The model learns patterns and helps us forecast rain chances.

classification data-analysis eda machine-learning-algorithms matplotlib numpy pandas python scikit-learn seaborn supervised-learning

Last synced: 11 Apr 2026

https://github.com/squadron-leader/ecopredict-ai

EcoPredict AI is a powerful, AI-driven solution for predicting Greenhouse Gas (GHG) emissions based on user-input industry data. Designed for environmental sustainability initiatives, EcoPredict AI utilizes machine learning models to deliver accurate carbon emission predictions and is deployed via Streamlit for real-time access.

epa-data linear-regression python regression-model scikit-learn streamlit

Last synced: 12 Apr 2026

https://github.com/urme-b/multimodal-multisensor

Longitudinal neurophysiological study of adult psychometric testing.

keras matplotlib numpy pandas python pytorch scikit-learn seaborn tensorflow

Last synced: 13 Apr 2026

https://github.com/siam29/exploring-explainable-ai-demystifying-dt-rf-knn-xgbc

Implemented XAI techniques to enhance transparency in fraud detection models. I employed techniques such as SHAP, LIME on DT, RF, XGBC, and KNN to offer lucid explanations for transactions that were flagged.

machine-learning matplotlib pandas scikit-learn xai

Last synced: 15 Apr 2026

https://github.com/kostasereksonas/ids_test

Code for intrusion detection system based on "Intrusion Detection System Using Machine Learning Algorithms" tutorial on Geeksforgeeks and Intrusion Detection on NSL KDD Github repository.

ids intrusion-detection intrusion-detection-system nsl-kdd-dataset numpy pandas python scikit-learn tensorflow

Last synced: 08 Apr 2026

https://github.com/davidyen1124/cowculator

COWCULATOR: AI-driven catering cost forecasting in Python. Trains order-level and daily time series models, exports an edge-ready JSON bundle, and includes a demo web UI.

cli data-science edge-ai forecasting github-actions machine-learning mypy pandas python ruff scikit-learn time-series uv

Last synced: 05 May 2026

https://github.com/paulj1989/bulgarian-constitutional-court-decisions

Developing NLP models for text and sentence classification using legal texts from the Bulgarian constitutional court.

keras neural-network nlp scikit-learn tensorflow tesseract

Last synced: 04 May 2026

https://github.com/akashshnkr/multi-disease-prediction

Developed and integrated three machine learning models for predicting diabetes, Parkinson's, and heart disease into a Streamlit-based web application. The interface allows users to input data and receive accurate health predictions, enhancing early detection and healthcare outcomes.

logistic-regression machine-learning-algorithms numpy pandas python scikit-learn streamlit-webapp svm

Last synced: 02 Jan 2026

https://github.com/bacross/datamunger

python package for handling nan's and outliers

data data-frame datamunger knn nan outliers python scikit-learn

Last synced: 17 May 2026

https://github.com/gregoritsch3/ml_eda_classification_loanapprovalprediction

An EDA and Machine Learning Classification exercise on the Loan Approval dataset demonstrating EDA, feature engineering, StratifiedKFold and the use of Tensorflow NN, SVC, LinearSVC, XGBoost, Naive-Bayes, Bagging, Random Forest and Decision Tree algorithms.etc. The modela are optimized using hyperparameter tuning through GridSearchCV.

eda feature-engineering machine-learning matplotlib numpy pandas scikit-learn scipy seaborn tensorflow

Last synced: 13 Apr 2026

https://github.com/vidhi1290/hr_employee_prediction

"Welcome to the HR Employee Promotion Prediction project! This repository contains the code and resources for a machine learning project that focuses on predicting employee promotions. By analyzing various employee attributes, this project aims to provide valuable insights for HR decision-making and talent recognition within organizations.

data-exploration data-science data-visualization docker hr-employee-prediction hyperparameter-tuning machine-learning matplot model-building numpy pandas scikit-learn seaborn streamlit streamlit-webapp

Last synced: 13 Apr 2026