An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/artikumari28/movie-recommender-system

This project is a content-based movie recommendation system, where movies are recommended based on their similarity in content. The system analyzes various features such as genres, cast, and descriptions to suggest similar movies.

google-colab machine-learning nltk numpy pandas pickle scikit-learn streamlit

Last synced: 06 Apr 2026

https://github.com/armanjscript/fusion-rag

A powerful web-based application designed to answer questions based on the content of uploaded PDF documents. This project leverages the **Fusion-in-Decoder (FiD)** approach for **Retrieval-Augmented Generation (RAG)**, combining semantic similarity, technical term relevance, and recency to deliver accurate and contextually relevant responses

chroma chromadb fusion-rag langchain langchain-ollama ollama pypdf qwen2-5 rag rag-chatbot scikit-learn streamlit tf-idf-score tf-idf-vectorizer vector-database

Last synced: 10 Apr 2026

https://github.com/nordszamora/predictive_lung_cancer

The lung cancer predictive ML project is use to predict a cancer based on the data of smoking intake and common symptoms with low cost.

bootstrap django django-rest-framework python reactjs rest-api scikit-learn vite

Last synced: 11 Apr 2026

https://github.com/andystmc/nextflownyc

Developed a machine learning model (Bidirectional LSTM) to forecast NYC traffic volumes using 10 years of automated traffic count data. Achieved strong predictive accuracy, demonstrating the power of deep learning for urban traffic analysis.

data-analysis data-cleaning data-science data-visualization exploratory-data-analysis feature-engineering hyperparameter-tuning jupyter-notebook lstm-neural-networks machine-learning numpy pandas predictive-modeling python3 scikit-learn tensorflow-keras traffic-flow-forecasting

Last synced: 07 Apr 2026

https://github.com/jai0212/cash-app-bias-busters

A platform developed with Cash App to help ML engineers detect and visualize biases in models using Fairlearn. Features include a collaborative and interactive dashboard (React, Chart.js), a Flask backend, and a secure MySQL database for data storage and analysis.

bias-detection chartjs fairlearn flask machine-learning mysql numpy pandas pytest python react scikit-learn scipy

Last synced: 16 Feb 2026

https://github.com/guoshijiang/scikit-learn

带你一起学习scikit-learn

nlp-machine-learning scikit-learn

Last synced: 14 Sep 2025

https://github.com/f-aguzzi/ChemFuseKit

Chemometrics library for data fusion, model training and prediction of data from multiple sensor sources.

chemometrics datafusion knn lda pca plsda scikit-learn svm

Last synced: 21 Sep 2025

https://github.com/tasninanika/coded_data_prediction-knn

K-Nearest Neighbors (KNN) is a supervised machine learning algorithm

knn pandas python3 scikit-learn

Last synced: 07 Apr 2026

https://github.com/evangks/k-means-clustering-synthetic-dataset

Customer Segmentation using K-Means Clustering: A complete machine learning workflow for segmenting customers based on synthetic demographic and spending data, with visualizations, evaluation metrics, and reproducible Jupyter notebook.

clustering customer-segmentation data-science jupyter-notebook k-means-clustering machine-learning portfolio-project python27 scikit-learn unsupervised-learning

Last synced: 10 Mar 2026

https://github.com/saro0307/pre-doctor-ai-model

Pre-Doctor is an AI-driven health advisor using sci-kit-learn, offering quick medical advice based on user-input symptoms, making healthcare accessible and user-friendly. Utilizing Flask and pyttsx3, it seamlessly integrates machine learning for informed well-being.

artificial-intelligence css flask generative-ai generative-model html machine-learning python reinforcement-learning scikit-learn

Last synced: 07 Apr 2026

https://github.com/victorkiosh/fake-news-detection

Detecting fake news using NLP and machine learning (Logistic Regression, Random Forest, XGBoost)

data-science fake-news-detection machine-learning nlp scikit-learn xgboost

Last synced: 18 May 2026

https://github.com/ishanoshada/matplot3dex

A Matplotlib 3D Extension package for enhanced data visualization

data data-science matplotlib python-packages scikit-learn

Last synced: 05 Jan 2026

https://github.com/gaurangdave/house_price_predictions

Machine Learning Application to predict House Prices

hands-on learning-by-doing machine-learning numpy pandas python scikit-learn

Last synced: 11 Apr 2026

https://github.com/viveksapkal2793/advertisement-response-analysis

This project analyzes advertisement responses using a Django backend and a Vite+React frontend. It includes scripts to load, clean, and transform data, which are executed within Docker containers. Data is stored in a MongoDB database, and the project can be run with or without Docker by adjusting the MongoDB connection strings.

advertisement advertisement-analysis container-image containerization django docker machine-learning mongodb react scikit-learn vite

Last synced: 23 Sep 2025

https://github.com/yessasvini23/deepfake_immunization_toolkit

🛡️ AI-powered toolkit to detect deepfakes, educate users, and verify content authenticity using federated learning and blockchain. Built for election security, media integrity, and digital literacy.

blockchain matplotlib numpy opencv python pytorch scikit-learn

Last synced: 11 Apr 2026

https://github.com/catlikeflyer/rsp-recognition

A computer vision project to recognize thumbs up

machine-learning mediapipe-hands python scikit-learn

Last synced: 16 May 2026

https://github.com/rexsimiloluwah/fastapi-ml-apps

Machine learning apps built with FastAPI

docker fastapi machine-learning python scikit-learn tensorflow

Last synced: 05 Apr 2026

https://github.com/elazzouzihassan/si-fraud-detection-prototype

Système de Détection des Fraudes avec Python (Prototype).

googlecolab matplotlib numpy pandas python scikit-learn seaborn

Last synced: 11 Apr 2026

https://github.com/aarryasutar/logistic_regression_on_age_prediction

This code evaluates the performance of a logistic regression model on age prediction using various features to predict a binary target variable, calculating metrics to determine the performance. It evaluates the comparison, identifies favorable features, and visualizes the ROC-AUC curve to determine the best model performance.

accuracy-score confusion-matrix f1-score feature-selection logistic-regression model-training numpy pandas precision recall rmse roc-auc-curve scikit-learn visualization

Last synced: 20 Jan 2026

https://github.com/tlapanco/knn-project

Projecto para la materia de Sistemas inteligentes haciendo uso de KNN oversampling.

jupyter-notebook knn pandas python scikit-learn smote

Last synced: 09 Apr 2026

https://github.com/amirjahantab/iris_classification

This project analyzes the famous Iris dataset using various machine learning techniques. The goal is to classify the iris flowers into three species: Setosa, Versicolor, and Virginica based on the features provided in the dataset.

classification data-science machine-learning scikit-learn

Last synced: 16 May 2026

https://github.com/gokularaman-c/ev-charging-log-anomaly-detection

EV charging log anomaly detection using Isolation Forest, engineered telemetry features, and a CLI inference pipeline.

anomaly-detection ev-charging feature-engineering isolation-forest machine-learning mlops python scikit-learn time-series

Last synced: 23 May 2026

https://github.com/docsallover/spam-detection

Building a Spam Filter with Python: Using Machine Learning to Combat Spam

datascience flask jinja2 machine-learning numpy numpy-library pandas pandas-python python python3 scikit-learn

Last synced: 09 Apr 2026

https://github.com/ishutak/disease_prediction

An AI-powered disease prediction system that uses machine learning to predict diseases based on symptoms. The system employs an ensemble of models including Random Forest and Neural Networks to provide accurate predictions with confidence levels.

css3 htlm5 javascript jquery numpy pandas pytorch scikit-learn select2

Last synced: 11 Apr 2026

https://github.com/gokulgowthams/clickstream-customer-conversion

Analyzes clickstream data from an e-commerce platform to predict customer conversions, estimate potential revenue, and segment users for personalized marketing strategies. By leveraging machine learning techniques, the project enhances decision-making for businesses seeking to optimize user engagement and sales.

data-preprocessing feature-engineering machine-learning matplotlib model-deployment numpy pandas pipeline python scikit-learn seaborn streamlit-web-application tensorflow xgboost

Last synced: 07 Apr 2026

https://github.com/jersongb22/computervision

Links to my repositories with a wide variety of Computer Vision models using CNNs, Transfer Learning, and Vision Transformer with TensorFlow, PyTorch, Hugging Face and Ultralytics.

cnn computer-vision convnextv2 efficientnetv2 hugging-face image-captioning image-classification image-segmentation lenet-5 object-detection opencv plotly python pytorch scikit-learn tensorflow ultralytics video-classification vision-transformer yolo11

Last synced: 12 Apr 2026

https://github.com/pkini2002/hpe_cty

Repository to maintain the learnings of the technologies used for the CTY'23 Project Work provided by HPE

computer-networks docker docker-container linux python scikit-learn swarm-learning ubuntu

Last synced: 07 Apr 2026

https://github.com/upul/chocolate-quality-analysis

This repository contains a Jupiter notebook which describes how to use basic machine learning tools such Scikit-Learning, Pandas, and Numpy for buiding models.

machine-learning numpy pandas predictive-analytics scikit-learn

Last synced: 04 May 2026

https://github.com/gperdrizet/ensembleset

Ensemble dataset generator for tabular data prediction and modeling projects.

classification ensemble feature-engineering machine-learning regression scikit-learn

Last synced: 07 Mar 2026

https://github.com/shreeparab1890/movie-recommender-system

This notebook is trying to build a model which will recommend the movie based on given movie and genre. In this we use Popularity Based Recommendation, Content Based Recommendation and Collaborative Filtering based Recommendation.

bag-of-words cosine-similarity matplotlib numpy pandas python scikit-learn sklearn vectorization

Last synced: 09 Apr 2026

https://github.com/vimal0156/ruaroa-ai

🧙‍♂️ Zero-Code Machine Learning Wizard - Transform ideas into intelligent solutions without writing code. AI-powered ML pipeline automation with interactive web interface.

ai-agents ai-assistant artificial-intelligence automated-machine-learning code-generation data-analysis data-science deep-learning jupyter machine-learning machine-learning-pipeline neural-networks no-code openai python scikit-learn streamlit visualization

Last synced: 09 Apr 2026

https://github.com/mariamabidi/pinn-based-flow-prediction

This repository contains code and experiments for predicting 3D aerodynamic flow around car geometries using Physics-Informed Neural Networks (PINNs) and for analyzing flow features via autoencoder-based clustering.

computer-vision machine-learning neural-network numpy pytorch pyvista scikit-learn

Last synced: 05 Aug 2025

https://github.com/vicperal/ai-genai_projects

Python projects about LLM and ML use cases. I am using modules such as Pandas, Numpy, Plotly, scikit-learn, Transformers, Flask, JSON, etc. to analyze data, predict, generate insights and create text from models such as LLMs, linear regression, assembly methods, etc. Server- Front-End using Flask

assembly clinical-trials flask json linear-regression llm ml numpy pandas plotly price-prediction python rag random-forest scikit-learn sentimental-analysis sql text-summarization tokens-counter transformers

Last synced: 02 Apr 2026

https://github.com/mdalamin5/machine-learning-2.0

Machine-Learning-2.0: A comprehensive repository documenting my journey to master ML from scratch. It includes core algorithms, advanced techniques, data preprocessing, feature engineering, and real-world projects. Follow my structured approach, inspired by "100 Days of ML," featuring Python implementations, tools, and insightful resources.

data-fetching-from-api datapreprocessing end-to-end-project feature-engineering gradient-descent-optimizers machine-learning-algorithms scikit-learn webscraping-data

Last synced: 21 Apr 2026

https://github.com/selcia25/sleep-disorder-detection

💤This project aims to develop an automated method for detecting sleep disorders from heart rate signals.

cnn-classification kmeans-clustering machine-learning matplotlib scikit-learn scipy sleep-disorders tensorflow

Last synced: 05 Jan 2026

https://github.com/kingabzpro/mlops-with-jenkins

From data ingestion to deploying the model using Jenkins.

classification fastapi jenkins mlops scikit-learn

Last synced: 13 Feb 2026

https://github.com/veb-101/machine-learning-practice

Contains code-works from the Hands on scikit-learn and tensorflow book

deep-learning keras machine-learning python3 scikit-learn tensorflow-gpu

Last synced: 19 Apr 2026

https://github.com/raju-2003/indiaai-cyberguard-ai-hackathon

An NLP-powered system to simplify cybercrime reporting by analyzing descriptions, categorizing incidents, and providing actionable insights.

matplotlib nltk numpy pandas python random-forest-classifier re scikit-learn seaborn shap spacy wordcloud

Last synced: 11 Apr 2026

https://github.com/aymen016/film-recommendation-engine

A machine learning-powered movie recommender system designed to provide personalized recommendations based on user preferences and data analysis. This project includes a backend recommendation engine, a Streamlit-based interface, and a web-based frontend for an enhanced user experience.

flask numpy pandas pickle python scikit-learn streamlit

Last synced: 09 Apr 2026

https://github.com/vedanty3/heart-disease-prediction

This project aims to build a machine learning model using K-Nearest Neighbor, LogisticRegression, RandomForestClassifier to classify whether or not a person has heart disease based upon his medical attributes. (accuracy achieved : 88.52%)

confusion-matrix correlation-matrices jupyter-notebook knn-classification logistic-regression machine-learning matplotlib numpy pandas python random-forest randomforestclassifier roccurve scikit-learn sklearn zerotomastery

Last synced: 09 Apr 2026

https://github.com/malleswarigelli/real_estate_house_price_prediction

Build end-to-end ML Regression pipeline for predicting housing price, deploy Flask app to cloud platform:Heroku with Docker, CI/CD tool: GitHub Actions

ci-cd-pipeline docker heroku-deployment machine-learning mlops mongodb python scikit-learn

Last synced: 09 Apr 2026

https://github.com/tasninanika/mammographic-masses-analysis-dt

This project uses a Decision Tree Classifier to predict whether a detected mammographic mass is benign (0) or malignant (1) based on input features.

decision-tree-classifier numpy pandas pyhton3 scikit-learn

Last synced: 11 Apr 2026

https://github.com/dustinmichels/bayesian-values-guesser

Uses some user input, data from the World Values Survey <www.worldvaluessurvey.org>, and Bayes Rule to guess a number of beliefs the user might have. STATUS: In progress.

bayes-rule bayesian-values-guesser naive-bayes-classifier pandas python scikit-learn values-survey

Last synced: 09 Apr 2026

https://github.com/vatshayan/hospital-discharge-analysis

Analysis of Hospitalization Discharge Rates in Lake County, Illinois of various attributes like Anxiety, Alcohol, mood, Diabetes, Asthma, etc

data-analysis data-visualization jupyter-notebook machine machine-learning machine-learning-algorithms scikit-learn

Last synced: 04 Mar 2025

https://github.com/vidhi1290/text-classification-model-with-attention-mechanism-nlp

This Python project utilizes PyTorch to perform text classification with an attention mechanism. Pre-trained GloVe embeddings are processed for word representation, and a custom attention model is trained on consumer complaint data to categorize complaints into product categories.🎯

attention-mechanism deeplearning machine-learning nlp nltk numpy pandas python pytorch scikit-learn text-classification tqdm

Last synced: 06 Apr 2026

https://github.com/gaurav9364/credit-card-fraud-detection

Credit Card Fraud Detection using Machine Learning – A classification project that detects fraudulent credit card transactions using supervised learning, with data preprocessing, handling class imbalance, and model evaluation (ROC-AUC, Precision, Recall, F1-score).

googlecolab imbalanced-learn matplotlib numpy pandas python scikit-learn seaborn xgboost

Last synced: 08 Apr 2026

https://github.com/PFS-AI/PFS

The AI-powered desktop tool for finding, classifying, and understanding your files. Search by keyword, ask questions, and get insights from your scattered files instantly.

ai cross-platform data-science document-classification fastapi file-management file-organizer file-search huggingface-transformers knowledge-management langchain machine-learning productivity-tools rag scikit-learn search-engine semantic-search vector-search

Last synced: 30 Dec 2025

https://github.com/labrijisaad/chefclub-data-internship

Repository showcasing my Data Engineer / Scientist internship at Chefclub, contributing to data infrastructure enhancement and fostering data-driven insights.

airflow chefclub data-engineering data-science gcp scikit-learn

Last synced: 28 Apr 2025

https://github.com/aaa1928/iris-ml-classifier

PyTorch model that classifies Iris species based on characteristics about the length and width of sepals and petals.

deep-learning iris-classification iris-dataset machine-learning neural-network numpy pandas python pytorch scikit-learn

Last synced: 05 Apr 2026

https://github.com/tasninanika/k-means-clustering

An interactive and insightful customer segmentation project using K-Means Clustering.

matplotlib numpy pandas plotly python3 scikit-learn seaborn

Last synced: 11 Apr 2026

https://github.com/uhstray-io/pystockbot

Platform & exchange agnostic Stock, Crypto, and Asset automated Machine Learning & AI Trading Bot

automation docker machine-learning python scikit-learn statistical-analysis trading-algorithms

Last synced: 13 Aug 2025

https://github.com/bestmahdi2/uni__pythonsupportvectormachinesbinaryclassification

A university project in which the binary classification of support vector machines is implemented with Python language

binary-classification classification matplotlib numpy python scikit-image scikit-learn seaborn support-vector-machine svm

Last synced: 07 Apr 2026

https://github.com/swimshahriar/heart-attack-prediction

Heart attack prediction from 13 features.

jupyter-notebook pandas python3 scikit-learn

Last synced: 18 Apr 2026

https://github.com/yvesemmanuel/machine_learning

Implements data problems solved with machine learning algorithms.

data-science keras keras-tensorflow linear-algebra machine-learning neural-network python scikit-learn

Last synced: 09 Apr 2026

https://github.com/haloapping/ml-workflow

Template alur kerja machine learning.

mahine-learning numpy pandas python3 scikit-learn

Last synced: 11 Apr 2026

https://github.com/rizz1406/spam-email-detector

Spam Email Classifier using Python and Streamlit A simple machine learning project that classifies emails as **spam** or **ham** using the **Naive Bayes algorithm** and **TF-IDF** for text feature extraction. The project includes a user-friendly web app built with Streamlit

nlp pandas pytho3 scikit-learn streamlit

Last synced: 09 Apr 2026

https://github.com/nekruzash/regression-correlation

This is from CS2023 - AI/DS/ML class, trained a model based on different categories of data and predicted using a linear regression for the best feature that has the greatest effect on the housing prices.

jupyter-notebook python scikit-learn

Last synced: 04 May 2026

https://github.com/mhmudfzli/exploring-mental-health-data

This project demonstrates a comprehensive approach to solving a regression problem using various machine learning models. The notebook includes: Data Preprocessing, Exploratory Data Analysis (EDA), Model Training, Hyperparameter Tuning, Model Evaluation, Feature Importance

catboost lightgbm matplotlib numpy pandas scikit-learn seaborn xgboost

Last synced: 09 Apr 2026

https://github.com/shanmukhsrisaivedullapalli/automatic-ticket-classification

This project processes customer complaint data using pandas for data manipulation and applies text preprocessing techniques, including lemmatization, to clean and normalize complaint text. The `tqdm` library provides progress bars for efficient tracking of text processing tasks.

matplotlib neural-networks nlp numpy pandas python3 scikit-learn seaborn tensorflow tqdm wordcloud

Last synced: 11 Apr 2026

https://github.com/jswong65/machine_learning_nanodegree

Projects of Udacity Machine Learning nanodegree

machine-learning numpy pandas python scikit-learn scipy

Last synced: 09 Apr 2026

https://github.com/garcane/income-prediction-ml

This is a machine learning project aimed at predicting whether an individual's annual income exceeds $50,000 based on their demographic and personal information.

data data-science machine-learning ml numpy pandas python random-forest scikit-learn

Last synced: 08 Apr 2026

https://github.com/luceldasilva/covid_19_italia

Una entidad gubernamental responsable de la gestión de la salud en Italia enfrenta el desafío de comprender y analizar la propagación del COVID-19 para tomar decisiones informadas y eficaces en la gestión de la pandemia. Como científico de datos nuestra tarea es presentar insights que responden a las inquietudes de la entidad

covid-19 deepnote google-colab jupyterlab pearson-correlation python random-forest scikit-learn

Last synced: 31 Jan 2026

https://github.com/bhimrazy/iris-species-prediction-using-decision-tree-algorithm-grip

Iris Species Intelligence: Classifying Iris Species with Confidence using Decision Trees | The Sparks Foundation: GRIP

decision-tree-classifier fastapi gripjan23 machine-learning python scikit-learn sparkfoundation

Last synced: 10 Apr 2026

https://github.com/grachale/predict_pass_exam

Creating AdaBoost classifier with decision trees for predicting whether a student will pass or fail an exam (classification) based on the number of study hours and their scores in the previous exam.

adaboost cross-validation decision-tree jupyter-notebook matplotlib python scikit-learn seaborn

Last synced: 06 May 2026

https://github.com/mohit1106/fraud-detection-in-financial-transactions

an anomaly detection system on 284,807 transactions, achieving an AUC of ~0.972 with CNNs and Autoencoders.

autoencoders cnn-model isolation-forest keras python scikit-learn tensorflow

Last synced: 10 Apr 2026

https://github.com/chitralputhran/tutorial-sklearn-columntransformer

ColumnTransformer was introduced in scikit-learn from version 0.20 onwards. The notebook file contains a quick and easy tutorial on ColumnTransformer to get you started.

scikit-learn

Last synced: 17 May 2026

https://github.com/offchan42/thai-thesis-classification

Classify each document inside the corpus using Python machine learning module: scikit-learn

nlp python python2 scikit-learn segment thai thai-language thai-thesis-classification

Last synced: 13 Aug 2025