An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/tharindanimnajith/deep-learning-spam-detection

Deep Learning classifiers to detect spam SMS messages - LSTM Model, DenseNet CNN Models - NLP, Python, Jupyter Notebook, Tensorflow, Keras, Numpy, Pandas, Matplotlib, Scikit-Learn

deep-learning densenet keras lstm nlp python3 scikit-learn tensorflow

Last synced: 05 Apr 2026

https://github.com/gmontamat/quora-question-pairs

Code for the Kaggle competition "Quora Question Pairs"

kaggle-competition quora-question-pairs scikit-learn spell-checker xgboost

Last synced: 02 Sep 2025

https://github.com/nikhiljsk/generic_regression_algo

A Python template to evaluate different regression models on a dataset. Includes metrics to cross-compare models on the data. Assumes the data to be numeric.

dataset generic machine-learning prediction python3 regression scikit-learn seaborn supervised-learning

Last synced: 01 May 2026

https://github.com/headless-start/cs2-endtoend-chatbot

This repository contains a simple end to end Counter Strike 2 chat bot.

chatbot counter-strike-2 css flask html5 nltk python3 scikit-learn streamlit

Last synced: 11 Apr 2026

https://github.com/selcia25/sleep-disorder-detection

💤This project aims to develop an automated method for detecting sleep disorders from heart rate signals.

cnn-classification kmeans-clustering machine-learning matplotlib scikit-learn scipy sleep-disorders tensorflow

Last synced: 05 Jan 2026

https://github.com/stitchsages/implyo

An advanced imputation library compatible with mixed type data with a focus on performance and high accuracy, with advanced imputation algorithms for numeric and categorical variables.

imputation imputation-algorithm imputation-methods knn machine-learning pandas pandas-dataframe pip python python3 random-forest scikit-learn

Last synced: 23 Jun 2026

https://github.com/squadron-leader/ecopredict-ai

EcoPredict AI is a powerful, AI-driven solution for predicting Greenhouse Gas (GHG) emissions based on user-input industry data. Designed for environmental sustainability initiatives, EcoPredict AI utilizes machine learning models to deliver accurate carbon emission predictions and is deployed via Streamlit for real-time access.

epa-data linear-regression python regression-model scikit-learn streamlit

Last synced: 12 Apr 2026

https://github.com/gdapriana/clickbait-detector-backend

This repository contains the backend logic for the “Clickbait Detector” app. Built using Python, it employs an Artificial Neural Network (ANN) to predict the likelihood of a news headline being clickbait. It provides REST API endpoints to interact with the model.

flask python scikit-learn tensorflow

Last synced: 11 Apr 2026

https://github.com/vedanty3/heart-disease-prediction

This project aims to build a machine learning model using K-Nearest Neighbor, LogisticRegression, RandomForestClassifier to classify whether or not a person has heart disease based upon his medical attributes. (accuracy achieved : 88.52%)

confusion-matrix correlation-matrices jupyter-notebook knn-classification logistic-regression machine-learning matplotlib numpy pandas python random-forest randomforestclassifier roccurve scikit-learn sklearn zerotomastery

Last synced: 09 Apr 2026

https://github.com/aryansk/fake-news-detection

A sophisticated machine learning solution to detect fake news using multiple classification algorithms. Identify the credibility of news articles with advanced text analysis techniques!

fake-news-detection machine-learning machine-learning-algorithms matplotlib numpy pandas python random-forest-classifier scikit-learn seaborn

Last synced: 10 Apr 2026

https://github.com/malleswarigelli/real_estate_house_price_prediction

Build end-to-end ML Regression pipeline for predicting housing price, deploy Flask app to cloud platform:Heroku with Docker, CI/CD tool: GitHub Actions

ci-cd-pipeline docker heroku-deployment machine-learning mlops mongodb python scikit-learn

Last synced: 09 Apr 2026

https://github.com/takkii/pylean

Data analysis ( 🐍 💎 📈 )

analayze matplotlib numpy pandas python scikit-learn

Last synced: 09 Sep 2025

https://github.com/nirmalyabag20/diabetes-prediction-using-machine-learning

This project focuses on predicting diabetes using machine learning algorithms based on health metrics like glucose levels, blood pressure, and BMI. By comparing different models, the goal is to identify the most accurate approach for early diabetes detection, showcasing the potential of machine learning in healthcare.

decision-tree-classifier jupyter-notebook kneighborsclassifier logistic-regression matplotlib numpy pandas python random-forest-classifier scikit-learn seaborn svc

Last synced: 18 Jan 2026

https://github.com/PFS-AI/PFS

The AI-powered desktop tool for finding, classifying, and understanding your files. Search by keyword, ask questions, and get insights from your scattered files instantly.

ai cross-platform data-science document-classification fastapi file-management file-organizer file-search huggingface-transformers knowledge-management langchain machine-learning productivity-tools rag scikit-learn search-engine semantic-search vector-search

Last synced: 30 Dec 2025

https://github.com/serhatderya/house-prices---advanced-regression-techniques

This machine learning model was developed for "House Prices - Advanced Regression Techniques" competition in Kaggle by using several machine learning models such as Random Forest, XGBoost and LightGBM.

ai artificial-intelligence data-science ju jupyter-notebook lightgbm lightgbm-regressor machine-learning machinelearning prediction python random-forest random-forest-regression regression scikit-learn xgboost xgboost-regression

Last synced: 28 Apr 2026

https://github.com/andrewquijano/operating_systems_ii

Creating an Intrusion Detection System

ids kdd99 nsl-kdd-dataset scikit-learn

Last synced: 17 Jan 2026

https://github.com/somjit101/human-activity-recognition

This project is to build a model that predicts the human activities such as Walking, Walking Upstairs, Walking Downstairs, Sitting, Standing or Laying using readings from the sensors on a smartphone carried by the user.

decision-tree-classifier eda feature-engineering gradient-boosting-classifier grid-search human-activity-recognition keras logistic-regression lstm random-forest-classifier rbf-kernel scikit-learn seaborn-plots signal-processing support-vector-classifier support-vector-machine t-sne tensorflow uci-har-dataset uci-machine-learning

Last synced: 23 Feb 2026

https://github.com/asosnovsky/analyzing-blood-vessel-aneurysm

A few simple scripts to identify aneurysm in a blood-vessel (research projects)

machine-learning meanshift medical-image-processing scikit-learn

Last synced: 20 May 2026

https://github.com/gaurav9364/credit-card-fraud-detection

Credit Card Fraud Detection using Machine Learning – A classification project that detects fraudulent credit card transactions using supervised learning, with data preprocessing, handling class imbalance, and model evaluation (ROC-AUC, Precision, Recall, F1-score).

googlecolab imbalanced-learn matplotlib numpy pandas python scikit-learn seaborn xgboost

Last synced: 08 Apr 2026

https://github.com/grachale/predict_pass_exam

Creating AdaBoost classifier with decision trees for predicting whether a student will pass or fail an exam (classification) based on the number of study hours and their scores in the previous exam.

adaboost cross-validation decision-tree jupyter-notebook matplotlib python scikit-learn seaborn

Last synced: 06 May 2026

https://github.com/hokagem/damagedlogginganalyzer

A project about an analyzation of a statistic of damaged logging (wood) in Germany using Python.

analysis csv csv-parser k-fold-cross-validation numpy pandas pandas-dataframe pandas-python polynomial-regression scikit-learn statistics wood

Last synced: 03 May 2026

https://github.com/Zen204/airbnb-availability

A machine learning model that predicts Airbnb listing availability, utilizing feature engineering and supervised learning techniques to improve guest experience and optimize host management.

binary-classification data-analysis data-preprocessing data-visualization feature-engineering machine-learning matplotlib model-evaluation nlp pandas predictive-modeling python scikit-learn seaborn supervised-learning

Last synced: 02 Apr 2025

https://github.com/mayankmittal29/stockvision

Stock price predictor LSTM Sequential Model with Dropout Regularization by which we can analyse any stock tickers, do its fundamental analysis using fundamental ratios and charts visualisations of 100MA and 200MA and can also predict stock price for next 10 days with its trend. Can also view candle stick charts for stock trading and latest news.

keras lstm-neural-networks matplotlib-pyplot mplfinance numpy pandas python scikit-learn streamlit yfinance-api

Last synced: 07 Apr 2026

https://github.com/somenath203/titanic-survival-project-backend

Click the link below to check the swagger documentation of the website live

fastapi pandas python render scikit-learn seaborn titanic-survival-predictor

Last synced: 05 Apr 2026

https://github.com/chrislemke/scikit-tabtrans

TabTransformer ready for Scikit learn 🧑‍🔬

deep-learning machine-learning python scikit-learn transformer

Last synced: 19 Apr 2025

https://github.com/supriya811106/healthcare-recommedation-system

A Flask-based web app that predicts diseases based on symptoms and recommends specialized doctors. It uses machine learning for accurate health predictions and location-based doctor searches.

css flask-application healthcare-application html javascript machine-learning numpy pandas recommendation-system scikit-learn

Last synced: 04 Mar 2026

https://github.com/jersongb22/datascience_ibm_stockpredictionlstm_project

In the IBM Advanced Data Science specialization, an interactive real-time web application was developed using LSTM networks in TensorFlow to predict stock market trends for global companies.

apache-spark data-science deep-learning lstm-neural-networks machine machine-learning plotly python scikit-learn streamlit tensorflow

Last synced: 13 Apr 2026

https://github.com/matsunagalab/tutorial_analyzingmddata

Google colab notebooks for typical MD trajectory analysis routines with Python

mdtraj molecular-dynamics scikit-learn tutorial

Last synced: 20 Apr 2026

https://github.com/nemeslaszlo/sale-price-of-bulldozers

The goal of predicting the sale price of bulldozers. How well can we predict the future sale price of a bulldozer, given its characteristics previous examples of how much similar bulldozers have been sold for? (Archive kaggle competition)

matplotlib numpy pandas random-forest-regressor regression scikit-learn seaborn

Last synced: 10 Apr 2026

https://github.com/ishutak/disease_prediction

An AI-powered disease prediction system that uses machine learning to predict diseases based on symptoms. The system employs an ensemble of models including Random Forest and Neural Networks to provide accurate predictions with confidence levels.

css3 htlm5 javascript jquery numpy pandas pytorch scikit-learn select2

Last synced: 11 Apr 2026

https://github.com/g-eoj/cv-tl-keras

Use the cross validation functions from scikit-learn to evaluate image classification transfer learning with Keras models.

cross-validation keras numpy scikit-learn transfer-learning

Last synced: 10 Apr 2026

https://github.com/gokulgowthams/smart-premium

An Interactive Premium Amount Detection for user which accurately predicts the required premium amount for a default loan by using series of questions that satisfies the criteria in Streamlit Application

data-preprocessing feature-engineering git github mlflow model-deployment numpy pandas python scikit-learn streamlit xgboost

Last synced: 11 Apr 2026

https://github.com/aditya172926/text_summarization

Project to generate summaries and perform Named Entity Recognition from multiple types of text bodies.

glove machine-learning nlp python scikit-learn spacy

Last synced: 05 May 2026

https://github.com/elazzouzihassan/si-fraud-detection-prototype

Système de Détection des Fraudes avec Python (Prototype).

googlecolab matplotlib numpy pandas python scikit-learn seaborn

Last synced: 11 Apr 2026

https://github.com/dustinmichels/bayesian-values-guesser

Uses some user input, data from the World Values Survey <www.worldvaluessurvey.org>, and Bayes Rule to guess a number of beliefs the user might have. STATUS: In progress.

bayes-rule bayesian-values-guesser naive-bayes-classifier pandas python scikit-learn values-survey

Last synced: 09 Apr 2026

https://github.com/abdullah321umar/internee.pk-dataanalytics_internship-assignment4

🌟 Fraud Detection in Application 🌟 Through Isolation Forest and K-Means Clustering, the project detects suspicious patterns like inconsistent income, duplicate entries, and unrealistic employment data. This end-to-end workflow transforms raw data into actionable fraud insights — enhancing trust and accuracy.

anomaly-detection csv-handling data-cleaning data-exporting data-import data-normalization exploratory-data-analysis export interpretation matplotlib model-evaluation pandas pca python reporting scaling scikit-learn seaborn

Last synced: 06 May 2026

https://github.com/vidhi1290/text-classification-model-with-attention-mechanism-nlp

This Python project utilizes PyTorch to perform text classification with an attention mechanism. Pre-trained GloVe embeddings are processed for word representation, and a custom attention model is trained on consumer complaint data to categorize complaints into product categories.🎯

attention-mechanism deeplearning machine-learning nlp nltk numpy pandas python pytorch scikit-learn text-classification tqdm

Last synced: 06 Apr 2026

https://github.com/mohit1106/fraud-detection-in-financial-transactions

an anomaly detection system on 284,807 transactions, achieving an AUC of ~0.972 with CNNs and Autoencoders.

autoencoders cnn-model isolation-forest keras python scikit-learn tensorflow

Last synced: 10 Apr 2026

https://github.com/ayushshahh/fespn

A neural network made to predict final exam scores of students

mlp mlp-regressor multilayer-perceptron neural-network prediction-model scikit-learn

Last synced: 02 May 2026

https://github.com/veb-101/machine-learning-practice

Contains code-works from the Hands on scikit-learn and tensorflow book

deep-learning keras machine-learning python3 scikit-learn tensorflow-gpu

Last synced: 19 Apr 2026

https://github.com/webcog-pk/recommandation-engine-in-drf-sk-learn

Full Stack Movie Recommendation System Project made in Django REST Framework and React JS

api django django-rest-framework movies reactjs recommender-system scikit-learn

Last synced: 22 Mar 2025

https://github.com/nafisalawalidris/logistic-regression-model-for-breast-cancer-recurrence-prediction

Predicting Breast Cancer Recurrence - A logistic regression model using patient attributes to classify recurrence risk. Dataset analysis and model evaluation. Contributions welcome.

breast-cancer classification-model data-analysis data-science healthcare logistic-regression machine-learning python recurrence-prediction scikit-learn

Last synced: 17 May 2026

https://github.com/guoshijiang/scikit-learn

带你一起学习scikit-learn

nlp-machine-learning scikit-learn

Last synced: 14 Sep 2025

https://github.com/mnitin-reddy/reducing-review-overhead-with-ml-based-application-screening

A machine learning classification project to filter out low-probability visa applications using historical data. It features an end-to-end implementation with CI/CD on AWS, achieving 93% accuracy with a KNN model optimized through Optuna, alongside integration of MLOps tools like Evidently and MLflow.

aws docker githubactions hypothesistesting machinelearning matplotlib mlflow mlops mongodb numpy optuna pandas python scikit-learn seaborn

Last synced: 10 Apr 2026

https://github.com/finite-sample/stagecoachml

Build two-stage models when your features arrive in two batches at different times.

machine-learning scikit-learn two-stage-models

Last synced: 14 Jan 2026

https://github.com/skekre98/picture-compressor

A tool for compressing images using unsupervised machine learning

kmeans-clustering scikit-learn

Last synced: 17 May 2026

https://github.com/abz4375/recommendersystem

A sophisticated recommender system that leverages web mining techniques to help users find hotels that match their preferences.

cosine-similarity css html javascript pandas python scikit-learn selenium selenium-webdriver

Last synced: 13 Apr 2026

https://github.com/benman1/python-time-series

Time-Series analysis, statistical and machine learning models for forecasting, regression, and classification

darts deep-learning forecasting mlforecast nixtla scikit-learn statsforecast time-series time-series-analysis

Last synced: 22 Feb 2026

https://github.com/f-aguzzi/ChemFuseKit

Chemometrics library for data fusion, model training and prediction of data from multiple sensor sources.

chemometrics datafusion knn lda pca plsda scikit-learn svm

Last synced: 21 Sep 2025

https://github.com/gaurangdave/house_price_predictions

Machine Learning Application to predict House Prices

hands-on learning-by-doing machine-learning numpy pandas python scikit-learn

Last synced: 11 Apr 2026

https://github.com/ishanoshada/matplot3dex

A Matplotlib 3D Extension package for enhanced data visualization

data data-science matplotlib python-packages scikit-learn

Last synced: 05 Jan 2026

https://github.com/varun-khorgade/churnshield-customer-retention-predictor

Built an ML-based classification model to predict customer churn. Applied data preprocessing, feature engineering, and ensemble algorithms to improve prediction accuracy and help businesses implement retention strategies.

classification-algorithm datapreprocessing f1-score feature-engineering hyperparameter-tuning logistic-regression matplotlib model-evaluation numpy pandas python ran roc-auc scikit-learn seaborn xgboost

Last synced: 07 May 2026

https://github.com/evangks/k-means-clustering-synthetic-dataset

Customer Segmentation using K-Means Clustering: A complete machine learning workflow for segmenting customers based on synthetic demographic and spending data, with visualizations, evaluation metrics, and reproducible Jupyter notebook.

clustering customer-segmentation data-science jupyter-notebook k-means-clustering machine-learning portfolio-project python27 scikit-learn unsupervised-learning

Last synced: 10 Mar 2026

https://github.com/labrijisaad/chefclub-data-internship

Repository showcasing my Data Engineer / Scientist internship at Chefclub, contributing to data infrastructure enhancement and fostering data-driven insights.

airflow chefclub data-engineering data-science gcp scikit-learn

Last synced: 28 Apr 2025

https://github.com/upul/chocolate-quality-analysis

This repository contains a Jupiter notebook which describes how to use basic machine learning tools such Scikit-Learning, Pandas, and Numpy for buiding models.

machine-learning numpy pandas predictive-analytics scikit-learn

Last synced: 04 May 2026

https://github.com/yancotta/anti-aging-epigenetics-ml-app

A thesis MVP for a personalized anti-aging system that analyzes genetic SNPs and lifestyle habits using ML models (Random Forest and Neural Networks) to provide risk assessments and actionable recommendations. Built with FastAPI, React, PostgreSQL, and containerized via Docker for scalability and explainability.

anti-aging bioinformatics docker explainable-ai fastapi genetics healthtech machine-learning mlops personalized-medicine pytorch reactjs scikit-learn synthetic-data thesis-project

Last synced: 16 Sep 2025

https://github.com/jersongb22/computervision

Links to my repositories with a wide variety of Computer Vision models using CNNs, Transfer Learning, and Vision Transformer with TensorFlow, PyTorch, Hugging Face and Ultralytics.

cnn computer-vision convnextv2 efficientnetv2 hugging-face image-captioning image-classification image-segmentation lenet-5 object-detection opencv plotly python pytorch scikit-learn tensorflow ultralytics video-classification vision-transformer yolo11

Last synced: 12 Apr 2026

https://github.com/khaymanii/parkinsons-disease-detection-model

This model was built with Python and Support Vector Machine Algorithm

matplotlib numpy pandas python scikit-learn

Last synced: 19 Apr 2026

https://github.com/shreeparab1890/movie-recommender-system

This notebook is trying to build a model which will recommend the movie based on given movie and genre. In this we use Popularity Based Recommendation, Content Based Recommendation and Collaborative Filtering based Recommendation.

bag-of-words cosine-similarity matplotlib numpy pandas python scikit-learn sklearn vectorization

Last synced: 09 Apr 2026

https://github.com/vimal0156/ruaroa-ai

🧙‍♂️ Zero-Code Machine Learning Wizard - Transform ideas into intelligent solutions without writing code. AI-powered ML pipeline automation with interactive web interface.

ai-agents ai-assistant artificial-intelligence automated-machine-learning code-generation data-analysis data-science deep-learning jupyter machine-learning machine-learning-pipeline neural-networks no-code openai python scikit-learn streamlit visualization

Last synced: 09 Apr 2026

https://github.com/edisedis777/pyspark-ml-features

A PySpark implementation of 6 lesser-known Scikit-Learn features optimized for Azure Databricks. This project translates powerful machine learning techniques from Scikit-Learn into PySpark's distributed computing framework.

azure databricks databricks-notebooks large-scale machine-learning pyspark python scikit-learn scikitlearn-machine-learning

Last synced: 13 Apr 2026

https://github.com/yuvraj0412s/proactive-fraud-detection-using-machine-learning

An end-to-end machine learning project for detecting financial fraud using LightGBM, featuring in-depth EDA, advanced feature engineering, and a focus on actionable business insights.

class-imbalance classification-model data-analysis data-science data-visualization exploratory-data-analysis feature-engineering fintech fraud-detection jupyter-notebook lightgbm machine-learning pandas python scikit-learn smote

Last synced: 02 May 2026

https://github.com/alisonmitchell/boston-housing

Investigation of the Boston housing dataset to evaluate, train and test a regression model to predict house prices.

data-science machine-learning matplotlib numpy pandas python scikit-learn scipy seaborn

Last synced: 10 Apr 2026

https://github.com/xharshit/careerconnect-smart-campus-placement-portal

CareerConnect is an AI-powered campus placement portal that helps students prepare for jobs through smart aptitude and coding tests, mock interviews, resume analysis, and more — all monitored with face recognition-based proctoring. Designed to assist students, TPOs, and companies for seamless hiring and tracking.

aptitude artificial-intelligence css face-recognition html machine-learning mockinterview nodejs opencv python resume-builder resumescanner scikit-learn streamlit technical-test tensorflow

Last synced: 13 Apr 2026

https://github.com/srujayreddy/selling-laptops

Predicting whether users will click on a promotional email for laptops based on historical user data and browsing logs.

customer-behavior-analysis feature-engineering logistic-regression machine-learning marketing-analytics numpy pandas predictive-modeling scikit-learn

Last synced: 12 Apr 2026

https://github.com/nordszamora/predictive_lung_cancer

The lung cancer predictive ML project is use to predict a cancer based on the data of smoking intake and common symptoms with low cost.

bootstrap django django-rest-framework python reactjs rest-api scikit-learn vite

Last synced: 11 Apr 2026