An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/khanovico/python-stock-analyzer

This is a Webapp implemented by python and several data science frameworks, enabling online stock trend analyzing.

amcharts-js-charts data-analysis data-visualization flask javascript pandas python scikit-learn

Last synced: 02 Feb 2026

https://github.com/sarowarahmed/advertising-sales-app

📈 Advertising Sales Predictor: A web app powered by a Machine Learning model, built with Numpy, Pandas, Scikit-learn, and Streamlit, to forecast sales based on TV, Newspaper, and Online Advertising. Deployed on Streamlit Cloud for real-time, easy-to-use predictions.

advertising app machine-learning multiple-linear-regression numpy pandas sales scikit-learn streamlit

Last synced: 07 Feb 2026

https://github.com/vladimiracunadev-create/python-data-science-program

Python Data Science Program — 197 clases en 9 partes. Pauta avanzada derivada de Géron, VanderPlas, Huyen, ISLP y Barocas/Hardt/Narayanan. Recurso personal de aprendizaje, enseñanza y mejora continua.

bootcamp data-analysis data-science education jupyter machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 01 Jun 2026

https://github.com/s-matke/eco-forecast

Machine learning model used for predicting European country with most green surplus energy generated

data-science green-energy machine-learning scikit-learn supervised-learning

Last synced: 05 May 2026

https://github.com/nihanthbhargav/time-series-stock-market

This project combines computer vision and NLP by segmenting pet images with a U-Net model and generating captions using CNN-RNN/LSTM. Using the Oxford-IIIT Pets dataset, it demonstrates a unified pipeline that integrates pixel-level segmentation with automatic caption generation for meaningful image understanding.

matplotlib numpy pandas plotly python scikit-learn seaborn

Last synced: 11 Apr 2026

https://github.com/sharkb8t/credit-risk-classification

Demonstrates my abilities to use Jupyter Notebook with scikit-learn to train and evaluate a machine learning model.

jupyter-notebook numpy pandas pathlib python scikit-learn

Last synced: 15 Apr 2026

https://github.com/aerojam95/math70076-data-science-cw2

This repository presents the second coursework for the MATH70076 Data Science module at Imperial College London, where the project showcases different machine and deep learning models for image classification

data-science deep-learning machine-learning python3 pytorch scikit-learn

Last synced: 15 Apr 2026

https://github.com/chirindaopensource/strapsim_portfolio_similarity_metric

End-to-End Python implementation of STRAPSim: a novel portfolio similarity metric from Li et al. (2025). Combines Random Forest proximity learning with residual-aware bipartite matching to quantify economic substitutability between ETF baskets. Full replication pipeline included.

asset-management bipartite-matching corporate-bonds etf-analysis fixed-income jupyter-notebook machine-learning numba pandas portfolio-optimization portfolio-similarity proximity-matrix python quantitative-finance random-forest research-replication scikit-learn similarity-metrics statistical-analysis supervised-learning

Last synced: 28 Apr 2026

https://github.com/danicaalana/wine-dataset-decision-tree

This project is developed as part of Digital Skill Fair (DSF) 35.0 - Data Science by Dibimbing. I am using Wine Recognition Dataset from scikit-learn, which is the results of a chemical analysis of wines grown in the same region in Italy by three different cultivators.

data data-analysis-python data-science decision-tree-classification machine-learning python scikit-learn wine-dataset

Last synced: 18 Apr 2026

https://github.com/jobanjps089/mental_wellness

This project is a Flask web application that predicts mental wellness levels based on lifestyle factors such as screen time, sleep hours, and work-related screen exposure. It uses a Machine Learning model trained in Google Colab and deployed via Hugging Face Spaces for public access.

flask joblib puthon3 scikit-learn

Last synced: 16 May 2026

https://github.com/max00358/sign_language_detection

A sign language detector that recognizes ASL(American Sign Language) alphabet

mediapipe opencv scikit-learn

Last synced: 09 Feb 2026

https://github.com/sarowarahmed/predicting-kolkata-house-price

🏠 Predicting Kolkata House Price: A web app powered by a Machine Learning model, built with Numpy, Pandas, Scikit-learn, and Streamlit, to predict house prices in Kolkata. Deployed on Streamlit Cloud for easy access and real-time predictions.

app kolkata linear-regression machine-learning numpy pandas scikit-learn streamlit

Last synced: 26 Feb 2026

https://github.com/arjunan-k/medical_insurance

Project to analyze and forecast medical insurance costs of patients using data science framework.

medical-insurance scikit-learn tableau

Last synced: 12 Jun 2026

https://github.com/smusab9152/bpm_pred_songs

ML project to predict the Beats Per Minute (BPM) of a song using various audio features. This is a submission for the Kaggle Playground Series (S04E02). The notebook covers a full data science workflow, including EDA, handling skewed data with log transformations, feature scaling, and building various regressions

data-science jupyter-notebook kaggle-competition machine-learning pandas regression scikit-learn

Last synced: 11 May 2026

https://github.com/manishrajmss13/regression_project

A predictive machine learning model to forecast the Algerian Forest Fire FWI using Python, Scikit-learn, and Statsmodels. Includes complete data cleaning and EDA.

data-cleaning-and-preprocessing data-science eda feature-engineering learning-by-doing linear-regression machine-learning python regression scikit-learn statsmodel

Last synced: 09 May 2026

https://github.com/mindlessmuse666/train-test-splitter

Анализ данных о пассажирах Титаника и разбиение на обучающую и тестовую выборки. Практическое задание по дисциплине "Основы применения методов искусственного интеллекта в программировании".

data-analysis data-preprocessing data-visualization machine-learning pandas python scikit-learn seaborn titanic train-test-split

Last synced: 12 Apr 2026

https://github.com/abhay-rudatala/resume-analyzer

Intelligent Resume Analysis System using Machine Learning and NLP. Features TF-IDF + Naive Bayes/SVM classification (90-95% accuracy), SpaCy NER for information extraction, and interactive Streamlit web app with custom UI. Built with Python, Scikit-learn, and deployed on Streamlit Cloud.

classification machine-learning named-entity-recognition nlp portfolio-project python resume-analysis scikit-learn spacy streamlit

Last synced: 06 May 2026

https://github.com/brossend/automl_bank_project

Automated ML pipeline for the UCI Bank Marketing dataset: ETL, Optuna-based AutoML, model evaluation, MLflow logging, pytest tests, Docker, and CI/CD.

automl bank-marketing binary-classification ci-cd classification data-science docker docker-compose etl github-actions gitlab-ci machine-learning ml-pipeline mlflow model-monitoring optuna pytest python scikit-learn uci-dataset

Last synced: 02 Jun 2026

https://github.com/sachinh123/cognitive-customer-insights-with-watson-ai

This project analyzes customer data to provide insights for personalized services, behavior prediction, and improved support.

flask ibm-cloud ibm-watson-assistant ibm-watson-nlu nltk python scikit-learn

Last synced: 10 Feb 2026

https://github.com/ytalk/deep-learning

Um repositório dedicado à minha jornada de aprendizado e experimentação em Deep Learning. Contém diversas pipelines e implementações em diferentes datasets, explorando modelos (MLPs, LSTMs, CNNs) e técnicas (Regressão, Classificação, etc.) com foco em TensorFlow e Keras.

data-science deep-learning keras machine-learning neural-networks pandas python scikit-learn tensorflow

Last synced: 30 Dec 2025

https://github.com/0eix/ibm-ds-spacex-falcon9

IBM Professional data science certificate Final Project Notebooks

data-science data-visualization exploratory-data-analysis ibm poetry scikit-learn shap

Last synced: 11 Feb 2026

https://github.com/pranavsp108/financial-fraud-detection

A comprehensive machine learning project for detecting financial fraud using XGBoost and LightGBM, with a focus on advanced feature engineering, class imbalance handling, and hyperparameter tuning.

classification-model data-science feature-engineering fraud-detection hyperparameter-tuning lightgbm machine-learning pandas python scikit-learn xgboost

Last synced: 04 May 2026

https://github.com/ahmadbuilds/fake-news-classifier

Classifies news articles as real or fake using an NLP pipeline with TF-IDF + n-grams and machine learning models. Includes text preprocessing, feature engineering, model training, and evaluation.

fastapi logistic-regression matplotlib n-grams nextjs nltk numpy pandas python3 random-forest-classifier react scikit-learn seaborn supervised-learning tf-idf typescript xgboost-classifier

Last synced: 11 Apr 2026

https://github.com/nurulashraf/predictive-maintenance-analysis-for-machine-failure-prevention

Predictive maintenance analysis for machine failure prevention using sensor data and ML. Built a Random Forest model and Gradio dashboard to identify high-risk machines for proactive maintenance.

data-science failure-prediction gradio industrial-iot machine-learning power-bi predictive-maintenance python scikit-learn

Last synced: 16 Apr 2026

https://github.com/pranavsp108/time-series-forcasting

A time-series forecasting project to predict hourly energy consumption using Python, Pandas, and an XGBoost regression model.

data-analysis data-science energy-consumption forecasting matplotlib numpy pandas python scikit-learn sustainability time-series xgboost

Last synced: 10 Apr 2026

https://github.com/18mahi/digital_cave

An intermediate-level deep learning project that compares Convolutional Neural Networks (CNN) and Multi-Layer Perceptrons (MLP) on the MNIST handwritten digits dataset. This project demonstrates data augmentation, learning rate scheduling, and visual comparison of model performance

cnn confusion-matrix data-augmentation data-science deep-learning evaluation-metrics jupyter-notebook keras learning-rate-scheduler machine-learning matplotlib mlp numpy python3 scikit-learn seaborn tensorflow

Last synced: 13 Apr 2026

https://github.com/sabin74/fake_news_detection

This project implements a Fake News Detection system using Python, Natural Language Processing (NLP), and machine learning. It classifies news articles as Real or Fake based on their textual content.

fake-news-detection kaggle-dataset multinomial-naive-bayes passive-aggressive-classifier python3 regex scikit-learn

Last synced: 16 Apr 2026

https://github.com/sanjiv856/machine_learning_scikit-learn

Repository for machine learning in Python using Scikit-learn.

pipelines python scikit-learn sklearn titanic-kaggle titanic-survival-prediction

Last synced: 27 Feb 2026

https://github.com/codedby-mozz/habits_vs_academic_performance

This repository contains a Jupyter Notebook that explores the relationship between student lifestyle habits and academic performance. It demonstrates the process of data loading, exploratory data analysis (EDA), correlation analysis, and the development of a predictive model using linear regression to predict exam scores based on daily habits.

linear-regression python scikit-learn

Last synced: 16 Apr 2026

https://github.com/subratamondal1/heart-attack-prediction

Heart Attack Prediction of patients based on the required data. Data Ingestion - Data Preparation - Exploratory Data Analysis (EDA) - Modelling - Evaluation.

data-analysis data-science data-visualization kaggle-dataset machine-learning matplotlib-pyplot numpy pandas python3 scikit-learn seaborn

Last synced: 09 Apr 2026