An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with pca-analysis

A curated list of projects in awesome lists tagged with pca-analysis .

https://github.com/tatevkaren/mathematics-statistics-for-data-science

Mathematical & Statistical topics to perform statistical analysis and tests; Linear Regression, Probability Theory, Monte Carlo Simulation, Statistical Sampling, Bootstrapping, Dimensionality reduction techniques (PCA, FA, CCA), Imputation techniques, Statistical Tests (Kolmogorov Smirnov), Robust Estimators (FastMCD) and more in Python and R.

bootstrap canonical-correlation clustering data-imputation dimensionality-reduction factor-analysis-methods importance-sampling inverse-transform-method linear-regression monte-carlo-simulation pca-analysis probability-distribution python3 r regression-analysis rejection-sampling statistcal-tests

Last synced: 10 Apr 2025

https://github.com/yaricom/timeserieslearning

The project aimed to implement Deep NN / RNN based solution in order to develop flexible methods that are able to adaptively fillin, backfill, and predict time-series using a large number of heterogeneous training datasets.

adagrad adam-optimizer adamax deep-learning deep-neural-networks leaky-relu pca-analysis random-forest-regressor recurrent-neural-networks rmsprop rnn time-series

Last synced: 05 Apr 2025

https://github.com/ethanhe42/an-analysis-of-single-layer-networks-in-unsupervised-feature-learning

Implementation for: An Analysis of Single-Layer Networks in Unsupervised Feature Learning

kmeans neural-network pca-analysis

Last synced: 11 Apr 2025

https://github.com/lucasrodes/kpca-denoising-python

Reproduction of the experiments presented in Kernel PCA and De-noising in Feature Spaces, as a project in DD2434 Machine Learning Advance Course during Winter 2016

denoising-images kernel-methods kpca-analysis machine-learning pca-analysis

Last synced: 18 Mar 2025

https://github.com/supersonichub1/spotify-audio-embeddings

Visualizations of music semantics calculus using Spotify and deep embeddings.

audio-embedding clustering embeddings interactive-visualizations music pca-analysis spotify

Last synced: 07 May 2025

https://github.com/n-sapkota/fault-detection-wind-turbine

Wind turbine fault detection using one class SVM

fault-detection machine-learning pca-analysis python wind-turbines

Last synced: 28 Jul 2025

https://github.com/wildoctopus/heart-disease-detection

Heart disease detection using different classifiers and neural network with feature engineering.

decesion-trees k-nearest-neighbours kmeans-clustering knn-algorithm multi-layer-perceptron pca-analysis principle-component-analysis random-forest svm

Last synced: 07 Aug 2025

https://github.com/iamdanialkamali/word-embedding-bert

Word Sense Disambiguation Using BERT model

bert classification pca-analysis

Last synced: 16 Mar 2025

https://github.com/akurgat/botnet-anomaly-detection

Using a MLP to identify botnets in network traffic

mlp-classifier pca-analysis pcap python seaborn tensorflow

Last synced: 13 Jul 2025

https://github.com/valentingol/torch_pca

Principal Component Anlaysis (PCA) in PyTorch.

pca pca-analysis principal-component-analysis python python3 pytorch torch

Last synced: 28 Oct 2025

https://github.com/asifhaider/machine-learning-4-2

Machine Learning Course Assignments from Scratch (Exploratory Data Analysis, Logistic Regression, Adaboost, Feed Forward Neural Network, EM Algorithm, Gaussian Mixture Models, Singular Value Decomposition, Image Reconstruction, Principal Component Analysis)

adaboost eda em-algorithm gaussian-mixture-models logistic-regression neural-network pca-analysis preprocessing singular-value-decomposition

Last synced: 27 Mar 2025

https://github.com/reckadon/ml-harwithdts

Repository for Assignment 1 of team KAR.ai. - Human Activity Recognition (HAR) with Decision trees and LLMs

data-collection decision-trees groq-api human-activity-recognition jupyter machine-learning matplotlib pandas pca-analysis prompt-engineering sklearn tsfel

Last synced: 23 Jan 2026

https://github.com/mikel-brostrom/housing_price_prediction

California housing price prediction with NN, Random Forest and Linear Regression

california-housing-price-prediction data-cleaning feature-engineering linear-regression pca-analysis random-forest

Last synced: 28 Dec 2025

https://github.com/aman-095/principal-orthogonal-decomposition-on-schlieren-images

Using SVD and PCA to observe the top and most influential modes present in a shock wave generated from a super-sonic wind tunnel.

canny-edge-detection pca-analysis super-resolution svd

Last synced: 15 Mar 2025

https://github.com/phanikmr/facefinder

FaceFinder is an face recognition security check app coded in Matlab. It can solve the issue of security check just in seconds. It identifies the particular person is allowed or not allowed for a particular thing or task. This can be used as an Visual Attendance system where student identification and recognition is achieved through face recognition, in security applications, in short face recognition applications are used in widely in many corporate and educational institutions, at ticket reservation systems. The alogorithm behind face tracking is Viola Jones and for face recognition PCA.

computer-vision face-recognition image-processing matlab pca-analysis

Last synced: 22 Mar 2025

https://github.com/oscarlorentzon/repstruct

Python library for finding representative structures in image collections

bag-of-visual-words descriptor image-retrieval pca-analysis python

Last synced: 22 Jul 2025

https://github.com/jash271/myntra_image_search_emulation

A Mockup to imitate Myntra's Image Search 👔

cosine-similarity opencv pca-analysis pickle python

Last synced: 19 Apr 2026

https://github.com/sherif-mooo/mts-cpca

Common PCA for Multivariate time series

multivariate-time-series pca-analysis

Last synced: 15 Jun 2026

https://github.com/aryehky/ml-theory-to-implementation

📚🧠 Explores core machine learning concepts through interactive, hands-on Jupyter notebooks. From building a neural network from scratch to applying dimensionality reduction and classification with real-world libraries, this project bridges the gap between theory and practical application.

4d-database jupyter-notebook machine-learning matploblib numpy pca-analysis

Last synced: 23 Jun 2025

https://github.com/mahnoorsheikh16/credit-card-default-prediction

This project focuses on predicting whether a customer will default on their credit card payment in the upcoming month. Utilizing historical transaction data and customer demographics, the project employs various machine learning algorithms to distinguish between risky and non-risky customers for better credit risk management.

encoding hiplot imblearn json knn-imputer logistic-regression matplotlib numpy pandas pca-analysis plotly scipy seaborn sklearn smote streamlit support-vector-machines timeseries-forecasting visualization xgboost-classifier

Last synced: 06 Apr 2026

https://github.com/GokulSuseendran/Insurance-Fraud-Prediction

The objective of the project is to predict the risk of auto Insurance fraud using Logistic Regression.

logistic-regression msexcel pca-analysis r rshinyapp

Last synced: 29 Jul 2025

https://github.com/moindalvs/pca_dimensionality_reduction

Principal Component Analysis Let's discuss PCA! Since this isn't exactly a full machine learning algorithm, but instead an unsupervised learning algorithm, we will just have a lecture on this topic, but no full machine learning project (although we will walk through the cancer set with PCA).

data-science pca pca-analysis principle-component-analysis

Last synced: 18 Apr 2026

https://github.com/yehoanatnezra/cancer_attributes_prediction

This repository implements a machine-learning pipeline that, given a patient’s clinical profile, predicts cancer risk and, if positive, the most likely subtype or metastasis pattern. It includes cleaned training data, reproducible training scripts and a concise analysis report.

gradient-boosting k-means-clustering machine-learning pca-analysis random-forest

Last synced: 25 Sep 2025

https://github.com/mahnoorsheikh16/sketchify-a-quick-draw-drawing-classifier

Implementation of a sketch‐recognition pipeline inspired by Google’s Quick, Draw!—from raw stroke data to prediction. Includes data preprocessing and feature‐engineering scripts, three Bayesian classifiers alongside Logistic Regression, SVM, K-NN and XGBoost baselines, and an RNN model.

bayesian-classifier drawing-classification feature-engineering gaussian-naive-bayes google-quick-draw kfold-cross-validation knn-classification linear-discriminant-analysis-lda logistic-regression matplotlib-pyplot multivariate-gaussian-distribution parzen-window pca-analysis rnn-pytorch seaborn-plots sequential-forward-selection spherical-gaussian-kernel svm-classifier tsne-visualization xgboost-classifier

Last synced: 13 Aug 2025

https://github.com/mmsaki/clustering-crypto

Using k-Means algorithm and a Principal Component Analysis (PCA) to cluster cryptocurrencies.

elbow-curves pca-analysis

Last synced: 02 Sep 2025

https://github.com/ihb-ibr-department/pie_toolbox

PIE Toolbox is a Python package for processing, analyzing, and classifying neuroimaging voxel data using SSM-PCA and SVM.

neuroimaging pca-analysis ssm-pca

Last synced: 10 Mar 2026

https://github.com/moindalvs/assignment_pca_wine_dataset

Case Summary Perform Principal component analysis and perform clustering using first 3 principal component scores (both Heirarchical and k mean clustering(scree plot or elbow curve) and obtain optimum number of clusters and check whether we have obtained same number of clusters with the original data (class column we have ignored at the begining who shows it has 3 clusters)

data-science feature-selection jupyter-notebook pca pca-analysis python tsne

Last synced: 17 May 2026

https://github.com/mahnoorsheikh16/Credit-Card-Default-Prediction

This project focuses on predicting whether a customer will default on their credit card payment in the upcoming month. Utilizing historical transaction data and customer demographics, the project employs various machine learning algorithms to distinguish between risky and non-risky customers for better credit risk management.

chi-square-test encoding hiplot imblearn json knn-imputer matplotlib numpy pandas pca-analysis pillow plotly robust-scalar scipy seaborn sklearn smote streamlit ttest visualization

Last synced: 01 Mar 2025

https://github.com/mumtaz4118/transfer-learning-on-covid-data

This deep learning model(CNN) uses Transfer learning by Feature Extraction and Fine Tuning in order to make multiclass-classification between COVID-19, Pneumonia and Healthy images.

data-science deep-learning embedding-vectors feature-engineering feature-extraction machine-learning pca-analysis research-project transfer-learning

Last synced: 10 Oct 2025

https://github.com/bp0609/har-human-activity-recognizer-decision-tree-model

This is a robust and efficient ML model for recognizing various human activities such as walking, sitting, and running using accelerometer data.

decision-tree-classifier pca-analysis tsfel

Last synced: 14 Oct 2025

https://github.com/itancio/sixt33n

The objective of this lab is to use voice commands to control how the car moves. Each voice command is associated with one of these events: drive straight far, drive straight close, turn right, and turn left. The voice command signals will undergo PCA projection and classification process in real-time. For every successful classification, Sixt33n should set to start drive mode and execute the intended action.

k-means-clustering pca-analysis

Last synced: 28 Feb 2026

https://github.com/raj-pulapakura/basketball-players-analysis

This repo features an analysis on various basketball players, using unsupervised learning techniques.

basketball clustering coursera data-science exploratory-data-analysis machine-learning pca-analysis unsupervised-machine-learning

Last synced: 19 Mar 2026

https://github.com/zofiaqlt/market_research_nlp_clustering

🎯 International market analysis and optimization program for reducing food waste and improving the vegetable export industry - use of Python, JupyterLab and NLP (Background research, Data collection, Cleaning, EDA, Unsupervised Machine Learning, Factorial method, HCA, Clustering, Data Visualization and 10-year projection with Sankey Diagram

clustering database-management nlp pca-analysis python sankey-diagram

Last synced: 05 May 2026

https://github.com/vercetti322/image-reconstruction

This repository has the image reconstruction models on MNIST (0-9) images dataset.

jupyter-notebook pca-analysis python variational-autoencoder

Last synced: 08 May 2026

https://github.com/cyprianfusi/predicting-heart-disease-using-k-nearest-neighbours

Up to 90% accuracy with just 5 features using KNN algorithm and PCA for feature engineering. The dataset contained less than 1000 observations. The model's accuracy could be improved using more observations, further hyperparameter optimization and feature engineering

cap-curve feature-engineering feature-selection heart-disease knn-classifier pca pca-analysis

Last synced: 21 Mar 2025

https://github.com/akimuddinshaikh/machine-learning-project

A comparative study of regression models (Decision Tree, Random Forest, Ridge, Lasso, SVM) for predicting real estate prices in King County, NYC, and California using PCA & Pipeline techniques.

machine-learning pca-analysis python regression-models scikit-learn statsmodels

Last synced: 16 May 2026

https://github.com/alessioborgi/MLPipelineOptimizationStudy

Exploration and optimization of a ML pipeline, delving into various techniques for enhancing different stages of ML workflows, including data preprocessing, feature engineering, model selection, and hyperparameter tuning.

catboost catboost-classifier data-imbalance gridsearchcv kernel-svm lightgbm lightgbm-classifier logistic-regression pca pca-analysis random-forest resource-complexity svm t-sne voting-ensemble xgboost xgboost-classifier

Last synced: 18 Jan 2026

https://github.com/abdellatif-laghjaj/data-analysis

PCA implementation with seaborn

dataanalysis pca pca-analysis

Last synced: 21 Apr 2026

https://github.com/noedemange/nspcaview

App Shiny : Optimised matrix visualization of Non-negative Sparse PCA components.

nspca pca pca-analysis r rstats shiny shiny-apps visualization

Last synced: 12 Jun 2026

https://github.com/kingflow-23/ai-related-article-detector

Create a simple system that determines whether an article is related to AI or not using web scraping, text representation, and a classifier.

data-analysis data-engineering data-science logistic-regression pca-analysis scraping selenium umap

Last synced: 04 May 2026

https://github.com/balaka-18/dimensionality-reduction

Notebooks on PCA(Principal Component Analysis)

dimensionality-reduction mnist pca pca-analysis pca-implementation

Last synced: 03 Feb 2026

https://github.com/ziraddingulumjanly/unsupervised-learning-implementation-on-heartattackdataset

This study aims to identify distinct subgroups within a dataset of patients with heart attack-related features using unsupervised learning techniques: k-means and Hierarchical

dataset heartattack kaggle kmeans-algorithm kmeans-clustering pca-analysis tsne-algorithm unsupervised-machine-learning

Last synced: 05 May 2026

https://github.com/aman-095/principal-component-analysis-pca-implementaion-from-scratch

Implemented PCA algorithm from scratch on MNIST Dataset. Visualizing the reconstructed images made and comparing them with the original image. Visualizing the residual images by subtracting the reconstructed image from the original image (for different values of Principal components). Finding the reconstruction error (pixel-wise root-mean-square) for each sample and plot them for a different number of principal components.

machine-learning mnist-classification mnist-dataset pca-analysis principal-component-analysis

Last synced: 15 Mar 2025

https://github.com/sabin74/boston_house_prediction

This project aims to predict the median value of owner-occupied homes in Boston suburbs using various machine learning regression models. Multiple regression techniques were applied, including Linear Regression, Decision Tree, Random Forest, Gradient Boosting and dimensionality reduction with PCA. Hyperparameter tuning was performed.

boston-housing-price-prediction hyperparameter-tuning kaggle-dataset pca-analysis python3 regression-models scikit-learn

Last synced: 06 May 2026

https://github.com/pritamgouda11/comparison_analysis_of_embeddings

Conducted intrinsic evaluation of word embeddings through analogy and similarity tests, comparing the custom Skip-gram model’s performance against Google’s pre-trained Word2Vec model to assess the quality of the generated word representations.

natural-language-processing pca-analysis skipgram wikitext word2vec

Last synced: 15 Mar 2025

https://github.com/stefagnone/unsupervised-analysis-project

This project investigates the impact of video content on social media engagement using advanced analytics techniques like PCA, k-means clustering, and logistic regression. It provides actionable insights for optimizing social media strategies for Thai fashion and cosmetics retailers.

data-analysis data-visualization engagement-metrics facebook-live-sellers k-means-clustering logistic-regression marketing-insights pca-analysis python social-media-analytics

Last synced: 05 Apr 2025

https://github.com/andersoncrs/analisis-dispositivos-moviles-eda-pca

Se analizaron datos de dispositivos móviles para encontrar patrones y simplificar la información usando herramientas visuales y técnicas como PCA. Así, es más fácil entender qué características son más relevantes y cómo se relacionan entre sí, facilitando la toma de decisiones en tecnología móvil.

data-exploration exploratory-data-analysis jupyter-notebook pca-analysis

Last synced: 01 Jul 2025

https://github.com/webyneter/components-analysis

Training application for performing Kernel/Principle Components Analysis

components-analysis kpca-analysis pca-analysis principle-component-analysis

Last synced: 06 May 2026

https://github.com/12danielll/neural_networks_project

The project focuses on analyzing neural activity data to classify neuron types (spiny and aspiny). It integrates unsupervised learning methods (PCA, Autoencoders) and supervised learning models (Logistic Regression, MLP) to build accurate classifiers that effectively analyze neurons' electrical responses.

2d-and-3d-visualizations autoencoders classifier-evaluation cortical-neurons data-compression gradient-descent high-dimensional-neural-datasets logistic-regression mlp mlp-networks neural-classification neuron neuronal-network pca-analysis perceptron roc-auc stochastic-gradient-descent supervised-learning unsupervised-learning

Last synced: 16 Mar 2025

https://github.com/abhijeetdasbakshi/ecommerce-insights

A Dockerized end-to-end project that combines unsupervised machine learning for customer segmentation with scalable data pipelines. It uses MongoDB for data ingestion, Scikit-learn for clustering, Airflow for orchestration, and Streamlit for interactive visualization — enabling actionable insights into e-commerce

airflow airflow-dags ci-cd-pipeline clustering dags data data-pipelines docker docker-compose docker-container dockerfile git great-expectations kafka mongodb pca-analysis postgresql pyspark t-sne umap-learn

Last synced: 04 Apr 2026

https://github.com/prajakta1321/exoplanet-atmospheric-characterization-1

A machine learning project to classify exoplanets using light curve image data. Developed as part of the ML4SCI GSoC 2025 Test Task. Includes data processing, CNN-based model, and full report.

classification colab-notebook dbscan gsoc-2025 machine-learning-algorithms matplotlib-python ml numpy open-source pca-analysis python3 seaborn

Last synced: 07 May 2026

https://github.com/niteshchawla/movie-recommender-system

To create a Recommender System to show personalized movie recommendations based on ratings given by a user and other users similar to them in order to improve user experience.

collaborative-filtering correlation-matrix cosine-similarity exploratory-data-analysis feature-engineering knearest-neighbor-algorithm mape matrix-factorization pca-analysis pearson-correlation recommender-system rmse sparsity tsne-visualization visualization

Last synced: 08 Apr 2025

https://github.com/aryanpillai2007/credit-card-fraud-detection

The primary goal of this project is to develop a comprehensive fraud detection system that enhances the security and trustworthiness of financial transactions.

anomaly-detection classification credit-card-fraud data-preprocessing data-science data-visualization fraud-detection imbalanced-data logistic-regression machine-learning outlier-detection pca pca-analysis python roc-curve scikit-learn

Last synced: 18 May 2026

https://github.com/ieCecchetti/Python_ML_DL_examples

A variety of Machine Learning and Deep Learning scripts in Python. Included some theorical info about that in the Readme

bayes-classifier bayesian-statistics deep-learning kernel machine-learning matplotlib neural-network numpy pandas pca pca-analysis python scikitlearn-machine-learning scipy shi

Last synced: 10 Mar 2025

https://github.com/tanyakuznetsova/music_mental_health

Harnessing music's power for better mental health: genre recommendations and data-driven analysis of listeners' trends

data-visualization decision-tree decision-tree-classifier exploratory-data-analysis k-means-clustering pca-analysis recommendation-system recommender-system surprise-python

Last synced: 11 Jul 2025

https://github.com/purcellcjp/cryptoclustering

This project applies K-means clustering to group crypto-currencies based on 24 hr and 7 day price changes. In addition, it investigates the impact of dimensionality reduction using Principal Component Analysis (PCA) on clustering outcomes.

clustering cryptocurrency machine-learning pca-analysis unsupervised-learning

Last synced: 15 Jul 2025

https://github.com/daniel-elston/credit-card-default-prediction-algorithm

Algorithm used to predict whether a bank customer will default on given credit cards using bank telemarketing dataset.

algorithms banking-applications classification data-science machine-learning pca-analysis pre-processing visualization wrangling-cleaning

Last synced: 04 Apr 2025

https://github.com/abdelhamid2c/acp

Implementation of Principal Component Analysis in Python

pca-analysis python

Last synced: 01 May 2026

https://github.com/norafrn/customer-clustering

Implemented a full K-Means clustering pipeline using Python, scikit-learn, and Pandas to segment customers in the Instacart dataset based on shopping behaviour. Automated preprocessing, feature scaling, and visualization (PCA, heatmaps).

heatmap k-means-clustering pandas pca-analysis sckit-learn

Last synced: 13 Apr 2026

https://github.com/leftcoastnerdgirl/unsupervised_learning

This project is the first step in machine learning, using K Means and Principal Component Analysis.

kmeans-clustering pca pca-analysis principal-component-analysis unsupervised-learning unsupervised-machine-learning

Last synced: 07 Sep 2025

https://github.com/shuyib/mouse_gut_otu

Vectorization and Unsupervised Learning of Mouse Operation Taxonomic Units to determine which species of bacteria form distinct groups in a dataset.

16s-rrna anaconda analysis data-visualization dataset gut-microbiome matplotlib-figures mothur numpy-arrays pandas-dataframe pca-analysis python3 scikitlearn-machine-learning sops t-sne unsupervised-learning

Last synced: 13 Apr 2026

https://github.com/mrtejas/ml-algos

A collection of fundamental Machine Learning Algorithms Implemented from scratch along-with their applications for various ML tasks like clustering, thresholding, data analysis, prediction, regression and image classification.

adaboost bagging-ensemble decision-trees gmm-clustering gradient-boosting hidden-markov-models kernel-density-estimation knn-classification machine-learning mlp-classifier mlp-regressor multinomial-logistic-regression pca-analysis random-forest stacking-ensemble

Last synced: 15 May 2025

https://github.com/blleshi/crypto_clustering

Clustering Cryptocurrencies

cryptocurrencies k-means pca-analysis

Last synced: 06 Jul 2025

https://github.com/lingumd/cryptocurrencies

Unsupervised machine learning models used to group the cryptocurrencies to help prepare for a new investment.

concatenate elbow-curves get-dummies hvplot jupyterlab kmeans matplotlib-pyplot minmaxscaler pandas path pca-analysis plotly-express scikit-learn unsupervised-machine-learning

Last synced: 13 Apr 2026

https://github.com/plambert777/pca-principal-component-analysis

This repository contains an R script for performing Principal Components Analysis (PCA) on a dataset. The script includes functions for data preprocessing, such as reading in data and imputing missing values, and for conducting PCA using both spectral decomposition and singular value decomposition (SVD). Additionally, the script calculates centroid

pca-analysis r rmd

Last synced: 02 Apr 2025

https://github.com/yoyolicoris/iml_hw2

My implementation of homework 2 for the Introduction to Machine Learning class in NCTU (course number 1181).

kd-tree knn-classification pca-analysis

Last synced: 02 Apr 2025

https://github.com/kris96tian/sc-rnaseq_analysis

Single-cell RNA data analysis using R (Seurat) , Python (Scanpy), and Julia.

annotations bioinformatics-analysis pca-analysis sc-rna-seq-analysis scanpy seurat singlr umap

Last synced: 05 May 2025

https://github.com/tinaland101/cryptoclustering

This project leverages unsupervised learning to analyze cryptocurrency data by clustering cryptocurrencies based on their price changes over 24 hours and 7 days. The goal is to predict if cryptocurrencies are impacted by recent price fluctuations using K-means clustering and Principal Component Analysis (PCA) for dimensionality reduction.

pandas-library pca-analysis python

Last synced: 08 May 2026

https://github.com/minhosong88/food-image-classification

This project is part of a lab assignment that focuses on food classification using the Food-11 image dataset.

classification machine-learning pca-analysis python

Last synced: 09 May 2026