https://github.com/theveryhim/dimensionality-reduction-and-clustering
Simple ML-like data analysis and processing.
https://github.com/theveryhim/dimensionality-reduction-and-clustering
autoencoder clustering data-analysis dimensionality-reduction pca
Last synced: 10 months ago
JSON representation
Simple ML-like data analysis and processing.
- Host: GitHub
- URL: https://github.com/theveryhim/dimensionality-reduction-and-clustering
- Owner: theveryhim
- License: mit
- Created: 2025-07-02T20:54:58.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2025-07-02T22:27:05.000Z (12 months ago)
- Last Synced: 2025-07-02T23:28:18.245Z (12 months ago)
- Topics: autoencoder, clustering, data-analysis, dimensionality-reduction, pca
- Language: Jupyter Notebook
- Homepage:
- Size: 1.09 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# MNIST
Some of tasks done in this section which is part of a ML assignment:
- PCA and LDA analysis on MNIST dataset
- Implement an Auto-encoder network
- Clustering with K-means
- Clustering with GMM
## PCA and LDA Analysis on MNIST Dataset
| Method | Accuracy |
|---------------------------------|------------|
| Original Data | 0.885 |
| Manual PCA | 0.865 |
| RBF Kernel PCA | 0.8175 |
| Polynomial Kernel PCA | 0.825 |
| Linear Kernel PCA | 0.865 |
| LDA Projected Data | 0.755 |
## K-means Results
| K (Number of Clusters) | Dunn Index |
|------------------------|------------|
| 3 | 0.363 |
| 4 | 0.265 |
| 5 | 0.460 |
# Cove data-set
Some of tasks done in this section which is part of a MDA assignment:
- Using the PCA algorithm, choose the value of k in such a way that at least 90% of the variance of the samples is maintained
- Reduce the dimensions of the samples using the obtained eigenvectors.
- Divide the data(covtype.info) into three chunks and group the data into 7 clusters using one of the BRF or Cure algorithms.
- Evaluate using metrics *Silhouette Score* and *Davies-Bouldin Index*
```markdown
Silhouette Score: 0.42960626042271594
Davies-Bouldin Index: 4.129832832292689
```
# Brain tumor: Description included in notebook
Some of tasks done in this section which is part of a DeepLearning assignment:
## Dimensionality Reduction
- Load data
- Flatten
- Split
- PCA
- Reconstruction
## Classifier
In this section use Support Vector Machine (SVM) for predicting Tumor from features.
Our purpose is comparing accuracy on test data before and after dimensionality reduction.
```
Test Accuracy before PCA: 0.8039
Test Accuracy after PCA: 0.6470
```