https://github.com/purcellcjp/cryptoclustering
This project applies K-means clustering to group crypto-currencies based on 24 hr and 7 day price changes. In addition, it investigates the impact of dimensionality reduction using Principal Component Analysis (PCA) on clustering outcomes.
https://github.com/purcellcjp/cryptoclustering
clustering cryptocurrency machine-learning pca-analysis unsupervised-learning
Last synced: 6 months ago
JSON representation
This project applies K-means clustering to group crypto-currencies based on 24 hr and 7 day price changes. In addition, it investigates the impact of dimensionality reduction using Principal Component Analysis (PCA) on clustering outcomes.
- Host: GitHub
- URL: https://github.com/purcellcjp/cryptoclustering
- Owner: purcellcjp
- Created: 2024-11-04T16:18:12.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-11-07T19:50:35.000Z (about 1 year ago)
- Last Synced: 2025-02-05T17:12:55.137Z (11 months ago)
- Topics: clustering, cryptocurrency, machine-learning, pca-analysis, unsupervised-learning
- Language: Jupyter Notebook
- Homepage:
- Size: 1.27 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Crypto Clustering
## Overview
In this challenge, you'll use your knowledge of Python and unsupervised learning to predict if cryptocurrencies are affected by 24-hour or 7-day price changes.
## Steps
1. Load the data into a DataFrame.
2. Prepare the data by scaling it using StandardScaler().
3. Find the best value for k using the Scaled DataFrame.
4. Cluster cryptocurrencies with K-means using the original scaled data.
5. Optimize clusters with Principal Component Analysis (PCA)
6. Find the best value for k using the PCA DataFrame.
7. Cluster cryptocurrencies with K-means using the PCA DataFrame.
8. Visualize and compare the results using hvPlot.
## Results
The project includes the following visualizations:
1. Elbow curve for the original data.

2. Scatter plot of cryptocurrency clusters based on the original data.

3. Elbow curve for the PCA data.

4. Scatter plot of cryptocurrency clusters based on the PCA data.

## Conclusion
The project analyzes the impact of using fewer features on clustering the data using K-means. Comparing the clustering results of the original data and the PCA data helps to understand the effect of dimensionality reduction on the clustering process.
## Dependencies
- Python
- pandas
- scikit-learn
- hvPlot