Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/w7negreiros/cryptoclustering
Uses K-Means clustering to group cryptocurrencies by their performance - Crypto Clustering Challenge - Unsupervised Learning - UofT Data Analytics - Bootcamp
https://github.com/w7negreiros/cryptoclustering
Last synced: 10 days ago
JSON representation
Uses K-Means clustering to group cryptocurrencies by their performance - Crypto Clustering Challenge - Unsupervised Learning - UofT Data Analytics - Bootcamp
- Host: GitHub
- URL: https://github.com/w7negreiros/cryptoclustering
- Owner: w7negreiros
- Created: 2024-07-17T14:30:09.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2024-07-17T15:33:48.000Z (7 months ago)
- Last Synced: 2024-12-04T02:13:31.240Z (2 months ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 215 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Clustering Cryptocurrencies - UofT Data Analytics - Bootcamp
In this project I apply the unsupervised learning technique of K-Means clustering to group cryptocurrencies by their performance in an effort to create profitable portfolio recommendations.
# Data Used
[crypto_market_data.csv](Resources/crypto_market_data.csv) - market data of different cryptocurrencies during different time periods
# Summary
I start by using the elbow curve method, using normalized data, to find the optimal k value for the K-Means model that will use all of the original features of the dataset.
![Resources/Crypto Visualizations/elbow_curve.png]()
Then, using the optimal k value I train and predict the K-Means model to generate 4 clusters of cryptocurrencies. The inertia of each cluster was significant enough to consider reducing the amount of features.
![image 2]()
To reduce the amount of features used, I applied Principal Component Analysis to create 3 primary clusters.
![image 3]()
I then used the PCA data to again calculate the optimal k value for the K-Means model.
![image 4]()
Finally, with the optimal k value for the PCA features, I plot the new clusters.
![image 5]()
# Technologies
This is a Python 3.7 project ran using a JupyterLab in a conda dev environment.
The following dependencies are used:
Jupyter - Running code
Conda (4.13.0) - Dev environment
Pandas (1.3.5) - Data analysis
Matplotlib (3.5.1) - Data visualization
Numpy (1.21.5) - Data calculations + Pandas support
hvPlot (0.8.1) - Interactive Pandas plots
scikit-learn (1.0.2) - KMeans clustering, data normalization, and PCA# Usage
The Jupyter notebook [crypto_investments.ipynb](Crypto_Clustering.ipynb) will provide all steps of the data collection, preparation, and analysis. Data visualizations are shown inline and accompanying analysis responses are provided.