https://github.com/gappeah/solana-ml-forecast

This project uses machine learning, specifically an XGBoost regressor, to predict the price of Solana (SOL) based on historical data and engineered features.
https://github.com/gappeah/solana-ml-forecast

cryptocurrency data-visualization machine-learning solana xgboost

Last synced: over 1 year ago
JSON representation

This project uses machine learning, specifically an XGBoost regressor, to predict the price of Solana (SOL) based on historical data and engineered features.

Host: GitHub
URL: https://github.com/gappeah/solana-ml-forecast
Owner: gappeah
Created: 2024-10-26T17:09:52.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-11-01T00:43:14.000Z (over 1 year ago)
Last Synced: 2025-01-07T19:44:27.571Z (over 1 year ago)
Topics: cryptocurrency, data-visualization, machine-learning, solana, xgboost
Language: Jupyter Notebook
Homepage:
Size: 354 MB
Stars: 0
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # Solana Price Prediction using XGBoost

This project uses machine learning, specifically an XGBoost regressor, to predict the price of Solana (SOL) based on historical data and engineered features.

## Project Overview

The primary objective of this project is to predict Solana prices using historical data. 

This repository provides a framework for predicting the price of Solana, utilising time-series data and engineered features like lagged and rolling statistics. The model is trained with the XGBoost algorithm, known for its performance in regression tasks on structured datasets using a gradient boosting framework. The dataset consists of historical price data of Solana from April 2015 to October 2024 obtained from Coincodex.com  

## What is Solana 

* Solana is a decentralized computer network that uses a blockchain database to record transactions and manage the currency. The individual unit of Solana is called a sol.

* Solana uses a proof-of-stake (PoS) mechanism and a proof-of-history (PoH) mechanism to improve on the traditional PoS blockchain. PoH uses hashed timestamps to verify when transactions occur.

* Solana can power smart contracts, decentralized finance apps, NFTs, and more. It claims to be able to process 50,000 transactions per second.

* Solana was created by Anatoly Yakovenko and Raj Gokal founded Solana Labs in 2018 and launched Solana in 2020.

## Setup Instructions

Ensure you have Python 3.7+ and the following libraries installed:

```bash

pip install numpy pandas matplotlib scikit-learn xgboost

```

### Files

- `main_xgboost copy.ipynb`: Jupyter notebook containing the main code for data preprocessing, model training, and evaluation.

- `solana_2020-04-09_2024-10-28.csv`: Historical Solana price data used to train and test the model.

## Running the Project

1. **Clone the repository** or download the necessary files.

   ```bash

   git clone https://github.com/gappeah/Solana-ML-Forecast

   ```

2. **Prepare the dataset**: Ensure the `solana_2020-04-09_2024-10-26.csv` file is in the working directory.

## Usage

### 1. Load Data

Load the Solana price data into a DataFrame and parse the `Start` column as the datetime index:

```python

import pandas as pd

# Load dataset

data = pd.read_csv('solana_2020-04-09_2024-10-28.csv')

data['Start'] = pd.to_datetime(data['Start'])

data.set_index('Start', inplace=True)

```

### 2. Feature Engineering

Create lagged and rolling statistical features for the `Close` price:

```python

target = 'Close'

lags = [1, 7, 14, 30]

# Lagged features

for lag in lags:

    data[f'{target}_lag_{lag}'] = data[target].shift(lag)

# Rolling statistics

data['rolling_mean_7'] = data[target].rolling(window=7).mean()

data['rolling_std_7'] = data[target].rolling(window=7).std()

data['rolling_mean_30'] = data[target].rolling(window=30).mean()

data['rolling_std_30'] = data[target].rolling(window=30).std()

# Drop NaN values

data.dropna(inplace=True)

```

### 3. Prepare Features and Target

Separate the features (X) and target variable (y):

```python

X = data.drop(columns=['Close'])

y = data['Close']

```

### 4. Train the XGBoost Model

Configure and train the XGBoost model with categorical support enabled:

```python

from xgboost import XGBRegressor

# Initialize XGBoost Regressor

xgb_model = XGBRegressor(

    objective='reg:squarederror',

    random_state=42,

    enable_categorical=True  # Enable categorical feature support

)

# Fit the model

xgb_model.fit(X, y)

```

### 5. Model Evaluation

Evaluate the model using Mean Absolute Error (MAE) and Mean Squared Error (MSE):

```python

from sklearn.metrics import mean_absolute_error, mean_squared_error

# Predictions

y_pred = xgb_model.predict(X)

# Evaluation metrics

mae = mean_absolute_error(y, y_pred)

mse = mean_squared_error(y, y_pred)

print("Mean Absolute Error (MAE):", mae)

print("Mean Squared Error (MSE):", mse)

```

## Example

An example run to check the dataset and engineered features:

```python

# Display the first few rows of the dataset

print(data.head())

```

## Results

The model's performance can be evaluated by comparing the predicted values against actual values. Here is an example of a quick visualization:

```python

import matplotlib.pyplot as plt

plt.figure(figsize=(10,5))

plt.plot(y.index, y, label="Actual Price")

plt.plot(y.index, y_pred, label="Predicted Price")

plt.xlabel("Date")

plt.ylabel("SOL Price")

plt.legend()

plt.show()

```

## References

- [Matplotlib documentation](https://matplotlib.org/)

- [Pandas documentation](https://pandas.pydata.org/)

- [Numpy documentation](https://numpy.org/doc/stable/)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/gappeah/solana-ml-forecast

Awesome Lists containing this project

README