Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/msikorski93/stock-price-forecasting-with-exogenous-variables

Time series forecasting (close prices) with different estimators.
https://github.com/msikorski93/stock-price-forecasting-with-exogenous-variables

eda linear-regression lstm prophet rnn sarimax sklearn stock-price-prediction support-vector-machine tensorflow time-series-forecasting xgboost yfinance

Last synced: about 2 months ago
JSON representation

Time series forecasting (close prices) with different estimators.

Awesome Lists containing this project

README

        

# Stock-Price-Forecasting-With-Exogenous-Variables
![ alt text ](https://img.shields.io/badge/license-MIT-green?style=&logo=)
![ alt text ](https://img.shields.io/badge/-Jupyter-F37626?logo=Jupyter&logoColor=white)
![ alt text ](https://img.shields.io/badge/-NumPy-013243?logo=Numpy&logoColor=white)
![ alt text ](https://img.shields.io/badge/-TensorFlow-FF6F00?logo=TensorFlow&logoColor=white)
![ alt text ](https://img.shields.io/badge/-Keras-D00000?logo=Keras&logoColor=white)
![ alt text ](https://img.shields.io/badge/-pandas-150458?logo=pandas&logoColor=white)
![ alt text ](https://img.shields.io/badge/-scikit--learn-F7931E?logo=scikitlearn&logoColor=white)

Time series forecasting (close prices) with different estimators. This project focused on adding exogenous fetures to the predicting models. The stock price data was collected and loaded via `yfinance` API. The idea behind was to enrich the dataset and improve the model learning. With proper feature selection we achived accurate price estimations with the following evaluation metrics:

| Model | RMSE | 6-Fold Cross-
Validation | R2 | MAE | MAPE [%] |
|----------------------|---------|-----------------------------|---------|---------|----------|
| SARIMAX | 7.9388 | 2.6185 | 0.9993 | 5.7220 | 0.0052 |
| RNN | 81.5023 | 0.0226 | 0.9257 | 64.9433 | 0.0556 |
| LSTM | 19.4527 | 0.0071 | 0.9959 | 16.1197 | 0.0147 |
| Prophet | 8.0641 | 1.8883 | 0.9993 | 5.8148 | 0.0052 |
| Linear
Regression | 7.9390 | 1.0217 | 0.9993 | 5.7226 | 0.0052 |
| SVR | 44.1079 | 0.0571 | 0.9789 | 42.5849 | 0.0419 |
| XGBoosting | 20.8949 | 0.0533 | 0.9953 | 14.7921 | 0.0127 |

Among the chosen regressors the linear regression outperformed all the other estimators. Although the SARIMAX model had practically the same performance and is much more widely applied among the data science community, the linear regressor is much easier to develop and understand. Its high performance proves that our engineered variables are linearly correlated with the target feature. Adding extra variables is very beneficial. Not only does this enrich the dataset, but also provides more information and therefore improves regression performance.