Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/firmai/atspy

AtsPy: Automated Time Series Models in Python (by @firmai)
https://github.com/firmai/atspy

automated finance forecasting forecasting-models python time-series time-series-analysis

Last synced: 3 months ago
JSON representation

AtsPy: Automated Time Series Models in Python (by @firmai)

Awesome Lists containing this project

README

        

# Automated Time Series Models in Python (AtsPy)

[![Downloads](https://pepy.tech/badge/atspy)](https://pepy.tech/project/atspy)

[![DOI](https://zenodo.org/badge/236661502.svg)](https://zenodo.org/badge/latestdoi/236661502)

---------

Finance Quant Machine Learning
------------------
- [ML-Quant.com](https://www.ml-quant.com/) - Automated Research Repository

---------

[SSRN Report](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3580631)

Easily develop state of the art time series models to forecast univariate data series. Simply load your data and select which models you want to test. This is the largest repository of automated structural and machine learning time series models. Please get in contact if you want to contribute a model. This is a fledgling project, all advice appreciated.

#### Install
```
pip install atspy
```

#### Automated Models

1. ```ARIMA``` - Automated ARIMA Modelling
1. ```Prophet``` - Modeling Multiple Seasonality With Linear or Non-linear Growth
1. ```HWAAS``` - Exponential Smoothing With Additive Trend and Additive Seasonality
1. ```HWAMS``` - Exponential Smoothing with Additive Trend and Multiplicative Seasonality
1. ```NBEATS``` - Neural basis expansion analysis (now fixed at 20 Epochs)
1. ```Gluonts``` - RNN-based Model (now fixed at 20 Epochs)
1. ```TATS``` - Seasonal and Trend no Box Cox
1. ```TBAT``` - Trend and Box Cox
1. ```TBATS1``` - Trend, Seasonal (one), and Box Cox
1. ```TBATP1``` - TBATS1 but Seasonal Inference is Hardcoded by Periodicity
1. ```TBATS2``` - TBATS1 With Two Seasonal Periods

#### Why AtsPy?

1. Implements all your favourite automated time series models in a unified manner by simply running ```AutomatedModel(df)```.
1. Reduce structural model errors with 30%-50% by using LightGBM with TSFresh infused features.
1. Automatically identify the seasonalities in your data using singular spectrum analysis, periodograms, and peak analysis.
1. Identifies and makes accessible the best model for your time series using in-sample validation methods.
1. Combines the predictions of all these models in a simple (average) and complex (GBM) ensembles for improved performance.
1. Where appropriate models have been developed to use GPU resources to speed up the automation process.
1. Easily access all the models by using ```am.models_dict_in``` for in-sample and ```am.models_dict_out``` for out-of-sample prediction.

#### AtsPy Progress

1. Univariate forecasting only (single column) and only monthly and daily data have been tested for suitability.
1. More work ahead; all suggestions and criticisms appreciated, use the issues tab.
1. **Here** is a **[Google Colab](https://colab.research.google.com/drive/1WzwxUlAKg-WiEm_SleAzBIV6rs5VY_3W)** to run the package in the cloud and **[here you can run all the models](https://colab.research.google.com/drive/14QVrnVtT434s-xYcalHFlQg-o658nekv)**.

### Documentation by Example

----------
#### Load Package
```python
from atspy import AutomatedModel
```

#### Pandas DataFrame

The data requires strict preprocessing, no periods can be skipped and there cannot be any empty values.

```python
import pandas as pd
df = pd.read_csv("https://raw.githubusercontent.com/firmai/random-assets-two/master/ts/monthly-beer-australia.csv")
df.Month = pd.to_datetime(df.Month)
df = df.set_index("Month"); df
```




Megaliters


Month





1956-01-01
93.2


1956-02-01
96.0


1956-03-01
95.2


1956-04-01
77.1


1956-05-01
70.9

#### AutomatedModel

1. ```AutomatedModel``` - Returns a class instance.
1. ```forecast_insample``` - Returns an in-sample forcasted dataframe and performance.
1. ```forecast_outsample``` - Returns an out-of-sample forcasted dataframe.
1. ```ensemble``` - Returns the results of three different forms of ensembles.
1. ```models_dict_in``` - Returns a dictionary of the fully trained in-sample models.
1. ```models_dict_out``` - Returns a dictionary of the fully trained out-of-sample models.

```python
from atspy import AutomatedModel
model_list = ["HWAMS","HWAAS","TBAT"]
am = AutomatedModel(df = df , model_list=model_list,forecast_len=20 )
```

Other models to try, **add as many as you like**; note ```ARIMA``` is slow: ```["ARIMA","Gluonts","Prophet","NBEATS", "TATS", "TBATS1", "TBATP1", "TBATS2"]```

#### In-Sample Performance
```python
forecast_in, performance = am.forecast_insample(); forecast_in
```




Target
HWAMS
HWAAS
TBAT


Date








1985-10-01
181.6
161.962148
162.391653
148.410071


1985-11-01
182.0
174.688055
173.191756
147.999237


1985-12-01
190.0
189.728744
187.649575
147.589541


1986-01-01
161.2
155.077205
154.817215
147.180980


1986-02-01
155.5
148.054292
147.477692
146.773549

```python
performance
```




Target
HWAMS
HWAAS
TBAT




rmse
0.000000
17.599400
18.993827
36.538009


mse
0.000000
309.738878
360.765452
1335.026136


mean
155.293277
142.399639
140.577496
126.590412

#### Out-of-Sample Forecast

```python
forecast_out = am.forecast_outsample(); forecast_out
```




HWAMS
HWAAS
TBAT


Date







1995-09-01
137.518755
137.133938
142.906275


1995-10-01
164.136220
165.079612
142.865575


1995-11-01
178.671684
180.009560
142.827110


1995-12-01
184.175954
185.715043
142.790757


1996-01-01
147.166448
147.440026
142.756399

#### Ensemble and Model Validation Performance

```python
all_ensemble_in, all_ensemble_out, all_performance = am.ensemble(forecast_in, forecast_out)
```

```python
all_performance
```




rmse
mse
mean




ensemble_lgb__X__HWAMS
9.697588
94.043213
146.719412


ensemble_lgb__X__HWAMS__X__HWAMS_HWAAS__X__ensemble_ts__X__HWAAS
9.875212
97.519817
145.250837


ensemble_lgb__X__HWAMS__X__HWAMS_HWAAS
11.127326
123.817378
142.994374


ensemble_lgb
12.748526
162.524907
156.487208


ensemble_lgb__X__HWAMS__X__HWAMS_HWAAS__X__ensemble_ts__X__HWAAS__X__HWAMS_HWAAS_TBAT__X__TBAT
14.589155
212.843442
138.615567


HWAMS
15.567905
242.359663
136.951615


HWAMS_HWAAS
16.651370
277.268110
135.544299


ensemble_ts
17.255107
297.738716
163.134079


HWAAS
17.804066
316.984751
134.136983


HWAMS_HWAAS_TBAT
23.358758
545.631579
128.785846


TBAT
39.003864
1521.301380
115.268940

#### Best Performing In-sample

```python
all_ensemble_in[["Target","ensemble_lgb__X__HWAMS","HWAMS","HWAAS"]].plot()
```
![png](atspy_files/insample.png)

#### Future Predictions All Models

```python
all_ensemble_out[["ensemble_lgb__X__HWAMS","HWAMS","HWAAS"]].plot()
```
![png](atspy_files/outsample.png)

#### And Finally Grab the Models

```
am.models_dict_in
```

{'HWAAS': ,
'HWAMS': ,
'TBAT': }

```
am.models_dict_out
```

{'HWAAS': ,
'HWAMS': ,
'TBAT': }

Follow [this link](https://colab.research.google.com/drive/1WzwxUlAKg-WiEm_SleAzBIV6rs5VY_3W) if you want to run the package in the cloud.

#### AtsPy Future Development

1. Additional in-sample validation steps to stop deep learning models from over and underfitting.
1. Extra performance metrics like MAPE and MAE.
1. Improved methods to select the window length to use in training and calibrating the model.
1. Add the ability to accept dirty data, and have the ability to clean it up, interpolation etc.
1. Add a function to resample to a larger frequency for big datasets.
1. Add the ability to algorithmically select a good enough chunk of a large dataset to balance performance and time to train.
1. More internal model optimisation using AIC, BIC an AICC.
1. Code annotations for other developers to follow and improve on the work being done.
1. Force seasonality stability between in and out of sample training models.
1. Make AtsPy less dependency heavy, currently it draws on tensorflow, pytorch and mxnet.

## Citations

If you use AtsPy in your research, please consider citing it. I have also written a [small report](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3580631) that can be found on SSRN.

BibTeX entry:

```
@software{atspy,
title = {{AtsPy}: Automated Time Series Models in Python.},
author = {Snow, Derek},
url = {https://github.com/firmai/atspy/},
version = {1.15},
date = {2020-02-17},
}
```

```
@misc{atspy,
author = {Snow, Derek},
title = {{AtsPy}: Automated Time Series Models in Python (1.15).},
year = {2020},
url = {https://github.com/firmai/atspy/},
}
```