Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/firmai/atspy
AtsPy: Automated Time Series Models in Python (by @firmai)
https://github.com/firmai/atspy
automated finance forecasting forecasting-models python time-series time-series-analysis
Last synced: 10 days ago
JSON representation
AtsPy: Automated Time Series Models in Python (by @firmai)
- Host: GitHub
- URL: https://github.com/firmai/atspy
- Owner: firmai
- Created: 2020-01-28T05:00:10.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2022-11-21T21:55:23.000Z (almost 2 years ago)
- Last Synced: 2024-04-25T01:42:06.792Z (7 months ago)
- Topics: automated, finance, forecasting, forecasting-models, python, time-series, time-series-analysis
- Language: Python
- Homepage: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3580631
- Size: 792 KB
- Stars: 510
- Watchers: 21
- Forks: 89
- Open Issues: 26
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-python-machine-learning-resources - GitHub - 90% open · ⏱️ 18.12.2021): (时间序列)
- StarryDivineSky - firmai/atspy
README
# Automated Time Series Models in Python (AtsPy)
[![Downloads](https://pepy.tech/badge/atspy)](https://pepy.tech/project/atspy)
[![DOI](https://zenodo.org/badge/236661502.svg)](https://zenodo.org/badge/latestdoi/236661502)
---------
Finance Quant Machine Learning
------------------
- [ML-Quant.com](https://www.ml-quant.com/) - Automated Research Repository---------
[SSRN Report](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3580631)
Easily develop state of the art time series models to forecast univariate data series. Simply load your data and select which models you want to test. This is the largest repository of automated structural and machine learning time series models. Please get in contact if you want to contribute a model. This is a fledgling project, all advice appreciated.
#### Install
```
pip install atspy
```#### Automated Models
1. ```ARIMA``` - Automated ARIMA Modelling
1. ```Prophet``` - Modeling Multiple Seasonality With Linear or Non-linear Growth
1. ```HWAAS``` - Exponential Smoothing With Additive Trend and Additive Seasonality
1. ```HWAMS``` - Exponential Smoothing with Additive Trend and Multiplicative Seasonality
1. ```NBEATS``` - Neural basis expansion analysis (now fixed at 20 Epochs)
1. ```Gluonts``` - RNN-based Model (now fixed at 20 Epochs)
1. ```TATS``` - Seasonal and Trend no Box Cox
1. ```TBAT``` - Trend and Box Cox
1. ```TBATS1``` - Trend, Seasonal (one), and Box Cox
1. ```TBATP1``` - TBATS1 but Seasonal Inference is Hardcoded by Periodicity
1. ```TBATS2``` - TBATS1 With Two Seasonal Periods#### Why AtsPy?
1. Implements all your favourite automated time series models in a unified manner by simply running ```AutomatedModel(df)```.
1. Reduce structural model errors with 30%-50% by using LightGBM with TSFresh infused features.
1. Automatically identify the seasonalities in your data using singular spectrum analysis, periodograms, and peak analysis.
1. Identifies and makes accessible the best model for your time series using in-sample validation methods.
1. Combines the predictions of all these models in a simple (average) and complex (GBM) ensembles for improved performance.
1. Where appropriate models have been developed to use GPU resources to speed up the automation process.
1. Easily access all the models by using ```am.models_dict_in``` for in-sample and ```am.models_dict_out``` for out-of-sample prediction.#### AtsPy Progress
1. Univariate forecasting only (single column) and only monthly and daily data have been tested for suitability.
1. More work ahead; all suggestions and criticisms appreciated, use the issues tab.
1. **Here** is a **[Google Colab](https://colab.research.google.com/drive/1WzwxUlAKg-WiEm_SleAzBIV6rs5VY_3W)** to run the package in the cloud and **[here you can run all the models](https://colab.research.google.com/drive/14QVrnVtT434s-xYcalHFlQg-o658nekv)**.### Documentation by Example
----------
#### Load Package
```python
from atspy import AutomatedModel
```#### Pandas DataFrame
The data requires strict preprocessing, no periods can be skipped and there cannot be any empty values.
```python
import pandas as pd
df = pd.read_csv("https://raw.githubusercontent.com/firmai/random-assets-two/master/ts/monthly-beer-australia.csv")
df.Month = pd.to_datetime(df.Month)
df = df.set_index("Month"); df
```
Megaliters
Month
1956-01-01
93.2
1956-02-01
96.0
1956-03-01
95.2
1956-04-01
77.1
1956-05-01
70.9
#### AutomatedModel
1. ```AutomatedModel``` - Returns a class instance.
1. ```forecast_insample``` - Returns an in-sample forcasted dataframe and performance.
1. ```forecast_outsample``` - Returns an out-of-sample forcasted dataframe.
1. ```ensemble``` - Returns the results of three different forms of ensembles.
1. ```models_dict_in``` - Returns a dictionary of the fully trained in-sample models.
1. ```models_dict_out``` - Returns a dictionary of the fully trained out-of-sample models.```python
from atspy import AutomatedModel
model_list = ["HWAMS","HWAAS","TBAT"]
am = AutomatedModel(df = df , model_list=model_list,forecast_len=20 )
```Other models to try, **add as many as you like**; note ```ARIMA``` is slow: ```["ARIMA","Gluonts","Prophet","NBEATS", "TATS", "TBATS1", "TBATP1", "TBATS2"]```
#### In-Sample Performance
```python
forecast_in, performance = am.forecast_insample(); forecast_in
```
Target
HWAMS
HWAAS
TBAT
Date
1985-10-01
181.6
161.962148
162.391653
148.410071
1985-11-01
182.0
174.688055
173.191756
147.999237
1985-12-01
190.0
189.728744
187.649575
147.589541
1986-01-01
161.2
155.077205
154.817215
147.180980
1986-02-01
155.5
148.054292
147.477692
146.773549
```python
performance
```
Target
HWAMS
HWAAS
TBAT
rmse
0.000000
17.599400
18.993827
36.538009
mse
0.000000
309.738878
360.765452
1335.026136
mean
155.293277
142.399639
140.577496
126.590412
#### Out-of-Sample Forecast
```python
forecast_out = am.forecast_outsample(); forecast_out
```
HWAMS
HWAAS
TBAT
Date
1995-09-01
137.518755
137.133938
142.906275
1995-10-01
164.136220
165.079612
142.865575
1995-11-01
178.671684
180.009560
142.827110
1995-12-01
184.175954
185.715043
142.790757
1996-01-01
147.166448
147.440026
142.756399
#### Ensemble and Model Validation Performance
```python
all_ensemble_in, all_ensemble_out, all_performance = am.ensemble(forecast_in, forecast_out)
``````python
all_performance
```
rmse
mse
mean
ensemble_lgb__X__HWAMS
9.697588
94.043213
146.719412
ensemble_lgb__X__HWAMS__X__HWAMS_HWAAS__X__ensemble_ts__X__HWAAS
9.875212
97.519817
145.250837
ensemble_lgb__X__HWAMS__X__HWAMS_HWAAS
11.127326
123.817378
142.994374
ensemble_lgb
12.748526
162.524907
156.487208
ensemble_lgb__X__HWAMS__X__HWAMS_HWAAS__X__ensemble_ts__X__HWAAS__X__HWAMS_HWAAS_TBAT__X__TBAT
14.589155
212.843442
138.615567
HWAMS
15.567905
242.359663
136.951615
HWAMS_HWAAS
16.651370
277.268110
135.544299
ensemble_ts
17.255107
297.738716
163.134079
HWAAS
17.804066
316.984751
134.136983
HWAMS_HWAAS_TBAT
23.358758
545.631579
128.785846
TBAT
39.003864
1521.301380
115.268940
#### Best Performing In-sample
```python
all_ensemble_in[["Target","ensemble_lgb__X__HWAMS","HWAMS","HWAAS"]].plot()
```
![png](atspy_files/insample.png)#### Future Predictions All Models
```python
all_ensemble_out[["ensemble_lgb__X__HWAMS","HWAMS","HWAAS"]].plot()
```
![png](atspy_files/outsample.png)#### And Finally Grab the Models
```
am.models_dict_in
```{'HWAAS': ,
'HWAMS': ,
'TBAT': }```
am.models_dict_out
```{'HWAAS': ,
'HWAMS': ,
'TBAT': }Follow [this link](https://colab.research.google.com/drive/1WzwxUlAKg-WiEm_SleAzBIV6rs5VY_3W) if you want to run the package in the cloud.
#### AtsPy Future Development
1. Additional in-sample validation steps to stop deep learning models from over and underfitting.
1. Extra performance metrics like MAPE and MAE.
1. Improved methods to select the window length to use in training and calibrating the model.
1. Add the ability to accept dirty data, and have the ability to clean it up, interpolation etc.
1. Add a function to resample to a larger frequency for big datasets.
1. Add the ability to algorithmically select a good enough chunk of a large dataset to balance performance and time to train.
1. More internal model optimisation using AIC, BIC an AICC.
1. Code annotations for other developers to follow and improve on the work being done.
1. Force seasonality stability between in and out of sample training models.
1. Make AtsPy less dependency heavy, currently it draws on tensorflow, pytorch and mxnet.## Citations
If you use AtsPy in your research, please consider citing it. I have also written a [small report](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3580631) that can be found on SSRN.
BibTeX entry:
```
@software{atspy,
title = {{AtsPy}: Automated Time Series Models in Python.},
author = {Snow, Derek},
url = {https://github.com/firmai/atspy/},
version = {1.15},
date = {2020-02-17},
}
``````
@misc{atspy,
author = {Snow, Derek},
title = {{AtsPy}: Automated Time Series Models in Python (1.15).},
year = {2020},
url = {https://github.com/firmai/atspy/},
}
```