Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/philipperemy/arma-scipy-fit

Estimating coefficients of ARMA models with the Scipy package.
https://github.com/philipperemy/arma-scipy-fit

Last synced: 20 days ago
JSON representation

Estimating coefficients of ARMA models with the Scipy package.

Host: GitHub
URL: https://github.com/philipperemy/arma-scipy-fit
Owner: philipperemy
Created: 2018-12-25T08:36:05.000Z (almost 6 years ago)
Default Branch: master
Last Pushed: 2022-01-22T10:47:46.000Z (almost 3 years ago)
Last Synced: 2024-10-10T19:12:16.874Z (about 1 month ago)
Language: Python
Size: 574 KB
Stars: 6
Watchers: 1
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # ARMA w. Scipy

Estimating coefficients of ARMA models with the Scipy package.

Installation

[![Downloads](https://pepy.tech/badge/arma-scipy)](https://pepy.tech/project/arma-scipy)

[![Downloads](https://pepy.tech/badge/arma-scipy/month)](https://pepy.tech/project/arma-scipy)

```bash

pip install arma-scipy

```

Python import

```python

from arma_scipy import fit

```

## Motivation

ARMA models in general can be, after choosing p and q, fitted by least

squares regression to find the values of the parameters which minimize

the error term. It is generally considered good practice to find the

smallest values of p and q which provide an acceptable fit to the data.

For a pure AR model, the Yule-Walker equations may be used to provide a

fit. "Least squares" means that the overall solution minimizes the sum of the

squares of the residuals made in the results of every single equation.

The reasons behind this Scipy fit implementation are twofold:

- provide an alternative when the score function is not the MSE - **more important**

- provide a way to compare both fit methods (stat theory vs optimization) - less important

You can fit the coefficients of an `ARMA(4,4)` that way:

```bash

python generate_arma_process.py

python scipy_fit_data.py

```

Here is an example of such a fit:

```

################################################################################

Optimization terminated successfully.

         Current function value: 1.432208

         Iterations: 508

         Function evaluations: 788

Estimation of the coefficients with the scipy package:

[ 0.2235 -0.5872  0.3143 -0.1805  0.167  -0.0464  0.6528  0.224 ]

Estimation of the coefficients with the statsmodels.tsa (least squares) package:

[ 0.237  -0.4998  0.3467 -0.128   0.1542 -0.1467  0.6244  0.2245]

True ARMA coefficients:

[ 0.25 -0.5   0.35 -0.15  0.5  -0.4   0.78  0.32]

```

## Comparison

- It is not a surprise that the score function is minimized by the fit of the `statsmodels` package. Indeed, the maximum likelihood estimation is guaranteed to yield the lowest mean squared error score on the train set.

- The Scipy minimize function is doing a relatively good job at being close to this minimum. However, due to the randomness nature of this optimization and the crucial choice of x0 (initial values of the coefficients to optimize), several runs are necessary to guarantee to be close enough to this global minimum. It's clear enough that there is a strong variability across the runs. A significant proportion of runs do not get even closer to this minimum. On 200 runs, the average hovers ~10% above the minimum expected value, then starts to overfit. The best run, however, reaches 1.41807, a score extremely close to the target minimum score of 1.4179.



  



The conclusion of this comparison is such that there it makes little sense to fit the coefficients of an ARMA model with a numerical method like `scipy.minimize`. A traditional fit using `statsmodels` is preferable. Yet, `scipy.minimize` could be used if the score function was not the mean squared error. Finding a good x0 is challenging and one way to have a smooth optimization could lie in the estimation of x0 by `statsmodels`. In other words, fit the coefficients with `statsmodels` then optimize with `scipy.minimize` and your custom score function.

## Custom score function

Lets consider the custom score function that assigns:

- 1 if the prediction has the same sign as the true value.

- 0 otherwise.

```python

def score_function(p, t):

    return 1 - np.mean(((p > 0) & (t > 0)) | ((p < 0) & (t < 0)))

```

- The optimization on the ARMA coefficients is done first with the `statsmodels`. This is our `x0`.

- The second optimization is done with `scipy` and the custome score function.

The second fit seems to add some value as the overall score decreases from 0.2090 to 0.2084. This represents a reduction of ~0.29% of the custom score. Concretely, after the first fit, the model correctly predicts the sign of the true value 79.10% of the time. After the second fit, this percentage goes up to 79.16%, thanks to Scipy.