https://github.com/jmoralez/window_ops
Fast window operations
https://github.com/jmoralez/window_ops
expanding numba numpy online rolling
Last synced: 14 days ago
JSON representation
Fast window operations
- Host: GitHub
- URL: https://github.com/jmoralez/window_ops
- Owner: jmoralez
- License: apache-2.0
- Created: 2020-10-20T18:49:42.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2024-06-02T03:51:48.000Z (12 months ago)
- Last Synced: 2024-12-27T05:06:15.580Z (5 months ago)
- Topics: expanding, numba, numpy, online, rolling
- Language: Python
- Homepage: https://jmoralez.github.io/window_ops/
- Size: 1.42 MB
- Stars: 38
- Watchers: 2
- Forks: 5
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
Window ops
================This library is intended to be used as an alternative to
`pd.Series.rolling` and `pd.Series.expanding` to gain a speedup by using
numba optimized functions operating on numpy arrays. There are also
online classes for more efficient updates of window statistics.## Install
### PyPI
`pip install window-ops`
## conda
`conda install -c conda-forge window-ops`
## How to use
### Transformations
For a transformations `n_samples` -\> `n_samples` you can use
`[seasonal_](rolling|expanding)_(mean|max|min|std)` on an array.#### Benchmarks
``` python
pd.__version__
```'1.3.5'
``` python
n_samples = 10_000 # array size
window_size = 8 # for rolling operations
season_length = 7 # for seasonal operations
execute_times = 10 # number of times each function will be executed
```Average times in milliseconds.
``` python
times.applymap('{:.2f}'.format)
```
window_ops
pandas
rolling_mean
0.03
0.43
rolling_max
0.14
0.57
rolling_min
0.14
0.58
rolling_std
0.06
0.54
expanding_mean
0.03
0.31
expanding_max
0.05
0.76
expanding_min
0.05
0.47
expanding_std
0.09
0.41
seasonal_rolling_mean
0.05
3.89
seasonal_rolling_max
0.18
4.27
seasonal_rolling_min
0.18
3.75
seasonal_rolling_std
0.08
4.38
seasonal_expanding_mean
0.04
3.18
seasonal_expanding_max
0.06
3.29
seasonal_expanding_min
0.06
3.28
seasonal_expanding_std
0.12
3.89
``` python
speedups = times['pandas'] / times['window_ops']
speedups = speedups.to_frame('times faster')
speedups.applymap('{:.0f}'.format)
```
times faster
rolling_mean
15
rolling_max
4
rolling_min
4
rolling_std
9
expanding_mean
12
expanding_max
15
expanding_min
9
expanding_std
4
seasonal_rolling_mean
77
seasonal_rolling_max
23
seasonal_rolling_min
21
seasonal_rolling_std
52
seasonal_expanding_mean
78
seasonal_expanding_max
52
seasonal_expanding_min
51
seasonal_expanding_std
33
### Online
If you have an array for which you want to compute a window statistic
and then keep updating it as more samples come in you can use the
classes in the `window_ops.online` module. They all have a
`fit_transform` method which take the array and return the
transformations defined above but also have an `update` method that take
a single value and return the new statistic.#### Benchmarks
Average time in milliseconds it takes to transform the array and perform
100 updates.``` python
times.to_frame().applymap('{:.2f}'.format)
```
average time (ms)
RollingMean
0.12
RollingMax
0.23
RollingMin
0.22
RollingStd
0.32
ExpandingMean
0.10
ExpandingMax
0.07
ExpandingMin
0.07
ExpandingStd
0.17
SeasonalRollingMean
0.28
SeasonalRollingMax
0.35
SeasonalRollingMin
0.38
SeasonalRollingStd
0.42
SeasonalExpandingMean
0.17
SeasonalExpandingMax
0.14
SeasonalExpandingMin
0.15
SeasonalExpandingStd
0.23