Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/WenjieZ/TSCV
Time Series Cross-Validation -- an extension for scikit-learn
https://github.com/WenjieZ/TSCV
backtesting cross-validation data-science hyperparameter-optimization machine-learning model-selection time-series tuning-parameters
Last synced: 4 months ago
JSON representation
Time Series Cross-Validation -- an extension for scikit-learn
- Host: GitHub
- URL: https://github.com/WenjieZ/TSCV
- Owner: WenjieZ
- License: bsd-3-clause
- Created: 2019-05-14T09:10:38.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2023-01-23T18:52:33.000Z (about 2 years ago)
- Last Synced: 2024-10-14T11:35:30.520Z (4 months ago)
- Topics: backtesting, cross-validation, data-science, hyperparameter-optimization, machine-learning, model-selection, time-series, tuning-parameters
- Language: Python
- Homepage: https://tscv.readthedocs.io
- Size: 224 KB
- Stars: 246
- Watchers: 12
- Forks: 42
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-time-series - tscv - Validation - an extension for scikit-learn. (📦 Packages / Python)
README
[](https://pepy.tech/project/tscv)
[](https://travis-ci.com/WenjieZ/TSCV)
[](https://codecov.io/gh/WenjieZ/TSCV)
[](https://tscv.readthedocs.io/en/latest/?badge=latest)
[](https://zenodo.org/badge/latestdoi/186586661)
# TSCV: Time Series Cross-Validation
This repository is a [scikit-learn](https://scikit-learn.org) extension for time series cross-validation.
It introduces **gaps** between the training set and the test set, which mitigates the temporal dependence of time series and prevents information leakage.## Installation
```bash
pip install tscv
```or
```bash
conda install -c conda-forge tscv
```## Usage
This extension defines 3 cross-validator classes and 1 function:
- `GapLeavePOut`
- `GapKFold`
- `GapRollForward`
- `gap_train_test_split`The three classes can all be passed, as the `cv` argument, to
scikit-learn functions such as `cross-validate`, `cross_val_score`,
and `cross_val_predict`, just like the native cross-validator classes.The one function is an alternative to the `train_test_split` function in `scikit-learn`.
## Examples
The following example uses `GapKFold` instead of `KFold` as the cross-validator.
```python
import numpy as np
from sklearn import datasets
from sklearn import svm
from sklearn.model_selection import cross_val_score
from tscv import GapKFoldiris = datasets.load_iris()
clf = svm.SVC(kernel='linear', C=1)# use GapKFold as the cross-validator
cv = GapKFold(n_splits=5, gap_before=5, gap_after=5)
scores = cross_val_score(clf, iris.data, iris.target, cv=cv)
```The following example uses `gap_train_test_split` to split the data set into the training set and the test set.
```python
import numpy as np
from tscv import gap_train_test_splitX, y = np.arange(20).reshape((10, 2)), np.arange(10)
X_train, X_test, y_train, y_test = gap_train_test_split(X, y, test_size=2, gap_size=2)
```## Contributing
- Report bugs in the issue tracker
- Express your use cases in the issue tracker## Documentations
- [tscv.readthedocs.io](https://tscv.readthedocs.io)## Acknowledgments
- I would like to thank Jeffrey Racine and Christoph Bergmeir for the helpful discussion.
## License
BSD-3-Clause## Citation
Wenjie Zheng. (2021). Time Series Cross-Validation (TSCV): an extension for scikit-learn. Zenodo. http://doi.org/10.5281/zenodo.4707309
```latex
@software{zheng_2021_4707309,
title={{Time Series Cross-Validation (TSCV): an extension for scikit-learn}},
author={Zheng, Wenjie},
month={april},
year={2021},
publisher={Zenodo},
doi={10.5281/zenodo.4707309},
url={http://doi.org/10.5281/zenodo.4707309}
}
```