https://github.com/maurosilber/dask-checkpoint
Customizable caching of dask.delayed results.
https://github.com/maurosilber/dask-checkpoint
cache checkpoint dask fsspec
Last synced: about 1 year ago
JSON representation
Customizable caching of dask.delayed results.
- Host: GitHub
- URL: https://github.com/maurosilber/dask-checkpoint
- Owner: maurosilber
- License: mit
- Created: 2020-08-19T20:22:05.000Z (almost 6 years ago)
- Default Branch: main
- Last Pushed: 2024-07-01T17:00:48.000Z (almost 2 years ago)
- Last Synced: 2025-03-20T01:44:43.541Z (about 1 year ago)
- Topics: cache, checkpoint, dask, fsspec
- Language: Python
- Homepage:
- Size: 86.9 KB
- Stars: 6
- Watchers: 1
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Dask-checkpoint
Dask-checkpoint is a Python package
that adds a customizable caching capabilities to [dask](https://dask.org).
It builds on top of `dask.delayed`,
adding load and save instructions
to the dask graph.
```python
from dask_checkpoint import Storage, task
storage = Storage.from_fsspec("my_directory")
@task(save=True)
def add_one(x):
return x + 1
x0 = add_one(1).compute() # computed
with storage():
x1 = add_one(1).compute() # computed and saved to storage
x2 = add_one(1).compute() # loaded from storage
x3 = add_one(1).compute() # recomputed, not loaded from storage
assert x0 == x1 == x2 == x3
```
## Installation
Dask-checkpoint can be installed from PyPI:
```
pip install dask-checkpoint
```
## Getting started
Check out the [tutorial](examples/tutorial.ipynb) to see Dask-checkpoint in action.
## Development
To set up a development environment in a new conda environment,
run the following commands:
```
git clone https://github.com/maurosilber/dask-checkpoint
cd dask-checkpoint
conda env create -f environment-dev.yml
pre-commit install
```
Run tests locally with `tox`:
```
tox
```
or, if you have `mamba` installed:
```
CONDA_EXE=mamba tox
```