Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/observingclouds/slkspec
fsspec filesystem for stronglink tape archive
https://github.com/observingclouds/slkspec
fsspec python tape-archive
Last synced: about 1 month ago
JSON representation
fsspec filesystem for stronglink tape archive
- Host: GitHub
- URL: https://github.com/observingclouds/slkspec
- Owner: observingClouds
- Created: 2022-11-07T11:32:01.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2023-09-20T05:04:55.000Z (over 1 year ago)
- Last Synced: 2023-09-20T06:45:37.020Z (over 1 year ago)
- Topics: fsspec, python, tape-archive
- Language: Python
- Homepage:
- Size: 96.7 KB
- Stars: 5
- Watchers: 2
- Forks: 2
- Open Issues: 10
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
[![CI](https://github.com/observingClouds/slkspec/workflows/Tests/badge.svg?branch=main)](https://github.com/observingClouds/slkspec/actions?query=workflow%3ATests)
[![Linter](https://github.com/observingClouds/slkspec/workflows/Linter/badge.svg?branch=main)](https://github.com/observingClouds/slkspec/actions?query=workflow%3ALinter)# slkspec
This is work in progress! This repository showcases how the tape archive can be integrated into the scientific workflow.
Pull requests are welcomed!
```python
import fsspecwith fsspec.open("slk:///arch/project/user/file", "r") as f:
print(f.read())
```
### Loading datasets```python
import ffspec
import xarray as xrurl = fsspec.open("slk:////arch/project/file.nc", slk_cache="/scratch/b/b12346").open()
dset = xr.open_dataset(url)
```## Usage in combination with preffs
### Installation of additional requirements
```console
mamba env create
mamba activate slkspec
pip install .[preffs]
```Open parquet referenced zarr-file
```python
import xarray as xr
ds = xr.open_zarr(f"preffs::/path/to/preffs/data.preffs",
storage_options={"preffs":{"prefix":"slk:///arch///slk/archive/prefix/"}
```Now only those files are retrieved from tape which are needed for any requested
dataset operation. In the beginning only the file containing the metadata
(e.g. .zattrs, .zmetadata) and coordinates are requested (e.g. time). After the
files have been retrieved once, they are saved at the path given in
`SLK_CACHE` and accessed directly from there.