Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/pwwang/pipen-gcs
https://github.com/pwwang/pipen-gcs
Last synced: 14 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/pwwang/pipen-gcs
- Owner: pwwang
- License: apache-2.0
- Created: 2024-07-23T07:04:35.000Z (5 months ago)
- Default Branch: master
- Last Pushed: 2024-12-19T04:27:57.000Z (15 days ago)
- Last Synced: 2024-12-19T05:23:01.755Z (15 days ago)
- Language: Python
- Size: 68.4 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# pipen-gcs
A plugin for [pipen][1] to handle files in Google Cloud Storage
## Installation
```bash
pip install -U pipen-gcs# uninstall to disable
pip uninstall pipen-gcs
```## Usage
```python
from pipen import Proc, Pipenclass MyProc(Proc):
input = "infile:file"
input_data = ["gs://bucket/path/to/file"]
output = "outfile:file:gs://bucket/path/to/output"
script = "cat {{in.infile}} > {{out.outfile}}"class MyPipen(Pipen):
starts = MyProc
# input files/directories will be downloaded to /tmp
# output files/directories will be generated in /tmp and then uploaded
# to the cloud storage
plugin_opts = {"gcs_localize": "/tmp"}if __name__ == "__main__":
MyPipen().run()
```You can also disable localization, then you will have to handle the
cloud storage files yourself.```python
from pipen import Proc, Pipenclass MyProc(Proc):
input = "infile:file"
input_data = ["gs://bucket/path/to/file"]
output = "outfile:file:gs://bucket/path/to/output"
script = "gsutil cp {{in.infile}} {{out.outfile}}"class MyPipen(Pipen):
starts = MyProc
plugin_opts = {"gcs_localize": False}if __name__ == "__main__":
MyPipen().run()
```## Configuration
- `gcs_localize`: The directory to localize the cloud storage files. If
set to `False`, the files will not be localized. Default is `False`.
- `gcs_localize_force`: If set to `True`, the files will be localized
even if they exist locally. Default is `False`.
- `gcs_credentials`: The path to the Google Cloud Service Account
credentials file.[1]: https://github.com/pwwang/pipen