https://github.com/micmurawski/cloud-array
cloud-array is an open-source Python library for storing and streaming large Numpy Arrays on local file systems and major cloud proviers CDNs.
https://github.com/micmurawski/cloud-array
aws azure big-data bigarray blob-storage cloud data-structures digitalocean-spaces gcp gcp-cloud-storage ibm-cloud-object-storage numpy s3 stream-processing streaming zadara
Last synced: 4 months ago
JSON representation
cloud-array is an open-source Python library for storing and streaming large Numpy Arrays on local file systems and major cloud proviers CDNs.
- Host: GitHub
- URL: https://github.com/micmurawski/cloud-array
- Owner: micmurawski
- Created: 2022-03-17T14:29:43.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2024-10-24T13:55:02.000Z (about 1 year ago)
- Last Synced: 2025-07-07T22:12:49.443Z (5 months ago)
- Topics: aws, azure, big-data, bigarray, blob-storage, cloud, data-structures, digitalocean-spaces, gcp, gcp-cloud-storage, ibm-cloud-object-storage, numpy, s3, stream-processing, streaming, zadara
- Language: Python
- Homepage:
- Size: 43 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Cloud Array
`cloud-array` is an open-source Python library for storing and streaming large Numpy Arrays on local file systems and major cloud providers CDNs. It automatically chunks a large array of data into arbitrary chunks sizes and uploads them into the targeted direcotry.
```python
import numpy as np
from cloud_array import CloudArray
shape = (10000, 100, 100)
chunk_shape = (10, 10, 10)
f = np.memmap(
'memmapped.dat',
dtype=np.float32,
mode='w+',
shape=shape
)
array = CloudArray(
chunk_shape=chunk_shape,
array=f,
url="s3://example_bucket/dataset0"
)
array.save()
print(array[:100,:100,:100])
```
## Links
* https://pypi.org/project/cloud-array/