Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/hdfgroup/hsds
Cloud-native, service based access to HDF data
https://github.com/hdfgroup/hsds
asyncio aws data-analysis docker hdf5 multi-dimensional python scientific-data
Last synced: about 4 hours ago
JSON representation
Cloud-native, service based access to HDF data
- Host: GitHub
- URL: https://github.com/hdfgroup/hsds
- Owner: HDFGroup
- License: apache-2.0
- Created: 2016-07-27T16:04:12.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2024-04-22T10:37:20.000Z (9 months ago)
- Last Synced: 2024-04-22T11:30:42.290Z (9 months ago)
- Topics: asyncio, aws, data-analysis, docker, hdf5, multi-dimensional, python, scientific-data
- Language: Python
- Homepage: https://www.hdfgroup.org/solutions/hdf-kita/
- Size: 7.4 MB
- Stars: 120
- Watchers: 20
- Forks: 52
- Open Issues: 32
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/hdfgroup/hsds)
# HSDS (Highly Scalable Data Service) - REST-based service for HDF5 data
## Introduction
HSDS is a web service that implements a REST-based web service for HDF5 data stores.
Data can be stored in either a POSIX files system, or using object-based storage such as
AWS S3, Azure Blob Storage, or [MinIO](https://min.io).
HSDS can be run a single machine with or without Docker or on a cluster using Kubernetes (or AKS on Microsoft Azure).## Quick Start
### With Github codespaces
Launch a Codespaces environment by clicking the banner __["Open in GitHub Codespaces"](https://codespaces.new/HDFGroup/hsds)__. Once the codespace is ready, type:
`python testall.py` in the terminal window to run the test suite.### On your desktop/laptop
Make sure you have Python 3 and Pip installed, then:
1. Run install: `$ ./build.sh --no-lint --no-docker` from source tree OR install from pypi: `$ pip install hsds`
2. Create a directory the server will use to store data, example: `$ mkdir ~/hsds_data`
3. Start server: `$ hsds --root_dir ~/hsds_data`
4. Run the test suite. In a separate terminal run:
- Set user_name: `$ export USER_NAME=$USER`
- Set user_password: `$ export USER_PASSWORD=$USER`
- Set admin name: `$ export ADMIN_USERNAME=$USER`
- Set admin password: `$ export ADMIN_PASSWORD=$USER`
- Run test suite: `$ python testall.py --skip_unit`
5. (Optional) Install the h5pyd package for an h5py compatible api and tool suite: https://github.com/HDFGroup/h5pyd
6. (Optional) Post install setup (test data, home folders, cli tools, etc): [docs/post_install.md](docs/post_install.md)To shut down the server, and the server is not running in Docker, just control-C.
If using docker, run: `$ ./stopall.sh`
Note: passwords can (and should for production use) be modified by changing values in hsds/admin/config/password.txt and rebuilding the docker image. Alternatively, an external identity provider such as Azure Active Directory or KeyCloak can be used. See: [docs/azure_ad_setup.md](docs/azure_ad_setup.md) for Azure AD setup instructions or [docs/keycloak_setup.md](docs/keycloak_setup.md) for KeyCloak.
## Detailed Install Instructions
### On AWS
For complete instructions to install on a single Azure VM with Docker:
- See: [docs/docker_install_aws.md](docs/docker_install_aws.md)
For complete instructions to install on AWS Kubernetes Service (EKS):
- See: [docs/kubernetes_install_aws.md](docs/kubernetes_install_aws.md)
For complete instructions to install on AWS Lambda:
- See: [docs/aws_lambda_setup.md](docs/aws_lambda_setup.md).
### On Azure
For complete instructions to install on a single Azure VM with Docker:
- See: [docs/docker_install_azure.md](docs/docker_install_azure.md)
For complete instructions to install on Azure Kubernetes Service (AKS):
- See: [docs/kubernetes_install_azure.md](docs/kubernetes_install_azure.md)
### On Prem (POSIX-based storage)
For complete instructions to install on a desktop or local server:
- See: [docs/docker_install_posix.md](docs/docker_install_posix.md)
### On DCOS (BETA)
For complete instructions to install on DCOS:
- See: [docs/docker_install_dcos.md](docs/docker_install_dcos.md)
## General Install Topics
Setting up docker:
- See [docs/setup_docker.md](docs/setup_docker.md)
Post install setup and testing:
- See [docs/post_install.md](docs/post_install.md)
Authorization, ACLs, and Role Based Access Control (RBAC):
- See [docs/authorization.md](docs/authorization.md)
## Writing Client Applications
As a REST service, clients be developed using almost any programming language. The
test programs under: hsds/test/integ illustrate some of the methods for performing
different operations using Python and HSDS REST API (using the requests package).The related project: provides a (mostly) h5py-compatible
interface to the server for Python clients.For C/C++ clients, the HDF REST VOL is a HDF5 library plugin that enables the HDF5 API to read and write data
using HSDS. See: . Note: requires v1.12.0 or greater version of the HDF5 library.## Uninstalling
HSDS only modifies the storage location that it is configured to use, so to uninstall just remove
source files, Docker images, and S3 bucket/Azure Container/directory files.## Reporting bugs (and general feedback)
Create new issues at for any problems you find.
For general questions/feedback, please use the HSDS forum: .
## License
HSDS is licensed under an APACHE 2.0 license. See LICENSE in this directory.
## Azure Marketplace
VM Offer for Azure Marketplace. HSDS for Azure Marketplace provides an easy way to
setup a Azure instance with HSDS. See: for more information.## Websites
- Main website:
- Source code:
- Forum:
- Documentation: (For REST API)## Other useful resources
### HDF Group Blog Posts
- Web Caching:
- HSDS Streaming:
- Cloud Storage Options for HDF5:
- HSDS Docker Images:
- HSDS Container Types:
- Using Multiprocessing in Python:
- Biosimulations - case study with HSDS and Vega:
- HSDS for Microsoft Azure:
- New Features in HSDS v0.6:
- HSDS Security:
- HDF for the Web: HDF Server:### External Blogs and Articles
- A RESTful Meeting Between MATLAB and HDF Server:
- AWS Big Data Blog:### Slide Decks
- HSDS v0.7 New Features, EUHUG 2022:
- HSDS Serverless, EUHUG 2021:
- HSDS REST, HUG 2020:
- HSDS with Jupyter, ESIP 2018:
- HDF Data Services, SciPy17:### Videos
- HSDS Webinar:
- HSDS Overview, Allotrope Connect Day:
- The Use of HSDS on SlideRule, HUG 2020:
- HDF Data Services, SciPy 2017:
- RESTful HDF, SciPy 2015:### Papers
- restfulSE: A semantically rich interface for cloud-scale genomics with Bioconductor:
- RESTful HDF5 White Paper: