{"id":22832431,"url":"https://github.com/hdfgroup/hsds","last_synced_at":"2025-04-05T13:02:20.885Z","repository":{"id":37239413,"uuid":"64322709","full_name":"HDFGroup/hsds","owner":"HDFGroup","description":"Cloud-native, service based access to HDF data","archived":false,"fork":false,"pushed_at":"2024-04-22T10:37:20.000Z","size":7760,"stargazers_count":120,"open_issues_count":32,"forks_count":52,"subscribers_count":20,"default_branch":"master","last_synced_at":"2024-04-22T11:30:42.290Z","etag":null,"topics":["asyncio","aws","data-analysis","docker","hdf5","multi-dimensional","python","scientific-data"],"latest_commit_sha":null,"homepage":"https://www.hdfgroup.org/solutions/hdf-kita/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/HDFGroup.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2016-07-27T16:04:12.000Z","updated_at":"2024-04-23T12:54:12.486Z","dependencies_parsed_at":"2023-12-04T08:23:45.317Z","dependency_job_id":"16f0139d-5fa7-4075-bdf6-bdb8a340c5ff","html_url":"https://github.com/HDFGroup/hsds","commit_stats":{"total_commits":1752,"total_committers":34,"mean_commits":"51.529411764705884","dds":"0.13299086757990863","last_synced_commit":"0aeaf2baabcbc128b86d8bfa62fca282bcc77b2a"},"previous_names":[],"tags_count":13,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HDFGroup%2Fhsds","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HDFGroup%2Fhsds/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HDFGroup%2Fhsds/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HDFGroup%2Fhsds/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/HDFGroup","download_url":"https://codeload.github.com/HDFGroup/hsds/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247339148,"owners_count":20923013,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["asyncio","aws","data-analysis","docker","hdf5","multi-dimensional","python","scientific-data"],"created_at":"2024-12-12T21:07:31.951Z","updated_at":"2025-04-05T13:02:20.858Z","avatar_url":"https://github.com/HDFGroup.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/hdfgroup/hsds)\n\n\n# HSDS (Highly Scalable Data Service) - REST-based service for HDF5 data\n\n## Introduction\n\nHSDS is a web service that implements a REST-based web service for HDF5 data stores.\nData can be stored in either a POSIX files system, or using object-based storage such as\nAWS S3, Azure Blob Storage, or [MinIO](https://min.io).\nHSDS can be run a single machine with or without Docker or on a cluster using Kubernetes (or AKS on Microsoft Azure).\n\n## Quick Start\n\n### With Github codespaces\n\nLaunch a Codespaces environment by clicking the banner __[\"Open in GitHub Codespaces\"](https://codespaces.new/HDFGroup/hsds)__.  Once the codespace is ready, type:\n`python testall.py` in the terminal window to run the test suite.\n\n### On your desktop/laptop\n\nMake sure you have Python 3 and Pip installed, then:\n\n1.  Run install: `$ ./build.sh --no-lint --no-docker` from source tree OR install from pypi: `$ pip install hsds`\n2.  Create a directory the server will use to store data, example: `$ mkdir ~/hsds_data`\n3.  Start server: `$ hsds --root_dir ~/hsds_data`\n4.  Run the test suite. In a separate terminal run:\n    -  Set user_name: `$ export USER_NAME=$USER`\n    -  Set user_password: `$ export USER_PASSWORD=$USER`\n    -  Set admin name: `$ export ADMIN_USERNAME=$USER`\n    -  Set admin password: `$ export ADMIN_PASSWORD=$USER`\n    -  Run test suite: `$ python testall.py --skip_unit`\n5. (Optional) Install the h5pyd package for an h5py compatible api and tool suite: https://github.com/HDFGroup/h5pyd\n6. (Optional) Post install setup (test data, home folders, cli tools, etc): [docs/post_install.md](docs/post_install.md)\n\nTo shut down the server, and the server is not running in Docker, just control-C.\n\nIf using docker, run: `$ ./stopall.sh`\n\nNote: passwords can (and should for production use) be modified by changing values in hsds/admin/config/password.txt and rebuilding the docker image. Alternatively, an external identity provider such as Azure Active Directory or KeyCloak can be used. See: [docs/azure_ad_setup.md](docs/azure_ad_setup.md) for Azure AD setup instructions or [docs/keycloak_setup.md](docs/keycloak_setup.md) for KeyCloak.\n\n## Detailed Install Instructions\n\n### On AWS\n\nFor complete instructions to install on a single Azure VM with Docker:\n\n- See: [docs/docker_install_aws.md](docs/docker_install_aws.md)\n\nFor complete instructions to install on AWS Kubernetes Service (EKS):\n\n- See: [docs/kubernetes_install_aws.md](docs/kubernetes_install_aws.md)\n\nFor complete instructions to install on AWS Lambda:\n\n- See: [docs/aws_lambda_setup.md](docs/aws_lambda_setup.md).\n\n### On Azure\n\nFor complete instructions to install on a single Azure VM with Docker:\n\n- See: [docs/docker_install_azure.md](docs/docker_install_azure.md)\n\nFor complete instructions to install on Azure Kubernetes Service (AKS):\n\n- See: [docs/kubernetes_install_azure.md](docs/kubernetes_install_azure.md)\n\n### On Prem (POSIX-based storage)\n\nFor complete instructions to install on a desktop or local server:\n\n- See: [docs/docker_install_posix.md](docs/docker_install_posix.md)\n\n### On DCOS (BETA)\n\nFor complete instructions to install on DCOS:\n\n- See: [docs/docker_install_dcos.md](docs/docker_install_dcos.md)\n\n## General Install Topics\n\nSetting up docker:\n\n- See [docs/setup_docker.md](docs/setup_docker.md)\n\nPost install setup and testing:\n\n- See [docs/post_install.md](docs/post_install.md)\n\nAuthorization, ACLs, and Role Based Access Control (RBAC):\n\n- See [docs/authorization.md](docs/authorization.md)\n\n## Writing Client Applications\n\nAs a REST service, clients be developed using almost any programming language. The\ntest programs under: hsds/test/integ illustrate some of the methods for performing\ndifferent operations using Python and HSDS REST API (using the requests package).\n\nThe related project: \u003chttps://github.com/HDFGroup/h5pyd\u003e provides a (mostly) h5py-compatible\ninterface to the server for Python clients.\n\nFor C/C++ clients, the HDF REST VOL is a HDF5 library plugin that enables the HDF5 API to read and write data\nusing HSDS. See: \u003chttps://github.com/HDFGroup/vol-rest\u003e. Note: requires v1.12.0 or greater version of the HDF5 library.\n\n## Uninstalling\n\nHSDS only modifies the storage location that it is configured to use, so to uninstall just remove\nsource files, Docker images, and S3 bucket/Azure Container/directory files.\n\n## Reporting bugs (and general feedback)\n\nCreate new issues at \u003chttp://github.com/HDFGroup/hsds/issues\u003e for any problems you find.\n\nFor general questions/feedback, please use the HSDS forum: \u003chttps://forum.hdfgroup.org/c/hsds\u003e.\n\n## License\n\nHSDS is licensed under an APACHE 2.0 license. See LICENSE in this directory.\n\n## Azure Marketplace\n\nVM Offer for Azure Marketplace. HSDS for Azure Marketplace provides an easy way to\nsetup a Azure instance with HSDS. See: \u003chttps://azuremarketplace.microsoft.com/en-us/marketplace/apps/thehdfgroup1616725197741.hsdsazurevm?tab=Overview\u003e for more information.\n\n## Websites\n\n- Main website: \u003chttps://www.hdfgroup.org/solutions/highly-scalable-data-service-hsds/\u003e\n- Source code: \u003chttps://github.com/HDFGroup/hsds\u003e\n- Forum: \u003chttps://forum.hdfgroup.org/c/hsds\u003e\n- Documentation: \u003chttps://support.hdfgroup.org/documentation/index.html\u003e \n- REST API: \u003chttps://github.com/HDFGroup/hdf-rest-api\u003e\n\n## Other useful resources\n\n### HDF Group Blog Posts\n\n- Web Caching: \u003chttps://www.hdfgroup.org/2022/10/improve-hdf5-performance-using-caching/\u003e\n- HSDS Streaming: \u003chttps://www.hdfgroup.org/2022/08/hsds-streaming/\u003e\n- Cloud Storage Options for HDF5: \u003chttps://www.hdfgroup.org/2022/08/cloud-storage-options-for-hdf5/\u003e\n- HSDS Docker Images: \u003chttps://www.hdfgroup.org/2022/07/hsds-docker-images/\u003e\n- HSDS Container Types: \u003chttps://www.hdfgroup.org/2022/07/deep-dive-hsds-container-types/\u003e\n- Using Multiprocessing in Python: \u003chttps://www.hdfgroup.org/2022/06/speed-up-cloud-access-using-multiprocessing/\u003e\n- Biosimulations - case study with HSDS and Vega: \u003chttps://www.hdfgroup.org/2022/02/biosimulations-a-platform-for-sharing-and-reusing-biological-simulations/\u003e\n- HSDS for Microsoft Azure: \u003chttps://www.hdfgroup.org/2021/08/hsds-for-azure/\u003e\n- New Features in HSDS v0.6: \u003chttps://www.hdfgroup.org/2020/10/new-features-in-hsds-version-0-6/\u003e\n- HSDS Security: \u003chttps://hdfgroup.org/wp/2015/12/serve-protect-web-security-hdf5\u003e\n- HDF for the Web: HDF Server: \u003chttps://www.hdfgroup.org/2015/04/hdf5-for-the-web-hdf-server/\u003e\n\n### External Blogs and Articles\n\n- A RESTful Meeting Between MATLAB and HDF Server: \u003chttps://www.mathworks.com/matlabcentral/fileexchange/59072-a-restful-meeting-between-matlab-and-hdf-server-web-based-hdf5-access-using-matlab\u003e\n- AWS Big Data Blog: \u003chttps://aws.amazon.com/blogs/big-data/power-from-wind-open-data-on-aws/\u003e\n\n### Slide Decks\n\n- HSDS v0.7 New Features, EUHUG 2022: \u003chttps://www.hdfgroup.org/wp-content/uploads/2022/05/HSDS_New_Feautres_7.0.pdf\u003e\n- HSDS Serverless, EUHUG 2021: \u003chttps://www.hdfgroup.org/wp-content/uploads/2021/07/ServerlessHSDS.pdf\u003e\n- HSDS REST, HUG 2020: \u003chttps://www.hdfgroup.org/wp-content/uploads/2020/10/HSDS_Rest_Service_HDF5_Readey.pdf\u003e\n- HSDS with Jupyter, ESIP 2018: \u003chttps://www.slideshare.net/HDFEOS/hdf-kita-lab-jupyterlab-hdf-service\u003e\n- HDF Data Services, SciPy17: \u003chttp://s3.amazonaws.com/hdfgroup/docs/hdf_data_services_scipy2017.pdf\u003e\n\n### Videos\n\n- HSDS Webinar: \u003chttps://www.youtube.com/watch?v=9b5TO7drqqE\u003e\n- HSDS Overview, Allotrope Connect Day: \u003chttps://www.youtube.com/watch?v=nRHXEkhlfZ0\u003e\n- The Use of HSDS on SlideRule, HUG 2020: \u003chttps://www.youtube.com/watch?v=i-KIoGqdEMg\u003e\n- HDF Data Services, SciPy 2017: \u003chttps://www.youtube.com/watch?v=EmnCz1Hg-VM\u003e\n- RESTful HDF, SciPy 2015: \u003chttps://www.youtube.com/watch?v=JSFZ3i3WcjQ\u003e\n\n### Papers\n\n- restfulSE: A semantically rich interface for cloud-scale genomics with Bioconductor: \u003chttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6392152\u003e\n- RESTful HDF5 White Paper: \u003chttps://www.hdfgroup.org/pubs/papers/RESTful_HDF5.pdf\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhdfgroup%2Fhsds","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhdfgroup%2Fhsds","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhdfgroup%2Fhsds/lists"}