https://github.com/maayanlab/archs4-api
An API for serving ARCHS4 Data from H5 over S3
https://github.com/maayanlab/archs4-api
Last synced: about 1 month ago
JSON representation
An API for serving ARCHS4 Data from H5 over S3
- Host: GitHub
- URL: https://github.com/maayanlab/archs4-api
- Owner: MaayanLab
- License: other
- Created: 2021-07-01T18:11:39.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2022-09-13T15:39:02.000Z (over 2 years ago)
- Last Synced: 2025-01-22T03:15:14.794Z (3 months ago)
- Language: Python
- Size: 11 MB
- Stars: 3
- Watchers: 5
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# archs4-api
An API for serving ARCHS4 Data from H5 over S3.
## Technical Details
supervisord launches & monitors aiohttp processes, and nginx process which load balances the aiohttp processes.## Preparation
For best performance, the h5 file should be created and chunked in a specific way.
- metadata should be created before data tables, in the order that the application wants them:
- meta/genes/gene_symbol
- defaults might be fine, otherwise large chunks will probably speed things up
- meta/samples/geo_accession
- defaults might be fine, otherwise large chunks will probably speed things up
- meta/samples/series_id
- defaults might be fine, otherwise large chunks will probably speed things up
- data/expression
- chunked using `(n_genes, avg_samples_per_request*2)`
assuming a shape of `(n_genes, n_samples)`
- this allows an entire sample to appear in a single chunk. with a chunk size of `avg_samples_per_request*2` each query will take one request to the backend on average (we fetch chunks individually).
- anything else