https://github.com/aplbrain/bossphorus-python
bossphorus is a simple volumetric datastore for dense 3D data.
https://github.com/aplbrain/bossphorus-python
3d bossdb dense-3d-data docker jhuapl neuroscience volume
Last synced: 4 months ago
JSON representation
bossphorus is a simple volumetric datastore for dense 3D data.
- Host: GitHub
- URL: https://github.com/aplbrain/bossphorus-python
- Owner: aplbrain
- License: apache-2.0
- Created: 2018-03-12T14:54:50.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2020-02-23T21:58:42.000Z (almost 6 years ago)
- Last Synced: 2025-03-15T09:22:42.406Z (9 months ago)
- Topics: 3d, bossdb, dense-3d-data, docker, jhuapl, neuroscience, volume
- Language: Python
- Homepage: https://aplbrain.github.io/bossphorus/#
- Size: 2.95 MB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
b o s s p h o r u s
a simple volumetric datastore for dense 3D data
> **WARNING!** *Bossphorus* is **NOT** stable and **NOT** tested. Use at your own risk, and always keep a backup copy of your data someplace safe.
## bossDB Feature Parity
For more information, see our [Features](features.md) page.
## Why use bossphorus?
*bossphorus* simplifies data-access patterns for data that do not fit into RAM. When you write a 100-gigabyte file, *bossphorus* automatically slices your dataset up to fit in bite-sized pieces.
When you request small pieces of your data for analysis, *bossphorus* intelligently serves only the parts you need, leaving the rest on disk.
## Usage
You can either run *bossphorus* using Python on your host machine, or use the provided Dockerfile to run *bossphorus* in a Docker container.
### Docker Method (Preferred)
#### 1. Build the docker image
```shell
docker build -t bossphorus .
```
#### 2. Create a directory for your filesystem to live in.
```shell
mkdir ./uploads
```
#### 3. Source the provided alias file.
This exposes a simplified wrapper to run *bossphorus* in a container.
```shell
source alias
```
#### 4. Run *bossphorus*!
```shell
bossphorus $(pwd)/uploads
```
You can run *bossphorus* in demo-mode by omitting the path to your uploads directory. **Data saved to bossphorus using this method will be destroyed when you end the bossphorus process!** Use only when testing *bossphorus* out.
### Native Method
```shell
pipenv install
mkdir ./uploads
python3 ./run.py
```
#### pip Method
```shell
pip3 install -U bossphorus
```
## Configuration
You can modify the top-level variables in `bossphorus/config.py` in order to change where bossphorus stores its data by default, and what size each file is by default.
A word of warning: While larger values of `BLOCK_SIZE` will reduce the amount of parallel threads in order to read a small file, it will also increase RAM usage per read. 2563 is probably a good default, unless you have a very good reason to change it.
---
### Why bossphorus instead of other volumetric services?
That's a great question! *bossphorus* is certainly not the most performant, nor is it the most secure. And it's not versioned or distributed. If you're looking for a volumetric datastore, I would recommend looking below at the _Alternatives_ section for some really well-engineered systems.
The primary advantage of *bossphorus* is that it uses an identical API to that of [bossDB](https://bossdb.org) — and so if you anticipate your data growing from a few gigabytes now to a few terabytes later, you can get used to the bossDB ecosystem ([intern](https://https://github.com/jhuapl-boss/intern), [ingest](https://github.com/jhuapl-boss/ingest-client), and [many more tools](https://github.com/aplbrain/)) _now_, and then invest in real bossDB architecture later on with a seamless transition.
## Why is it called bossphorus?
*bossphorus* borrows its indexing pattern from _[bossDB](https://bossdb.org)_, a cloud-native database that can store way more data than *bossphorus* ever could. If your day-to-day routine includes multiple terabytes of volumetric data, [bossDB](https://bossdb.org) may be for you.
## Alternatives
| Project | Description | If you want... |
|---------|-------------|----------------|
| [bossDB](https://bossdb.org) | Petabyte-scale, Cloud-Native Volumetric Database | ...faster IO speed and infinite scalability |
| [DVID](https://github.com/janelia-flyem/dvid) | Distributed, Versioned, Image-oriented Dataservice | ...versioned data |
## Contributing
### Updating the Documentation
When you make any changes to outward-facing APIs or services, you must update the documentation. To do so, run the following:
```shell
cd website/ # enter the docusaurus dir
yarn # install dependencies
GIT_USER=XXXX yarn run publish-gh-pages # build and upload the documentation
```
-----
