https://github.com/markoelez/basicfs
Simple distributed key-value store for small files
https://github.com/markoelez/basicfs
distributed-storage distributed-systems
Last synced: 6 months ago
JSON representation
Simple distributed key-value store for small files
- Host: GitHub
- URL: https://github.com/markoelez/basicfs
- Owner: markoelez
- License: mit
- Created: 2020-08-20T16:27:24.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2024-02-01T00:05:25.000Z (over 2 years ago)
- Last Synced: 2024-02-02T00:33:40.011Z (over 2 years ago)
- Topics: distributed-storage, distributed-systems
- Language: Python
- Homepage:
- Size: 55.7 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# BasicFS
[](https://github.com/markoelez/basicfs/actions/workflows/ci.yaml)
## Introduction
BasicFS is a very simple distributed key value store optimized for small files (i.e. photos), inspired by Facebook's [Haystack](https://www.usenix.org/legacy/event/osdi10/tech/full_papers/Beaver.pdf) object store and [SeaweedFS](https://github.com/chrislusf/seaweedfs).
## Usage
By default, volume servers will run on port 9091. When using multiple volume servers, their respective ports should be specified. The master server will default to port 9090 and should be initialized with a comma separated string containing all volume server urls.
### Build and run docker image
```
./docker.sh
```
Dependencies are tracked/installed using [Pipenv](https://pipenv.pypa.io/en/stable/) and [Pipfile](https://github.com/pypa/pipfile).
Dependencies can be installed manually using:
```
pipenv lock --requirements > requirements.txt && pip install -r /tmp/requirements.txt
```
### Start Two Volume Servers
```
PORT=9090 VOLUME=/tmp/v1 ./scripts/volume
PORT=9091 VOLUME=/tmp/v1 ./scripts/volume
```
### Start Master Server
Must have more volumes than replicas.
```
PORT=9092 DB=/tmp/db REPLICAS=2 ./scripts/master localhost:9090,localhost:9091
```
### Write File
To write a file, send a HTTP PUT request containing the filedata to the master server.
```
curl -X PUT -d filedata localhost:9092/file_id
```
### Read File
To read a file, send a HTTP GET request to the `file_id`.
```
curl localhost:9092/file_id
```
You can use also use this URL to read directly from the volume server:
```
http://localhost:9090/file_id
```
### Delete File
To delete a file, send a HTTP DELETE request to the `file_id`.
```
curl -X DELETE localhost:9092/file_id
```
## Architecture
BasicFS is designed to handle small files efficiently.
Currently file_id's are mapped to volume servers with the master. Eventually, this should be changed so that the master is only aware of volumeIDs which are mapped to their respective urls. Since objects will be written once and read often, the file_id and volumeID mapping should be cached in a local database after the initial write and used in subsequent GET requests. Uploaded key/value pairs will be replicated accross the specified volume servers based on the user specified replication protocol.
## Todo
- Consistent hashing instead of random volume selection
- RAFT consensus protocol
- RPC communication for master --> volume relationship (using gRPC, protocol buffers)
- Allow for incorporation of additional volumes to master index (using rebuild, RPC heartbeat)
## License
All code is MIT licensed. Libraries follow their respective licenses.