Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/aiven-open/astacus
Clustered database backup
https://github.com/aiven-open/astacus
backups cassandra cluster database m3db restore
Last synced: 3 months ago
JSON representation
Clustered database backup
- Host: GitHub
- URL: https://github.com/aiven-open/astacus
- Owner: Aiven-Open
- License: apache-2.0
- Created: 2020-04-24T10:35:58.000Z (almost 5 years ago)
- Default Branch: main
- Last Pushed: 2024-11-06T15:04:38.000Z (3 months ago)
- Last Synced: 2024-11-06T16:19:46.714Z (3 months ago)
- Topics: backups, cassandra, cluster, database, m3db, restore
- Language: Python
- Homepage:
- Size: 2.07 MB
- Stars: 36
- Watchers: 63
- Forks: 4
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Security: SECURITY.md
Awesome Lists containing this project
README
# astacus
[![codecov](https://codecov.io/gh/Aiven-Open/astacus/branch/main/graph/badge.svg)](https://codecov.io/gh/Aiven-Open/astacus)
Astacus is a clustered database backup system that is meant to work with
multiple open-source cluster databases, such as
[M3](https://github.com/m3db/m3/) and
[Apache Cassandra](https://cassandra.apache.org)._My name is Maximus Backupus Astacus, Co-ordinator of the Backups of the
Cluster, Master of the Storage Availability, loyal servant to the true
emperor, Prunus Aivenius. Father to a failed backup, husband to a corrupted
data. And I will have my restore, in this runtime or the next._# Goals
- Support multiple clustered database products
- Most of the code generic
- Product-specific code with simple, testable API- Complexities to deal with e.g. reuse of blobs with same value in the
shared code
- It is needed to accomplish e.g. fast non-incremental M3 backups that are
essentially incremental as only commit logs change frequently- Support list of object storage backup site locations -> Facilitate
migration from old to new during service cloud migration- Have most of the code covered by unit tests
# See also
- [Design overview](doc/design/overview.md)
- [Implementation overview](doc/design/implementation.md)# Installation
Please see Dockerfile.fedora and Dockerfile.ubuntu for concrete up-to-date
examples, but here are the current ones:## Optional features
- cassandra can be added with 'cassandra' optional:
```
sudo pip3 install -e '.[cassandra]'
```## Fedora 34
(as root or user with sudo access; for root, skip sudo prefix)
```
sudo dnf install -y make
make build-dep-fedora
sudo python3 ./setup.py install
```## Ubuntu 20.04
(as root or user with sudo access; for root, skip sudo prefix)
```
sudo apt-get update
sudo apt-get install -y make sudo
make build-dep-ubuntu
sudo python3 ./setup.py install
```# Configuration
Create astacus.conf, which specifies which database to back up, and where.
The configuration file format is YAML, but as it is JSON superset, JSON is
also fine.Unfortunately the configuration part is not particularly well documented at
this time, but there are some examples of file backups to
[local directory (JSON)](examples/astacus-files-local.json),
[local directory (YAML)](examples/astacus-files-local.yaml), [Amazon S3](examples/astacus-files-s3.json), or
[Google GCS](examples/astacus-files-gcs.json). There is even one example of
[backing up M3 to GCS](examples/astacus-m3-gcs.json).# Usage
## Start the nodes
Start astacus server on all nodes to be backed up, either by hand or via
e.g. systemd:`astacus server -c `
## Perform backups
Periodically (e.g. from cron) call on (ideally only one node, but it
doesn't really matter as only one operation can run at a time):- `astacus backup` or
- HTTP POST to http://server-address:5515/backup## Restore backups
Backup can be restored with either
- `astacus restore` or
- HTTP POST to http://server-address:5515/restore## List backups
To see list of backups:
- `astacus list` or
- HTTP GET http://server-address:5515/list## Clean up old backups
To clean up backups based on the configured retention policy,
- use `astacus cleanup` (from cronjob or CLI), or
- HTTP POST to http://server-address:5515/cleanup# TODO
There is separate [TODO](TODO.md) file which tracks what is still to be done.