Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/freerange/google-drive-backup

Scheduled backup of Google Drive to AWS S3 bucket using rclone sync
https://github.com/freerange/google-drive-backup

aws-cdk ecs fargate google-drive rclone s3 sync

Last synced: 4 days ago
JSON representation

Scheduled backup of Google Drive to AWS S3 bucket using rclone sync

Awesome Lists containing this project

README

        

# Google Drive Backup

## Requirements

* node.js & npm
* Docker CLI for building & pushing images
* AWS CLI for initial setup (optional)

## Building Docker images

If the docker image in ECR is out-of-date, the `cdk deploy` command will build a new image using the Docker CLI and push it to ECR for use in the ECS task. This image must be built with the amd64 architecture. If your local machine is an M1 Mac, by default the Docker CLI will build images with the arm64 architecture. Since the docker CLI commands are generated by the AWS CDK code we can't easily use the `--platform` option to change the architecture of the image. However, setting the `DOCKER_DEFAULT_PLATFORM` environment variable to `linux/amd64` in the shell from which you run `cdk deploy` will have the desired effect.

## Significant files

Schedules an ECS Fargate Task to execute a backup script within a Docker container. The backup script runs the `rclone sync` command with a Google Drive directory as the source and an AWS S3 bucket as the target.

```
├── lib
│ └── google-drive-backup-stack.ts # contains ECS Task definition
└── local-image
├── Dockerfile # defines docker image for ECS Task
└── home
├── backup.sh # script executed by ECS Task
└── rclone.conf # includes config for AWS S3
```

## Configuration

Specify values for the following environment variables in the `.env` file:

* `HEALTHCHECKS_URL` - ping URL for healthchecks.io check
* `GOOGLE_DRIVE_IMPERSONATION_EMAIL` - email address to use with rclone `--drive-impersonate` option
* `GOOGLE_DRIVE_FOLDER` - source folder path
* `RCLONE_S3_REGION` - AWS region in which `cdk deploy` was run and thus S3 bucket was created
* `CRON_SCHEDULE` - JSON representation of JavaScript object conforming to [`CronOptions` interface](https://docs.aws.amazon.com/cdk/api/latest/docs/@aws-cdk_aws-applicationautoscaling.CronOptions.html), e.g. `{"weekDay":"mon","hour":"03","minute":"15"}`

### Healthchecks

* Create a [healthchecks.io](healthchecks.io) account, create a project, and add a check with a suitable period and grace time to ensure the task completes successfully according to the schedule defined in `CRON_SCHEDULE`.

* Add suitable integrations to the check to provide relevant notifications, e.g. email, Slack, etc.

* Set `HEALTHCHECKS_URL` to the "ping URL" for the check, this will be of the form: `https://hc-ping.com/${uuid}`.

### Google Drive

* [Create a Service Account for your organisation's G-Suite/Google Workspace account](https://rclone.org/drive/#service-account-support) and [give it access to the Google Drive API using domain-wide delegation of authority](https://developers.google.com/admin-sdk/directory/v1/guides/delegation).
Download a JSON key file for the Service Account and save it in `google-drive-credentials.json`.

* Run `cdk deploy` to create a secret named `/google-drive-backup/RCLONE_DRIVE_SERVICE_ACCOUNT_CREDENTIALS` in the AWS Secrets Manager with an automatically generated value. Overwrite the value of that secret with the JSON credentials string from the previous step using the following command:

```
$ aws secretsmanager put-secret-value \
--secret-id /google-drive-backup/RCLONE_DRIVE_SERVICE_ACCOUNT_CREDENTIALS \
--secret-string `cat google-drive-credentials.json`
```

* You can delete the temporary file, `google-drive-credentials.json`, after you've done this, but it might be worth keeping a record of the credentials somewhere secure.

* The `RCLONE_DRIVE_SERVICE_ACCOUNT_CREDENTIALS` environment variable specifies the credentials for rclone to use for Google Drive (see [this documentation](https://rclone.org/drive/#advanced-options) for details).

### AWS S3

* This is setup automatically when running `cdk deploy` to generate the stack. The `rclone` `env_auth` config setting is set to `true` so that `rclone` uses the IAM role assigned to the ECS Task - see [this section of the documentation](https://rclone.org/s3/#authentication).

## Useful commands

Note that `cdk` commands should be run with credentials for an AWS IAM user that has wide ranging permissions to use CloudFormation to create/update/destroy AWS resources.

* `npm run build` compile typescript to js
* `npm run watch` watch for changes and compile
* `npm run test` perform the jest unit tests
* `cdk deploy` deploy this stack to your default AWS account/region
* `cdk diff` compare deployed stack with current state
* `cdk synth` emits the synthesized CloudFormation template