https://github.com/Sage-Bionetworks/s3-synapse-sync
https://github.com/Sage-Bionetworks/s3-synapse-sync
Last synced: 20 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/Sage-Bionetworks/s3-synapse-sync
- Owner: Sage-Bionetworks
- License: apache-2.0
- Created: 2020-07-17T18:48:14.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2024-02-21T20:17:31.000Z (over 1 year ago)
- Last Synced: 2024-04-11T01:26:00.159Z (about 1 year ago)
- Language: Python
- Size: 175 KB
- Stars: 3
- Watchers: 7
- Forks: 6
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-synapse - s3-synapse-sync - Lambda function code to index files in S3 buckets by creating filehandles on Synapse, triggered by file changes to S3. (Python / Shiny)
README
# s3-synapse-sync
Lambda function code to index files in S3 buckets by creating filehandles on Synapse, triggered by file changes to S3.
## Getting started
Confirm [center onboarding](https://docs.google.com/document/d/1cCRkfK6or6lwMNc96f5LTp9rK8aHEDqhyP0ISgSn9sI/edit) steps are complete, and a Synapse project has been created to which the bucket will be synced.
---
## Development
### Requirements
Run `pipenv install --dev` to install both production and development
requirements, and `pipenv shell` to activate the virtual environment. For more
information see the [pipenv docs](https://pipenv.pypa.io/en/latest/).After activating the virtual environment, run `pre-commit install` to install
the [pre-commit](https://pre-commit.com/) git hook.#### Parameters
Create an AWS KMS key to encrypt secure strings.Create a sceptre s3-synapse-sync-kms-key.yaml file used to deploy cloudformation
template [s3-synapse-sync-kms-key.yaml](s3-synapse-sync-kms-key.yaml):
```yaml
template_path: "s3-synapse-sync-kms-key.yaml"
stack_name: "s3-synapse-sync-kms-key"
stack_tags:
Department: "CompOnc"
Project: "HTAN"
OwnerEmail: "[email protected]"
```
__Note__: You may need to add your user ARN to the policy principal in the
cloudformation template.Deploy the stack using sceptre:
```shell script
sceptre --var "profile=my-profile" --var "region=us-east-1" launch prod/s3-synapse-sync-kms-key.yaml
```Add two **SecureString** parameters containing Synapse credentials to SSM Parameter Store
| Parameter Name | Value | Type |
| ------------- | ------------- | ------------- |
| `/HTAN/SynapseSync/username` | Synapse service account username | SecureString |
| `/HTAN/SynapseSync/apiKey` | Synapse service account API Key | SecureString |```shell script
aws ssm put-parameter \
--name /HTAN/SynapseSync/username \
--value \
--type SecureString \
--key-id alias/s3-synapse-sync-kms-key/kmskey
```### Create a local build
```shell script
$ sam build
```### Run locally
```shell script
$ sam local invoke HelloWorldFunction --event events/event.json
```### Run unit tests
Tests are defined in the `tests` folder in this project. Use PIP to install the
[pytest](https://docs.pytest.org/en/latest/) and run unit tests.```shell script
$ python -m pytest tests/ -v
```## Deployment
### Deploy Docker
Containerize the python Minerva rendering script.
1. Make sure [Docker](https://docs.docker.com/get-docker/) is installed
2. Download this repository
3. Build:```
cd docker
docker build -t .
```4. Tag the build and push image to registry
### Deploy Lambda to S3
Deployments are sent to the
[Sage cloudformation repository](https://bootstrap-awss3cloudformationbucket-19qromfd235z9.s3.amazonaws.com/index.html)
which requires permissions to upload to Sage
`bootstrap-awss3cloudformationbucket-19qromfd235z9` and
`essentials-awss3lambdaartifactsbucket-x29ftznj6pqw` buckets.```shell script
sam package --template-file .aws-sam/build/template.yaml \
--s3-bucket essentials-awss3lambdaartifactsbucket-x29ftznj6pqw \
--output-template-file .aws-sam/build/s3-synapse-sync.yamlaws s3 cp .aws-sam/build/s3-synapse-sync.yaml s3://bootstrap-awss3cloudformationbucket-19qromfd235z9/s3-synapse-sync/master/
```## Publish Lambda
### Private access
Publishing the lambda makes it available in your AWS account. It will be accessible in
the [serverless application repository](https://console.aws.amazon.com/serverlessrepo).```shell script
sam publish --template .aws-sam/build/cfn-cr-synapse-tagger.yaml
```### Public access
Making the lambda publicly accessible makes it available in the
[global AWS serverless application repository](https://serverlessrepo.aws.amazon.com/applications)```shell script
aws serverlessrepo put-application-policy \
--application-id \
--statements Principals=*,Actions=Deploy
```## Install Lambda into AWS
### Sceptre
Create the following [sceptre](https://github.com/Sceptre/sceptre) fileCreate a sceptre s3-synapse-sync.yaml file used to deploy cloudformation
template [s3-synapse-sync.yaml](template.yaml):
```yaml
template_path: "remote/s3-synapse-sync.yaml"
stack_name: "s3-synapse-sync"
stack_tags:
Department: "CompOnc"
Project: "HTAN"
OwnerEmail: "[email protected]"
dependencies:
- "prod/s3-synapse-sync-kms-key.yaml"
hooks:
before_launch:
- !cmd "curl https://{{stack_group_config.admincentral_cf_bucket}}.s3.amazonaws.com/s3-synapse-sync/master/s3-synapse-sync.yaml --create-dirs -o templates/remote/s3-synapse-sync.yaml"
parameters:
BucketVariables: >-
{
"htan-dcc-bucket-a":{"SynapseProjectId":"syn11111"},
"htan-dcc-bucket-b":{"SynapseProjectId":"syn22222"}
}
KmsDecryptPolicyArn: !stack_output_external "s3-synapse-sync-kms-key::KmsDecryptPolicyArn"
BucketNamePrefix: "htan-dcc-*"
DockerImage: "registry/repository:tag"
Subnets: "subnet-1a2b3c4d,subnet-1a2b3c4d"
VpcID: "vpc-1a2b3c4d"
```Install the lambda using sceptre:
```shell script
sceptre --var "profile=my-profile" --var "region=us-east-1" launch prod/s3-synapse-sync.yaml
```### AWS Console
Steps to deploy from AWS console.1. Login to AWS
2. Access the
[serverless application repository](https://console.aws.amazon.com/serverlessrepo)
-> Available Applications
3. Select application to install
4. Enter Application settings
5. Click Deploy---
## Create Buckets
**Note**: Buckets must be explicitly named. Bucket names must begin with the prefix specified in the lambda parameter `BucketNamePrefix` (e.g. htan-dcc-*) and must be globally unique across all AWS accounts.Create a sceptre s3-synapse-sync-bucket-a.yaml file used to deploy jinjaized
cloudformation template [s3-synapse-sync-bucket-a.yaml](s3-synapse-sync-bucket.j2):
```yaml
template_path: "remote/s3-synapse-sync-bucket.j2"
stack_name: "s3-synapse-sync-bucket-a"
stack_tags:
Department: "CompOnc"
Project: "HTAN"
OwnerEmail: "[email protected]"
hooks:
before_launch:
- !cmd "curl https://{{stack_group_config.admincentral_cf_bucket}}.s3.amazonaws.com/s3-synapse-sync/master/s3-synapse-sync-bucket.j2 --create-dirs -o templates/remote/s3-synapse-sync-bucket.j2"
dependencies:
- "prod/s3-synapse-sync.yaml"
parameters:
BucketName: "htan-dcc-bucket-a"
SynapseIDs:
- "1111111"
S3UserARNs:
- "arn:aws:sts::213235685529:assumed-role/sandbox-developer/[email protected]"
DenyDeleteARNs:
- "arn:aws:sts::213235685529:assumed-role/sandbox-developer/[email protected]"
S3AdminARNs:
- "arn:aws:sts::213235685529:assumed-role/sandbox-developer/[email protected]"
S3SynapseSyncFunctionArn: !stack_output_external "s3-synapse-sync::FunctionArn"
S3SynapseSyncFunctionRoleArn: !stack_output_external "s3-synapse-sync::FunctionRoleArn"# Due to circular dependencies, enabling bucket notification must be done after bucket creation"
# https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-s3-bucket-notificationconfig.html
sceptre_user_data:
EnableNotificationConfiguration: "false"
```Deploy with sceptre, Notification configuration is disabled on 1st deploy.
Deploy a 2nd time with `EnableNotificationConfiguration: "true"`---
### To Use:
1. Place file in folder of S3 bucket
- Grant the bucket owner full control of the object by including the flag `--acl bucket-owner-full-control`Example `cp` and `put-object` commands:
```
aws s3 cp test.txt s3://MyBucket/MyFolder/test.txt --acl bucket-owner-full-control
```
```
aws s3api put-object --bucket MyBucket --key MyFolder/test.txt --body test.txt --acl bucket-owner-full-control
```2. Check CloudWatch logs for the Lambda function to see if the function was triggered and completed successfully
3. Check Synapse project to see if filehandle was created#### Minerva Story
The lambda will also run a [Minerva](https://gist.github.com/thejohnhoffer/f6193f079f6efa85befab97194d11984) pre-processing tool to create a JPEG image pyramid and an `exhibit.json` suitable for hosting with Minerva Story.Add input OME-TIFF and json (.story.json) files to the `minerva` folder in the bucket. Ensure that the image name contained in the `in_file` property of the author json file matches that of the OME-TIFF input file. Output image tiles and exhibit files will be added to the directory in the `minerva` folder.