Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/JaneliaSciComp/burst-compute
Burst compute framework for AWS
https://github.com/JaneliaSciComp/burst-compute
aws-lambda burst-parallel compute-engine hpc
Last synced: 2 months ago
JSON representation
Burst compute framework for AWS
- Host: GitHub
- URL: https://github.com/JaneliaSciComp/burst-compute
- Owner: JaneliaSciComp
- License: bsd-3-clause
- Created: 2020-10-06T13:16:08.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2023-12-09T21:24:06.000Z (about 1 year ago)
- Last Synced: 2024-04-07T01:49:26.461Z (9 months ago)
- Topics: aws-lambda, burst-parallel, compute-engine, hpc
- Language: JavaScript
- Homepage:
- Size: 1.47 MB
- Stars: 4
- Watchers: 4
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-janelia-software - Burst Compute - AWS service for highly parallel Lambda processing (High Performance Computing)
- awesome-janelia-software - Burst Compute - AWS service for highly parallel Lambda processing (High Performance Computing)
README
# Burst Compute Framework
[![DOI](https://zenodo.org/badge/301732359.svg)](https://zenodo.org/badge/latestdoi/301732359)
![CI workflow](https://github.com/JaneliaSciComp/burst-compute/actions/workflows/node.js.yml/badge.svg)Serverless burst-compute implementation for AWS, using only native AWS services.
As seen in the [AWS Architecture Blog](https://aws.amazon.com/blogs/architecture/scaling-neuroscience-research-on-aws/).
For [embarassingly parallel](https://en.wikipedia.org/wiki/Embarrassingly_parallel) workloads, **N** items may be trivially processed by **T** threads. Given **N** items, and a **batchSize** (the maximum number of items to be processed by a single process in serial), we divide the work into **N/batchSize** batches and invoke that many user-provided **worker** Lambdas. When all the worker functions are done, the results are combined by the user-provided **combiner** Lambda.
In the diagram below, the code you write is indicated by the blue lambda icons.
![Architecture Diagram](docs/burst-compute-diagram.png)
Here's how it works, step-by-step:
1) You define a **worker** function and a **combiner** function
2) Launch your burst compute job by calling the **dispatch** function with a range of items to process
3) The dispatcher will start copies of itself recursively and efficiently start your worker lambdas
4) Each **worker** is given a range of inputs and must compute results for those inputs and write results to DynamoDB
5) The Step Function monitors all the results and calls the combiner function when all workers are done
6) The **combiner** function reads all output from DynamoDB and aggregates them into the final result## Build
You need Node.js 12.x or later in your path, then:
```bash
npm install
```## Deployment
Follow the build instructions above before attempting to deploy.
Deployment will create all the necessary AWS services, including Lambda functions, DynamoDB tables, and Step Functions. To deploy this framework to your AWS account, you must have the [AWS CLI configured](https://www.serverless.com/framework/docs/providers/aws/guide/credentials#sign-up-for-an-aws-account).
To deploy to the *dev* stage:
```bash
npm run sls -- deploy
```This will create a application stack named `burst-compute-dev`.
To deploy to a different stage (e.g. "prod"), add a stage argument:
```bash
npm run sls -- deploy -s prod
```## Usage
1. Create **worker** and **combiner** functions which follow the input/output specification defined in the [Interfaces](docs/Interfaces.md) document.
2. Invoke the **dispatch** function to start a burst job.