An open API service indexing awesome lists of open source software.

https://github.com/openaq/fetches-optimizer

Optimize fetch results for more efficient aggregations
https://github.com/openaq/fetches-optimizer

Last synced: 3 months ago
JSON representation

Optimize fetch results for more efficient aggregations

Awesome Lists containing this project

README

        

# fetches-optimizer

This is a simple piece of code intended to be run as a Lambda function when new data is inserted into `s3://openaq-fetches/realtime`. Currently, it's purpose is to gzip new objects and save them to `s3://openaq-fetches/realtime-gzipped`. This will make aggregations (via Athena) quicker and less expensive.

## Running in batch

In addition to being triggered by SNS when new data is added to `s3://openaq-fetches/realtime`, the code can also be run in batch by creating a manifest file.

If you have a file of the form

```
realtime/2018-02-10/1518286521.ndjson.gz
realtime/2018-02-10/1518287191.ndjson.gz
realtime/2018-02-10/1518293192.ndjson.gz
...
```

you can invoke the Lambda function via the AWS CLI like

```