https://github.com/mashiike/redshift-udf-kpl-deaggregate
Lambda UDF to de-aggregate KPL for Redshift
https://github.com/mashiike/redshift-udf-kpl-deaggregate
Last synced: 3 months ago
JSON representation
Lambda UDF to de-aggregate KPL for Redshift
- Host: GitHub
- URL: https://github.com/mashiike/redshift-udf-kpl-deaggregate
- Owner: mashiike
- License: mit
- Created: 2022-12-15T18:19:18.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-03-20T20:22:21.000Z (about 2 years ago)
- Last Synced: 2024-06-21T06:20:59.589Z (12 months ago)
- Language: Go
- Size: 42 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# redshift-udf-kpl-deaggregate


[](https://github.com/mashiike/redshift-udf-kpl-deaggregate/blob/master/LICENSE)Lambda UDF to de-aggregate KPL for Redshift
## Usage
### Deploy Lambda Function
Download binary from [Releases](https://github.com/mashiike/redshift-udf-kpl-deaggregate/releases).
Then create a zip archive like the following and deploy it. (runtime `provided.al2`)```
lambda.zip
└── bootstrap # build binary
```A related document is [https://docs.aws.amazon.com/lambda/latest/dg/runtimes-custom.html](https://docs.aws.amazon.com/lambda/latest/dg/runtimes-custom.html)
deploy lambda function example in [lambda directory](lambda/)
The example of lambda directory uses [lambroll](https://github.com/fujiwara/lambroll) for deployment.### Create Redshift UDF
```sql
CREATE OR REPLACE EXTERNAL FUNCTION udf_kpl_deaggregate(varchar(max))
RETURNS varchar(max)
IMMUTABLE
LAMBDA 'redshift-udf-kpl-deaggregate'
IAM_ROLE 'arn:aws:iam::012345678910:role/lambda-udf-redshift';
```### use with Redshift Streaming ingestion
see details: https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-streaming-ingestion.html
```sql
CREATE EXTERNAL SCHEMA kinesis
FROM KINESIS
IAM_ROLE default ;
``````sql
CREATE MATERIALIZED VIEW my_view AS
SELECT
approximate_arrival_timestamp,
partition_key,
shard_id,
sequence_number,
JSON_PARSE(udf_kpl_deaggregate(from_varbyte(kinesis_data,'hex'))) as kinesis_data,
refresh_time
FROM kinesis.my_stream_name
WHERE is_valid_json_array(udf_kpl_deaggregate(from_varbyte(kinesis_data,'hex')));
``````sql
REFRESH MATERIALIZED VIEW my_view;
SELECT approximate_arrival_timestamp,partition_key,shard_id,sequence_number,refresh_time, data
from my_view as record, record.kinesis_data as data
order by approximate_arrival_timestamp desc, sequence_number desc
```## LICENSE
MIT License
Copyright (c) 2022 IKEDA Masashi