https://github.com/awslabs/aws-glue-blueprint-libs
https://github.com/awslabs/aws-glue-blueprint-libs
Last synced: 3 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/awslabs/aws-glue-blueprint-libs
- Owner: awslabs
- License: apache-2.0
- Created: 2020-07-09T01:17:14.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2024-06-03T11:53:35.000Z (9 months ago)
- Last Synced: 2024-08-14T07:09:21.021Z (6 months ago)
- Language: Python
- Size: 17.9 MB
- Stars: 66
- Watchers: 12
- Forks: 44
- Open Issues: 8
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
- jimsghstars - awslabs/aws-glue-blueprint-libs - (Python)
README
# aws-glue-blueprint-libs
This repository provides a Python library to build and test AWS Glue Custom Blueprints locally. It also provides sample blueprints addressing common use-cases in ETL.
### Sample Blueprints
* Crawling Amazon S3 locations: This blueprint crawls multiple Amazon S3 locations to add metadata tables to the Data Catalog.* Importing Amazon S3 data into a DynamoDB table: This blueprint imports data from Amazon S3 into a DynamoDB table.
* Conversion: This blueprint converts input files in various standard file formats into Apache Parquet format, which is optimized for analytic workloads.
* Converting character encoding: This blueprint converts your non-UTF files into UTF encoded files.
* Compaction: This blueprint creates a job that compacts input files into larger chunks based on desired file size.
* Partitioning: This blueprint creates a partitioning job that places output files into partitions based on specific partition keys.
* Importing an AWS Glue table into a Lake Formation governed table: This blueprint imports a Glue Catalog table into a Lake Formation governed table.
* Creating table definitions from Glue Custom Connection: This blueprint accesses data stores using Glue Custom Connectors, read the records, and populate the table definitions on Glue Data Catalog based on the record schema.
## Security
See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.
## License
This project is licensed under the Apache-2.0 License.