Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/shiyis/data-labs
This repo hosts data collection, wrangling, modeling, engineering practices and labs.
https://github.com/shiyis/data-labs
aws data-engineering sam serverless serverless-functions terraform
Last synced: about 1 month ago
JSON representation
This repo hosts data collection, wrangling, modeling, engineering practices and labs.
- Host: GitHub
- URL: https://github.com/shiyis/data-labs
- Owner: shiyis
- Created: 2023-08-11T14:34:44.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2024-04-12T18:26:57.000Z (8 months ago)
- Last Synced: 2024-04-13T04:14:15.335Z (8 months ago)
- Topics: aws, data-engineering, sam, serverless, serverless-functions, terraform
- Homepage:
- Size: 20.7 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
#### data-engineering-things
This repo holds all the aws data engineering practices and general data pipeline tutorials I have done. This only holds the submodule mapping to the repos that contain the actual content of these exercises.
[aws-sam-cicd](https://github.com/shiyis/aws-serverless-etl-cicd) is a simple aws data pipeline that streams, validates, and loads tweets.
[twitter-archive](https://github.com/shiyis/twitter-archive) is a github action workflow that retrieves tweet using YAML configuration.
[terraform-labs](https://github.com/shiyis/terraform-labs) data engineering schema config with terraform hcl.
[dra-data](https://github.com/shiyis/dra-data) open source data collection with github action _flat_ and manifest file.
[pyspark-etl-example](https://github.com/AlexIoannides/pyspark-example-project/tree/eeee0c2b9af79fdd7c5d86fe56466c147b487e26) a pyspark etl example that extracts, transforms, and loads dummy data.
[yelp-to-xml](https://github.com/shiyis/data-labs/tree/master/yelp-to-xml) a small data collection app/lab of yelp reviews; converted to xml, cleaned, wrangled and managed.