Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/airscholar/redditdataengineering
This project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data warehouse. The pipeline leverages a combination of tools and services including Apache Airflow, Celery, PostgreSQL, Amazon S3, AWS Glue, Amazon Athena, and Amazon Redshift.
https://github.com/airscholar/redditdataengineering
apache-airflow aws celery data-pipeline end-to-end-data-engineering reddit
Last synced: about 2 months ago
JSON representation
This project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data warehouse. The pipeline leverages a combination of tools and services including Apache Airflow, Celery, PostgreSQL, Amazon S3, AWS Glue, Amazon Athena, and Amazon Redshift.
- Host: GitHub
- URL: https://github.com/airscholar/redditdataengineering
- Owner: airscholar
- Created: 2023-10-23T12:05:13.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2023-10-23T12:13:02.000Z (about 1 year ago)
- Last Synced: 2024-04-18T02:57:15.775Z (9 months ago)
- Topics: apache-airflow, aws, celery, data-pipeline, end-to-end-data-engineering, reddit
- Language: Python
- Homepage: https://www.youtube.com/watch?v=LSlt6iVI_9Y
- Size: 118 KB
- Stars: 26
- Watchers: 3
- Forks: 18
- Open Issues: 0