https://github.com/joyceannie/reddit_data_pipeline
The purpose of the project is to create a data pipeline to extract data from Reddit API and create a dashboard to analyse the data. The data is extracted from the subreddit r/Python. The data is extracted daily and uploaded to S3 buckets, and copied to Redshift. The dashboard is created using Google Data Studio.
https://github.com/joyceannie/reddit_data_pipeline
airflow aws etl python redshift s3 terraform
Last synced: 4 months ago
JSON representation
The purpose of the project is to create a data pipeline to extract data from Reddit API and create a dashboard to analyse the data. The data is extracted from the subreddit r/Python. The data is extracted daily and uploaded to S3 buckets, and copied to Redshift. The dashboard is created using Google Data Studio.
- Host: GitHub
- URL: https://github.com/joyceannie/reddit_data_pipeline
- Owner: joyceannie
- Created: 2022-08-16T22:28:20.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2022-08-27T16:34:00.000Z (almost 3 years ago)
- Last Synced: 2024-12-01T05:14:11.261Z (6 months ago)
- Topics: airflow, aws, etl, python, redshift, s3, terraform
- Language: Python
- Homepage:
- Size: 261 KB
- Stars: 4
- Watchers: 2
- Forks: 1
- Open Issues: 1