Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/kevinndungu-source/amazon_redshift_s3_data_pipeline
This repository contains code and documentation for the Amazon Redshift project, showcasing a data pipeline using Amazon S3. Explore how I manage and analyze large datasets in the cloud!
https://github.com/kevinndungu-source/amazon_redshift_s3_data_pipeline
amazon-redshift-data-pipeline-aws-cloud-computing amazon-s3 datamanagement dataquality redshift-database sql-query vpc-creation vpc-endpoint
Last synced: about 1 month ago
JSON representation
This repository contains code and documentation for the Amazon Redshift project, showcasing a data pipeline using Amazon S3. Explore how I manage and analyze large datasets in the cloud!
- Host: GitHub
- URL: https://github.com/kevinndungu-source/amazon_redshift_s3_data_pipeline
- Owner: kevinndungu-source
- License: mit
- Created: 2024-02-01T11:46:05.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-08-06T07:30:15.000Z (5 months ago)
- Last Synced: 2024-08-06T09:28:33.954Z (5 months ago)
- Topics: amazon-redshift-data-pipeline-aws-cloud-computing, amazon-s3, datamanagement, dataquality, redshift-database, sql-query, vpc-creation, vpc-endpoint
- Language: JSON
- Homepage: https://drive.google.com/file/d/1EnSiEKzNvtVooPt4Xfal0PzYQiqiJYZN/view?usp=drive_link
- Size: 5.39 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Amazon-Redshift_S3-Data_Pipeline
Welcome to the documentation repository for my Amazon Redshift Project! This repository contains detailed information about the project, including setup instructions, usage guidelines, and insights into data analysis processes.
---
![S3 Redshift](https://github.com/kevinndungu-source/Amazon_Redshift_S3_Data_Pipeline/assets/114335263/11fafc6b-288a-4163-8cd3-369fd19bd164)
---
## Overview
The Amazon Redshift Project is aimed at demonstrating effective data management and analysis using Amazon Redshift, a fully managed data warehouse service in the cloud. The `Amazon_RedShift_Project_Documentation` pdf file is divided into four main parts:
1. Introduction: Provides an overview of the project objectives and outlines the contents of the documentation.
2. Preparation of the Environment: Details the setup process, including creating a virtual private cloud (**VPC**), setting up **subnets**, creating an Amazon **S3** bucket, and configuring **IAM** roles and **security groups**.3. Interaction with Amazon Redshift: Explains how to create a **Redshift cluster**, configure database tables, import data from S3, and run queries using the Redshift query editor.
4. Conclusion: Summarizes the project outcomes and discusses future considerations.
## Infrastructure as Code (IaC)
For those interested in replicating my cloud infrastructure setup, I provide *Infrastructure as Code* (**IaC**) files in the `Amazon-Redshift_S3-Data_Pipeline` directory. You can find both [YAML](https://github.com/kevinndungu-source/Amazon_Redshift_S3_Data_Pipeline/blob/main/Project's_Code/Amazon_RedShift_Cloud_Formation_Stack_YAML_Code.yml) and [JSON](https://github.com/kevinndungu-source/Amazon_Redshift_S3_Data_Pipeline/blob/main/Project's_Code/Amazon_RedShift_Cloud_Formation_Stack_JSON_Code.json) formats of my cloud formation stack templates, enabling easy deployment and scalability.
### Deployment Instructions
1. Clone this repository to your local machine.
2. Navigate to the `Amazon-Redshift_S3-Data_Pipeline` `/Project's_Code` files directory.
3. Choose the appropriate cloud formation stack template (**YAML** or **JSON**) for your deployment.
4. Deploy the stack using the **AWS CloudFormation** console or **CLI**. Refer to the [AWS documentation](https://docs.aws.amazon.com/whitepapers/latest/overview-deployment-options/aws-cloudformation.html) for detailed instructions on deploying CloudFormation stacks.---
## How to Run this project in your own environment
**Clone the Repository:-**
```sh
git clone http://github.com/kevinndungu-source/Amazon_Redshift_S3_Data_Pipeline