https://github.com/kevinndungu-source/amazon_redshift_s3_data_pipeline

This repository contains code and documentation for the Amazon Redshift project, showcasing a data pipeline using Amazon S3. Explore how I manage and analyze large datasets in the cloud!
https://github.com/kevinndungu-source/amazon_redshift_s3_data_pipeline

amazon-redshift-data-pipeline-aws-cloud-computing amazon-s3 datamanagement dataquality redshift-database sql-query vpc-creation vpc-endpoint

Last synced: 10 months ago
JSON representation

This repository contains code and documentation for the Amazon Redshift project, showcasing a data pipeline using Amazon S3. Explore how I manage and analyze large datasets in the cloud!

Host: GitHub
URL: https://github.com/kevinndungu-source/amazon_redshift_s3_data_pipeline
Owner: kevinndungu-source
License: mit
Created: 2024-02-01T11:46:05.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2024-08-06T07:30:15.000Z (over 1 year ago)
Last Synced: 2025-01-16T23:43:24.638Z (11 months ago)
Topics: amazon-redshift-data-pipeline-aws-cloud-computing, amazon-s3, datamanagement, dataquality, redshift-database, sql-query, vpc-creation, vpc-endpoint
Language: JSON
Homepage: https://drive.google.com/file/d/1EnSiEKzNvtVooPt4Xfal0PzYQiqiJYZN/view?usp=drive_link
Size: 5.39 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Amazon-Redshift_S3-Data_Pipeline

Welcome to the documentation repository for my Amazon Redshift Project! This repository contains detailed information about the project, including setup instructions, usage guidelines, and insights into data analysis processes.

---

![S3 Redshift](https://github.com/kevinndungu-source/Amazon_Redshift_S3_Data_Pipeline/assets/114335263/11fafc6b-288a-4163-8cd3-369fd19bd164)

---

## Overview

The Amazon Redshift Project is aimed at demonstrating effective data management and analysis using Amazon Redshift, a fully managed data warehouse service in the cloud. The `Amazon_RedShift_Project_Documentation` pdf file is divided into four main parts:

1. Introduction: Provides an overview of the project objectives and outlines the contents of the documentation.

2. Preparation of the Environment: Details the setup process, including creating a virtual private cloud (**VPC**), setting up **subnets**, creating an Amazon **S3** bucket, and configuring **IAM** roles and **security groups**.

3. Interaction with Amazon Redshift: Explains how to create a **Redshift cluster**, configure database tables, import data from S3, and run queries using the Redshift query editor.

4. Conclusion: Summarizes the project outcomes and discusses future considerations.

## Infrastructure as Code (IaC)

For those interested in replicating my cloud infrastructure setup, I provide *Infrastructure as Code* (**IaC**) files in the `Amazon-Redshift_S3-Data_Pipeline` directory. You can find both [YAML](https://github.com/kevinndungu-source/Amazon_Redshift_S3_Data_Pipeline/blob/main/Project's_Code/Amazon_RedShift_Cloud_Formation_Stack_YAML_Code.yml) and [JSON](https://github.com/kevinndungu-source/Amazon_Redshift_S3_Data_Pipeline/blob/main/Project's_Code/Amazon_RedShift_Cloud_Formation_Stack_JSON_Code.json) formats of my cloud formation stack templates, enabling easy deployment and scalability.

### Deployment Instructions

1. Clone this repository to your local machine.
2. Navigate to the `Amazon-Redshift_S3-Data_Pipeline` `/Project's_Code` files directory.
3. Choose the appropriate cloud formation stack template (**YAML** or **JSON**) for your deployment.
4. Deploy the stack using the **AWS CloudFormation** console or **CLI**. Refer to the [AWS documentation](https://docs.aws.amazon.com/whitepapers/latest/overview-deployment-options/aws-cloudformation.html) for detailed instructions on deploying CloudFormation stacks.

---

## How to Run this project in your own environment

**Clone the Repository:-**
```sh
git clone http://github.com/kevinndungu-source/Amazon_Redshift_S3_Data_Pipeline

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kevinndungu-source/amazon_redshift_s3_data_pipeline

Awesome Lists containing this project

README