An open API service indexing awesome lists of open source software.

https://github.com/gps31320779/insightflow-retail-economic-pipeline

A data engineering portfolio project using AWS cloud services to analyze correlations between Malaysian retail performance and fuel prices. Features Terraform IaC, ETL/ELT with AWS S3, Glue, SQL analytics via Athena coupled with data transformation via dbt, and workflow orchestration with Kestra.
https://github.com/gps31320779/insightflow-retail-economic-pipeline

aws-athena aws-batch aws-glue aws-quicksight aws-s3 dataengineering dbt-cloud docker kestra python sql terraform

Last synced: 7 months ago
JSON representation

A data engineering portfolio project using AWS cloud services to analyze correlations between Malaysian retail performance and fuel prices. Features Terraform IaC, ETL/ELT with AWS S3, Glue, SQL analytics via Athena coupled with data transformation via dbt, and workflow orchestration with Kestra.

Awesome Lists containing this project

README

          

# 🚀 InsightFlow: Retail Economic Pipeline

![GitHub repo size](https://img.shields.io/github/repo-size/gps31320779/insightflow-retail-economic-pipeline)
![License](https://img.shields.io/badge/license-MIT-blue.svg)
![Issues](https://img.shields.io/github/issues/gps31320779/insightflow-retail-economic-pipeline)

Welcome to the **InsightFlow: Retail Economic Pipeline** repository! This project is a comprehensive data engineering portfolio piece that leverages AWS cloud services to analyze the relationship between Malaysian retail performance and fuel prices.

## 📦 Table of Contents

- [Project Overview](#project-overview)
- [Key Features](#key-features)
- [Technologies Used](#technologies-used)
- [Getting Started](#getting-started)
- [Usage](#usage)
- [Contributing](#contributing)
- [License](#license)
- [Contact](#contact)

## 📊 Project Overview

The InsightFlow project aims to provide insights into how fuel prices impact retail performance in Malaysia. By utilizing various AWS services, we can efficiently extract, transform, and analyze data to derive meaningful conclusions.

## 🌟 Key Features

- **Infrastructure as Code (IaC)**: Use Terraform to manage cloud resources.
- **ETL/ELT Processes**: Implement data extraction, transformation, and loading with AWS S3 and Glue.
- **SQL Analytics**: Analyze data using AWS Athena.
- **Data Transformation**: Utilize dbt for effective data modeling.
- **Workflow Orchestration**: Manage workflows using Kestra.
- **Data Visualization**: Create dashboards with AWS QuickSight for easy data interpretation.

## 🛠️ Technologies Used

This project employs a variety of technologies to ensure a robust and scalable solution:

- **AWS Services**:
- AWS S3
- AWS Glue
- AWS Athena
- AWS Batch
- AWS QuickSight
- **Data Engineering Tools**:
- dbt (data build tool)
- Docker
- Kestra
- **Programming Languages**:
- Python
- SQL
- **Infrastructure as Code**:
- Terraform
- **Database**:
- PostgreSQL
- **API**:
- Open API

## 🚀 Getting Started

To get started with the InsightFlow project, follow these steps:

1. **Clone the Repository**:
```bash
git clone https://github.com/gps31320779/insightflow-retail-economic-pipeline.git
```

2. **Navigate to the Project Directory**:
```bash
cd insightflow-retail-economic-pipeline
```

3. **Install Dependencies**:
Ensure you have Docker and Terraform installed, then run:
```bash
docker-compose up
```

4. **Set Up AWS Credentials**:
Configure your AWS credentials in the environment variables or AWS config file.

5. **Run Terraform**:
Deploy the infrastructure:
```bash
terraform init
terraform apply
```

## 📈 Usage

After setting up the project, you can start using it to analyze data.

1. **Load Data**:
Use AWS Glue to load your datasets into S3.

2. **Transform Data**:
Use dbt to run transformations on the data.

3. **Analyze Data**:
Use AWS Athena to run SQL queries against your datasets.

4. **Visualize Data**:
Use AWS QuickSight to create visualizations and dashboards.

5. **Orchestrate Workflows**:
Use Kestra to manage and automate your data workflows.

## 🤝 Contributing

We welcome contributions! If you would like to contribute to this project, please follow these steps:

1. **Fork the Repository**.
2. **Create a New Branch**:
```bash
git checkout -b feature/YourFeature
```
3. **Make Your Changes**.
4. **Commit Your Changes**:
```bash
git commit -m "Add Your Feature"
```
5. **Push to the Branch**:
```bash
git push origin feature/YourFeature
```
6. **Open a Pull Request**.

## 📜 License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.

## 📞 Contact

For questions or suggestions, feel free to reach out:

- **GitHub**: [gps31320779](https://github.com/gps31320779)
- **Email**: your-email@example.com

## 📥 Releases

To download the latest releases, visit the [Releases](https://github.com/gps31320779/insightflow-retail-economic-pipeline/releases) section. Here you can find the necessary files to execute the project.

## 🌐 Conclusion

The InsightFlow: Retail Economic Pipeline project serves as a powerful example of data engineering capabilities using AWS services. It provides a clear framework for analyzing retail performance in relation to fuel prices, making it a valuable resource for anyone interested in data-driven insights.

Feel free to explore the repository and contribute to its growth. Your insights and improvements are always welcome!