https://github.com/gps31320779/insightflow-retail-economic-pipeline
A data engineering portfolio project using AWS cloud services to analyze correlations between Malaysian retail performance and fuel prices. Features Terraform IaC, ETL/ELT with AWS S3, Glue, SQL analytics via Athena coupled with data transformation via dbt, and workflow orchestration with Kestra.
https://github.com/gps31320779/insightflow-retail-economic-pipeline
aws-athena aws-batch aws-glue aws-quicksight aws-s3 dataengineering dbt-cloud docker kestra python sql terraform
Last synced: 7 months ago
JSON representation
A data engineering portfolio project using AWS cloud services to analyze correlations between Malaysian retail performance and fuel prices. Features Terraform IaC, ETL/ELT with AWS S3, Glue, SQL analytics via Athena coupled with data transformation via dbt, and workflow orchestration with Kestra.
- Host: GitHub
- URL: https://github.com/gps31320779/insightflow-retail-economic-pipeline
- Owner: gps31320779
- License: apache-2.0
- Created: 2025-04-13T03:27:56.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2025-04-14T20:00:59.000Z (7 months ago)
- Last Synced: 2025-04-14T20:23:24.247Z (7 months ago)
- Topics: aws-athena, aws-batch, aws-glue, aws-quicksight, aws-s3, dataengineering, dbt-cloud, docker, kestra, python, sql, terraform
- Size: 8.79 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# 🚀 InsightFlow: Retail Economic Pipeline



Welcome to the **InsightFlow: Retail Economic Pipeline** repository! This project is a comprehensive data engineering portfolio piece that leverages AWS cloud services to analyze the relationship between Malaysian retail performance and fuel prices.
## 📦 Table of Contents
- [Project Overview](#project-overview)
- [Key Features](#key-features)
- [Technologies Used](#technologies-used)
- [Getting Started](#getting-started)
- [Usage](#usage)
- [Contributing](#contributing)
- [License](#license)
- [Contact](#contact)
## 📊 Project Overview
The InsightFlow project aims to provide insights into how fuel prices impact retail performance in Malaysia. By utilizing various AWS services, we can efficiently extract, transform, and analyze data to derive meaningful conclusions.
## 🌟 Key Features
- **Infrastructure as Code (IaC)**: Use Terraform to manage cloud resources.
- **ETL/ELT Processes**: Implement data extraction, transformation, and loading with AWS S3 and Glue.
- **SQL Analytics**: Analyze data using AWS Athena.
- **Data Transformation**: Utilize dbt for effective data modeling.
- **Workflow Orchestration**: Manage workflows using Kestra.
- **Data Visualization**: Create dashboards with AWS QuickSight for easy data interpretation.
## 🛠️ Technologies Used
This project employs a variety of technologies to ensure a robust and scalable solution:
- **AWS Services**:
- AWS S3
- AWS Glue
- AWS Athena
- AWS Batch
- AWS QuickSight
- **Data Engineering Tools**:
- dbt (data build tool)
- Docker
- Kestra
- **Programming Languages**:
- Python
- SQL
- **Infrastructure as Code**:
- Terraform
- **Database**:
- PostgreSQL
- **API**:
- Open API
## 🚀 Getting Started
To get started with the InsightFlow project, follow these steps:
1. **Clone the Repository**:
```bash
git clone https://github.com/gps31320779/insightflow-retail-economic-pipeline.git
```
2. **Navigate to the Project Directory**:
```bash
cd insightflow-retail-economic-pipeline
```
3. **Install Dependencies**:
Ensure you have Docker and Terraform installed, then run:
```bash
docker-compose up
```
4. **Set Up AWS Credentials**:
Configure your AWS credentials in the environment variables or AWS config file.
5. **Run Terraform**:
Deploy the infrastructure:
```bash
terraform init
terraform apply
```
## 📈 Usage
After setting up the project, you can start using it to analyze data.
1. **Load Data**:
Use AWS Glue to load your datasets into S3.
2. **Transform Data**:
Use dbt to run transformations on the data.
3. **Analyze Data**:
Use AWS Athena to run SQL queries against your datasets.
4. **Visualize Data**:
Use AWS QuickSight to create visualizations and dashboards.
5. **Orchestrate Workflows**:
Use Kestra to manage and automate your data workflows.
## 🤝 Contributing
We welcome contributions! If you would like to contribute to this project, please follow these steps:
1. **Fork the Repository**.
2. **Create a New Branch**:
```bash
git checkout -b feature/YourFeature
```
3. **Make Your Changes**.
4. **Commit Your Changes**:
```bash
git commit -m "Add Your Feature"
```
5. **Push to the Branch**:
```bash
git push origin feature/YourFeature
```
6. **Open a Pull Request**.
## 📜 License
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
## 📞 Contact
For questions or suggestions, feel free to reach out:
- **GitHub**: [gps31320779](https://github.com/gps31320779)
- **Email**: your-email@example.com
## 📥 Releases
To download the latest releases, visit the [Releases](https://github.com/gps31320779/insightflow-retail-economic-pipeline/releases) section. Here you can find the necessary files to execute the project.
## 🌐 Conclusion
The InsightFlow: Retail Economic Pipeline project serves as a powerful example of data engineering capabilities using AWS services. It provides a clear framework for analyzing retail performance in relation to fuel prices, making it a valuable resource for anyone interested in data-driven insights.
Feel free to explore the repository and contribute to its growth. Your insights and improvements are always welcome!