An open API service indexing awesome lists of open source software.

https://github.com/caogiathinh/caogiathinh


https://github.com/caogiathinh/caogiathinh

airflow database dataengineer dbt dsa linux python spark sql

Last synced: 8 months ago
JSON representation

Awesome Lists containing this project

README

          

# Cao Gia Thα»‹nh
### Data Engineer


Typing SVG

Welcome to my GitHub profile!

I'm Cao Gia Thinh, a final-year Computer Science student with a deep focus on Data Engineering. I am passionate about designing and building scalable, high-performance data systems that transform raw data into valuable insights to support business decision-making.

---

## πŸ“Š GitHub Stats






## πŸ› οΈ Tech Stack & Core Competencies


Python
SQL
Apache Spark
dbt
Google Cloud
Docker
PostgreSQL
Git
Kestra

---

## πŸš€ Key Projects

These are my flagship projects that showcase my skills and experience.

### 1. [urban-mobility-elt-pipeline](https://github.com/caogiathinh/urban_mobility_elt_pipeline)
*Built a complete data platform on Google Cloud to collect, process, and analyze retail data from various sources.*

- **Orchestration:** Leveraged **Kestra** (deployed on Cloud Composer) to schedule and orchestrate data ingestion pipelines from parquet files.
- **Data Lake & Warehouse:** Stored raw data in **Google Cloud Storage (GCS)**. Subsequently, cleaned, transformed, and loaded the data into **Google BigQuery** using **Apache Spark**.
- **Data Modeling:** Implemented a **Star Schema** within BigQuery to optimize for analytical queries.
- **Deployment:** Containerized the entire application and its dependencies using **Docker** to ensure consistency across environments.

**Technologies:** `GCP (BigQuery, GCS, Composer)`, `Kestra`, `Apache Spark`, `Docker`, `Python`, `SQL`, `dbt`, `Google Data Studio`.

---

### 2. [modern-data-warehouse](https://github.com/caogiathinh/modern-data-warehouse)
*Designed and implemented a modern data warehouse to empower Sales and Marketing teams with advanced analytics.*

- **ETL & Transformation:** Using SQL to extract, transform, and load from source to destination data warehouse.
- **Data Warehouse Design:** Architected a DWH schema on **Microsoft SQL Server**.

**Technologies:** `T-SQL`, `MS SQL SERVER`.

## πŸ“« Let's Connect!

I'm always open to discussing new opportunities, interesting projects, or anything related to data and technology. Feel free to reach out!



LinkedIn


Email

****