An open API service indexing awesome lists of open source software.

https://github.com/willie-conway/relational-database-administration-capstone-project

🧱 Relational Database Administration Capstone Project focuses on design, secure, optimize, and automate OLTP & Data Warehouse systems using MySQL, PostgreSQL, Apache Airflow, and shell scripting. πŸ’ΎπŸ”πŸ“Šβš™οΈ
https://github.com/willie-conway/relational-database-administration-capstone-project

airflow backup data-pipelines data-warehousing database-admin database-security encryption etl mysql oltp optimization phpmyadmin phppgadmin postgresql restore shell-scripting sql

Last synced: 3 months ago
JSON representation

🧱 Relational Database Administration Capstone Project focuses on design, secure, optimize, and automate OLTP & Data Warehouse systems using MySQL, PostgreSQL, Apache Airflow, and shell scripting. πŸ’ΎπŸ”πŸ“Šβš™οΈ

Awesome Lists containing this project

README

          

# πŸ–₯️ IBM Relational Database Administration Capstone Project




This capstone project highlights the hands-on application of core relational database administration skills by simulating a real-world enterprise environment.
I played the role of a **Database Administrator (DBA)** responsible for designing, managing, and securing both OLTP and OLAP systems using MySQL and PostgreSQL, with automation and performance tuning included.

---

## πŸš€ Project Overview

As a final project in the [IBM Relational Database Administrator Professional Certificate](https://www.coursera.org/professional-certificates/ibm-database-administrator), this capstone involved:

- Designing transactional and analytical databases
- Implementing ETL workflows using Apache Airflow
- Performing backups and automating restore procedures
- Securing data with encryption and granular access control
- Optimizing SQL queries and database memory usage

---

## 🧠 What I Learned

βœ… OLTP and Data Warehouse schema design and deployment
βœ… Data encryption, column-level access control, and secure user management
βœ… Backup, restore, and cron job automation
βœ… Performance tuning with indexing and query optimization
βœ… End-to-end ETL pipeline development using Apache Airflow

---

## 🧰 Skills & Tools Used

- 🐬 MySQL & 🐘 PostgreSQL
- βš™οΈ Apache Airflow
- πŸ” SQL Grants & Data Encryption
- πŸ’Ύ Backup & Restore Scripting (cron, `.sh`)
- πŸ“Š OLTP & OLAP Systems
- 🧱 Schema Design & Indexing
- 🐧 Linux Terminal, Shell Scripting

---

## πŸ“¦ Capstone Modules Breakdown

| Module | Description |
|--------|-------------|
| πŸ”§ **1. OLTP Database & Platform Architecture** | Created MySQL-based OLTP schema with indexes and constraints |
| πŸ—οΈ **2. Data Warehouse Design** | Built dimensional models using PostgreSQL with star schema |
| πŸ” **3. ETL & Pipelines** | Developed Apache Airflow DAGs to automate data ingestion and transformation |
| πŸ’Ύ **4. Backup & Restore Automation** | Scripted full backup and recovery pipelines using cron and shell |
| πŸ” **5. Security & Optimization** | Encrypted sensitive data, configured user access roles, and optimized queries |
| πŸ“€ **6. Final Submission** | Submitted full database system with automation and documentation |

---

## πŸ“‚ Project Assets

```

πŸ“ OLTP Schema & Sales Database (MySQL)
πŸ“ PostgreSQL Data Warehouse
πŸ“ Apache Airflow DAGs & ETL Scripts
πŸ“ Backup/Restore Scripts (cron, bash)
πŸ“ Security Scripts (GRANT, ENCRYPTION)
πŸ“ Optimization SQL (Indexes, EXPLAIN)

```

---

## πŸ§ͺ Lab Highlights

### πŸ” Access Management & Data Encryption
- Defined roles: `db_admin`, `db_analyst`, `db_reporter`, `db_external`
- Encrypted sales amount fields using `VARBINARY` and passphrases
- Column- and table-level GRANT statements

### ♻️ Backup & Restore Automation
- Created cron jobs to back up the sales database every 3 minutes
- Wrote scripts to truncate tables, zip backups, and restore via terminal
- Maintained logs and `.tar.gz` backup archives

### πŸ” ETL with Apache Airflow
- Parsed web logs and loaded structured data to databases
- Implemented DAGs for extract β†’ transform β†’ load
- Used custom Python scripts for ingestion and transformation

### 🧱 Data Warehouse & Reporting
- Created dimensional tables: `DimDate`, `DimCategory`, `DimCountry`, `FactSales`
- Performed OLAP queries using `ROLLUP`, `CUBE`, and `GROUPING SETS`
- Generated Materialized Query Tables (MQTs)

### βš™οΈ Database Optimization
- Created indexes and tested pre/post-query performance
- Analyzed memory usage using MySQL EXPLAIN
- Improved query execution time by optimizing data types and schema

---

## 🧠 Summary

This project brought together all major elements of relational database administration, allowing me to:
- Simulate enterprise-grade database systems
- Automate routine tasks using Linux tools
- Secure and manage data access at scale
- Apply performance tuning methods in real-world contexts

---

## πŸ† Outcome

βœ… **Confident in managing RDBMS systems (MySQL, PostgreSQL)**
βœ… **Experienced with automation, security, and optimization workflows**
βœ… **Ready for DBA or backend-focused roles involving data infrastructure**

---

## πŸ’Ό Ideal For

- Recruiters seeking skilled DBAs or backend engineers
- Employers looking for automated, secure data systems
- Projects requiring optimized RDBMS solutions and ETL integration

---

## πŸ”— Related Projects

- [ETL Automation using Apache Airflow](#)
- [Data Warehouse with PostgreSQL](#)
- [SQL Query Optimization & Indexing](#)

---

## 🀝 Let’s Connect

Explore more of my work and collaborations:
🌐 [My Portfolio](#) | πŸ’Ό [LinkedIn](#) | πŸ“‚ [GitHub Projects](#)