https://github.com/varsha-vraj/airport_parking_toolkit
This toolkit is designed to simulate and manage airport parking events. It provides a command-line interface (CLI) for managing vehicles, zones, and parking events. It includes full integration with PostgreSQL for data storage, SQL for advanced queries, and Apache Spark for big data batch processing of parquet logs.
https://github.com/varsha-vraj/airport_parking_toolkit
big-data cli dataengineering hadoop java parquet-files poetry postgresql pyspark python3 spark sqlalchemy
Last synced: about 2 months ago
JSON representation
This toolkit is designed to simulate and manage airport parking events. It provides a command-line interface (CLI) for managing vehicles, zones, and parking events. It includes full integration with PostgreSQL for data storage, SQL for advanced queries, and Apache Spark for big data batch processing of parquet logs.
- Host: GitHub
- URL: https://github.com/varsha-vraj/airport_parking_toolkit
- Owner: varsha-vraj
- License: mit
- Created: 2025-07-19T14:29:37.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2025-07-19T16:08:59.000Z (3 months ago)
- Last Synced: 2025-08-11T11:02:27.546Z (about 2 months ago)
- Topics: big-data, cli, dataengineering, hadoop, java, parquet-files, poetry, postgresql, pyspark, python3, spark, sqlalchemy
- Language: Python
- Homepage:
- Size: 232 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
๐ Airport Parking Management Toolkit
Author: varshaa112003@gmail.com
Project Type: Python CLI Toolkit + PostgreSQL + Apache Spark
๐ Description
This toolkit is designed to simulate and manage airport parking events. It provides a command-line interface (CLI) for managing vehicles, zones, and parking events. It includes full integration with PostgreSQL for data storage, SQL for advanced queries, and Apache Spark for big data batch processing of parquet logs.
๐ ๏ธ Technologies Used
TechnologyPurpose
PythonCore programming
CLICommand Line Interface
PostgreSQLDatabase backend
SQLAlchemyDatabase ORM
Apache Spark - PySparkETL on parquet data
PoetryDependency management
๐ Folder Structure
airport_parking_project_demo/
โ
โโโ airport_parking_toolkit/
โ โโโ cli_tool.py
โ โโโ spark_jobs.py
โ
โโโ output_parquet/
โ โโโ vehicles.parquet
โ โโโ parking_zones.parquet
โ โโโ parking_events.parquet
โ
โโโ tests/
โ โโโ test_cli.py
โ
โโโ queries/
โ โโโ queries.sql
โ โโโ advanced_sql/
โ โโโ triggers.sql
โ
โโโ logs/
โ โโโ parking_etl.log
โ
โโโ load_parquet_to_db.py
โโโ pyproject.toml
โโโ README.html
๐ How to Run
- ๐ฆ Install dependencies:
poetry install
- ๐งน Clean old logs (if needed):
rm -rf logs/parking_etl.log
- โถ๏ธ Run the CLI:
python -m airport_parking_toolkit.cli_tool
- ๐ฅ Load parquet data to PostgreSQL:
python load_parquet_to_db.py
- ๐ฅ Run Apache Spark jobs:
python airport_parking_toolkit/spark_jobs.py
๐ธ Images
1. Vehicle Entry/Exit Logging
![]()
Validate Parking Records
![]()
Analyze Zone Utilization
![]()
Track Frequent Parkers
![]()
Data Quality Issues
![]()
Compare Parking Zone Performance
![]()
Schema Diagram
![]()
๐ Advanced SQL Features
- Trigger to log insertions into
parking_events
- Stored Functions for computing durations
- Views for summarised parking statistics
๐ฆ Logs
ETL logs are written to
logs/parking_etl.log
after every Spark job run.
โจ Thank You