{"id":28330888,"url":"https://github.com/zainea-bogdan/data_engineer_project_wowcinema","last_synced_at":"2026-05-19T07:02:39.452Z","repository":{"id":293295177,"uuid":"983529702","full_name":"zainea-bogdan/Data_Engineer_Project_WoWCinema","owner":"zainea-bogdan","description":"WoWCinema is a project based on a fictional scenario where I stepped into the role of a Data Engineer, designing and building an end-to-end Data Infrastructure. A ETL pipeline ingests data from multiple sources, transforms it, and loads it into a centralized PostgreSQL data warehouse to power analytics, KPI tracking, and reporting","archived":false,"fork":false,"pushed_at":"2025-05-31T11:01:58.000Z","size":1666,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-31T17:23:12.023Z","etag":null,"topics":["analytics","big-data","data","datawarehousing","etl-pipeline","postgres","python","sql"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zainea-bogdan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-14T14:14:18.000Z","updated_at":"2025-05-31T11:02:02.000Z","dependencies_parsed_at":"2025-05-31T12:24:50.415Z","dependency_job_id":"f5c0e29e-f171-4785-a1b6-7a3d8b8bb595","html_url":"https://github.com/zainea-bogdan/Data_Engineer_Project_WoWCinema","commit_stats":null,"previous_names":["zainea-bogdan/data_engineer_project_wowcinema"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zainea-bogdan%2FData_Engineer_Project_WoWCinema","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zainea-bogdan%2FData_Engineer_Project_WoWCinema/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zainea-bogdan%2FData_Engineer_Project_WoWCinema/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zainea-bogdan%2FData_Engineer_Project_WoWCinema/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zainea-bogdan","download_url":"https://codeload.github.com/zainea-bogdan/Data_Engineer_Project_WoWCinema/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zainea-bogdan%2FData_Engineer_Project_WoWCinema/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259088563,"owners_count":22803657,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["analytics","big-data","data","datawarehousing","etl-pipeline","postgres","python","sql"],"created_at":"2025-05-26T17:43:38.304Z","updated_at":"2025-10-27T15:05:12.230Z","avatar_url":"https://github.com/zainea-bogdan.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# **WoWCinema Data Project - Prototype**\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"wowcinema v1 logo.png\" alt=\"Logo\" width=\"600\"\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\n  \u003cem\u003eCreated using Canva AI tools. I do not claim ownership of the visual elements.\u003cbr\u003e\n  If this image presents an issue, please feel free to contact me.\u003c/em\u003e\n\u003c/p\u003e\n\n![Postgres](https://img.shields.io/badge/postgres-%23316192.svg?style=for-the-badge\u0026logo=postgresql\u0026logoColor=white)\n![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge\u0026logo=python\u0026logoColor=ffdd54)\n![Power Bi](https://img.shields.io/badge/power_bi-F2C811?style=for-the-badge\u0026logo=powerbi\u0026logoColor=black)\n\n---\n\nWelcome to my first end-to-end data project — **WoWCinema Repository**. This project is inspired by the [InternIT Repository](https://github.com/romanmurzac/InternIT/tree/main), and serves as a comprehensive record of my learning journey across multiple areas of the data field. It combines both theoretical concepts and practical implementations, with the primary objective of simulating a real-world business environment by designing and implementing a complete data solution, from ingestion and transformation to analytics and visualization, using industry-relevant tools and best practices.\n\n## **Project's Purpose:**\n\nAs mentioned earlier, this repository is intended to document my journey of learning and applying Big Data concepts. It is structured in a logical manner that reflects my personal approach to learning, emphasizing how I apply theoretical knowledge to my project-based scenario.\n\nThrough this project, _my objectives_ are to:\n\n- Build a solid understanding of core Big Data concepts and modern data workflows.\n- Simulate a real business context to define meaningful requirements and goals\n- Improve my ability to structure work, track progress, and document the development process.\n- Gain practical experience with widely used tools and technologies across the data stack.(ex: PostgreSQL,Python, PowerBi)\n\nUltimately, this project aims to help me develop both breadth and depth across the Big Data field, preparing me for possible future roles,primarily focusing on: Data Engineer, Data Analyst,Database Administrator and Data Warehouse Arhitect.\n\n## **Business Context**\n\nI am a **Data Engineer** at **WoWCinema**, a young, ambitious Romanian startup in the online movie streaming industry. WoWCinema's mission is to make international film content easily accessible to Romanian audiences, with a long-term vision of promoting Romanian cinema as well and expanding throughout Eastern Europe.\n\nWoWcinema's platform offers a diverse catalog of the latest global releases, and they are actively working to integrate Romanian titles as part of their identity. Users can choose from three flexible subscription plans, tailored to various budgets. They also provide movie recommendations based on user behavior and preferences, along with personalized dashboards and viewer engagement analytics.\n\nTo succeed in a competitive market, WoWCinema relies on data to guide user engagement, retention, and content acquisition strategies.\n\nAs the **Data Engineer** of WoWCinema, I am responsible for designing, building, and maintaining the data infrastructure that powers the company's reporting, analytics, and internal decision-making systems. My objective is to ensure that data is:\n\n- **Precisely collected** from the right sources (ex: external databases, internal systems)\n- **Securely stored** in a well-structured and scalable data warehouse.\n- **Efficiently processed** through a robust ETL pipeline.\n- **Governed and compliant** with applicable data privacy regulations, such as GDPR.\n- **Analytics-ready**, enabling data analysts and business stakeholders to measure company performance and make informed, data-driven decisions.\n\nFor more details on core business goals, required [reports](./Business_Requirements/README.md#Reports), desired [dashboards](./Business_Requirements/README.md#Dashboards), and key performance indicators ([KPIs](./Business_Requirements/README.md#KPIs)), please refer to the folder named: [Business Requirements](./Business_Requirements/README.md).\n\n## Table of Contents\n\n- [Business_Requirements](./Business_Requirements/)\n- [Dashboards_Power_BI](./Dashboards_Power_BI/)\n- [Data_Warehouse_Arhitecture](./Data_Warehouse_Arhitecture/)\n  - [src](./Data_Warehouse_Arhitecture/src/)\n    - [bronze](./Data_Warehouse_Arhitecture/src/bronze/)\n      - [creating](./Data_Warehouse_Arhitecture/src/bronze/creating/)\n      - [inserting](./Data_Warehouse_Arhitecture/src/bronze/inserting/)\n      - [selecting](./Data_Warehouse_Arhitecture/src/bronze/selecting/)\n      - [sequences](./Data_Warehouse_Arhitecture/src/bronze/sequences/)\n    - [gold](./Data_Warehouse_Arhitecture/src/gold/)\n      - [creating](./Data_Warehouse_Arhitecture/src/gold/creating/)\n    - [schemas](./Data_Warehouse_Arhitecture/src/schemas/)\n    - [silver](./Data_Warehouse_Arhitecture/src/silver/)\n      - [creating](./Data_Warehouse_Arhitecture/src/silver/creating/)\n      - [inserting](./Data_Warehouse_Arhitecture/src/silver/inserting/)\n- [ETL_pipeline](./ETL_pipeline/)\n  - [src](./ETL_pipeline/src/)\n    - [extract](./ETL_pipeline/src/extract/)\n    - [load](./ETL_pipeline/src/load/)\n    - [transform](./ETL_pipeline/src/transform/)\n- [IMDb](./IMDb/)\n\n## Repository tree:\n\n- The structure of the repository is outlined below. Each main folder includes:\n  - A **`README.md`** file, which provides detailed explanations of the folder's contents along with instructions for practical tasks.\n  - A **`src/`** subfolder, which contains the source code relevant to that specific component.\n\n```\n|\n+---Business_Requirements\n|       README.md\n|\n+---Dashboards_Power_BI\n|       Subscription Plan Performance Overview.pdf\n|       User Engagement \u0026 Activity Overview.pdf\n|\n+---Data_Warehouse_Arhitecture\n|   |   BSG_arhitecture.drawio\n|   |   README.md\n|   |\n|   \\---src\n|       +---bronze\n|       |   +---creating\n|       |   |       create_table_logs_system.sql\n|       |   |       create_table_name_basics.sql\n|       |   |       create_table_ratings.sql\n|       |   |       create_table_subscription_plans.sql\n|       |   |       create_table_title_basics.sql\n|       |   |       create_table_title_crew.sql\n|       |   |       create_table_title_episodes.sql\n|       |   |       create_table_users.sql\n|       |   |\n|       |   +---inserting\n|       |   |       insert_logs.sql\n|       |   |       insert_subscriptions.sql\n|       |   |       insert_title_basics.sql\n|       |   |       insert_title_directors.sql\n|       |   |       insert_title_director_names.sql\n|       |   |       insert_title_episodes.sql\n|       |   |       insert_title_ratings.sql\n|       |   |       insert_users.sql\n|       |   |\n|       |   +---selecting\n|       |   |       select_all_directors_codes_from_crew.sql\n|       |   |       select_all_users.sql\n|       |   |       select_id_users.sql\n|       |   |       select_subscription_start_date.sql\n|       |   |       select_tconst_basics.sql\n|       |   |\n|       |   \\---sequences\n|       |           logs_counter.sql\n|       |           user_counter.sql\n|       |\n|       +---gold\n|       |   +---creating\n|       |   |       creating_gold_layer_structure_and_data.sql\n|       |   |       creating_views_for_reports.sql\n|       |   |\n|       |   +---inserting\n|       |   \\---selecting\n|       +---schemas\n|       |       create_bronze_schema.sql\n|       |       create_gold_schema.sql\n|       |       create_silver_schema.sql\n|       |\n|       \\---silver\n|           +---creating\n|           |       create_dim_reactions.sql\n|           |       create_dim_regions.sql\n|           |       create_dim_subscriptions.sql\n|           |       create_dim_titles.sql\n|           |       create_dim_users.sql\n|           |       create_table_fact_logs.sql\n|           |       judete_romania.txt\n|           |\n|           +---inserting\n|           |       insert_dim_reactions.sql\n|           |       insert_dim_regions.sql\n|           |       insert_dim_subscriptions.sql\n|           |       insert_dim_titles.sql\n|           |       insert_dim_users.sql\n|           |       insert_fact_logs.sql\n|           |\n|           \\---selecting\n+---ETL_pipeline\n|   |   README.md\n|   |\n|   \\---src\n|       +---extract\n|       |       bronze_layer_structure.py\n|       |       extract_director_name.py\n|       |       extract_logs.py\n|       |       extract_subscriptions.py\n|       |       extract_title_basics.py\n|       |       extract_title_crew.py\n|       |       extract_title_episodes.py\n|       |       extract_title_ratings.py\n|       |       extract_users.py\n|       |\n|       +---load\n|       |       loader.py\n|       |\n|       \\---transform\n|               creating_silver_layer.py\n|               dim_region.py\n|\n\\---IMDb\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzainea-bogdan%2Fdata_engineer_project_wowcinema","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzainea-bogdan%2Fdata_engineer_project_wowcinema","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzainea-bogdan%2Fdata_engineer_project_wowcinema/lists"}