https://github.com/gabor-gabor/ii.-data-science-competition
Conducted Exploratory Data Analysis on a real-world but anonymized dataset from MORGENS hotel management system
https://github.com/gabor-gabor/ii.-data-science-competition
dashboard data-modeling dax-functions m-language pandas-python powerbi powerquery
Last synced: 11 months ago
JSON representation
Conducted Exploratory Data Analysis on a real-world but anonymized dataset from MORGENS hotel management system
- Host: GitHub
- URL: https://github.com/gabor-gabor/ii.-data-science-competition
- Owner: gabor-gabor
- Created: 2025-03-11T18:56:09.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-21T08:20:27.000Z (about 1 year ago)
- Last Synced: 2025-06-21T09:26:22.205Z (about 1 year ago)
- Topics: dashboard, data-modeling, dax-functions, m-language, pandas-python, powerbi, powerquery
- Language: Jupyter Notebook
- Homepage:
- Size: 6.22 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# II.-Data-Science-Competition
# Conducted Exploratory Data Analysis on a real-world but anonymized dataset from [MORGENS](https://morgens.hu/) hotel management system
## 1) Project Background
This project focuses on analyzing a comprehensive dataset from the hotel and accommodation industry, covering:
- **Marketing effectiveness**
- **Website traffic and user behavior**
- **Booking engine activity**
- **Reservation trends and occupancy patterns**
### Objectives
The goal is to uncover actionable insights and develop data-driven strategies to optimize and enhance accommodation bookings by:
- Enhancing marketing efficiency to drive higher returns on investment.
- Improving website functionality and booking engine efficiency to increase user engagement.
- Identifying key booking behavior trends to capitalize on high-value opportunities.
- Providing a strategic framework for boosting hotel booking system performance and profitability in a competitive hospitality market.
## 2) Project Goals and Outcomes
### **Improving the marketing and booking efficiency of a hotel industry company, thereby contributing to increased profitability.**
### **Key Contributions**
- **Data preparation and cleansing**
- **In-depth Exploratory Data Analysis (EDA):**
- Examination of search trends
- Conversion rate analysis
- Insights for revenue and yield management, as well as campaign optimization
- Assessment of advertising spend and PPC performance
- **Strategic recommendations** based on insights to enhance business profitability
### **Datasets**
- **1 CSV file** – Website activity data
- **1 CSV file** – Marketing channel data
- **1 CSV file** – Occupancy data
- **8 CSV files** – Search and booking data
- **3 Jupyter Notebook (.ipynb) files** – Data preparation process
- **1 Power BI (.pbix) file** – Data model
- **1 PDF file** – Presentation
### **Goals**
To process and analyze marketing and booking data across multiple hotel properties, aiming to optimize booking opportunities and enhance profitability.
## 3) Appendix
### **Methodologies**
- Data Cleaning
- Data Preparation and Exploration
- Statistical Metrics Investigation
- Data Visualization
- Identifying Business Opportunities
- Advertising Cost and Medium Performance Analysis
- Presentation of Actionable Insights
### **IT Tools**
- Python
- Power BI
### **Data Preprocessing and Model Building**
- Data cleaning and preparation using Python
- Handling outliers and missing data using Power Query in Power BI
- Joining multiple data tables in Power BI
- Building a unified data model in Power BI
- Implementing a **Star Schema** by creating an independent date dimension table and linking it to fact tables
- Creating and optimizing **DAX formulas**, including complex functions in Power BI
- Developing **interactive dashboards** in Power BI
### **Current Capabilities & Key Features**
This model is designed to automatically perform all data manipulations, analyses, and reports with a single click as soon as the next month's source tables are added to the source data folder.
This analysis was initially developed for an internal competition organized by [data36.com](https://data36.com/) data science club, where it earned me the **Medior Special Prize** ranking.