https://github.com/alexondata/daan_eda-exploratory-data-analysis_ecommerce
This project presents an Exploratory Data Analysis (EDA) pipeline for an eCommerce dataset, integrating Python, SQL Server, and Power BI to transform raw transactional data into meaningful business insights. The project was developed as part of an academic assignment at Transilvania University of BraΘov, Faculty of Mathematics and Computer Science.
https://github.com/alexondata/daan_eda-exploratory-data-analysis_ecommerce
data-analysis data-visualization ecommerce microsoft-sql-server powerbi python
Last synced: about 1 month ago
JSON representation
This project presents an Exploratory Data Analysis (EDA) pipeline for an eCommerce dataset, integrating Python, SQL Server, and Power BI to transform raw transactional data into meaningful business insights. The project was developed as part of an academic assignment at Transilvania University of BraΘov, Faculty of Mathematics and Computer Science.
- Host: GitHub
- URL: https://github.com/alexondata/daan_eda-exploratory-data-analysis_ecommerce
- Owner: AlexOnData
- Created: 2025-09-20T14:43:47.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-09-20T15:00:13.000Z (9 months ago)
- Last Synced: 2025-09-20T16:42:37.177Z (9 months ago)
- Topics: data-analysis, data-visualization, ecommerce, microsoft-sql-server, powerbi, python
- Language: Python
- Homepage:
- Size: 58.6 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# π PowerBI Project - E-Commerce Data Analysis (EDA)
β οΈ **Disclaimer:** The dataset used in this project is **fictitious**.
This dashboard was created **only as a presentation model** and should not be interpreted as real operational data.
---
## π Description
**Application acces:** _[DaAn_EDA-Exploratory-Data-Analysis_eCommerce](https://app.powerbi.com/view?r=eyJrIjoiNGI5NzNhZGUtMDA1Yy00MDNjLWJlNTAtOTY4YWM5MjJkMmMwIiwidCI6IjU5ZTJkYTQzLWI1N2UtNDA4Ny05OGEwLWI1NDlmODczNzE0MiIsImMiOjl9)_
This project presents an **Exploratory Data Analysis (EDA)** pipeline for an **e-commerce dataset**, integrating **Python**, **SQL Server**, and **Power BI** to transform raw transactional data into meaningful business insights.
The project was developed as part of an academic assignment at *Transilvania University of BraΘov*, Faculty of Mathematics and Computer Science.
---
## π Project Overview
The main objective of this project is to convert a real-world dataset of online retail transactions into a structured database and create an **interactive Power BI dashboard** for analyzing sales, customers, and geographical distributions.
The dataset used is **[Online Retail II](https://archive.ics.uci.edu/dataset/502/online+retail+ii)**, containing transactions from a UK-based online store between **2009β2011**.
---
## βοΈ Tech Stack
- **Python** β data cleaning, preprocessing, and ETL
- Libraries: `pandas`, `numpy`, `pyodbc`, `sqlalchemy`, `matplotlib`, `seaborn`
- **Microsoft SQL Server 2021 Developer Edition** β relational database for structured storage
- **Power BI** β interactive dashboard, DAX measures, data visualization
---
## π Project Workflow
1. **Data Extraction & Preprocessing (Python)**
- Load Excel sheets (2009β2010, 2010β2011) into a unified DataFrame
- Handle missing values and data normalization
- Prepare data for SQL insertion
2. **Data Loading (SQL Server)**
- Create `OnlineRetailDB` database and `OnlineRetail` table
- Insert >1,000,000 rows via Python with error handling
- Ensure proper datatypes for efficient queries
3. **Data Modeling (Power BI)**
- Build a **Calendar table** in DAX for time-based analysis
- Define KPIs:
- `TotalSales`
- `NumberOfClients`
- `TotalOrders`
4. **Data Visualization (Power BI)**
- **Matrix Table** β sales aggregated by country and month
- **Slicers** β filters for year, month, and country
- **Area Chart** β monthly sales evolution
- **Cards & Donut Charts** β KPIs and country sales proportions
- **Map Visualization** β geographical distribution of sales
---
## π Results
The final **Power BI Dashboard** provides:
- Sales trends over time (year, quarter, month, day)
- Customer behavior and purchasing patterns
- Top-performing countries by revenue
- Interactive filtering for custom insights
This workflow can easily be extended to:
- Connect to live APIs or multiple data sources
- Automate updates with scheduled Python scripts
- Integrate predictive models for sales forecasting
---
## π Dashboards
### π’ Dashboard 1 β Sales Overview
- **KPI Cards** β Total Sales, Total Orders, Number of Clients.
- **Donut charts** β percentage distribution of sales and orders by country.
- **Line/Area chart** β yearly sales trend.
- **Matrix table** β aggregated sales by country and time.
- **Slicers** β filters for Year, Month, and Quarter.
β‘οΈ Example visualization:

---
### π’ Dashboard 2 β Geographical Analysis
- **Map visualization** β global distribution of total sales.
- Interactive zoom and hover for country-level insights.
- **Country ranking table** β total sales per region.
- Same slicers (Year, Month, Quarter) for filtering.
β‘οΈ Example visualization:

---
## π Getting Started
1. Clone this repository:
```
git clone https://github.com/AlexOnData/DaAn_EDA-Exploratory-Data-Analysis_eCommerce.git
cd DaAn_EDA-Exploratory-Data-Analysis_eCommerce
```
2. Install dependencies (Python β₯ 3.9 recommended):
```
pip install pandas numpy pyodbc sqlalchemy matplotlib seaborn
```
3. Set up SQL Server (Developer Edition recommended).
4. Run the Python scripts to load the dataset into SQL Server.
5. Open the provided Power BI file and connect it to your SQL database.