https://github.com/s-narasimman/zepto_inventory_sql_data_analysis
This project focuses on data cleaning, exploration, and analysis of product information from the Zepto dataset using SQL. It provides actionable insights into pricing, stock availability, discounts, and category-level performance.
https://github.com/s-narasimman/zepto_inventory_sql_data_analysis
aggregation categorization csv data-analysis data-cleaning kaggle postgresql sql zepto
Last synced: 27 days ago
JSON representation
This project focuses on data cleaning, exploration, and analysis of product information from the Zepto dataset using SQL. It provides actionable insights into pricing, stock availability, discounts, and category-level performance.
- Host: GitHub
- URL: https://github.com/s-narasimman/zepto_inventory_sql_data_analysis
- Owner: S-Narasimman
- License: mit
- Created: 2025-10-21T08:03:34.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-10-21T08:23:05.000Z (8 months ago)
- Last Synced: 2025-10-21T10:24:18.722Z (8 months ago)
- Topics: aggregation, categorization, csv, data-analysis, data-cleaning, kaggle, postgresql, sql, zepto
- Homepage:
- Size: 83 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ๐ Zepto Data Analysis Using SQL
This project focuses on **data cleaning, exploration, and analysis** of product information from the **Zepto** dataset using SQL.
It provides actionable insights into **pricing**, **stock availability**, **discount strategies**, and **category-level performance**.
๐ **Dataset Source:** The dataset was obtained from **Kaggle**, contributed by **Palvinder**.
---
## ๐ Key Objectives
* Explore and clean raw product data to ensure accuracy and consistency.
* Analyze discount trends, pricing strategies, and stock status.
* Derive insights on product performance, revenue, and value metrics.
---
## ๐งฉ SQL Operations Performed
### 1๏ธโฃ Table Creation & Data Exploration
* Created the `zepto` table with detailed product-level fields.
* Verified null values, duplicates, and anomalies.
* Checked product availability (in-stock vs out-of-stock).
### 2๏ธโฃ Data Cleaning
* Removed invalid records where MRP = 0.
* Converted price data from **paise to rupees** for consistency.
### 3๏ธโฃ Data Analysis & Insights
| Query | Description |
| :----- | :----------------------------------------------------------- |
| **Q1** | Top 10 best-value products based on discount percentage. |
| **Q2** | High-MRP products that are out of stock. |
| **Q3** | Estimated revenue generated by each category. |
| **Q4** | Premium products (MRP > โน500) with minimal discounts (<10%). |
| **Q5** | Top 5 categories offering the highest average discounts. |
| **Q6** | Price-per-gram calculation to determine best-value items. |
| **Q7** | Weight-based classification: *Low*, *Medium*, *Bulk*. |
| **Q8** | Total inventory weight per category. |
---
## ๐ก Key Insights
* **High discounts** highlight best-value products that attract customers.
* **Premium items (>โน500)** typically offer **lower discounts**, maintaining brand value.
* **Bulk and medium-weight** items dominate total inventory weight.
* **High-MRP products** going out of stock indicate **strong customer demand**.
---
## ๐ง Tech Stack
* **Language:** SQL
* **Database:** PostgreSQL
* **Focus Areas:**
* Data Cleaning
* Aggregation
* Analytical Querying
* Business Insight Generation
---
## ๐งพ Example Queries
```sql
-- Q1: Top 10 Best-Value Products
SELECT DISTINCT name, mrp, discountPercent
FROM zepto
ORDER BY discountPercent DESC
LIMIT 10;
-- Q3: Estimated Revenue by Category
SELECT category,
SUM(discountedSellingPrice * availableQuantity) AS total_revenue
FROM zepto
GROUP BY category
ORDER BY total_revenue;
```
## ๐งโ๐ป Author
**Narasimman S**
๐ Chennai | Data Analyst | SQL & Data Science Enthusiast
---