An open API service indexing awesome lists of open source software.

https://github.com/s-narasimman/zepto_inventory_sql_data_analysis

This project focuses on data cleaning, exploration, and analysis of product information from the Zepto dataset using SQL. It provides actionable insights into pricing, stock availability, discounts, and category-level performance.
https://github.com/s-narasimman/zepto_inventory_sql_data_analysis

aggregation categorization csv data-analysis data-cleaning kaggle postgresql sql zepto

Last synced: 27 days ago
JSON representation

This project focuses on data cleaning, exploration, and analysis of product information from the Zepto dataset using SQL. It provides actionable insights into pricing, stock availability, discounts, and category-level performance.

Awesome Lists containing this project

README

          

# ๐Ÿ›’ Zepto Data Analysis Using SQL

This project focuses on **data cleaning, exploration, and analysis** of product information from the **Zepto** dataset using SQL.
It provides actionable insights into **pricing**, **stock availability**, **discount strategies**, and **category-level performance**.

๐Ÿ“‚ **Dataset Source:** The dataset was obtained from **Kaggle**, contributed by **Palvinder**.

---

## ๐Ÿ“Š Key Objectives

* Explore and clean raw product data to ensure accuracy and consistency.
* Analyze discount trends, pricing strategies, and stock status.
* Derive insights on product performance, revenue, and value metrics.

---

## ๐Ÿงฉ SQL Operations Performed

### 1๏ธโƒฃ Table Creation & Data Exploration

* Created the `zepto` table with detailed product-level fields.
* Verified null values, duplicates, and anomalies.
* Checked product availability (in-stock vs out-of-stock).

### 2๏ธโƒฃ Data Cleaning

* Removed invalid records where MRP = 0.
* Converted price data from **paise to rupees** for consistency.

### 3๏ธโƒฃ Data Analysis & Insights

| Query | Description |
| :----- | :----------------------------------------------------------- |
| **Q1** | Top 10 best-value products based on discount percentage. |
| **Q2** | High-MRP products that are out of stock. |
| **Q3** | Estimated revenue generated by each category. |
| **Q4** | Premium products (MRP > โ‚น500) with minimal discounts (<10%). |
| **Q5** | Top 5 categories offering the highest average discounts. |
| **Q6** | Price-per-gram calculation to determine best-value items. |
| **Q7** | Weight-based classification: *Low*, *Medium*, *Bulk*. |
| **Q8** | Total inventory weight per category. |

---

## ๐Ÿ’ก Key Insights

* **High discounts** highlight best-value products that attract customers.
* **Premium items (>โ‚น500)** typically offer **lower discounts**, maintaining brand value.
* **Bulk and medium-weight** items dominate total inventory weight.
* **High-MRP products** going out of stock indicate **strong customer demand**.

---

## ๐Ÿง  Tech Stack

* **Language:** SQL
* **Database:** PostgreSQL
* **Focus Areas:**

* Data Cleaning
* Aggregation
* Analytical Querying
* Business Insight Generation

---

## ๐Ÿงพ Example Queries

```sql
-- Q1: Top 10 Best-Value Products
SELECT DISTINCT name, mrp, discountPercent
FROM zepto
ORDER BY discountPercent DESC
LIMIT 10;

-- Q3: Estimated Revenue by Category
SELECT category,
SUM(discountedSellingPrice * availableQuantity) AS total_revenue
FROM zepto
GROUP BY category
ORDER BY total_revenue;
```

## ๐Ÿง‘โ€๐Ÿ’ป Author

**Narasimman S**
๐Ÿ“ Chennai | Data Analyst | SQL & Data Science Enthusiast

---