An open API service indexing awesome lists of open source software.

https://github.com/shruti-h/sales_data_analysis

Sales Data Analysis | Pandas & Matplotlib
https://github.com/shruti-h/sales_data_analysis

data-analysis data-science data-vi matplotlib pandas-library python

Last synced: about 2 months ago
JSON representation

Sales Data Analysis | Pandas & Matplotlib

Awesome Lists containing this project

README

          

# Sales Data Analysis - Mini Project

## 📌 Overview
This repository contains a **Sales Data Analysis** project that explores and answers business-related questions using **Python, Pandas, and Matplotlib**. The dataset consists of **12 months of sales data** from an electronics store, containing information on order ID, products, quantity ordered, price, order data and purchase address.

The project includes data cleaning, exploratory data analysis (EDA), and visualization of insights to make data-driven business decisions.

---

## 📊 Data Cleaning & Preparation
Before diving into analysis, data cleaning was performed to ensure accuracy and consistency. Tasks included:

✔ **Dropping NaN values** from the DataFrame.
✔ **Converting data types**
✔ **Extracting useful columns** (e.g., hour from timestamp, city from address).

---

## 🔍 Business Questions Answered
Using **Pandas & Matplotlib**, the following **key business questions** were explored:

1️⃣ **What was the best month for sales?** 🏆 How much revenue was generated that month?
2️⃣ **Which city had the highest sales?** 📍 Understanding regional demand.
3️⃣ **What time should advertisements be displayed?** ⏰ Maximizing customer purchase likelihood.
4️⃣ **Which products are most often sold together?** 🔗 Product bundling insights.
5️⃣ **What product sold the most?** 📦 Why might it have been the top seller?

Each question was answered using **data aggregation, groupby operations, and visualizations**

---

## 🔧 Methods & Techniques Used
Throughout this analysis, the following **Pandas & Matplotlib techniques** were leveraged:

✔ **Merging & Concatenating multiple CSV files** to create a unified dataset (`pd.concat`).
✔ **Adding new calculated columns** (e.g., sales, hour,city etc).
✔ **String parsing operations** (`.str.split()`, `.apply()` functions).
✔ **Using `groupby` for aggregate analysis**.
✔ **Visualizing insights** using vertival and horizontal bar charts and line graphs
✔ **Labeling and formatting graphs** for better readability.

---
## 🏆 Credits
This project was inspired by **real-world business problems** and implemented using Python's powerful data analysis tools.