An open API service indexing awesome lists of open source software.

https://github.com/gabrielmazzotta/hypothesis-testing---supermarket-sales

Project focused on exploring business insights emphasizing statistical tests.
https://github.com/gabrielmazzotta/hypothesis-testing---supermarket-sales

jupyter-notebook pingouin python scipy-stats statistical-inference statsmodels

Last synced: 8 months ago
JSON representation

Project focused on exploring business insights emphasizing statistical tests.

Awesome Lists containing this project

README

          

![image](image.jpg)

# Hypothesis Testing on Supermarket Sales

## Introduction

This project is dedicated to uncovering valuable business insights through the lens of __statistical tests__ applied to supermarket sales data. With a specific focus on __hypothesis testing__, we aim to delve into the underlying patterns and relationships within the dataset.

Emphasizing a rigorous analytical approach, the project sets a margin of error at 5%, ensuring the precision of our statistical inferences. Through this exploration, we seek to not only understand the dynamics of supermarket sales but also contribute actionable findings to inform strategic business decisions.

## Statistical Tests and Concepts
- 2 sample non-parametric test (Mann-Whitney U test)
- Bootstrap estimates
- Normality Test
- ANOVA
- Chi-square Independence Test
- 2 Sample Proportion Test

## Business insights
* Total purchasing by gender
* Rating by branch
* Association between costumer type and product line
* Customer type by gender


## Language/Libraries
* Python / Jupyter notebooks
* Scipy.stats
* Pingouin
* Statsmodels.stats
* Pandas
* Numpy
* Matplotlib
* Seaborn

## Conclusions
This project illustrated how exploratory data analysis (EDA) and statistical hypothesis testing can address specific business requirements.

The insights gained from these analyses pave the way for further investigation and potential development of machine learning algorithms tailored to meet the specific needs of the business.

The results obtained here provide a foundation for informed decision-making and continuous improvement in line with business objectives.

## Dataset
The growth of supermarkets in most populated cities are increasing and market competitions are also high. The dataset is one of the historical sales of supermarket company which has recorded in 3 different branches for 3 months data.

- Invoice id: Computer generated sales slip invoice identification number
- Branch: Branch of supercenter (3 branches are available identified by A, B and C).
- City: Location of supercenters
- Customer type: Type of customers, recorded by Members for customers using member card and Normal for without member card.
- Gender: Gender type of customer
- Product line: General item categorization groups - Electronic accessories, Fashion accessories, Food and beverages, Health and beauty, Home and lifestyle, Sports and travel
- Unit price: Price of each product in $
- Quantity: Number of products purchased by customer
- Tax: 5% tax fee for customer buying
- Total: Total price including tax
- Date: Date of purchase (Record available from January 2019 to March 2019)
- Time: Purchase time (10am to 9pm)
- Payment: Payment used by customer for purchase (3 methods are available – Cash, Credit card and Ewallet)
- COGS: Cost of goods sold
- Gross margin percentage: Gross margin percentage
- Gross income: Gross income
- Rating: Customer stratification rating on their overall shopping experience (On a scale of 1 to 10)