https://github.com/cintia0528/data_cleaning_and_analytics-python
Evaluate if aggressive discounting benefits Eniac long-term, considering differing views on customer acquisition and brand positioning. Focus on data cleaning for informed decision-making.
https://github.com/cintia0528/data_cleaning_and_analytics-python
colab-notebook data data-analysis datacleaning dataquality jupyter-notebook matplotlib pandas python seaborn
Last synced: 6 months ago
JSON representation
Evaluate if aggressive discounting benefits Eniac long-term, considering differing views on customer acquisition and brand positioning. Focus on data cleaning for informed decision-making.
- Host: GitHub
- URL: https://github.com/cintia0528/data_cleaning_and_analytics-python
- Owner: Cintia0528
- Created: 2023-08-22T10:42:11.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2023-08-24T13:05:17.000Z (about 2 years ago)
- Last Synced: 2025-03-31T05:35:18.833Z (6 months ago)
- Topics: colab-notebook, data, data-analysis, datacleaning, dataquality, jupyter-notebook, matplotlib, pandas, python, seaborn
- Homepage:
- Size: 403 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Data-Cleaning-and-Analysis-with-Python
## Task
To evaluate whether agressive discounting is beneficial in the long-run.## Overview
Eniac's board noticed that the last quarter's increasing sales volume did not translate to higher revenue._Marketing_: Believes that discounts are beneficial in the long-run for customer acquisition and retention.
_Board_: Is concerned that their positioning as a premium quality provider is compromised by agressive discounts.## Context
Eniac is an **online marketplace** specializing in Apple-compatible accessories. The board wants an **immediate answer** to whether the company should continue **discounting** or not. **The data is compromised**, therefore before analysis we must **clean** the data and **assess its quality**, in addition to defining how it **impairs our decision-making ability**.### Challenge:
*How to clean and assure the data's quality without losing too much information, so as to retain enough data for decision making?*## Approach
Evaluate the database:
1. Clean data for unreadable entries, duplicates and other obvious errors
2. Assess the remaining data for quality - remove compromised orders
3. Note the constraints the loss of data caused in our ability to make decisions
4. Basis of decision making: comparison between the recommended prices, and product catalog and the actual sales
5. Note recommendations; how to improve data collection and further research questions## Deliverables
5 minute **PowerPoint presentation** found [here](https://drive.google.com/file/d/1v3fMzSTz0JX0YVLydWhl2BjVN6A2yels/view?usp=sharing) to the Board of Directors, that summarizes the findings and suggests a course of action.
**Python code** is found [here](https://github.com/Cintia0528/Data-Cleaning-and-Analysis-with-Python.git).### Colab Files
1. Files starting with 2 are the data cleaning files, each table its own file
2. Files starting with 3 are the data quality files
3. Files starting with 4 are the data analysis files## Skills & Tools
1. Data Cleaning & Quality Assurance
2. Data Visualization & Storytelling
3. Colab & Jupyter Notebook
4. Python: Pandas, Seaborn, Matplotlib