https://github.com/jcsaputra/shoe-store-data-visualization-using-sas-visual-analytics
This repository provides a comprehensive analysis of Nike sales data across the United States from January 2020 to December 2021. Using data sourced from Kaggle, this project covers data cleaning, visualizations, and predictive modeling to uncover sales trends, regional performance, and product popularity.
https://github.com/jcsaputra/shoe-store-data-visualization-using-sas-visual-analytics
Last synced: 2 months ago
JSON representation
This repository provides a comprehensive analysis of Nike sales data across the United States from January 2020 to December 2021. Using data sourced from Kaggle, this project covers data cleaning, visualizations, and predictive modeling to uncover sales trends, regional performance, and product popularity.
- Host: GitHub
- URL: https://github.com/jcsaputra/shoe-store-data-visualization-using-sas-visual-analytics
- Owner: jcsaputra
- Created: 2024-11-02T15:58:40.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-11-02T16:03:01.000Z (about 1 year ago)
- Last Synced: 2025-10-14T13:06:53.356Z (2 months ago)
- Language: SAS
- Homepage:
- Size: 1.84 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
This repository provides a comprehensive analysis of Nike sales data across the United States from January 2020 to December 2021. Using data sourced from Kaggle, this project covers data cleaning, visualizations, and predictive modeling to uncover sales trends, regional performance, and product popularity.
Project Overview
The analysis is divided into three main sections:
Data Preprocessing:
Cleaning the dataset by removing unnecessary columns and handling null values to prepare it for analysis.
Data Visualization:
Generating visual insights through bar charts, pie charts, and scatterplots to showcase sales distribution, top-performing products, and trends across retailers and regions.
Key findings include top-selling categories, high-performing retailers, and product popularity variations.
Predictive Modeling:
Applying machine learning models, including Gradient Boosting, Neural Network, Random Forest, and Decision Tree, to predict product performance.
Models are evaluated based on classification accuracy, with insights into feature importance, model comparisons, and classification strengths.
Key Findings
Sales Leaders: Identifies top retailers and popular product categories based on sales data.
Model Insights: Gradient Boosting was the most accurate model, highlighting significant features like Units Sold and Price per Unit as influential in predicting product performance.