https://github.com/abdullahkhurshid/ecommerce-marketing-analytics
Using Apache Spark for marketing analytics
https://github.com/abdullahkhurshid/ecommerce-marketing-analytics
apache-spark big-data-analytics cloud-computing marketing-analytics r supervised-learning unsupervised-learning
Last synced: 2 months ago
JSON representation
Using Apache Spark for marketing analytics
- Host: GitHub
- URL: https://github.com/abdullahkhurshid/ecommerce-marketing-analytics
- Owner: AbdullahKhurshid
- Created: 2025-01-29T13:25:52.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-30T00:53:56.000Z (over 1 year ago)
- Last Synced: 2025-01-30T01:19:52.483Z (over 1 year ago)
- Topics: apache-spark, big-data-analytics, cloud-computing, marketing-analytics, r, supervised-learning, unsupervised-learning
- Homepage:
- Size: 17.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# E-Commerce Marketing Analytics
## Overview
This project explores the use of machine learning in the Apache Spark environment. Though the dataset used in this project is small, it is used to simulate cloud computing for big data analytics. As such, we would conduct this project in a manner that is most appropriate for big data analytics.
## Problem Statement
In the B2C e-commerce sector, businesses face the challenge of optimizing their operations and enhancing customer understanding to drive revenue growth.
The dataset is from Kaggle, [transactions.csv](https://www.kaggle.com/datasets/gabrielramos87/an-online-shop-business), contains a one-year record of e-commerce sales transactions comprising 500,000 rows and 8 columns.
| Column Name | Description |
| --- | --- |
| CustomerNo | An identification number for each unique customer |
| TransactionNo | An identification number for each unique transaction |
| Date | The date on which the transaction was made |
| ProductNo | An (alpha)numeric code for each unique product |
| ProductName | Name of Product |
| Price | Unit Price of the specific product |
| Quantity | Quantity purchased for a single product within the transaction
| Country | Country where the customer is based in
## Objectives
There are two objectives to this project:
1) Understand the contributing factors to customer loyalty to gain actionable insights for nurturing loyal customer relationships for sustained revenue growth.
2) Understand customer behaviour through effective
segmentation to recommend tailored customer targeting strategies.
## Methodology
For this project, we adopted the following process:
1) Data Cleaning & Preparation
2) Exploratory Data Analysis
3) Feature Engineering
4) Machine Learning Modelling
5) Model Evaluation
6) Model Deployment
The steps taken in the above steps can be found in the code documentation file and supporting it would be a codebook for all the variables in the code.