An open API service indexing awesome lists of open source software.

https://github.com/abdullahkhurshid/ecommerce-marketing-analytics

Using Apache Spark for marketing analytics
https://github.com/abdullahkhurshid/ecommerce-marketing-analytics

apache-spark big-data-analytics cloud-computing marketing-analytics r supervised-learning unsupervised-learning

Last synced: 2 months ago
JSON representation

Using Apache Spark for marketing analytics

Awesome Lists containing this project

README

          

# E-Commerce Marketing Analytics

## Overview
This project explores the use of machine learning in the Apache Spark environment. Though the dataset used in this project is small, it is used to simulate cloud computing for big data analytics. As such, we would conduct this project in a manner that is most appropriate for big data analytics.

## Problem Statement
In the B2C e-commerce sector, businesses face the challenge of optimizing their operations and enhancing customer understanding to drive revenue growth.

The dataset is from Kaggle, [transactions.csv](https://www.kaggle.com/datasets/gabrielramos87/an-online-shop-business), contains a one-year record of e-commerce sales transactions comprising 500,000 rows and 8 columns.

| Column Name | Description |
| --- | --- |
| CustomerNo | An identification number for each unique customer |
| TransactionNo | An identification number for each unique transaction |
| Date | The date on which the transaction was made |
| ProductNo | An (alpha)numeric code for each unique product |
| ProductName | Name of Product |
| Price | Unit Price of the specific product |
| Quantity | Quantity purchased for a single product within the transaction
| Country | Country where the customer is based in

## Objectives
There are two objectives to this project:

1) Understand the contributing factors to customer loyalty to gain actionable insights for nurturing loyal customer relationships for sustained revenue growth.

2) Understand customer behaviour through effective
segmentation to recommend tailored customer targeting strategies.

## Methodology

For this project, we adopted the following process:

1) Data Cleaning & Preparation
2) Exploratory Data Analysis
3) Feature Engineering
4) Machine Learning Modelling
5) Model Evaluation
6) Model Deployment

The steps taken in the above steps can be found in the code documentation file and supporting it would be a codebook for all the variables in the code.