https://github.com/anhvu2201/customer_segmentation_by_rfm_analysis
Process data through sequential steps to implement customer segmentation evaluation in a retail business. Furthermore, analyze customer segments categorized based on the contribution to revenue, their quantity & value in order to deliver specific marketing plans for each group.
https://github.com/anhvu2201/customer_segmentation_by_rfm_analysis
marketing-analytics python-3 rfm-analysis
Last synced: 4 months ago
JSON representation
Process data through sequential steps to implement customer segmentation evaluation in a retail business. Furthermore, analyze customer segments categorized based on the contribution to revenue, their quantity & value in order to deliver specific marketing plans for each group.
- Host: GitHub
- URL: https://github.com/anhvu2201/customer_segmentation_by_rfm_analysis
- Owner: anhvu2201
- Created: 2024-11-17T09:14:13.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-11-25T15:49:33.000Z (11 months ago)
- Last Synced: 2025-01-27T18:43:04.442Z (8 months ago)
- Topics: marketing-analytics, python-3, rfm-analysis
- Language: Jupyter Notebook
- Homepage:
- Size: 21.7 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Customer Segmentation by RFM Analysis
# I. Introduction
## 1. Introduction to RFM analysis:
- RFM analysis is a popular customer analysis technique in marketing and customer relationship management (CRM). It evaluates customers based on three factors:
- Recency (R): How recently did the customer make a purchase? The more recent, the higher their likelihood to engage.
- Frequency (F): How often does the customer make purchases? Frequent buyers tend to be more loyal.
- Monetary (M): How much has the customer spent in total? Higher spending indicates potentially valuable customers.
- By analyzing these factors, RFM helps businesses segment customers to optimize marketing strategies, personalize promotions, and enhance customer retention, driving data-driven decision-making efficiently.
## 2. Project Purpose:
- Determine RFM scores to segment customers into different groups.
- Analyze the company - Superstore's performance status and provide recommendations for the marketing department.
- Identify which of the three metrics (R, F, and M) should be prioritized.
# II. Exploratory Data Analysis - EDA
## 1. Explore Data:

## 2. Apply Conditions On Dataset:

- Only users from UK are chosen for analysis, as they contributed for 98% of the dataset.
- Data in 'Quantity' must take a positive value.
- Data in 'UnitPrice' must take a positive value.
## 3. Check & Handle Null Values:

- Null values are not accepted in primal key column.
- Action: Drop null values.
## 4. Correct Data Type:

- Changing 'CustomerID' data type from float64 to int64.
## 5. Check & Handle Missing Values:

## 6. Conclusion:

# III. Data Visualization Using Python
## 1. Distribution of Recency:

- The distribution of Recency is right-skewed. As the Recency increases, there is a steep decline in the number of customers.
- The histogram shows that most customers have made recent purchases (<=100 days). About 1,700 customers have already bought something at Superstore in the last 50 days.
- It indicates that most of the Superstore's customers are active customers who tend to make purchases recently.
## 2. Distribution of Frequency:

- The distribution of Frequency is highly right-skewed. As the Frequency increases, there is a significant drop in the number of customers.
- This historam shows that most of customers have fewer than 20 transactions. For particular, more than 3,500 customers have made 1 to 10 purchases, when only few hundreds of customers have placed more than 10 orders and barely any have placed 20 or more.
- It indicates that the majority of Superstore's customers are low-frequency purchasers who do not make purchases often.
## 3. Distribution of Monetary:

- The distribution of Monetary is highly right-skewed. As the Monetary increases, there is a significant drop in the number of customers.
- This historam shows that most of customers have fewer than 10,000 monetary values. For particular, more than 3,500 customers have spent less than 5,000, when only hundreds of customers have spent more than 5,000. Virtually none have spent more than 10,000.
- It indicates that most of Superstore customers have low spending, while a small portion of customers contributes to the high monetary value segment.
## 4. Customer Segmentation By Total Sale:

- Ranking the customer segmentation by total sales:
1.Champions
2.Loyal
3.At Risk
4.Need Attention
5.Hibernating Customers
6.Potential Loyalist
7.Cannot Lose Them
8.Lost Customers
9.Promising
10.About To Sleep
11.New Customers
## 5. Customer Segmentation By Customer Value:

- Ranking the customer segmentation by total sales:
1.Champions
2.Hibernating Customers
3.Lost Customers
4.Loyal
5.Potential Loyalist
6.At Risk
7.Need Attention
8.About To Sleep
9.New Customers
10.Promising
11.Cannot Lose Them
## 6. Distribution Of Customer Across Segments:

- Customer segments can be categorized into 3 groups:
- High-Value Customers (HVC): Champions, Loyal, Potential Loyalist, New Customers, Promising.
- At-Risk Customers (ARC): Need Attention. About To Sleep, At Risk, Cannot Lose Them.
- Low-Value Customers (LVC): Hibernating Customers, Lost Customers.

- Observation:
- HVC category contributes the most, with 1871 customers.
- LVC category contributes the second - 1075 customers.
- ARC category, while the smallest, still contributes a considerable amount of 974 customers.
- It indicates that the business is in a growing state despite facing with several problems.
# III. Insights

- [Link](https://docs.google.com/spreadsheets/d/1MBt3b48lT-RzD44xsbMsgwtVO-JmIJjU/edit?usp=sharing&ouid=107825711284033293753&rtpof=true&sd=true)
# IV. Recommendation



- [Link](https://docs.google.com/spreadsheets/d/1MBt3b48lT-RzD44xsbMsgwtVO-JmIJjU/edit?usp=sharing&ouid=107825711284033293753&rtpof=true&sd=true)