{"id":25602863,"url":"https://github.com/preethi2805/customer-segmentation","last_synced_at":"2026-05-18T05:34:56.702Z","repository":{"id":274571480,"uuid":"923356215","full_name":"Preethi2805/Customer-Segmentation","owner":"Preethi2805","description":"This project applies Recency, Frequency, and Monetary (RFM) Analysis along with K-Means Clustering to segment customers based on their purchasing behavior. The goal is to identify distinct customer groups and develop targeted marketing strategies.","archived":false,"fork":false,"pushed_at":"2025-02-20T05:27:31.000Z","size":23657,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-20T05:29:56.688Z","etag":null,"topics":["customer-segmentation","kmeans-clustering","python-3","rfm-analysis"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Preethi2805.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-28T04:50:43.000Z","updated_at":"2025-02-20T05:27:34.000Z","dependencies_parsed_at":"2025-01-28T05:35:41.283Z","dependency_job_id":null,"html_url":"https://github.com/Preethi2805/Customer-Segmentation","commit_stats":null,"previous_names":["preethi2805/customer-segmentation"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Preethi2805%2FCustomer-Segmentation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Preethi2805%2FCustomer-Segmentation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Preethi2805%2FCustomer-Segmentation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Preethi2805%2FCustomer-Segmentation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Preethi2805","download_url":"https://codeload.github.com/Preethi2805/Customer-Segmentation/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240056173,"owners_count":19741090,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["customer-segmentation","kmeans-clustering","python-3","rfm-analysis"],"created_at":"2025-02-21T17:24:19.383Z","updated_at":"2025-11-11T05:34:44.335Z","avatar_url":"https://github.com/Preethi2805.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 📊 Customer Segmentation using RFM and K-Means Clustering\n\n## 🔍 Problem Statement\nUnderstanding customer behavior is crucial for businesses. By segmenting customers based on their purchasing patterns, businesses can:\n- Identify **high-value** customers.\n- Recognize **at-risk** customers.\n- Personalize marketing strategies for **customer retention** and **acquisition**.\n\n## 📌 Dataset\nThe dataset consists of transaction records from an e-commerce platform with the following key attributes:\n- `InvoiceNo` – Unique invoice identifier.\n- `StockCode` – Product identifier.\n- `Description` – Product description.\n- `Quantity` – Number of units purchased.\n- `InvoiceDate` – Date of purchase.\n- `UnitPrice` – Price per unit.\n- `CustomerID` – Unique customer identifier.\n- `Country` – Country of purchase.\n\n## 🛠️ Methodology\n### 1️⃣ Data Preprocessing\n- Handled missing values and duplicates.\n- Converted `InvoiceDate` to a datetime format.\n- Removed outliers using **Z-score analysis**.\n\n### 2️⃣ RFM Analysis\n- **Recency (R)**: Days since the customer’s last purchase.\n- **Frequency (F)**: Number of transactions by each customer.\n- **Monetary Value (M)**: Total spend per customer.\n\n### 3️⃣ K-Means Clustering\n- Determined the optimal number of clusters using the **Elbow Method** and **Silhouette Score**.\n- Segmented customers into **4 clusters**.\n\n### 4️⃣ Insights from Clusters\n- **Cluster 3:** Loyal and high-spending customers.\n- **Cluster 2:** Less engaged customers with high recency.\n- **Cluster 1 \u0026 0:** Intermediate customers who can be nurtured.\n\n## 📈 Results \u0026 Visualizations\n- **Pair plots** to analyze feature distributions across clusters.\n- **Box plots** to compare `Recency`, `Frequency`, and `Monetary Value` across clusters.\n- **Bar charts** showing average RFM values per cluster.\n\n![Pairplot - relationships between features and their cluster distribution.](pairplot.png)\n![Boxplot](boxplot.png)\n\n## 🔧 Technologies Used\n- **Python**\n- **Pandas, NumPy** – Data Manipulation\n- **Matplotlib, Seaborn** – Data Visualization\n- **Scikit-Learn** – Machine Learning (K-Means Clustering)\n- **StandardScaler** – Feature Scaling\n\n## 📌 Future Enhancements\n- Implement **Hierarchical Clustering** for better interpretability.\n- Develop **automated customer insights** with dashboards.\n- Integrate with **real-time e-commerce data**.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpreethi2805%2Fcustomer-segmentation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpreethi2805%2Fcustomer-segmentation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpreethi2805%2Fcustomer-segmentation/lists"}