{"id":24848719,"url":"https://github.com/alhadikhan/e-commerce-customer-behavior-analysis-and-segmentation","last_synced_at":"2025-03-26T11:12:31.864Z","repository":{"id":247262856,"uuid":"825400647","full_name":"alhadikhan/E-commerce-Customer-Behavior-Analysis-and-Segmentation","owner":"alhadikhan","description":"This project utilizes machine learning to analyze and segment e-commerce customer behavior. It predicts purchases and clusters customers based on demographic data and product preferences, aiming to optimize marketing strategies and enhance customer satisfaction.","archived":false,"fork":false,"pushed_at":"2024-07-08T21:45:08.000Z","size":547,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-01-31T12:16:50.532Z","etag":null,"topics":["customer-behavior-analysis","customer-segmentation","e-commerce","machine-learning","marketing-strategies"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/alhadikhan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-07T16:58:49.000Z","updated_at":"2024-07-08T21:51:35.000Z","dependencies_parsed_at":"2024-07-07T18:26:40.052Z","dependency_job_id":"b8e25e62-bc34-4393-a421-43e7d0917bb0","html_url":"https://github.com/alhadikhan/E-commerce-Customer-Behavior-Analysis-and-Segmentation","commit_stats":null,"previous_names":["alhadikhan/e-commerce-customer-behavior-analysis-and-segmentation"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alhadikhan%2FE-commerce-Customer-Behavior-Analysis-and-Segmentation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alhadikhan%2FE-commerce-Customer-Behavior-Analysis-and-Segmentation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alhadikhan%2FE-commerce-Customer-Behavior-Analysis-and-Segmentation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alhadikhan%2FE-commerce-Customer-Behavior-Analysis-and-Segmentation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/alhadikhan","download_url":"https://codeload.github.com/alhadikhan/E-commerce-Customer-Behavior-Analysis-and-Segmentation/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245641441,"owners_count":20648644,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["customer-behavior-analysis","customer-segmentation","e-commerce","machine-learning","marketing-strategies"],"created_at":"2025-01-31T12:16:52.757Z","updated_at":"2025-03-26T11:12:31.845Z","avatar_url":"https://github.com/alhadikhan.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# E-commerce Customer Behavior Analysis and Segmentation\n\nThis project analyzes customer behavior and segments customers in an e-commerce context using machine learning techniques. The dataset includes information about customers' demographics, product preferences, and purchasing patterns.\n\n## Table of Contents\n\n- [Introduction](#introduction)\n- [Dataset Overview](#dataset-overview)\n- [Project Goals](#project-goals)\n- [Methods and Techniques](#methods-and-techniques)\n- [Results](#results)\n\n## Introduction\n\nUnderstanding customer behavior is crucial for e-commerce businesses to personalize marketing strategies, improve customer satisfaction, and optimize product offerings. This project leverages machine learning algorithms to predict customer purchases and segment customers based on their characteristics.\n\n## Dataset Overview\n\nThe dataset includes the following columns:\n- Customer ID\n- Gender\n- Age\n- Salary\n- Product ID\n- Price\n- Purchased (Target variable)\n\n## Project Goals\n\n1. **Customer Purchase Prediction**:\n   - Build and evaluate machine learning models to predict whether a customer will purchase a product.\n\n2. **Customer Segmentation**:\n   - Utilize clustering algorithms to segment customers based on demographic and behavioral attributes.\n\n3. **Feature Importance Analysis**:\n   - Determine which factors (e.g., Age, Salary, Product Price) influence customer purchasing decisions the most.\n\n## Methods and Techniques\n\n### Data Preprocessing\n- Encoding categorical variables (Gender, Product ID)\n- Scaling numerical features (Age, Salary, Price)\n\n### Model Evaluation\n\nTwo different splitting methods were used to evaluate model performance:\n\n#### Stratified Splitting\n\n| Model    | Accuracy | Precision | Recall   | F1 Score |\n|----------|----------|-----------|----------|----------|\n| LogReg   | 0.820    | 0.783     | 0.783    | 0.783    |\n| DecTree  | 0.780    | 0.747     | 0.711    | 0.728    |\n| RandFor  | 0.785    | 0.750     | 0.723    | 0.736    |\n| SVM      | 0.790    | 0.773     | 0.699    | 0.734    |\n| NaiveBay | 0.780    | 0.760     | 0.687    | 0.722    |\n| KNN      | 0.765    | 0.725     | 0.699    | 0.712    |\n\n#### Default Random Splitting\n\n| Model    | Accuracy | Precision | Recall   | F1 Score |\n|----------|----------|-----------|----------|----------|\n| LogReg   | 0.815    | 0.753     | 0.782    | 0.767    |\n| DecTree  | 0.750    | 0.679     | 0.679    | 0.679    |\n| RandFor  | 0.770    | 0.716     | 0.679    | 0.697    |\n| SVM      | 0.795    | 0.740     | 0.731    | 0.735    |\n| NaiveBay | 0.795    | 0.734     | 0.744    | 0.739    |\n| KNN      | 0.760    | 0.679     | 0.731    | 0.704    |\n\n### Results\n\n- Stratified splitting generally improves model performance metrics compared to default random splitting.\n- Logistic Regression and SVM show more consistent results across both splitting methods.\n- Decision Tree and Random Forest perform better with stratified splitting, indicating the importance of balanced class distributions in training and testing sets.\n\n## Usage\n\n### Requirements\n\n- Python 3.x\n- Libraries: pandas, numpy, scikit-learn, matplotlib, seaborn\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falhadikhan%2Fe-commerce-customer-behavior-analysis-and-segmentation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falhadikhan%2Fe-commerce-customer-behavior-analysis-and-segmentation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falhadikhan%2Fe-commerce-customer-behavior-analysis-and-segmentation/lists"}