https://github.com/abhinav330/customer-behavior-analysis-linear-regression
This repository explores customer behavior data for an NYC clothing company with both a mobile app and website. They want to understand which platform drives higher sales.
https://github.com/abhinav330/customer-behavior-analysis-linear-regression
data-analysis data-science data-visualization eda exploratory-data-analysis jupyter jupyter-notebook linear-regression machine-learning machine-learning-algorithms machinelearning-python numpy pandas python regression-analysis
Last synced: 3 months ago
JSON representation
This repository explores customer behavior data for an NYC clothing company with both a mobile app and website. They want to understand which platform drives higher sales.
- Host: GitHub
- URL: https://github.com/abhinav330/customer-behavior-analysis-linear-regression
- Owner: Abhinav330
- Created: 2024-08-25T23:05:00.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-08-31T02:56:30.000Z (over 1 year ago)
- Last Synced: 2025-03-03T08:16:34.923Z (11 months ago)
- Topics: data-analysis, data-science, data-visualization, eda, exploratory-data-analysis, jupyter, jupyter-notebook, linear-regression, machine-learning, machine-learning-algorithms, machinelearning-python, numpy, pandas, python, regression-analysis
- Language: Jupyter Notebook
- Homepage:
- Size: 687 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
[](https://app.codacy.com/gh/Abhinav330/Customer-behavior-Analysis-Linear-Regression/dashboard?utm_source=gh&utm_medium=referral&utm_content=&utm_campaign=Badge_grade)








# Customer-behavior-Analysis-Linear-Regression
# Code Summary
This repository explores customer behavior data for an NYC clothing company with both a mobile app and website. They want to understand which platform drives higher sales.
## Data Exploration and Visualization
The code starts by importing necessary libraries, loading the 'Ecommerce Customers' dataset using pandas, and performing data exploration and visualization tasks:
- Displays the first few rows of the dataset using `customers.head()`.
- Provides summary statistics using `customers.describe()`.
- Shows dataset information using `customers.info()`.
- Creates various plots and visualizations:
- Joint plots to explore relationships between 'Time on Website/App' and 'Yearly Amount Spent'.
- A pair plot to visualize pairwise relationships between numerical features.
- A linear regression plot (`lmplot`) to visualize the relationship between 'Yearly Amount Spent' and 'Length of Membership'.
## Data Preprocessing
The code preprocesses the data by selecting specific columns for feature variables ('X') and the target variable ('Y'). It then splits the dataset into training and testing sets using `train_test_split()`.
## Linear Regression Modeling
The script proceeds to build a Linear Regression model to predict 'Yearly Amount Spent' based on the selected features ('Avg. Session Length', 'Time on App', 'Time on Website', 'Length of Membership'):
- Imports `LinearRegression` from scikit-learn.
- Initializes and fits a Linear Regression model to the training data.
- Calculates the coefficients of the model using `lm.coef_`.
- Makes predictions on the testing data using `lm.predict()`.
- Visualizes the predictions against actual values using a scatter plot.
## Model Evaluation
The code evaluates the Linear Regression model's performance by calculating and printing various regression metrics:
- Mean Absolute Error (MAE).
- Mean Squared Error (MSE).
- Root Mean Squared Error (RMSE).
- Additionally, it visualizes the distribution of residuals (the difference between actual and predicted values) using a histogram (`sns.distplot`).
## Coefficients
The script creates a DataFrame (`coof`) to display the coefficients of the Linear Regression model along with their corresponding feature names.