{"id":26907106,"url":"https://github.com/munas-git/website-traffic-analysis","last_synced_at":"2025-04-01T11:37:45.611Z","repository":{"id":283269231,"uuid":"951210999","full_name":"munas-git/website-traffic-analysis","owner":"munas-git","description":null,"archived":false,"fork":false,"pushed_at":"2025-03-19T11:21:46.000Z","size":586,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-19T11:35:42.987Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/munas-git.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-19T10:29:02.000Z","updated_at":"2025-03-19T11:21:49.000Z","dependencies_parsed_at":"2025-03-19T11:45:45.547Z","dependency_job_id":null,"html_url":"https://github.com/munas-git/website-traffic-analysis","commit_stats":null,"previous_names":["munas-git/website-traffic-analysis"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/munas-git%2Fwebsite-traffic-analysis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/munas-git%2Fwebsite-traffic-analysis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/munas-git%2Fwebsite-traffic-analysis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/munas-git%2Fwebsite-traffic-analysis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/munas-git","download_url":"https://codeload.github.com/munas-git/website-traffic-analysis/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246634953,"owners_count":20809324,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-04-01T11:37:44.837Z","updated_at":"2025-04-01T11:37:45.597Z","avatar_url":"https://github.com/munas-git.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Heat Pump Installation Website Analytics and Conversion Prediction Model\n\n## Overview\n\nThis project focuses on analysing website traffic patterns, user behavior, and conversion rates for a heat pump installation service. It includes exploratory data analysis (EDA), data preprocessing, and the development of a machine-learning model to predict conversion rates. The goal is to identify key factors influencing customer engagement and optimise the conversion funnel.\n\n## Table of Contents\n\n- [Introduction](#introduction)\n- [Data Sources](#data-sources)\n- [Libraries Used](#libraries-used)\n- [Data Wrangling](#data-wrangling)\n- [Exploratory Data Analysis (EDA)](#exploratory-data-analysis-eda)\n- [Machine Learning Model Development](#machine-learning-model-development)\n- [Model Evaluation](#model-evaluation)\n- [Results](#results)\n- [Challenges Addressed](#challenges-addressed)\n- [Recommendations](#recommendations)\n- [Further Investigation](#further-investigation)\n- [Contact](#contact)\n\n## Introduction\n\nThe project aims to provide insights into how website visitors interact with a heat pump installation service's online platform. By analysing various metrics, such as website visits over time, time taken to book a design consultation, and conversion rates, the project seeks to identify opportunities for improving customer engagement and increasing the likelihood of conversion.\n\n## Data Sources\n\nThe project utilises the following datasets:\n\n-   `funnel_data.csv`: Contains data related to user interactions and the various stages of the conversion funnel, including first search, design consultation payments, and completion.\n-   `installer_locations.csv`: Contains location data of installers, which can be used to calculate distances and analyse regional trends.\n-   `property_estimates.csv`: Contains property estimates that can be linked to user data for further analysis.\n\n## Libraries Used\n\nThe following Python libraries were used in this project:\n\n-   **Data Wrangling**:\n    -   `pandas`: For data manipulation and analysis.\n    -   `calendar`: For date-related functionalities.\n-   **Data Preprocessing**:\n    -   `LabelEncoder`: For encoding categorical variables.\n-   **Geospatial Calculations**:\n    -   `math`: For calculating distances between geographical coordinates.\n-   **Visualisation**:\n    -   `seaborn`: For creating statistical visualisations.\n    -   `matplotlib`: For creating plots and charts.\n    -   `IPython.display`: For displaying images.\n-   **Machine Learning**:\n    -   `scikit-learn`: For model development, evaluation, and preprocessing.\n    -   `imblearn`: For handling imbalanced datasets using techniques like SMOTE.\n    -   `xgboost`: For training gradient boosting models.\n-   **Model Evaluation**:\n    -   `sklearn.metrics`: For evaluating model performance using metrics like accuracy, confusion matrix, and classification report.\n\n## Data Wrangling\n\nThe data wrangling process involved the following steps:\n\n1.  **Loading Data**: Reading the datasets (`funnel_data.csv`, `installer_locations.csv`, `property_estimates.csv`) using pandas.\n2.  **Data Type Conversion**: Converting date columns to datetime format using `pd.to_datetime`.\n3.  **Handling Missing Values**: Addressing missing data in relevant columns.\n4.  **Feature Engineering**: Creating new features such as time deltas between first visit and design consultation payment.\n\n## Exploratory Data Analysis (EDA)\n\nThe EDA process involved the following steps:\n\n1.  **Understanding Data**: Examining the structure, data types, and missing values in the datasets.\n2.  **Visualising Website Visits Over Time**: Creating time series plots to identify patterns and trends in website traffic.\n3.  **Analysing Time Deltas**: Calculating and visualising the time taken from the first visit to design consultation payment.\n4.  **Determining Conversion Rates**: Calculating the percentage of website visitors who booked a design consultation.\n\n## Machine Learning Model Development\n\nThe machine learning model development process involved the following steps:\n\n1.  **Feature Selection**: Selecting relevant features for the model.\n2.  **Data Preprocessing**: Encoding categorical variables using LabelEncoder and handling imbalanced datasets using SMOTE.\n3.  **Model Training**: Training various machine learning models, including Support Vector Machines (SVM), XGBoost, and Decision Trees.\n4.  **Pipeline Creation**: Implementing a machine learning pipeline using scikit-learn and imblearn to streamline the process.\n5.  **Cross-Validation**: Employing cross-validation techniques to ensure model robustness.\n\n## Model Evaluation\n\nThe model evaluation process involved the following steps:\n\n1.  **Performance Metrics**: Evaluating model performance using metrics such as accuracy, confusion matrix, and classification report.\n2.  **Comparative Analysis**: Comparing the performance of different models to identify the most effective one.\n3.  **Visualisation**: Visualising the results using confusion matrices and other relevant plots.\n\n## Results\n\nKey findings from the analysis include:\n\n-   **Website Traffic Patterns**: Peak first visits occurred at the beginning of December 2023, followed by a progressive decrease.\n-   **Time to Book Consultation**: The average time from the first visit to design consultation payment is approximately 17.71 days.\n-   **Conversion Rate**: Approximately 0.98% of website visitors booked a design consultation.\n\n## Challenges Addressed\n\nThe project addressed the following challenges:\n\n-   **Analysing website traffic patterns over time.**\n-   **Determining the time taken for users to pay for design consultations after their first visit.**\n-   **Calculating the percentage of website visitors who booked a design consultation.**\n-   **Understanding how conversion rates vary based on eligibility for the Boiler Upgrade Scheme grant.**\n\n## Recommendations\n\nBased on the analysis, the following recommendations are made:\n\n-   **Track website visits before installation deposit**: Monitor the number of website visits before clients make an installation deposit.\n-   **Track proposal downloads**: If proposal downloads are an option, track these to understand client engagement.\n-   **Further investigation**: Investigate the reasons for the decline in conversion rates after the initial peak in website visits.\n\n## Further Investigation\n\nFurther areas for investigation include:\n\n-   **Impact of Boiler Upgrade Scheme grant eligibility on conversion rates.**\n-   **Reasons for the decline in conversion rates after the initial peak in website visits.**\n-   **Analysis of installer locations and their impact on service delivery.**\n\n## Contact\n\nFor any questions or further information, please contact:\n\n\\[Einetein]\n\n\\[einsteinmunachiso@gmail.com]\n\n\\[https://www.linkedin.com/in/einstein-ebereonwu/]\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmunas-git%2Fwebsite-traffic-analysis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmunas-git%2Fwebsite-traffic-analysis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmunas-git%2Fwebsite-traffic-analysis/lists"}