{"id":22437242,"url":"https://github.com/anhvu2201/e-commerce_website_performance_analysis","last_synced_at":"2026-01-07T06:05:32.957Z","repository":{"id":263218947,"uuid":"889711927","full_name":"anhvu2201/E-commerce_Website_Performance_Analysis","owner":"anhvu2201","description":"Ultilize SQL in Big Query to calculate the key metrics. Furthermore, identify current status of the business to assist in deciding the next business plan.","archived":false,"fork":false,"pushed_at":"2024-11-25T15:40:35.000Z","size":25,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-01T13:45:07.495Z","etag":null,"topics":["problem-solving","sql","sql-query"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/anhvu2201.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-17T02:48:18.000Z","updated_at":"2024-11-25T15:43:54.000Z","dependencies_parsed_at":"2024-12-06T00:12:10.537Z","dependency_job_id":"e7461deb-ab11-442d-ac15-42413e225731","html_url":"https://github.com/anhvu2201/E-commerce_Website_Performance_Analysis","commit_stats":null,"previous_names":["anhvu2201/explore-ecommerce-dataset","anhvu2201/e-commerce_website_performance_analysis"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anhvu2201%2FE-commerce_Website_Performance_Analysis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anhvu2201%2FE-commerce_Website_Performance_Analysis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anhvu2201%2FE-commerce_Website_Performance_Analysis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anhvu2201%2FE-commerce_Website_Performance_Analysis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/anhvu2201","download_url":"https://codeload.github.com/anhvu2201/E-commerce_Website_Performance_Analysis/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245813232,"owners_count":20676763,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["problem-solving","sql","sql-query"],"created_at":"2024-12-06T00:12:16.872Z","updated_at":"2026-01-07T06:05:32.944Z","avatar_url":"https://github.com/anhvu2201.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"#  E-commerce Website Performance Analysis\n# I. Introduction\n\nIn this project, I will use SQL on Google [BigQuery](https://cloud.google.com/bigquery/) to explore an eCommerce business dataset, which is based on the Google Analytics public dataset.\n\n# II. Dataset Exploration\n\nThere is 8 different queries in this project:\n\n## Query 01: Calculate total visit, pageview, transaction and revenue of the business in January, February and March 2017, in order of month.\n\n- SQL Code:\n\n\t\tSELECT\n\t\t    format_date('%Y%m',PARSE_DATE('%Y%m%d', date) ) month\n\t\t    ,SUM(totals.visits) visits\n\t\t    ,SUM(totals.pageviews) pageviews\n\t\t    ,SUM(totals.transactions) transactions\n\t\tFROM \n\t\t    `bigquery-public-data.google_analytics_sample.ga_sessions_2017*`\n\t\tWHERE \n\t\t    date BETWEEN '20170101' AND '20170331'\n\t\tGROUP BY \n\t\t    month\n\t\tORDER BY \n\t\t    month;\n- Query Result:\n\n\t\t\tmonth\tvisits\tpageviews\ttransactions\n\t\t\t201701\t64694\t257708\t\t713\n\t\t\t201702\t62192\t233373\t\t733\n\t\t\t201703\t69931\t259522\t\t993\n- Link To Result: [Link](https://drive.google.com/file/d/1TMId10oA9mxwMws7YoTywQspK7LW2il4/view?usp=sharing)\n## Query 02: Bounce rate per traffic source in July 2017.\n\n- SQL Code:\n\n\t\tSELECT \n\t\t    trafficSource.source source\n\t\t    ,SUM(totals.visits) visits\n\t\t    ,SUM(totals.bounces) total_no_of_bounces\n\t\t    ,ROUND((SUM(totals.bounces) / SUM(totals.visits)) *100,3) bounce_rate\n\t\tFROM \n\t\t    `bigquery-public-data.google_analytics_sample.ga_sessions_201707*`\n\t\tGROUP BY \n\t\t    trafficSource.source\n\t\tORDER BY \n\t\t    visits DESC;\n- Query Result:\n\n\t\tsource\t\tvisits\ttotal_no_of_bounces\tbounce_rate\n\t\tgoogle\t\t38400\t19798\t\t\t51.557\n\t\t(direct)\t\t19891\t8606\t\t\t43.266\n\t\tyoutube.com\t\t6351\t4238\t\t\t66.73\n\t\tanalytics.google.com\t1972\t1064\t\t\t53.955\n\t\tPartners\t\t1788\t936\t\t\t52.349\n\t\tm.facebook.com\t669\t430\t\t\t64.275\n\t\tgoogle.com\t\t368\t183\t\t\t49.728\n\t\tdfa\t\t\t302\t124\t\t\t41.06\n\t\tsites.google.com\t230\t97\t\t\t42.174\n- Link To Result: [Link](https://drive.google.com/file/d/1a_w1-Brkxmsx2encFke0t704Yj9s6BBB/view?usp=sharing)\n\n## Query 03: Revenue contributed by traffic source calculated by week and by month in June 2017.\n\n- SQL Code:\n\n\t\tSELECT\n\t\t    'Month' time_type\n\t\t    , format_date('%Y%m',PARSE_DATE('%Y%m%d', date) ) time\n\t\t    , trafficSource.source source\n\t\t    , ROUND (SUM ((product.productRevenue) / 1000000),4) revenue\n\t\tFROM `bigquery-public-data.google_analytics_sample.ga_sessions_201706*`\n\t\t    ,UNNEST (hits) hits,\n\t\t    UNNEST (hits.product) product\n\t\tWHERE product.productRevenue is not null\n\t\tGROUP BY source, time\n\t\t\n\t\tUNION ALL\n\t\t\n\t\tSELECT\n\t\t    'Week' time_type\n\t\t    , format_date('%Y%W',PARSE_DATE('%Y%m%d', date) ) time\n\t\t    , trafficSource.source source\n\t\t    , ROUND (SUM ((product.productRevenue) / 1000000),4) revenue\n\t\tFROM `bigquery-public-data.google_analytics_sample.ga_sessions_201706*`\n\t\t    ,UNNEST (hits) hits,\n\t\t    UNNEST (hits.product) product\n\t\tWHERE product.productRevenue is not null\n\t\tGROUP BY source, time\n\t\tOrder by revenue DESC;\n- Query Result:\n\n\t\ttime_type\ttime\tsource\t\trevenue\n\t\tMonth\t\t201706\t(direct)\t97333.6197\n\t\tWeek\t\t201724\t(direct)\t30908.9099\n\t\tWeek\t\t201725\t(direct)\t27295.3199\n\t\tMonth\t\t201706\tgoogle\t\t18757.1799\n\t\tWeek\t\t201723\t(direct)\t17325.6799\n\t\tWeek\t\t201726\t(direct)\t14914.81\n\t\tWeek\t\t201724\tgoogle\t\t9217.17\n\t\tMonth\t\t201706\tdfa\t\t8862.23\n\t\tWeek\t\t201722\t(direct)\t6888.9\n\t\tWeek\t\t201726\tgoogle\t\t5330.57\n- Link To Result: [Link](https://drive.google.com/file/d/1bIS2-TLoupKlBFz62ECcB00XLRxeOs8h/view?usp=sharing)\n\n\n## Query 04: Average number of product pageviews categorized by purchaser type (purchasers and non-purchasers) in June and July 2017.\n\n- SQL Code:\n\n\t\tWITH \n\t\tpurchaser_data AS(\n\t\t  SELECT\n\t\t      FORMAT_DATE(\"%Y%m\",PARSE_DATE(\"%Y%m%d\",date)) AS month,\n\t\t      (SUM(totals.pageviews)/COUNT(DISTINCT fullvisitorid)) AS avg_pageviews_purchase,\n\t\t  FROM `bigquery-public-data.google_analytics_sample.ga_sessions_2017*`\n\t\t    ,UNNEST(hits) hits\n\t\t    ,UNNEST(product) product\n\t\t  WHERE _table_suffix BETWEEN '0601' AND '0731'\n\t\t  AND totals.transactions\u003e=1\n\t\t  AND product.productRevenue IS NOT NULL\n\t\t  GROUP BY month\n\t\t),\n\t\t\n\t\tnon_purchaser_data AS(\n\t\t  SELECT\n\t\t      FORMAT_DATE(\"%Y%m\",PARSE_DATE(\"%Y%m%d\",date)) AS month,\n\t\t      SUM(totals.pageviews)/COUNT(DISTINCT fullvisitorid) AS avg_pageviews_non_purchase,\n\t\t  FROM `bigquery-public-data.google_analytics_sample.ga_sessions_2017*`\n\t\t      ,UNNEST(hits) hits\n\t\t    ,UNNEST(product) product\n\t\t  WHERE _table_suffix BETWEEN '0601' AND '0731'\n\t\t  AND totals.transactions IS NULL\n\t\t  AND product.productRevenue IS NULL\n\t\t  GROUP BY month\n\t\t)\n\t\t\n\t\tSELECT\n\t\t    pd.*,\n\t\t    avg_pageviews_non_purchase\n\t\tFROM purchaser_data pd\n\t\tFULL JOIN non_purchaser_data USING(month)\n\t\tORDER BY pd.month;\n- Query Result:\n\n\t\tmonth\t\tavg_pageviews_purchase\tavg_pageviews_non_purchase\n\t\t201706\t94.02050113895217\t316.86558846341671\n\t\t201707\t124.23755186721992\t334.05655979568053\n- Link To Result: [Link](https://drive.google.com/file/d/1XxcJESc57hGYPZOmuQ2H1o-DVRq1KB3x/view?usp=sharing)\n\n## Query 05: Average number of transactions per user that made atleast a purchase in July 2017.\n\n- SQL Code:\n\n\t\tSELECT\n\t\t    format_date('%Y%m',PARSE_DATE('%Y%m%d', date) ) Month\n\t\t    , sum(totals.transactions) / count (distinct (fullVisitorId)) Avg_total_transactions_per_user\n\t\tFROM `bigquery-public-data.google_analytics_sample.ga_sessions_201707*`\n\t\t    ,UNNEST (hits) hits,\n\t\t    UNNEST (hits.product) product\n\t\tWHERE totals.transactions \u003e= 1 \n\t\t  and product.productRevenue is not null\n\t\t  and _table_suffix between '01' and '31'\n\t\tGROUP BY month;\n\t\t- Query Result:\n- Query Result:\n\n\t\tMonth\t\tAvg_total_transactions_per_user\n\t\t201707\t4.16390041493776\n- Link To Result: [Link](https://drive.google.com/file/d/15zB2NTqVZiVx8lGYxuZdN7dbbjSryRx8/view?usp=sharing)\n\n## Query 06: Average amount of money spent per session in July 2017.\n\n- SQL Code:\n\n\t\tSELECT\n\t\t    format_date('%Y%m',PARSE_DATE('%Y%m%d', date) ) Month\n\t\t    , ROUND ((sum(product.productRevenue) / sum(totals.visits) / 1000000),2) avg_revenue_by_user_per_visit\n\t\tFROM `bigquery-public-data.google_analytics_sample.ga_sessions_201707*`\n\t\t    ,UNNEST (hits) hits,\n\t\t    UNNEST (hits.product) product\n\t\tWHERE totals.transactions is not null \n\t\t    and product.productRevenue is not null\n\t\t    and _table_suffix between '01' and '31'\n\t\tGROUP BY month;\n- Query Result:\n\n\t\tMonth\t\tavg_revenue_by_user_per_visit\n\t\t201707\t43.86\n- Link To Result: [Link](https://drive.google.com/file/d/1-YCR7yBo3gMGngwNUTfLfX58JoEOqeKQ/view?usp=sharing)\n\n## Query 07: Other products purchased by customers who purchased product \"YouTube Men's Vintage Henley\" in July 2017.\n\n- SQL Code:\n\n\t\tWith customers_who_purchased_henley as (\n\t\tSELECT DISTINCT fullVisitorId\n\t\tFROM `bigquery-public-data.google_analytics_sample.ga_sessions_201707*`\n\t\t    ,UNNEST (hits) hits,\n\t\t    UNNEST (hits.product) product\n\t\tWHERE   \n\t\t     product.v2ProductName = \"YouTube Men's Vintage Henley\"\n\t\t    and _table_suffix between '01' and '31'\n\t\t    and totals.transactions is not null\n\t\t    and product.productRevenue is not null\n\t\t)\n\t\tSELECT \n\t\t    product.v2ProductName other_purchased_products,\n\t\t    sum(product.productQuantity) quantity\n\t\tFROM `bigquery-public-data.google_analytics_sample.ga_sessions_201707*`\n\t\t    ,UNNEST (hits) hits,\n\t\t    UNNEST (hits.product) product\n\t\tINNER JOIN customers_who_purchased_henley\n\t\tUSING (fullVisitorId)\n\t\tWHERE \n\t\t    product.v2ProductName \u003c\u003e \"YouTube Men's Vintage Henley\"\n\t\t    and _table_suffix between '01' and '31'\n\t\t    and totals.transactions is not null\n\t\t    and product.productRevenue is not null\n\t\tGROUP BY other_purchased_products\n\t\tORDER BY quantity DESC;\n- Query Result:\n\n\t\tother_purchased_products\t\t\t\tquantity\n\t\tGoogle Sunglasses\t\t\t\t\t20\n\t\tGoogle Women's Vintage Hero Tee Black\t\t\t7\n\t\tSPF-15 Slim \u0026 Slender Lip Balm\t\t\t6\n\t\tGoogle Women's Short Sleeve Hero Tee Red Heather\t4\n\t\tYouTube Men's Fleece Hoodie Black\t\t\t3\n\t\tGoogle Men's Short Sleeve Badge Tee Charcoal\t\t3\n\t\tCrunch Noise Dog Toy\t\t\t\t\t2\n\t\tAndroid Wool Heather Cap Heather/Black\t\t2\n\t\tYouTube Twill Cap\t\t\t\t\t2\n\t\tRecycled Mouse Pad\t\t\t\t\t2\n- Link To Result: [Link](https://drive.google.com/file/d/1eWZoWVcLPy-2Uv4K1F_tjkrR7ZGE8hwi/view?usp=sharing)\n\n## Query 08: Calculate cohort map from pageview to addtocart to purchase in the last 3 month.\n\n- SQL Code:\n\n\t\tWITH product_data AS(\n\t\tSELECT\n\t\t    FORMAT_DATE('%Y%m', PARSE_DATE('%Y%m%d',date)) AS month,\n\t\t    COUNT(CASE WHEN eCommerceAction.action_type = '2' THEN product.v2ProductName END) AS num_product_view,\n\t\t    COUNT(CASE WHEN eCommerceAction.action_type = '3' THEN product.v2ProductName END) AS num_add_to_cart,\n\t\t    COUNT(CASE WHEN eCommerceAction.action_type = '6' AND product.productRevenue IS NOT NULL THEN product.v2ProductName END) AS num_purchase\n\t\tFROM `bigquery-public-data.google_analytics_sample.ga_sessions_*`\n\t\t,UNNEST(hits) AS hits\n\t\t,UNNEST (hits.product) AS product\n\t\tWHERE _table_suffix BETWEEN '20170101' AND '20170331'\n\t\tAND eCommerceAction.action_type IN ('2','3','6')\n\t\tGROUP BY month\n\t\tORDER BY month\n\t\t)\n\t\t\n\t\tSELECT\n\t\t    *,\n\t\t    ROUND(num_add_to_cart/num_product_view * 100, 2) AS add_to_cart_rate,\n\t\t    ROUND(num_purchase/num_product_view * 100, 2) AS purchase_rate\n\t\tFROM product_data;\n- Query Result:\n\n\t\tmonth\t\tnum_product_view\tnum_add_to_cart\t  num_purchase\tadd_to_cart_rate\tpurchase_rate\n\t\t201701\t25787\t\t\t7342\t\t  2143\t\t28.47\t\t\t8.31\n\t\t201702\t21489\t\t\t7360\t\t  2060\t\t34.25\t\t\t9.59\n\t\t201703\t23549\t\t\t8782\t\t  2977\t\t37.29\t\t\t12.64\n- Link To Result: [Link](https://drive.google.com/file/d/1MiMC9QuzYWLw_or2QZt60T7YqvYhmM1y/view?usp=sharing)\n\n# III. Conclusion\n\n- In conclusion, analyzing the eCommerce dataset using SQL on Google BigQuery has uncovered key insights into total visits, pageviews, transactions, bounce rate, and revenue by traffic source, which can drive more informed business decisions.\n- By exploring the dataset, a deeper understanding of critical metrics is achieved, setting the foundation for further analysis. The next step will involve using visualization tools like Power BI or Tableau to highlight key trends and patterns.\n- Overall, this project showcases the effectiveness of combining SQL with big data tools like Google BigQuery to derive actionable insights from extensive datasets, emphasizing the value of data-driven decision-making.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanhvu2201%2Fe-commerce_website_performance_analysis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fanhvu2201%2Fe-commerce_website_performance_analysis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanhvu2201%2Fe-commerce_website_performance_analysis/lists"}