{"id":28047902,"url":"https://github.com/larisanti/transaction-ml","last_synced_at":"2025-05-11T21:55:32.716Z","repository":{"id":291998228,"uuid":"979466204","full_name":"larisanti/transaction-ml","owner":"larisanti","description":"This project demonstrates a sequence of BigQuery ML queries to build and evaluate a logistic regression model that predicts customer transactions based on website traffic data from Google Analytics.","archived":false,"fork":false,"pushed_at":"2025-05-07T19:46:23.000Z","size":251,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-11T21:55:27.760Z","etag":null,"topics":["bigquery","machine-learning"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/larisanti.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-07T14:53:31.000Z","updated_at":"2025-05-07T19:52:06.000Z","dependencies_parsed_at":"2025-05-11T21:55:27.987Z","dependency_job_id":null,"html_url":"https://github.com/larisanti/transaction-ml","commit_stats":null,"previous_names":["larisanti/transaction-forecasting-ml","larisanti/transaction-ml"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/larisanti%2Ftransaction-ml","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/larisanti%2Ftransaction-ml/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/larisanti%2Ftransaction-ml/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/larisanti%2Ftransaction-ml/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/larisanti","download_url":"https://codeload.github.com/larisanti/transaction-ml/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253639575,"owners_count":21940446,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bigquery","machine-learning"],"created_at":"2025-05-11T21:55:31.823Z","updated_at":"2025-05-11T21:55:32.708Z","avatar_url":"https://github.com/larisanti.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# BigQuery ML Transaction Forecasting Lab\n\nThis project is a lab exercise completed during the Machine Learning Engineer Learning Path course. It demonstrates a sequence of BigQuery ML queries to build and evaluate a logistic regression model that predicts customer transactions based on website traffic data from Google Analytics.\n\nThe project utilizes the `google_analytics_sample` dataset to train and evaluate the model. The model uses features such as operating system, mobile device usage, country, and pageviews to predict whether a visitor will make a transaction.\n\n## Workflow\n\nFirst, a BigQuery dataset is created, then:\n\n## 1.  **Create a BigQuery ML model:**\n   \n```sql\nCREATE OR REPLACE MODEL `bqml_lab.sample_model`\nOPTIONS(model_type='logistic_reg') AS\nSELECT\n  IF(totals.transactions IS NULL, 0, 1) AS label,\n  IFNULL(device.operatingSystem, \"\") AS os,\n  device.isMobile AS is_mobile,\n  IFNULL(geoNetwork.country, \"\") AS country,\n  IFNULL(totals.pageviews, 0) AS pageviews\nFROM\n  `bigquery-public-data.google_analytics_sample.ga_sessions_*` -- dataset: Google Analytics sample data\nWHERE\n  _TABLE_SUFFIX BETWEEN '20160801' AND '20170631'\nLIMIT 100000; -- limit to 100,000 rows to speed up training\n```\n\n![Creating a BigQuery ML model](https://github.com/larisanti/transaction-forecasting-ml/blob/main/Screenshots/1.1.png)\n![Creating a BigQuery ML model - Evaluation](https://github.com/larisanti/transaction-forecasting-ml/blob/main/Screenshots/1.2.png)\n\n          \n---\n## 2.  **Evaluate the model:**\n\n```sql\nSELECT\n  *\nFROM\n  ml.EVALUATE(MODEL `bqml_lab.sample_model`, (\nSELECT\n  IF(totals.transactions IS NULL, 0, 1) AS label, -- features used for prediction\n  IFNULL(device.operatingSystem, \"\") AS os,\n  device.isMobile AS is_mobile,\n  IFNULL(geoNetwork.country, \"\") AS country,\n  IFNULL(totals.pageviews, 0) AS pageviews\nFROM\n  `bigquery-public-data.google_analytics_sample.ga_sessions_*`\nWHERE\n  _TABLE_SUFFIX BETWEEN '20170701' AND '20170801'));\n```\n\n![Evaluating the model](https://github.com/larisanti/transaction-forecasting-ml/blob/main/Screenshots/2.png)\n\n\n---\n## 3.  **Predict purchases per country:**\n\n```sql\nSELECT\n  country,\n  SUM(predicted_label) as total_predicted_purchases -- total predicted purchases for the country\nFROM\n  ml.PREDICT(MODEL `bqml_lab.sample_model`, (\nSELECT\n  IFNULL(device.operatingSystem, \"\") AS os, -- features for prediction\n  device.isMobile AS is_mobile,\n  IFNULL(totals.pageviews, 0) AS pageviews,\n  IFNULL(geoNetwork.country, \"\") AS country\nFROM\n  `bigquery-public-data.google_analytics_sample.ga_sessions_*`\nWHERE\n  _TABLE_SUFFIX BETWEEN '20170701' AND '20170801'))\nGROUP BY country\nORDER BY total_predicted_purchases DESC\nLIMIT 10;\n```\n\n![Predicting transactions by country](https://github.com/larisanti/transaction-forecasting-ml/blob/main/Screenshots/3.png)\n\n---\n## 4.  **Predict purchases per user:**\n\n```sql\nSELECT\n  fullVisitorId,\n  SUM(predicted_label) as total_predicted_purchases -- total predicted purchases for each user\nFROM\n  ml.PREDICT(MODEL `bqml_lab.sample_model`, ( -- apply the trained model for prediction\nSELECT\n  IFNULL(device.operatingSystem, \"\") AS os,\n  device.isMobile AS is_mobile,\n  IFNULL(totals.pageviews, 0) AS pageviews,\n  IFNULL(geoNetwork.country, \"\") AS country,\n  fullVisitorId\nFROM\n  `bigquery-public-data.google_analytics_sample.ga_sessions_*`\nWHERE\n  _TABLE_SUFFIX BETWEEN '20170701' AND '20170801'))\nGROUP BY fullVisitorId\nORDER BY total_predicted_purchases DESC\nLIMIT 10;\n```\n\n![Predicting transactions per user](https://github.com/larisanti/transaction-forecasting-ml/blob/main/Screenshots/4.png)\n\n\n## Prerequisites\n\n* A Google Cloud Project\n* Access to BigQuery\n\n## Dataset\n\nThis project utilizes the following public dataset:\n\n* `bigquery-public-data.google_analytics_sample`\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flarisanti%2Ftransaction-ml","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flarisanti%2Ftransaction-ml","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flarisanti%2Ftransaction-ml/lists"}