{"id":23307410,"url":"https://github.com/tobiasmcvey/gar-reporting","last_synced_at":"2025-06-19T06:36:27.051Z","repository":{"id":167154975,"uuid":"227904363","full_name":"tobiasmcvey/gar-reporting","owner":"tobiasmcvey","description":"Exporting and analysing data from Google Analytics in R","archived":false,"fork":false,"pushed_at":"2024-06-28T13:53:53.000Z","size":19,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-02-13T06:26:17.302Z","etag":null,"topics":["ab-testing","google-analytics","google-analytics-api","r"],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tobiasmcvey.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-12-13T19:00:40.000Z","updated_at":"2024-06-28T13:53:57.000Z","dependencies_parsed_at":null,"dependency_job_id":"9290eb0d-cfb2-4c09-890a-0830d6f90e80","html_url":"https://github.com/tobiasmcvey/gar-reporting","commit_stats":null,"previous_names":["tobiasmcvey/gar-reporting"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobiasmcvey%2Fgar-reporting","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobiasmcvey%2Fgar-reporting/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobiasmcvey%2Fgar-reporting/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobiasmcvey%2Fgar-reporting/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tobiasmcvey","download_url":"https://codeload.github.com/tobiasmcvey/gar-reporting/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247574095,"owners_count":20960495,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ab-testing","google-analytics","google-analytics-api","r"],"created_at":"2024-12-20T12:35:52.548Z","updated_at":"2025-04-07T00:48:58.618Z","avatar_url":"https://github.com/tobiasmcvey.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# **Google Analytics Reporting in R**\n\nThese are examples of how to use the [googleAnalyticsR](https://code.markedmondson.me/googleAnalyticsR/) package by Mark Edmondson. \n\nThe package offers support to download data from Google Analytics via the Google Analytics Reporting API. \n\nThis is valuable when you have a large dataset, f.ex more than ten pages with 5,000 rows in each, and when you want to attempt to download data without sampling.\n\nIn my case I found it useful when the dataset contains a few hundred thousand rows. If you have a lot of traffic and custom dimensions containing unique IDs such as timestamp, client ID and session ID this is likely to occur. Doing this manually takes a lot of time.\n\nRather than downloading these datasets from the user interface of Google Analytics, this package handles downloading the entire dataset for you, freeing up your time to plan your analysis and reporting instead. :)\n\nThis has helped me run AB tests, spend less time gathering data and focus more time on interpreting the information to draw conclusions.\n\n**Use cases**\n* Creating reports with large datasets\n* Performing AB-tests and other statistical tests\n* ETL jobs for a database\n* Data mining\n\n**Contents**\n* [Authentication](https://github.com/tobmcv/gar-reporting#authentication)\n* [Creating Reports](https://github.com/tobmcv/gar-reporting#creating-reports)\n* [Reports with filters](https://github.com/tobmcv/gar-reporting#example-query-for-reporting-with-filters)\n* [Reports with segments](https://github.com/tobmcv/gar-reporting#example-query-for-reporting-with-segments)\n* [Fields in reporting queries](https://github.com/tobmcv/gar-reporting#fields-in-example-requests)\n\n## **Authentication**\n\nI recommend reading [Mark Edmondson's guide for setting up a project in Google Cloud](http://code.markedmondson.me/googleAnalyticsR/articles/setup.html#your-own-google-project)\n\nI use the googleAuthR package for authentication but there are other options.\n```r \ngoogleAuthR::gar_set_client\n```\n\nI create 2 JSON-files to store credentials and account-specific information. These are exempted in `.gitinore`. \n\nThe first contains project credentials from Google Cloud, and the second contains the unique table IDs for querying different properties in Google Analytics.\n\n## **Creating reports**\nI recommend you choose starting with reports that have filters OR reports with segments. \n\nCreating reports is simple with googleAnalyticsR. I recommend the following syntax to benefit from Analytics Reporting API version 4.\n\nUse the [Metrics and Dimension explorer](https://ga-dev-tools.appspot.com/dimensions-metrics-explorer/) for the API name of your metrics and dimensions.\n\nUse the [Google Analytics Account Explorer](https://ga-dev-tools.appspot.com/account-explorer/) to look up the View IDs for your account properties.\n\n\n### **Example Query for reporting with filters**\n\nHere is an example query to create a report with a flat table\n```r\nmypage_usage \u003c- google_analytics(ga_tableid,\n                                 date_range = c(\"2019-09-02\", \"2019-09-08\"),\n                                 metrics = c(\"uniquePageviews\", \"uniqueEvents\"),\n                                 dimensions = c(\"pagePath\"),\n                                 dim_filters = dim_filter_pagepath,\n                                 anti_sample = TRUE)\n\n```\n\nThis lets us download a table consisting of 2 metrics: Unique Pageviews and Unique Events, and the dimension Page Path. We also use a filter to hone in a specific webpage. Finally we add the anti_sample argument to ensure googleAnalyticsR tries to retrieve a complete dataset before downloading it to R.\n\n**Making filters**\n\nI highly recommend using filters in your report to speed up the download of your dataset and to reduce the risk of sampling. Google Analytics will often let you get more data without sampling if you just filter the dataset thoroughly in advance. If you use filters in segments this is less likely to work.\n\nSimply assign a variable and use it in your query object. In this example we look at page path, which requires the URI only, without the hostname and protocol.\n```r\ndim_filter_pagepath \u003c- filter_clause_ga4(list(dim_filter(dimension = \"pagePath\", operator = \"REGEXP\", expressions = \"^\\\\/(foldername)\\\\/(pagename)\\\\/$\")))\n```\n\nI prefer to use Regular Expressions since it lets you choose between a group of multiple values, f.ex URLs, or a specific page URL. You can also use other filter criteria, such as exact match and containing like so\n\n```r\ndim_filter_pagepath \u003c- filter_clause_ga4(list(dim_filter(dimension = \"pagePath\", operator = \"EXACT\", expressions = \"/foldername/pagename\")))\n```\n\n### **Example Query for reporting with segments**\nTo run a query for a report based on **segments** you can try this approach\n\n**If you already have a segment** in Google Analytics you can retrieve the data that matches the segment by using the `segment_ga4` argument.\n\nTo see the list of segments and IDs you can run store this as a table since it's easier to read, f.ex `ga_segments \u003c- ga_segment_list()`. Your custom segments will appear with the prefix `gaid::`.\n\nFor example, retrieving 2 segments for an AB split test:\n```r\nab_controlgroup \u003c- segment_ga4(\"control\", segment_id = \"gaid::xxxxxxxxxxxxxx\")\nab_variantgroup \u003c- segment_ga4(\"variant\", segment_id = \"gaid::yyyyyyyyyyyyyy\")\n```\nFind your segment ID and then use it to store your segments with an easily recognisable variable name\n\nThen compose a query containing the segment, for example like this\n```r\nab_mypage_controlgroup \u003c- google_analytics(ga_tableid,\n                                                  date_range = c(\"2019-10-21\",\"2019-10-23\"),\n                                                  metrics = c(\"users\",\"uniqueEvents\"),\n                                                  dimensions = c(\"dimension14\", \"eventCategory\", \"eventAction\", \"eventLabel\"),\n                                                  segments = ab_controlgroup,\n                                                  anti_sample = TRUE)\n```\n\nThis retrieves a table with the metrics Users and Unique Events, and combines both standard and custom dimensions, containing our 14th Custom Dimension, Event Category, Event Action and Event Label. We add a segment to retrieve only data for our control group in an AB split test, and add anti_sample just to be sure we can get unsampled data. The custom dimension contains a session specific ID so we can compare our segments by a unique ID for each visit.\n\nIf you want to **create a segment on the fly** here is an example\n```r\nse \u003c- segment_element(\"eventAction\",\n                      operator = \"REGEXP\",\n                      type = \"DIMENSION\",\n                      expressions = \"optimize.*\",\n                      scope = \"HIT\")\n```\n\n### **Fields in example requests**\nThese are the fields used in the example queries\n\n| Object | Argument | Explanation | Example Values |\n| :--------- | --------: | :----------- | :--------------|\n| `google_analytics` | `viewId` | unique ID for the View in Google Analytics| a variable `ga_tableid` or string `1234567` |\n| `google_analytics` | `date_range` | date range to query | `c(\"2019-09-02\", \"2019-09-08\")` |\n| `google_analytics` | `metrics` | list of GA metrics | `c(\"uniquePageviews\", \"uniqueEvents\")` |\n| `google_analytics` | `dimensions` | list of GA dimensions | `c(\"pagePath\")` |\n| `google_analytics` | `dim_filters` | list of GA filters | a variable `dim_filter_pagepath` or a filter clause |\n| `google_analytics` | `segments` | list of GA segments | a variable containing a segment ID or a string |\n| `google_analytics` | `anti_sample` | Try to download the data without sampling | `TRUE` or `FALSE` |\n\n\n## **To Do**\nExample code for running AB tests\n\n\n## SessionInfo\n\n```\n\u003e sessionInfo()\nR version 3.6.1 (2019-07-05)\nPlatform: x86_64-apple-darwin15.6.0 (64-bit)\nRunning under: macOS Catalina 10.15.2\n\nMatrix products: default\nBLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib\nLAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib\n\nlocale:\n[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8\n\nattached base packages:\n[1] stats     graphics  grDevices utils     datasets  methods   base     \n\nother attached packages:\n[1] googleAnalyticsR_0.6.0 jsonlite_1.6          \n\nloaded via a namespace (and not attached):\n [1] Rcpp_1.0.2        pillar_1.4.2      compiler_3.6.1    googleAuthR_1.0.0\n [5] remotes_2.1.0     prettyunits_1.0.2 tools_3.6.1       testthat_2.2.1   \n [9] pkgload_1.0.2     zeallot_0.1.0     digest_0.6.20     pkgbuild_1.0.5   \n[13] memoise_1.1.0     gargle_0.3.1      tibble_2.1.3      lifecycle_0.1.0  \n[17] pkgconfig_2.0.2   rlang_0.4.2       cli_1.1.0         rstudioapi_0.10  \n[21] curl_4.0          withr_2.1.2       dplyr_0.8.3       httr_1.4.1       \n[25] askpass_1.1       desc_1.2.0        fs_1.3.1          vctrs_0.2.1      \n[29] htmlwidgets_1.3   devtools_2.2.0    rprojroot_1.3-2   DT_0.8           \n[33] tidyselect_0.2.5  glue_1.3.1        R6_2.4.0          processx_3.4.1   \n[37] sessioninfo_1.1.1 tidyr_1.0.0       purrr_0.3.3       callr_3.3.1      \n[41] magrittr_1.5      usethis_1.5.1     backports_1.1.4   ps_1.3.0         \n[45] htmltools_0.3.6   ellipsis_0.2.0.1  assertthat_0.2.1  openssl_1.4.1    \n[49] crayon_1.3.4  \n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftobiasmcvey%2Fgar-reporting","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftobiasmcvey%2Fgar-reporting","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftobiasmcvey%2Fgar-reporting/lists"}