{"id":13710208,"url":"https://github.com/practical-data-science/gapandas4","last_synced_at":"2025-07-21T03:14:09.018Z","repository":{"id":37770157,"uuid":"506166341","full_name":"practical-data-science/gapandas4","owner":"practical-data-science","description":"GAPandas4 is a Python package for querying the Google Analytics Data API for GA4 and displaying the results in a Pandas dataframe. ","archived":false,"fork":false,"pushed_at":"2022-07-06T07:55:20.000Z","size":30,"stargazers_count":34,"open_issues_count":2,"forks_count":5,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-05-06T18:51:02.534Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/practical-data-science.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-06-22T08:43:23.000Z","updated_at":"2024-11-22T16:29:38.000Z","dependencies_parsed_at":"2022-08-08T22:00:49.234Z","dependency_job_id":null,"html_url":"https://github.com/practical-data-science/gapandas4","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/practical-data-science/gapandas4","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/practical-data-science%2Fgapandas4","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/practical-data-science%2Fgapandas4/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/practical-data-science%2Fgapandas4/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/practical-data-science%2Fgapandas4/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/practical-data-science","download_url":"https://codeload.github.com/practical-data-science/gapandas4/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/practical-data-science%2Fgapandas4/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266232585,"owners_count":23896632,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T23:00:53.066Z","updated_at":"2025-07-21T03:14:08.976Z","avatar_url":"https://github.com/practical-data-science.png","language":"Python","funding_links":[],"categories":["Data"],"sub_categories":[],"readme":"# GAPandas4\nGAPandas4 is a Python package for querying the Google Analytics Data API for GA4 and displaying the results in a Pandas dataframe. It is the successor to the [GAPandas](https://practicaldatascience.co.uk/data-science/how-to-access-google-analytics-data-in-pandas-using-gapandas) package, which did the same thing for GA3 or Universal Analytics. GAPandas4 is a wrapper around the official Google Analytics Data API package and simplifies imports and queries, requiring far less code. \n\n### Before you start\nIn order to use GAPandas4 you will first need to [create a Google Service Account](https://practicaldatascience.co.uk/data-engineering/how-to-create-a-google-service-account-client-secrets-json-key) with access to the Google Analytics Data API and export a client secrets JSON keyfile to use for authentication. You'll also need to add the service account email address as a user on the Google Analytics 4 property you wish to access, and you'll need to note the property ID to use in your queries.  \n\n### Installation\nYou can install GAPandas4 in two ways: via GitHub or via PyPi using the Pip Python package management system. \n\n```commandline\npip3 install git+https://github.com/practical-data-science/gapandas4.git\npip3 install gapandas4\n```\n\n### Usage examples\nGAPandas4 has been written to allow you to use as little code as possible. Unlike the previous version of GAPandas for Universal Analytics, which used a payload based on a Python dictionary, GAPandas4 now uses a Protobuf (Protocol Buffer) payload as used in the API itself. \n\n#### Report\nThe `query()` function is used to send a protobug API payload to the API. The function supports various report types \nvia the `report_type` argument. Standard reports are handled using `report_type=\"report\"`, but this is also the \ndefault. Data are returned as a Pandas dataframe. \n\n```python\nimport gapandas4 as gp\n\nservice_account = 'client_secrets.json'\nproperty_id = 'xxxxxxxxx'\n\nreport_request = gp.RunReportRequest(\n    property=f\"properties/{property_id}\",\n    dimensions=[\n        gp.Dimension(name=\"country\"),\n        gp.Dimension(name=\"city\")\n    ],\n    metrics=[\n        gp.Metric(name=\"activeUsers\")\n    ],\n    date_ranges=[gp.DateRange(start_date=\"2022-06-01\", end_date=\"2022-06-01\")],\n)\n\ndf = gp.query(service_account, report_request, report_type=\"report\")\nprint(df.head())\n```\n\n#### Batch report\nIf you construct a protobuf payload using `BatchRunReportsRequest()` you can pass up to five requests at once. These \nare returned as a list of Pandas dataframes, so will need to access them using their index. \n\n```python\nimport gapandas4 as gp\n\nservice_account = 'client_secrets.json'\nproperty_id = 'xxxxxxxxx'\n\n\nbatch_report_request = gp.BatchRunReportsRequest(\n    property=f\"properties/{property_id}\",\n    requests=[\n        gp.RunReportRequest(\n            dimensions=[\n                gp.Dimension(name=\"country\"),\n                gp.Dimension(name=\"city\")\n            ],\n            metrics=[\n                gp.Metric(name=\"activeUsers\")\n            ],\n            date_ranges=[gp.DateRange(start_date=\"2022-06-01\", end_date=\"2022-06-01\")]\n        ),\n        gp.RunReportRequest(\n            dimensions=[\n                gp.Dimension(name=\"country\"),\n                gp.Dimension(name=\"city\")\n            ],\n            metrics=[\n                gp.Metric(name=\"activeUsers\")\n            ],\n            date_ranges=[gp.DateRange(start_date=\"2022-06-02\", end_date=\"2022-06-02\")]\n        )\n    ]\n)\n\ndf = gp.query(service_account, batch_report_request, report_type=\"batch_report\")\nprint(df[0].head())\nprint(df[1].head())\n```\n\n#### Pivot report\nConstructing a report using `RunPivotReportRequest()` will return pivoted data in a single Pandas dataframe. \n\n```python\nimport gapandas4 as gp\n\nservice_account = 'client_secrets.json'\nproperty_id = 'xxxxxxxxx'\n\npivot_request = gp.RunPivotReportRequest(\n    property=f\"properties/{property_id}\",\n    dimensions=[gp.Dimension(name=\"country\"),\n                gp.Dimension(name=\"browser\")],\n    metrics=[gp.Metric(name=\"sessions\")],\n    date_ranges=[gp.DateRange(start_date=\"2022-05-30\", end_date=\"today\")],\n    pivots=[\n        gp.Pivot(\n            field_names=[\"country\"],\n            limit=5,\n            order_bys=[\n                gp.OrderBy(\n                    dimension=gp.OrderBy.DimensionOrderBy(dimension_name=\"country\")\n                )\n            ],\n        ),\n        gp.Pivot(\n            field_names=[\"browser\"],\n            offset=0,\n            limit=5,\n            order_bys=[\n                gp.OrderBy(\n                    metric=gp.OrderBy.MetricOrderBy(metric_name=\"sessions\"), desc=True\n                )\n            ],\n        ),\n    ],\n)\n\ndf = gp.query(service_account, pivot_request, report_type=\"pivot\")\nprint(df.head())\n```\n\n#### Batch pivot report\nConstructing a payload using `BatchRunPivotReportsRequest()` will allow you to run up to five pivot reports. These \nare returned as a list of Pandas dataframes. \n\n```python\nimport gapandas4 as gp\n\nservice_account = 'client_secrets.json'\nproperty_id = 'xxxxxxxxx'\n\nbatch_pivot_request = gp.BatchRunPivotReportsRequest(\n    property=f\"properties/{property_id}\",\n    requests=[\n        gp.RunPivotReportRequest(\n            dimensions=[gp.Dimension(name=\"country\"),\n                        gp.Dimension(name=\"browser\")],\n                metrics=[gp.Metric(name=\"sessions\")],\n                date_ranges=[gp.DateRange(start_date=\"2022-05-30\", end_date=\"today\")],\n                pivots=[\n                    gp.Pivot(\n                        field_names=[\"country\"],\n                        limit=5,\n                        order_bys=[\n                            gp.OrderBy(\n                                dimension=gp.OrderBy.DimensionOrderBy(dimension_name=\"country\")\n                            )\n                        ],\n                    ),\n                    gp.Pivot(\n                        field_names=[\"browser\"],\n                        offset=0,\n                        limit=5,\n                        order_bys=[\n                            gp.OrderBy(\n                                metric=gp.OrderBy.MetricOrderBy(metric_name=\"sessions\"), desc=True\n                            )\n                        ],\n                    ),\n                ],\n        ),\n        gp.RunPivotReportRequest(\n            dimensions=[gp.Dimension(name=\"country\"),\n                        gp.Dimension(name=\"browser\")],\n                metrics=[gp.Metric(name=\"sessions\")],\n                date_ranges=[gp.DateRange(start_date=\"2022-05-30\", end_date=\"today\")],\n                pivots=[\n                    gp.Pivot(\n                        field_names=[\"country\"],\n                        limit=5,\n                        order_bys=[\n                            gp.OrderBy(\n                                dimension=gp.OrderBy.DimensionOrderBy(dimension_name=\"country\")\n                            )\n                        ],\n                    ),\n                    gp.Pivot(\n                        field_names=[\"browser\"],\n                        offset=0,\n                        limit=5,\n                        order_bys=[\n                            gp.OrderBy(\n                                metric=gp.OrderBy.MetricOrderBy(metric_name=\"sessions\"), desc=True\n                            )\n                        ],\n                    ),\n                ],\n        )\n    ]\n)\n\ndf = gp.query(service_account, batch_pivot_request, report_type=\"batch_pivot\")\nprint(df[0].head())\nprint(df[1].head())\n\n```\n\n#### Metadata\nThe `get_metadata()` function will return all metadata on dimensions and metrics within the Google Analytics 4 property. \n\n```python\nmetadata = gp.get_metadata(service_account, property_id)\nprint(metadata)\n```\n\n### Current features\n- Support for all current API functionality including `RunReportRequest`, `BatchRunReportsRequest`,\n  `RunPivotReportRequest`,  `BatchRunPivotReportsRequest`, `RunRealtimeReportRequest`, and `GetMetadataRequest`. \n- Returns data in a Pandas dataframe, or a list of Pandas dataframes. ","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpractical-data-science%2Fgapandas4","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpractical-data-science%2Fgapandas4","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpractical-data-science%2Fgapandas4/lists"}