https://github.com/snowplow/dbt-snowplow-web
A fully incremental model, that transforms raw web event data generated by the Snowplow JavaScript tracker into a series of derived tables of varying levels of aggregation.
https://github.com/snowplow/dbt-snowplow-web
analytics data-model data-pipeline dbt snowplow-analytics
Last synced: 12 days ago
JSON representation
A fully incremental model, that transforms raw web event data generated by the Snowplow JavaScript tracker into a series of derived tables of varying levels of aggregation.
- Host: GitHub
- URL: https://github.com/snowplow/dbt-snowplow-web
- Owner: snowplow
- License: other
- Created: 2021-06-22T10:38:44.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2024-05-14T15:38:57.000Z (12 months ago)
- Last Synced: 2024-05-15T04:36:27.174Z (12 months ago)
- Topics: analytics, data-model, data-pipeline, dbt, snowplow-analytics
- Language: Shell
- Homepage:
- Size: 2.76 MB
- Stars: 50
- Watchers: 11
- Forks: 16
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
README
[![maintained]][tracker-classification] [![License][license-image]][license] [![Discourse posts][discourse-image]][discourse]

> This package is in maintenance mode. This means it will only receive bug fixes and security patches as required. Future development of the Snowplow dbt models is being done in the [Unified Digital](https://docs.snowplow.io/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/) package which you can get access to via the [Snowplow Data Models Pack](https://snowplow.io/snowplow-data-model-pack/).
# snowplow-webThis dbt package:
- Transforms and aggregates raw web event data collected from the [Snowplow JavaScript tracker][tracker-docs] into a set of derived tables: page views, sessions and users, plus an optional set of consent tables.
- Derives a mapping between user identifiers, allowing for 'session stitching' and the development of a single customer view.
- Processes **all web events incrementally**. It is not just constrained to page view events - any custom events you are tracking will also be incrementally processed.
- Is designed in a modular manner, allowing you to easily integrate your own custom dbt models into the incremental framework provided by the package.Please refer to the [doc site](https://docs.snowplow.io/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-web-data-model/) for a full breakdown of the package.
### Getting Started
The easiest way to get started is to follow our [QuickStart guide](https://docs.snowplow.io/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/web/), or to use our [Advanced Analytics for Web Accelerator](https://docs.snowplow.io/accelerators/web/) which includes steps for setting up tracking as well as modeling, and our [Consent Tracking for Marketing Accelerator](https://docs.snowplow.io/accelerators/consent/) specifically for our Consent Management Platform models.
### Adapter Support
The latest version of the snowplow-web package supports BigQuery, Databricks, Redshift, Snowflake & Postgres. For previous versions see our [package docs](https://docs.snowplow.io/docs/modeling-your-data/modeling-your-data-with-dbt/).
### Requirements
- A dataset of web events from the [Snowplow JavaScript tracker][tracker-docs] must be available in the database.
- Have the [`webPage` context][webpage-context] enabled.
- dbt-core version 1.6.0 or greater
- You must be using RDB Loader v4.0.0 and above, or BigQuery Loader v1.0.0 and above, to ensure your data has the `load_tstamp` column. If you are not using these versions, or are using the Postgres loader, you will need to set `snowplow__enable_load_tstamp` to false in your` dbt_project.yml` and will not be able to use the consent models.### Installation
Check [dbt Hub](https://hub.getdbt.com/snowplow/snowplow_web/latest/) for the latest installation instructions.
### Configuration & Operation
Please refer to the [doc site](https://docs.snowplow.io/docs/modeling-your-data/modeling-your-data-with-dbt/) for details on how to [configure](https://docs.snowplow.io/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/web/) and [run](https://docs.snowplow.io/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/web/) the package.
### Models
The package contains multiple staging models however the output models are as follows:
| Model | Description |
| --------------------------------- | ------------------------------------------------------------------------------------------------------------ |
| snowplow_web_page_views | A table of page views, including engagement metrics such as scroll depth and engaged time. |
| snowplow_web_sessions | An aggregated table of session events, including conversions [Optional], grouped on `domain_sessionid`. |
| snowplow_web_users | An aggregated table of sessions to a user level, grouped on `domain_userid`. |
| snowplow_web_user_mapping | Provides a mapping between user identifiers, `domain_userid` and `user_id`. |
| snowplow_web_consent_log | [Optional] Incremental table showing the audit trail of consent and Consent Management Platform (cmp) events |
| snowplow_web_consent_users | [Optional] By user consent stats |
| snowplow_web_consent_totals | [Optional] Summary of the latest consent status as per consent version |
| snowplow_web_consent_scope_status | [Optional] Aggregate of current number of users consented to each consent scope |
| snowplow_web_consent_cmp_stats | [Optional] Used for modeling cmp_visible events and related metrics |
| snowplow_web_consent_versions | [Optional] Used to keep track of each consent version and its validity |Please refer to the [dbt doc site](https://snowplow.github.io/dbt-snowplow-web/#!/overview/snowplow_web) for details on the model output tables.
# Join the Snowplow community
We welcome all ideas, questions and contributions!
For support requests, please use our community support [Discourse][discourse] forum.
If you find a bug, please report an issue on GitHub.
# Copyright and license
The snowplow-web package is Copyright 2020-present Snowplow Analytics Ltd.
This distribution is all licensed under the [Snowplow Community License, Version 1.0][license] . (If you are uncertain how it applies to your use case, check our answers to [frequently asked questions](https://docs.snowplow.io/docs/contributing/community-license-faq/).)
[license]: https://docs.snowplow.io/community-license-1.0/
[license-image]: http://img.shields.io/badge/license-Snowplow--Community--1-blue.svg?style=flat
[tracker-classification]: https://docs.snowplow.io/docs/collecting-data/collecting-from-own-applications/tracker-maintenance-classification/
[maintained]: https://img.shields.io/static/v1?style=flat&label=Snowplow&message=Maintained&color=a069d7&labelColor=9ba0aa&logo=
[tracker-docs]: https://docs.snowplow.io/docs/collecting-data/collecting-from-own-applications/javascript-trackers/
[webpage-context]: https://docs.snowplow.io/docs/collecting-data/collecting-from-own-applications/javascript-trackers/javascript-tracker/javascript-tracker-v3/tracker-setup/initialization-options/#adding-predefined-contexts
[dbt-package-docs]: https://docs.getdbt.com/docs/building-a-dbt-project/package-management
[discourse-image]: https://img.shields.io/discourse/posts?server=https%3A%2F%2Fdiscourse.snowplow.io%2F
[discourse]: http://discourse.snowplow.io/