{"id":19491806,"url":"https://github.com/rudderlabs/dbt-sessionization","last_synced_at":"2026-03-17T14:37:10.519Z","repository":{"id":49948149,"uuid":"266024002","full_name":"rudderlabs/dbt-sessionization","owner":"rudderlabs","description":"Using DBT for Creating Session Abstractions on RudderStack - an open-source, warehouse-first customer data pipeline and Segment alternative.","archived":false,"fork":false,"pushed_at":"2022-11-07T19:05:59.000Z","size":20,"stargazers_count":20,"open_issues_count":1,"forks_count":7,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-02-25T18:49:56.082Z","etag":null,"topics":["dbt","rudderstack","sessionization"],"latest_commit_sha":null,"homepage":"https://rudderstack.com/","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rudderlabs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-05-22T05:13:33.000Z","updated_at":"2024-12-20T06:50:14.000Z","dependencies_parsed_at":"2023-01-21T04:42:18.408Z","dependency_job_id":null,"html_url":"https://github.com/rudderlabs/dbt-sessionization","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rudderlabs%2Fdbt-sessionization","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rudderlabs%2Fdbt-sessionization/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rudderlabs%2Fdbt-sessionization/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rudderlabs%2Fdbt-sessionization/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rudderlabs","download_url":"https://codeload.github.com/rudderlabs/dbt-sessionization/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240738092,"owners_count":19849546,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dbt","rudderstack","sessionization"],"created_at":"2024-11-10T21:18:23.914Z","updated_at":"2025-11-18T14:34:38.257Z","avatar_url":"https://github.com/rudderlabs.png","language":null,"readme":"# Sessionization using DBT and RudderStack\n\nThis repository contains a sample DBT project for RudderStack. It can be applied on the RudderStack data residing in BigQuery. This DBT project creates \"session\" abstractions on top of RudderStack's `track` event data. The materialized DBT tables and views are used for the same.\n\n## Overview\n\nThis project builds on top data from the `tracks` table which is created by default in all the RudderStack warehouse destinations. Developers having access to a RudderStack BigQuery destination can start using this code by simply changing the dataset and schema names in the following files:\n* `models/rudder/dbt_aliases_mapping.sql`\n* `models/rudder/dbt_mapped_tracks.sql`\n\nFor example, a statement like `... from big-query-integration-poc.RudderAutoTrack.tracks` will need to be changed \nto ` ... from MyBigQueryDataSet.MyRudderSchema.tracks`\n\n## How to Use This Repository\n\nThis project was created on DBT Cloud (https://cloud.getdbt.com). Hence there is no `profiles.yml` file with the connection information. Developers who want to execute the models on the Command Line Interface (CLI) mode will need to create additional configuration files by following the directions provided [here](https://docs.getdbt.com/docs/running-a-dbt-project/using-the-command-line-interface/)\n\n**Note**: While this code has been tested for Google BigQuery, it should also be usable for other RudderStack-supported data warehouses like Amazon Redshift and Snowflake. The only differences that might arise are with regards to functions related to timestamp handling and analytics. Even then, we believe the code should be usable by just replacing the BigQuery functions with their counterparts from Redshift or Snowflake as required.\n\n### Sequence of Commands\n\nThe sequence in which the DBT models should be executed for a fresh run is as follows:\n* `dbt_aliases_mapping`\n\nThis model/table has two attributes/columns - `alias` and `dbt_visitor_id`. This table captures the linkages between one or more `anonymous_id` values (`alias`) and a `user_id` (`dbt_visitor_id`).\n\n* `dbt_mapped_tracks`\n\nThis table has columns - `event_id`, `anonymous_id`, `dbt_visitor_id`, `timestamp`, `event`, `idle_time_minutes`.\n`event` represents the actual event name. `timestamp` corresponds to the instant when the event was actually generated.\n`idle_time_minutes` captures the time gap between the event and the immediate preceeding one.\n\n* `dbt_session_tracks`\n\nThis table contains columns - `session_id`, `dbt_visitor_id`, `session_start_at`, `session_sequence_number`, `next_session_start_at`. \n\nThe data in the `dbt_mapped_tracks` table is partitioned first by `dbt_visitor_id`. It is then partitioned farther into \ngroups of events where within one group the time-gap i.e. `idle_time_minutes` is not more than 30. In other words - if `idle_time_minutes`for an event is more than 30, a new group is created. \n\nThese groups of sequential events are essentially the sessions.The value of 30 can be modified in the model definition. \nThe `session_sequence_number` represent the order of the session for a particular user.\nThe `session_id` is of the form `session_sequence_number - dbt_visitor_id`.\n\n* dbt_track_facts\n\nThis table has columns - `anonymous_id`, `timestamp`, `event_id`, `event`, `session_id`, `dbt_visitor_id`, \n`track_sequence_number`. \n\nIn this table, the information from `dbt_session_tracks` is tied back to the records in the `dbt_mapped_tracks` table.\nEach event is now tied to a `session_id` and within the session also, the event is assigned a `track_sequence_number`.\n\n* dbt_session_track_facts\n\nThe columns in this table are - `session_id`, `ended_at`, `num_pvs`, `cnt_viewed_product`, `cnt_signup`.\n`num_pvs` captures the number of distinct events in that session. `cnt_viewed_product` captures the total number of times 'view_product' events were triggered. This is only for illustrative purposes; developers might want to monitor statistics \nfor a different event type.\n\n* dbt_user_session_facts\n\nThe columns in this table are `dbt_visitor_id`, `first_date`, `last_date`, `number_of_sessions`. This table captures the \ntime period for which the user has been active and the number of sessions they have created in that time.\n\n* dbt_session_duration\n\nThis table captures the duration for each session of a user; the associated columns are `dbt_visitor_id`, `session_id` and \n`session_duration`.\n\n* dbt_tracks_flow\n\nThe columns in this table are - `event_id`, `session_id`, `track_sequence_number`, `event`, `dbt_visitor_id`, `timestamp`,\n`event_2`, `event_3`, `event_4`, `event_5`. This is essentially a table where each event and 4 subsequent events are \nrepresented in each record. \n\n**Note**: A sample analysis query is provided at `analysis/dbt_top_users_by_avg_session_duration.sql`\n\n# What is RudderStack?\n\n[RudderStack](https://rudderstack.com/) is a **customer data pipeline** tool for collecting, routing and processing data from your websites, apps, cloud tools, and data warehouse.\n\nMore information on RudderStack can be found [here](https://github.com/rudderlabs/rudder-server).\n\n## Contact Us\n\nIf you come across any issues while configuring or using this project, please feel free to start a conversation on our [Slack](https://resources.rudderstack.com/join-rudderstack-slack) channel. We will be happy to help you.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frudderlabs%2Fdbt-sessionization","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frudderlabs%2Fdbt-sessionization","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frudderlabs%2Fdbt-sessionization/lists"}