{"id":19914711,"url":"https://github.com/fullstorydev/dbt_fullstory","last_synced_at":"2026-01-22T06:32:09.792Z","repository":{"id":190494759,"uuid":"678887512","full_name":"fullstorydev/dbt_fullstory","owner":"fullstorydev","description":"The official FullStory dbt package","archived":false,"fork":false,"pushed_at":"2024-08-07T20:59:06.000Z","size":92,"stargazers_count":1,"open_issues_count":0,"forks_count":2,"subscribers_count":17,"default_branch":"main","last_synced_at":"2024-08-07T23:56:13.313Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fullstorydev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-08-15T16:03:20.000Z","updated_at":"2024-08-07T20:59:09.000Z","dependencies_parsed_at":null,"dependency_job_id":"c0f579d9-96dc-4a33-b3bb-f4f0ef627df9","html_url":"https://github.com/fullstorydev/dbt_fullstory","commit_stats":null,"previous_names":["fullstorydev/dbt_fullstory"],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fullstorydev%2Fdbt_fullstory","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fullstorydev%2Fdbt_fullstory/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fullstorydev%2Fdbt_fullstory/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fullstorydev%2Fdbt_fullstory/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fullstorydev","download_url":"https://codeload.github.com/fullstorydev/dbt_fullstory/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224354150,"owners_count":17297401,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-12T21:36:50.948Z","updated_at":"2026-01-22T06:32:09.786Z","avatar_url":"https://github.com/fullstorydev.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# The Official Fullstory dbt Package for Data Destinations\n\n\u003e ⚠️ **DEPRECATION NOTICE**: This repository is deprecated and no longer maintained. No further updates, bug fixes, or security patches will be provided.\n\u003e\n\u003e **Fullstory now provides this functionality natively through [Ready to Analyze Views](https://developer.fullstory.com/anywhere/warehouse/ready-to-analyze-views/).** If you are an existing customer syncing raw data, please contact your Fullstory representative to have it enabled. If you created your warehouse connection after 8/1/2025, your newly connected warehouse will be opted-in automatically.\n\nThis dbt package contains models, macros, seeds, and tests for [Fullstory](https://www.fullstory.com/)'s [Data Destinations](https://help.fullstory.com/hc/en-us/articles/6295300682903-Data-Destinations) add-on.\n\n## Models\n\n| model | description |\n| --- | --- |\n| anonymous_users | All users that have not been identified. |\n| devices | All events with their device information parsed. |\n| events | All events |\n| identified_users | All users who have been identified. |\n| identities | All identify events. |\n| sessions | Session-level aggregations, including event counts broken down by type, location and device information, duration, Fullstory session replay links, etc. |\n| users | User-level aggregations, including email addresses, location and device information, session counts, etc. |\n\n## Vars\n\n| var | description |\n| --- | --- |\n| fullstory_events_database | The database where your Fullstory events table lives. |\n| fullstory_events_schema | The schema inside of your database where your Fullstory events table lives. |\n| fullstory_events_table | The name of the table inside your schema where your Fullstory events table lives. |\n| fullstory_replay_host | The hostname to use when building links to session replay. |\n| fullstory_sessions_model_name | The name of the model for the canonical list of sessions. |\n| fullstory_anonymous_users_model_name | The customized name of the `anonymous_users` model. |\n| fullstory_devices_model_name | The customized name of the `devices` model. |\n| fullstory_events_model_name | The customized name of the `events` model. |\n| fullstory_identified_users_model_name | The customized name of the `identified_users` model. |\n| fullstory_identities_model_name | The customized name of the `identities` model. |\n| fullstory_sessions_model_name | The customized name of the `sessions` model. |\n| fullstory_skip_json_parse | Whether or not to skip JSON parsing when processing the data, default False. |\n| fullstory_users_model_name | The customized name of the `users` model. |\n| fullstory_min_event_time | All events before this date will not be considered for analysis. Use this option to limit table size. |\n| fullstory_event_types | A list of event types to auto-generate rollups for in the `users` and `sessions` model. |\n| **fullstory_enable_safe_json_parsing** | **NEW in v0.10.0** - Enables safe JSON parsing with error handling. When `true` (default), uses `SAFE.PARSE_JSON` in BigQuery to handle malformed JSON gracefully. Set to `false` for strict parsing. |\n| **fullstory_enable_data_tests** | **NEW in v0.10.0** - Controls whether dbt data tests are enabled. When `true`, runs comprehensive data quality tests on all models. Default is `false` for performance. |\n| **fullstory_test_store_failures** | **NEW in v0.10.0** - When data tests are enabled, controls whether test failures are stored in your warehouse for analysis. Default is `false`. Only applicable when `fullstory_enable_data_tests` is `true`. |\n\n\u003e We **highly recommend** using `fullstory_events_database`, `fullstory_events_schema` and `fullstory_events_table` to indicate the location of the Fullstory events table that is synced from Data Destinations. Using these variables allow you to use a separate database or schema for the Fullstory events table than your dbt package.\n\n### Example use of vars for BigQuery\n\n```yaml\nvars:\n  fullstory_events_database: my-gcp-project\n  fullstory_events_schema: my-big-query-dataset\n  fullstory_events_table: fullstory_events_[my-org-id]\n```\n\n### Example use of vars for Snowflake\n\n```yaml\nvars:\n  fullstory_events_database: my_database\n  fullstory_events_schema: my_schema\n  fullstory_events_table: my_table\n```\n\n### Example use of vars for Redshift or Redshift Serverless\n\n```yaml\nvars:\n  fullstory_events_database: my_database\n  fullstory_events_schema: my_schema\n  fullstory_events_table: my_table\n```\n\n## Configuration Variables (New in v0.10.0)\n\n### Data Quality and Error Handling\n\nThe dbt_fullstory package now includes enhanced configuration options for better data quality management and error handling:\n\n#### Safe JSON Parsing\n\n```yaml\nvars:\n  fullstory_enable_safe_json_parsing: true  # Default: true\n```\n\nControls how JSON data is parsed from Fullstory events:\n\n- **`true` (recommended)**: Uses safe JSON parsing functions (e.g., `SAFE.PARSE_JSON` in BigQuery) that handle malformed JSON gracefully\n- **`false`**: Uses strict JSON parsing that will fail on invalid JSON data\n\nThis is particularly useful when dealing with:\n\n- Malformed JSON in event or source properties\n- Edge cases in data synchronization\n- Development environments with test data\n\n#### Data Testing Configuration\n\n```yaml\nvars:\n  fullstory_enable_data_tests: false        # Default: false\n  fullstory_test_store_failures: false     # Default: false\n```\n\n**Enable Data Tests:**\n\n- **`fullstory_enable_data_tests: true`**: Activates comprehensive data quality tests across all models\n- **`fullstory_enable_data_tests: false`**: Disables tests for faster builds (default for performance)\n\n**Store Test Failures:**\n\n- **`fullstory_test_store_failures: true`**: Saves failed test results in your warehouse for analysis\n- **`fullstory_test_store_failures: false`**: Does not store test failure details (default)\n\n#### Example Complete Configuration\n\n```yaml\nvars:\n  # Connection settings\n  fullstory_events_database: my-project\n  fullstory_events_schema: fullstory_data\n  fullstory_events_table: fullstory_events_o_123_na1\n  \n  # Data quality settings (NEW in v0.10.0)\n  fullstory_enable_safe_json_parsing: true\n  fullstory_enable_data_tests: true\n  fullstory_test_store_failures: false\n  \n  # Performance settings\n  fullstory_incremental_interval_hours: 48  # Look back 2 days for incremental updates\n```\n\n\u003e **💡 Tip**: For production environments, we recommend keeping `fullstory_enable_safe_json_parsing: true` and enabling data tests during initial setup to validate data quality, then disabling them for regular runs to improve performance.\n\n## Supported Warehouses\n\n- BigQuery\n- Snowflake\n- Redshift\n- Redshift Serverless\n\n### Example Profile Configurations\n\n### BigQuery\n\n```yaml\ndbt_fullstory:\n  target: prod\n  outputs:\n    prod:\n      type: bigquery\n      method: oauth\n      project: my-gcp-project\n      dataset: my_dataset\n      threads: 1\n```\n\n### Snowflake\n\n```yaml\ndbt_fullstory:\n  target: prod\n  outputs:\n    prod:\n      type: snowflake\n      account: xy12345.us-east-1.aws\n      user: my_admin_user\n      password: ********\n      role: my_admin_role\n      database: fullstory\n      warehouse: compute_wh\n      schema: my_schema\n      threads: 1\n      client_session_keep_alive: False\n      query_tag: [fullstory_dbt]\n```\n\n#### Redshift\n\n```yaml\ndbt_fullstory:\n  target: prod\n  outputs:\n    prod:\n      type: redshift\n      cluster_id: my-cluster-id\n      method: iam\n      host: my-cluster-id.12345678910.us-east-1.redshift.amazonaws.com\n      port: 5439\n      user: admin\n      iam_profile: my-aws-profile\n      dbname: dev\n      schema: dbt\n      region: us-east-1\n      threads: 8\n      connect_timeout: 30\n```\n\n#### Redshift Serverless\n\n```yaml\ndbt_fullstory:\n  target: prod\n  outputs:\n    prod:\n      type: redshift\n      cluster_id: my-namespace-id\n      method: iam\n      host: my-workgroup.12345678910.us-east-1.redshift-serverless.amazonaws.com\n      port: 5439\n      user: serverlessuser\n      iam_profile: my-aws-profile\n      dbname: dev\n      schema: dbt\n      region: us-east-1\n      threads: 8\n      connect_timeout: 30\n```\n\n## Installation\n\nGeneral information about dbt packages can be found [in the dbt documentation](https://docs.getdbt.com/docs/build/packages).\n\n### Requirements\n\n- dbt version \u003e= 1.6.0\n- Fullstory Data Destination events table\n  - In BigQuery, this table will be named `fullstory_events_o_123_na1` where `o-123-na1` is your org id.\n    - Your org ID can be found in the URL when logged into fullstory.\n\n    ```text\n    app.fullstory.com/ui/\u003cyour-org-id\u003e/...\n    ```\n\n  - In Snowflake, this table will be named `events`.\n  - The events table will be created the first time that Fullstory syncs event data to your warehouse.\n\n### Adding to an Existing Project\n\nInclude the following into your packages.yml file:\n\n```yaml\n  - package: fullstorydev/dbt_fullstory\n    revision: 0.11.0\n```\n\nThen, run `dbt deps` to install the package. We highly recommend pinning to a specific release. Pinning your version helps prevent unintended changes to your warehouse.\n\nTo use the seed tables which have some info around common types, run:\n\n```sh\ndbt seed\n```\n\n## Customizing model materialization\n\n### Materializing models as a table\n\nYou can configure your project to materialize any model from this package as a *table*. All you need to do is add a configuration block for the `dbt_fullstory` project under the `models` key in your `dbt_project.yml`:\n\n#### Configuring Individual Model as Table\n\n```yaml\n# Configuring models\n# Full documentation: https://docs.getdbt.com/docs/configuring-models\nmodels:\n  ...\n\n  dbt_fullstory: # The package name you are customizing\n    anonymous_users: # The model name\n      +materialized: table\n      # The following optional options are Big Query specific optimizations. For specific configuration options for your warehouse see: https://docs.getdbt.com/reference/model-configs#warehouse-specific-configurations\n      +partition_by: # Optional Config\n        field: event_time\n        data_type: timestamp\n        granularity: day\n    # .. more models\n```\n\n### Incremental modeling\n\nDBT provides a powerful mechanism for improving the performance of your models and reducing query costs: [incremental models](https://docs.getdbt.com/docs/build/incremental-models). An incremental model only processes new or updated records since the last run, thereby saving significant processing power and time.\n\n\u003e If your organization generates an arbitrarily large amount of events or grows at a large rate, then each `dbt build` will increase past the point of acceptance.\n\nFor most customers, `sessions` will be the most taxing to your data warehouse, and we recommend you start incremental loading there.\n\n#### Getting started with incremental models\n\n\u003e The following models have the option of being materialized as a table incrementally: `devices`, `display_names`, `identified_users`, `sessions`.\n\nYou can configure your project to load any of the forementioned models from this package incrementally. All you need to do is add a configuration block for the `dbt_fullstory` project under the `models` key in your `dbt_project.yml`:\n\n```yaml\n# Configuring models\n# Full documentation: https://docs.getdbt.com/docs/configuring-models\nmodels:\n  ...\n\n  dbt_fullstory: # The package name you are customizing\n    sessions: # The model name\n      +materialized: incremental\n      # The following optional options are Big Query specific optimizations. For specific configuration options for your warehouse see: https://docs.getdbt.com/reference/model-configs#warehouse-specific-configurations\n      +partition_by: # Optional Config\n        field: event_time\n        data_type: timestamp\n        granularity: day\n    devices: # The model Name\n      +materialized: incremental\n    # .. more models\n```\n\nWhen loading data incrementally, DBT needs to know how far back to look in the current table for data to compare to the incoming data. We will look back 2 days for data to update by default. This interval can be configured with the variable `fullstory_incremental_interval_hours`.\n\nTwo days was decided upon because we typically drop late arriving events after 24 hours. To understand why a event may arrive late, please check out [this article on swan songs](https://help.fullstory.com/hc/en-us/articles/360048109714-Swan-songs-How-Fullstory-captures-sessions-that-end-unexpectedly#:~:text=If%20the%20user%20navigates%20away,Fullstory%20before%20the%20page%20closes.).\n\nThis incremental interval is important; it can limit the cost of a query by greatly reducing the amount of work that needs to be done in order to add new data. Ultimatley, this setting will be specific to your needs; we recommend starting with the default and updating once you understand the trends of your data set.\n\n### Considerations\n\n- **Use incrementally-loaded models judiciously:** While incremental loading does improve performance and cut costs, it adds some complexity to managing your dbt project. Ensure you need the trade-off before implementing it.\n\n- **Aggregation challenges:** Aggregations in incrementally-loaded models can be challenging and unreliable. When performing aggregations (such as count, sum, average), best practice is to refresh the complete model to include all data in the aggregation. Incrementally updating aggregated data can yield incorrect results because of missing or partially updated data.\n\nThink about whether using date-partitioned tables, continuous rollups (using window functions), or occasionally running full-refreshes might serve your use case better.\n\nRemember, fine tuning model performance and costs is a balancing act. Incremental models may not suit all scenarios, but when managed correctly, they can be incredibly powerful. Start with the `sessions` model, measure the benefits, and then increment other models as necessary. Happy modeling!\n\n### Other models\n\nAlthough, we often find the incrementalization of the `sessions` model to be sufficient, you can customize the materialization method of any model in this package. Enabling additional incrementalization can be done in the same way as the `sessions` table, simply add a configuration block to your `dbt_project.yml`.\n\n## Running Integration Tests\n\nThe `integration_tests` directory is a dbt project itself that depends on `dbt_fullstory`. We use this package to test how our models will execute in the real world as it simulates a live environment and is used in CI to hit actual databases. If you wish, you can run these tests locally. All you need is a target configured in your `profiles.yml` that is authenticated to a supported warehouse type.\n\n\u003e Internally, we name our profiles after the type of warehouse we are connecting (e.g. `bigquery`, `snowflake`, etc.). It makes the command more clear, like: `dbt run --target bigquery`.\n\nTo create the test data in your database:\n\n```sh\ndbt seed --target my-target\n```\n\nTo run the shim for your warehouse:\n\n\u003e The shim will emulate how data is synced for your particular warehouse. As an example, data is loaded in JSON columns in Snowflake but as strings in BigQuery. You can choose from:\n\u003e\n\u003e - bigquery_events_shim\n\u003e - snowflake_events_shim\n\n```sh\ndbt run --target my-target --select \u003cmy-warehouse\u003e_events_shim\n```\n\nTo run the models:\n\n```sh\ndbt run --target my-target\n```\n\nTo run the tests:\n\n```sh\ndbt test --target my-target\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffullstorydev%2Fdbt_fullstory","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffullstorydev%2Fdbt_fullstory","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffullstorydev%2Fdbt_fullstory/lists"}