https://github.com/dataform-co/dataform-segment
https://github.com/dataform-co/dataform-segment
hacktoberfest
Last synced: about 1 year ago
JSON representation
- Host: GitHub
- URL: https://github.com/dataform-co/dataform-segment
- Owner: dataform-co
- License: mit
- Created: 2019-12-20T11:44:27.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2024-04-11T07:01:18.000Z (about 2 years ago)
- Last Synced: 2025-04-07T18:11:10.980Z (about 1 year ago)
- Topics: hacktoberfest
- Language: JavaScript
- Homepage:
- Size: 95.7 KB
- Stars: 4
- Watchers: 5
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Common data models for segment data such as `sessions` and a user roll up table built from `identifies`.
## Supported warehouses
- BigQuery
- Redshift (Beta)
- Postgres (Beta)
*If you would like us to add support for another warehouse, please get in touch via [email](mailto:team@dataform.co) or [Slack](https://dataform.co/slack)*
## Installation
Add the package to your `package.json` file in your Dataform project. You can find the most up to package version on the [releases page](https://github.com/dataform-co/dataform-segment/releases).
## Configure the package
Create a new JS file in your `definitions/` folder and create the segment tables with the following example:
```js
const segment = require("dataform-segment");
segment({
// The name of your segment schema.
segmentSchema: "javascript",
// The timeout for splitting sessions in milliseconds.
sessionTimeoutMillis: 30 * 60 * 1000,
// Default configuration applied to all produced datasets.
defaultConfig: {
schema: "dataform_segment",
tags: ["segment"],
type: "view"
},
// list of custom fields to extract from the pages table
customPageFields: ["url_hash", "category"],
// list of custom fields to extract from the identifies table
customUserFields: ["email", "name", "company_name", "created_at"],
// list of custom fields to extract from the tracks table
customerTrackFields: ["browser_type"],
// choose which of tracks, pages and screens to include in the sessionization model
includeTracks: true,
includePages: true,
includeScreens: false,
});
```
For more advanced uses cases, see the [example.js](https://github.com/dataform-co/dataform-segment/blob/master/definitions/example.js).
## Data models
This primary outputs of this package are the following data models (configurable as tables or views).
### `segment_sessions`
Contains a combined view of tracks, pages and screens from segment. Each session is a period of sustained activity, with a new session starting after a 30min+ period of inactivity. Each session contains a repeated field of records which are either tracks or pages. Common fields are extracted out into the top level and type specific fields are kept within two structs: `records.track` and `records.page`.
- To create a web-only sessions table, use `includeTracks: true, includePages: true, includeScreens: false`
- To create an app-only sessions table, use `includeTracks: true, includePages: false, includeScreens: true`
- To create a cross-platform sessions table, use `includeTracks: true, includePages: true, includeScreens: true`
_At least one of inicludeTracks, includePages, or inclueScreens must be set as true_
### `segment_users`
Aggregates all identifies calls to give a table with one row per user_id. Identify calls with only an anonymous_id are mapped to the matching user_id where possible.
