{"id":19914731,"url":"https://github.com/fullstorydev/hauser","last_synced_at":"2025-05-03T05:31:47.018Z","repository":{"id":24365717,"uuid":"101294833","full_name":"fullstorydev/hauser","owner":"fullstorydev","description":"Service for moving your Fullstory export files to a data warehouse","archived":false,"fork":false,"pushed_at":"2024-04-09T17:42:07.000Z","size":48109,"stargazers_count":48,"open_issues_count":8,"forks_count":23,"subscribers_count":27,"default_branch":"master","last_synced_at":"2024-06-19T00:39:13.073Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fullstorydev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-08-24T12:55:29.000Z","updated_at":"2024-04-09T15:44:09.000Z","dependencies_parsed_at":"2023-01-14T00:50:29.139Z","dependency_job_id":"a5639c24-87db-473e-b39a-ab1f23483e1f","html_url":"https://github.com/fullstorydev/hauser","commit_stats":null,"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fullstorydev%2Fhauser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fullstorydev%2Fhauser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fullstorydev%2Fhauser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fullstorydev%2Fhauser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fullstorydev","download_url":"https://codeload.github.com/fullstorydev/hauser/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224354151,"owners_count":17297401,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-12T21:36:53.900Z","updated_at":"2024-11-12T21:36:54.127Z","avatar_url":"https://github.com/fullstorydev.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# hauser\n\n[![CircleCI](https://circleci.com/gh/fullstorydev/hauser.svg?style=svg)](https://circleci.com/gh/fullstorydev/hauser)\n[![Go Report Card](https://goreportcard.com/badge/github.com/fullstorydev/hauser)](https://goreportcard.com/report/github.com/fullstorydev/hauser)\n\n`hauser` is a service to download Fullstory Data Export files and load them into storage.\nCurrently, Data Export files can be saved to local disk, S3, Redshift, GCS, and BigQuery.\n(Others are easy to add -- pull requests welcome.)\n\n`hauser` is designed to run continuously so that it can update your chosen data store as new data becomes available.\n VMs are a good option for running `hauser` continuously.\n\nSQL recipes for Data Export analysis are in the [Data Export Cookbook](https://github.com/fullstorydev/hauser/wiki).\n\u003cp\u003e\n  \u003cbr\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg width=\"414\" src=\"doc/img/fs-logo-2024-clr.png\" alt=\"fullstory logo\"/\u003e\n\u003c/p\u003e\n\n## Quick Start\n1. Download the latest [release binary](https://github.com/fullstorydev/hauser/releases)\n2. Download the included `example-config.toml` file and customize it for your environment,\n   including your Fullstory API key, warehouse host, and credentials. AWS credentials (for S3) come from your local environment.\n3. Assuming the binary and updated config are in the current directory, run:\n```bash\n./hauser -c myconfig.toml\n```\n\n### Important Configuration Fields\n\n#### `FsApiToken`\nYour Fullstory API Token. It can also be set through the `FULLSTORY_API_TOKEN` environment variable.\n\n#### `ExportDuration`\nDetermines the time range for each export which ultimately determines the size of each exported file.\nThe size of this file will be based on the amount of traffic that your Fullstory account records within the\nspecified duration.\nThe default is 1 hour, but if a different file size is desirable, this can be modified to meet your specific needs. The\nmax duration is 24 hours.\n\n#### `ExportDelay`\nDetermines how long to wait before creating an export.\nThis delay is necessary because there is some latency between when an event is recorded and when it is available and complete.\n24 hours is the default, but is also fairly conservative. In many cases this can be reduced safely to 3 hours, but note that\nevents from \"[swan songs]\" may not be available.\n\n#### `StartTime`\nDetermines the datetime that should be used a starting point for creating the exports.\nThis value is only used when starting with a fresh database/storage instance (i.e. `hauser` hasn't been used with the specified warehouse).\nIf you would like to export all the data that is currently within retention, set this to the date of the oldest possible\nsession start. For example, if your Fullstory account has 3 months of retention, and today is October 20th, 2020, set `StartTime`\nto `2020-7-20T00:00:00Z` to include the oldest data for your account.\n\n## How It Works\n`hauser` will use Fullstory's [segment export API] to create exports\nof the `everyone` segment. When the export has completed (see [Operations API](https://developer.fullstory.com/get-operation)),\n`hauser` will download the file, perform some light transformation for [custom user vars](http://help.fullstory.com/develop-js/setuservars?from_search=17717406)\n, and load the data into the warehouse.\n\n`hauser` will continue to this process of \"create export -\u003e download -\u003e upload\" until it reaches the most \"live\" data (i.e. \"Now - `ExportDelay` - `ExportDuration`\").\nAt this point, `hauser` will perform the process approximately every `ExportDuration`.\n\n`hauser` can safely be stopped and restarted.\nWhen using a database, it uses the `SyncTable` to keep track of what export files have been processed, and will restart from the last known sync point.\nFor a `StorageOnly` process, it will create a file called `.sync.hauser` that will be used as a checkpoint.\n\n### Amazon Web Services Notes\n_Currently, only S3 and Redshift are supported for this provider._\n\nTo use AWS, set the `Provider` config option to `aws`.\n\nEach export file is saved locally to the temp directory before it is moved to S3.\nIf not `StorageOnly`, the S3 copy is then loaded into Redshift through the `copy` command, and the S3 file is removed.\n\nDetails about Redshift configuration can be found in the [Redshift Guide](https://github.com/fullstorydev/hauser/blob/master/Redshift.md).\n\n### Google Cloud Notes\n_Currently, only GCS and BigQuery are supported for this provider._\n\nTo use Google Cloud, set the `Provider` config option to `gcp`.\n\nEach export file is saved locally to the temp directory before it is moved to GCS.\nIf not `StorageOnly`, the GCS file is then loaded into BigQuery through the gRPC client API equivalent of the `bq load` command,\nand the GCS file is removed.\n\nThe BigQuery `ExportTable` is expected to be a date partitioned table.\nThe default values `ExportTable = \"fs_export\"` and `SyncTable = \"fs_sync\"` will work, but feel free to customize the `fs_sync` and `fs_export` names.\nIf the `SyncTable` and `ExportTable` do not already exist in BigQuery, they will be created.\n\n### Local Storage Notes\n\nTo only store downloaded export files locally, set the `Provider` option to `local`.\nThis will save exports to a local folder specified by `SaveDir`.\nIf `UseStartTime` is set to `true`, only exports since `StartTime` will be downloaded (as opposed to all available exports).\nExports can be saved in JSON format (by setting `SaveAsJson` to `true`) or in CSV format.\n\n## Table Schema Changes\n\nOn startup, `hauser` will ensure that the export table listed in the config contains columns for all export fields.\nIf `hauser` detects columns for fields don't exist, it will append columns for those fields to the export table.\nIt uses this schema information, which it acquires once on startup, to intelligently build CSV files and deal with schema alterations to the export table.\nIf schema changes are made, `hauser` will have to be restarted so it is aware of the updated export table schema.\n\nIf the export table contains columns that aren't part of the export, `hauser` will insert null values for those columns when it inserts new records.\nNote: In order for `hauser` to successfully insert records, any added columns must be nullable.\n\nIf Fullstory adds fields to the export, a new version of hauser will need to be downloaded to pick up the new fields.\nIf a backfill of the fields is desired, you can create a one-off export of just the new fields by using the [segment export API].\n\n## Working with Custom Vars\nFor convenience, any custom user vars in your data are stored in a json map in the `CustomVars` column. In Redshift, they can be easily accessed using the [`JSON_EXTRACT_PATH_TEXT`](http://docs.aws.amazon.com/redshift/latest/dg/JSON_EXTRACT_PATH_TEXT.html) function.\n\nFor example:\n```\nSELECT COUNT(*)\nFROM myexport\nWHERE JSON_EXTRACT_PATH_TEXT(CustomVars, 'acct_adminDisabled_bool') = 'false';\n```\n\n## Using hauser with Docker\nFor platforms that support Docker, you can download an image from [docker hub](https://hub.docker.com/r/fullstorydev/hauser) that lets you run `hauser`:\n\n```shell\n# Download image\ndocker pull fullstorydev/hauser:latest\n\n# Assumes that your config.toml is in the current directory\ndocker run --rm \\\n  -v $(pwd)/config.toml:/config.toml \\\n  fullstorydev/hauser:latest -c /config.toml\n```\n\nTo include Hauser in a custom Docker container, add the following to your Dockerfile.\nNote that the example below assumes the desired image is linux.\n\n```Dockerfile\nFROM fullstorydev/hauser:latest as builder\n\n# Custom docker config ...\nFROM alpine:latest\nCOPY --from=builder /bin/hauser /usr/bin/hauser\n\n# Entry point ...\n```\n\nOr you can download the release directly:\n```Dockerfile\nRUN curl -L \u003ehauser.tar.gz https://github.com/fullstorydev/hauser/releases/download/v${HAUSER_VERSION}/hauser_${HAUSER_VERSION}_linux_x86_64.tar.gz \\\n  \u0026\u0026 tar -xzvf hauser.tar.gz -C /usr/bin \\\n  \u0026\u0026 rm hauser.tar.gz\n```\n\nThe `${HAUSER_VERSION}` can be provided at build time with the [docker ARG command](https://docs.docker.com/engine/reference/builder/#arg).\nYou can find the latest version of Hauser on the [releases](https://github.com/fullstorydev/hauser/releases/latest) page.\nFor a more complete example of using Hauser with docker, see [this recipe](./recipes/multi-hauser/README.md).\n\n\n## Building from source\n* Make sure you have [installed](https://golang.org/doc/install) Go 1.11 or higher.\n* **OPTIONAL**: Set a custom [GOPATH](https://github.com/golang/go/wiki/SettingGOPATH).\n* Build it...\n    * To compile for use on your local machine: ``go get github.com/fullstorydev/hauser``\n    * To cross-compile for deployment on a VM: ``GOOS=\u003clinux\u003e GOARCH=\u003camd64\u003e go get github.com/fullstorydev/hauser``\n        - Type `go version` in the VM's command line to find its `GOOS` and `GOARCH` values.\n        - Example (Amazon EC2 Linux): `go1.11.5 linux/amd64` is `GOOS=linux GOARCH=amd64`\n        - The list of valid `GOOS` and `GOARCH` values can be found [here](https://golang.org/doc/install/source#environment).\n* Copy the included `example-config.toml` file and customize it for your environment, including your Fullstory API key, warehouse host, and credentials. AWS credentials (for S3) come from your local environment.\n* Run it...\n    * **NOTE**: `go get` downloads and installs the hauser package in your `GOPATH`, not the local directory in which you call the command.\n    * If you did _NOT_ set a custom `GOPATH`...\n        - Linux \u0026 Mac: `$HOME/go/bin/hauser -c \u003cyour updated config file\u003e`\n        - Windows: `%USERPROFILE%\\go\\bin\\hauser -c \u003cyour updated config file\u003e`\n    * If you _DID_ set a custom `GOPATH`...\n        - Linux \u0026 Mac: `$GOPATH/bin/hauser -c \u003cyour updated config file\u003e`\n        - Windows: `$GOPATH\\bin\\hauser -c \u003cyour updated config file\u003e`\n\n## Developing\nEasily format your commits by adding git pre-commit hook:\n```bash\nln -s ../../pre-commit.sh .git/hooks/pre-commit\n```\n\n[segment export API]: http://developer.fullstory.com/create-segment-export\n[swan songs]: https://help.fullstory.com/hc/en-us/articles/360048109714-Swan-songs-How-Fullstory-records-sessions-that-end-unexpectedly\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffullstorydev%2Fhauser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffullstorydev%2Fhauser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffullstorydev%2Fhauser/lists"}