https://github.com/snowflake-labs/openlineage-accesshistory-setup
Guideline to extract table lineage info in OpenLineage format from access history view
https://github.com/snowflake-labs/openlineage-accesshistory-setup
Last synced: 8 months ago
JSON representation
Guideline to extract table lineage info in OpenLineage format from access history view
- Host: GitHub
- URL: https://github.com/snowflake-labs/openlineage-accesshistory-setup
- Owner: Snowflake-Labs
- License: apache-2.0
- Created: 2022-01-21T00:09:43.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2023-05-11T08:01:14.000Z (about 3 years ago)
- Last Synced: 2024-05-01T13:33:10.461Z (about 2 years ago)
- Size: 924 KB
- Stars: 10
- Watchers: 8
- Forks: 5
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# OpenLineage Adapter
## Overview
Guideline to extract lineage info in [OpenLineage](https://github.com/OpenLineage/OpenLineage) format from Snowflake [ACCESS_HISTORY](https://docs.snowflake.com/en/sql-reference/account-usage/access_history.html) view.
## Code Deployment
### OPENLINEAGE_ACCESS_HISTORY View
#### View Defintion
[open_lineage_access_history.sql](https://github.com/Snowflake-Labs/OpenLineage-AccessHistory-Setup/blob/main/open_lineage_access_history.sql) is the script to create the view from [ACCESS_HISTORY](https://docs.snowflake.com/en/sql-reference/account-usage/access_history.html) and [QUERY_HISTORY](https://docs.snowflake.com/en/sql-reference/account-usage/query_history.html)
that outputs each query that accesses tables in the account in OpenLineage [JsonSchema](https://github.com/OpenLineage/OpenLineage/blob/main/spec/OpenLineage.json) specification.
* The view only shows a query that has non-empty value for `query_tag` column in the [query_history](https://docs.snowflake.com/en/sql-reference/account-usage/query_history.html).
* The `namespace` of each record is in the format of `snowflake://-`
#### Prerequisite
Set your account's organization name to the session variable `current_organization` before creating the view and running each query on the view.
###### Example
`set current_organization='my_org';`