Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/slidoapp/dbt-superset-lineage
Make dbt docs and Apache Superset talk to one another
https://github.com/slidoapp/dbt-superset-lineage
cli data-lineage dbt lineage superset tool
Last synced: about 1 month ago
JSON representation
Make dbt docs and Apache Superset talk to one another
- Host: GitHub
- URL: https://github.com/slidoapp/dbt-superset-lineage
- Owner: slidoapp
- License: mit
- Created: 2021-12-03T08:55:32.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-04-18T08:31:02.000Z (about 2 months ago)
- Last Synced: 2024-04-22T00:14:06.414Z (about 2 months ago)
- Topics: cli, data-lineage, dbt, lineage, superset, tool
- Language: Python
- Homepage:
- Size: 1.84 MB
- Stars: 128
- Watchers: 4
- Forks: 14
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Lists
- awesome-dbt - dbt-superset-lineage - Push and pull metadata between dbt to Superset. (Packages)
- awesome-dbt - dbt-superset-lineage - Push and pull metadata between dbt to Superset. (Packages)
- awesome-dbt - dbt-superset-lineage - Push and pull metadata between dbt to Superset. (Packages)
README
# dbt-superset-lineage
![]()
![]()
![GitHub last commit](https://img.shields.io/github/last-commit/slidoapp/dbt-superset-lineage)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/dbt-superset-lineage)
![PyPI - Format](https://img.shields.io/pypi/format/dbt-superset-lineage)![dbt-superset-lineage](assets/lineage_white.png)
_Make [dbt](https://github.com/dbt-labs/dbt) docs and [Apache Superset](https://github.com/apache/superset) talk to one another_
## Why do I need something like this?
Odds are rather high that you use dbt together with a visualisation tool. If so, these questions might have popped
into your head time to time:
- "Could I get rid of this model? Does it get used for some dashboards? And in which ones, if yes?"
- "It would be so handy to see all these well-maintained model and column descriptions when exploring and creating charts."In case your visualisation tool of choice is Supserset, you are in luck!
Using `dbt-superset-lineage`, you can:
- Add dependencies of Superset dashboards to your dbt sources and models
- Sync model and column descriptions from dbt docs to SupersetThis will help you:
- Avoid broken dashboards because of deprecated or changed models
- Choosing the right attributes without navigating back and forth between chart and documentation## Demo
The package was presented during [Coalesce](https://coalesce.getdbt.com/), the annual dbt conference, as a part of the talk
[_From 100 spreadsheets to 100 data analysts: the story of dbt at Slido_](https://www.getdbt.com/coalesce-2021/from-spreadsheets-to-data-analysts-the-story-of-dbt-at-slido/).
Watch a demo in the video below.[![Demo video](assets/demo.png)](https://youtu.be/YA0yqYSs9BQ?t=1240)
## Installation
```
pip install dbt-superset-lineage
```## Usage
`dbt-superset-lineage` comes with two basic commands: `pull-dashboards` and `push-descriptions`.
The documentation for the individual commands can be shown by using the `--help` option.It includes a wrapper for [Superset API](https://superset.apache.org/docs/rest-api), one only needs to provide
`SUPERSET_ACCESS_TOKEN`/`SUPERSET_REFRESH_TOKEN` (obtained via `/security/login`)
as environment variable or through `--superset-access-token`/`superset-refresh-token` option.**N.B.**
- Make sure to run `dbt compile` (or `dbt run`) against the production profile, not your development profile
- In case more databases are used within dbt and/or Superset and there are duplicate names (`schema + table`) across
them, specify the database through `--dbt-db-name` and/or `--superset-db-id` options
- Currently, `PUT` requests are only supported if CSRF tokens are disabled in Superset (`WTF_CSRF_ENABLED=False`).
- Tested on dbt v1.4.5 and Apache Superset v2.0.1. Other versions might face errors due to different underlying code and API.### Pull dashboards
Pull dashboards from Superset and add them as
[exposures](https://docs.getdbt.com/docs/building-a-dbt-project/exposures/) to dbt docs with
references to dbt sources and models, making them visible both separately and as dependencies.**N.B.**
- Only published dashboards are extracted.```console
$ cd jaffle_shop
$ dbt compile # Compile project to create manifest.json
$ export SUPERSET_ACCESS_TOKEN=
$ dbt-superset-lineage pull-dashboards https://mysuperset.mycompany.com # Pull dashboards from Superset to /models/exposures/superset_dashboards.yml
$ dbt docs generate # Generate dbt docs
$ dbt docs serve # Serve dbt docs
```![Separate exposure in dbt docs](assets/exposures_1.png)
![Referenced exposure in dbt docs](assets/exposures_2.png)
### Push descriptions
Push model and column descriptions from your dbt docs to Superset as plain text so that they could be viewed
in Superset when creating charts.**N.B.**:
- Run carefully as this rewrites your datasets using merged metadata from Superset and dbt docs.
- Running with `--superset-refresh-columns` overrides `columns.filterable` and `columns.groupby` to `true`,
because of [this issue](https://github.com/apache/superset/issues/24136).
- Descriptions are rendered as plain text, hence no markdown syntax, incl. links, will be displayed.
- Avoid special characters and strings in your dbt docs, e.g. `→` or ``.```console
$ cd jaffle_shop
$ dbt compile # Compile project to create manifest.json
$ export SUPERSET_ACCESS_TOKEN=
$ dbt-superset-lineage push-descriptions https://mysuperset.mycompany.com # Push descrptions from dbt docs to Superset
```
![Column descriptions in Superset](assets/descriptions.png)## License
Licensed under the MIT license (see [LICENSE.md](LICENSE.md) file for more details).