https://github.com/iobruno/data-catalog-labs
DataCatalog to explore end-to-end lineage with Airbyte, Airflow, dbt, BigQuery
https://github.com/iobruno/data-catalog-labs
airbyte airflow datahub dbt metabase openlineage spark
Last synced: 3 months ago
JSON representation
DataCatalog to explore end-to-end lineage with Airbyte, Airflow, dbt, BigQuery
- Host: GitHub
- URL: https://github.com/iobruno/data-catalog-labs
- Owner: iobruno
- Created: 2026-01-26T09:48:29.000Z (5 months ago)
- Default Branch: master
- Last Pushed: 2026-02-19T23:32:28.000Z (4 months ago)
- Last Synced: 2026-02-20T03:20:44.232Z (4 months ago)
- Topics: airbyte, airflow, datahub, dbt, metabase, openlineage, spark
- Language: Jupyter Notebook
- Homepage:
- Size: 595 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Data Catalog with DataHub
[](https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/taskflow.html)
[](https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/taskflow.html)
[](https://docs.getdbt.com/reference/warehouse-setups/bigquery-setup)
[](https://console.cloud.google.com/bigquery)
This project aims to provision end-to-end pipeline lineage with Airbyte, Airflow, dbt, BigQuery and DataHub as the Data Catalog/Lineage platform. Also ensuring sibling relationships are not duplicate (e.g: Airbyte destination table for a given source matches the same entity as dbt source table)
## Quick Start:
1. Spin up DataHub
```shell
docker compose -f datahub/compose.yaml up -d
```
2. Spin up Airflow
```shell
docker compose -f airflow/compose.yaml up --build --force-recreate -d
```
3. Spin up Airbyte with abctl
```shell
brew tap airbytehq/tap
brew install abctl
abctl local install
```
4. Fetch Airbyte credentials
```shell
abctl local credentials
```
5. Build the dbt-bigquery Docker Image
```shell
docker build -t dbt-bigquery:latest dbt/ --no-cache
```
6. Build the datahub-ingest Docker Image
```shell
docker build -t datahub-ingest:latest datahub/ --no-cache
```
7. Terraform
Follow the instructions on [terraform](./terraform/) for guidelines on how to run/apply
## Reference Docs
Refer to the specific project folder on how to start each component individually
- [DataHub](datahub/README.md)
- [Airflow](airflow/README.md)
- [Airbyte](airbyte/README.md)
- [dbt-bigQuery](dbt/README.md)
- [Terraform](terraform/REA)