{"id":46060098,"url":"https://github.com/iobruno/data-catalog-labs","last_synced_at":"2026-03-01T11:13:22.416Z","repository":{"id":336588790,"uuid":"1142353164","full_name":"iobruno/data-catalog-labs","owner":"iobruno","description":"DataCatalog to explore end-to-end lineage with Airbyte, Airflow, dbt, BigQuery","archived":false,"fork":false,"pushed_at":"2026-02-19T23:32:28.000Z","size":609,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2026-02-20T03:20:44.232Z","etag":null,"topics":["airbyte","airflow","datahub","dbt","metabase","openlineage","spark"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/iobruno.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-01-26T09:48:29.000Z","updated_at":"2026-02-19T23:32:32.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/iobruno/data-catalog-labs","commit_stats":null,"previous_names":["iobruno/data-catalog-labs"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/iobruno/data-catalog-labs","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iobruno%2Fdata-catalog-labs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iobruno%2Fdata-catalog-labs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iobruno%2Fdata-catalog-labs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iobruno%2Fdata-catalog-labs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/iobruno","download_url":"https://codeload.github.com/iobruno/data-catalog-labs/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iobruno%2Fdata-catalog-labs/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29967991,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-01T10:55:55.490Z","status":"ssl_error","status_checked_at":"2026-03-01T10:55:55.175Z","response_time":124,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["airbyte","airflow","datahub","dbt","metabase","openlineage","spark"],"created_at":"2026-03-01T11:13:21.903Z","updated_at":"2026-03-01T11:13:22.410Z","avatar_url":"https://github.com/iobruno.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Data Catalog with DataHub\n\n[![Airbyte](https://img.shields.io/badge/Airbyte-2.0.19-007CEE?style=flat\u0026logo=airbyte\u0026logoColor=5F5DFF\u0026labelColor=14193A)](https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/taskflow.html)\n[![Airflow](https://img.shields.io/badge/Airflow-2.10-007CEE?style=flat\u0026logo=apacheairflow\u0026logoColor=white\u0026labelColor=14193A)](https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/taskflow.html)\n[![dbt](https://img.shields.io/badge/dbt-1.11-262A38?style=flat\u0026logo=dbt\u0026logoColor=FF6849\u0026labelColor=262A38)](https://docs.getdbt.com/reference/warehouse-setups/bigquery-setup)\n[![BigQuery](https://img.shields.io/badge/BigQuery-3772FF?style=flat\u0026logo=googlebigquery\u0026logoColor=white\u0026labelColor=3772FF)](https://console.cloud.google.com/bigquery)\n\nThis project aims to provision end-to-end pipeline lineage with Airbyte, Airflow, dbt, BigQuery and DataHub as the Data Catalog/Lineage platform. Also ensuring sibling relationships are not duplicate (e.g: Airbyte destination table for a given source matches the same entity as dbt source table)\n\n\n## Quick Start:\n\n1. Spin up DataHub\n```shell\ndocker compose -f datahub/compose.yaml up -d\n```\n\n2. Spin up Airflow\n```shell\ndocker compose -f airflow/compose.yaml up --build --force-recreate -d\n``` \n\n3. Spin up Airbyte with abctl\n```shell\nbrew tap airbytehq/tap\nbrew install abctl\n\nabctl local install\n```\n\n4. Fetch Airbyte credentials\n```shell\nabctl local credentials\n```\n\n5. Build the dbt-bigquery Docker Image\n```shell\ndocker build -t dbt-bigquery:latest dbt/ --no-cache\n```\n\n6. Build the datahub-ingest Docker Image\n```shell\ndocker build -t datahub-ingest:latest datahub/ --no-cache\n```\n\n7. Terraform \n\nFollow the instructions on [terraform](./terraform/) for guidelines on how to run/apply\n\n\n## Reference Docs\nRefer to the specific project folder on how to start each component individually\n\n- [DataHub](datahub/README.md)\n- [Airflow](airflow/README.md)\n- [Airbyte](airbyte/README.md)\n- [dbt-bigQuery](dbt/README.md)\n- [Terraform](terraform/REA)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fiobruno%2Fdata-catalog-labs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fiobruno%2Fdata-catalog-labs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fiobruno%2Fdata-catalog-labs/lists"}