https://github.com/ekoepplin/dbt-bigquery-core
How to get data to BigQuery and setup dbt tests for SODA cloud monitoring
https://github.com/ekoepplin/dbt-bigquery-core
bigquery data data-quality dbt dlt gcp soda
Last synced: 2 months ago
JSON representation
How to get data to BigQuery and setup dbt tests for SODA cloud monitoring
- Host: GitHub
- URL: https://github.com/ekoepplin/dbt-bigquery-core
- Owner: ekoepplin
- Created: 2025-03-11T13:23:24.000Z (2 months ago)
- Default Branch: master
- Last Pushed: 2025-03-18T14:24:10.000Z (2 months ago)
- Last Synced: 2025-03-18T15:37:29.459Z (2 months ago)
- Topics: bigquery, data, data-quality, dbt, dlt, gcp, soda
- Language: Python
- Homepage:
- Size: 300 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Project Overview
This repository demonstrates a modern data quality engineering workflow:
1. **Data Ingestion**:
- Uses `dlt` (data load tool) to load NewsAPI data into BigQuery
- Serves as a simple example of data ingestion2. **Main Focus: Data Quality Engineering**:
- **dbt Transformations**:
- Structured data modeling with staging, intermediate, and mart layers
- Demonstrates testing and documentation best practices
- Shows how to implement data contracts and quality checks
- **Soda Integration**:
- Automated data quality monitoring
- Integration with dbt metadata
- Real-time quality checks and alerting
- Data freshness and volume monitoringThe primary goal is to showcase how to implement robust data quality practices using dbt and Soda in a BigQuery environment.
## Quick Start
For detailed setup and usage instructions, please see our [GETTING_STARTED.md](GETTING_STARTED.md) guide, which includes:
- Development environment setup (Dev Container recommended for Windows users)
- Prerequisites and account requirements
- Step-by-step configuration
- Testing and data quality monitoring## Credential Setup
1. Create a `credentials` directory if it doesn't exist
2. Copy `credentials/soda-credentials.env.template` to `credentials/soda-credentials.env`
3. Add your service account JSON file as `credentials/service-account.json`
4. Update the credentials files with your actual credentials## Important Notes
- **Development Environment**: We recommend using VS Code with Dev Containers, especially for Windows users
- **Required Accounts**:
- Google Cloud Platform with BigQuery access
- Soda Cloud (45-day free trial available)
- **Learning Resources**:
- [dbt Fundamentals Course](https://learn.getdbt.com/courses/dbt-fundamentals) (Recommended)
- Detailed documentation in GETTING_STARTED.mdFor detailed setup instructions and best practices, please refer to our comprehensive [Getting Started Guide](GETTING_STARTED.md).