{"id":26477465,"url":"https://github.com/ekoepplin/dbt-bigquery-core","last_synced_at":"2026-05-06T18:35:09.276Z","repository":{"id":283097820,"uuid":"946652451","full_name":"ekoepplin/dbt-bigquery-core","owner":"ekoepplin","description":"How to get data to BigQuery (or duckDB) and setup dbt tests for SODA cloud monitoring","archived":false,"fork":false,"pushed_at":"2025-05-19T07:19:34.000Z","size":302,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-05-19T08:30:27.721Z","etag":null,"topics":["bigquery","data","data-quality","dbt","dlt","duckdb","gcp","soda"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ekoepplin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-03-11T13:23:24.000Z","updated_at":"2025-05-19T07:19:38.000Z","dependencies_parsed_at":"2025-05-19T08:37:14.145Z","dependency_job_id":null,"html_url":"https://github.com/ekoepplin/dbt-bigquery-core","commit_stats":null,"previous_names":["ekoepplin/dbt-bigquery-core"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ekoepplin/dbt-bigquery-core","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ekoepplin%2Fdbt-bigquery-core","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ekoepplin%2Fdbt-bigquery-core/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ekoepplin%2Fdbt-bigquery-core/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ekoepplin%2Fdbt-bigquery-core/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ekoepplin","download_url":"https://codeload.github.com/ekoepplin/dbt-bigquery-core/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ekoepplin%2Fdbt-bigquery-core/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270982194,"owners_count":24679447,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-18T02:00:08.743Z","response_time":89,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bigquery","data","data-quality","dbt","dlt","duckdb","gcp","soda"],"created_at":"2025-03-20T00:47:01.248Z","updated_at":"2026-05-06T18:35:04.255Z","avatar_url":"https://github.com/ekoepplin.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Project Overview\n\nThis repository demonstrates a modern data quality engineering workflow:\n\n1. **Data Ingestion**:\n   - Uses `dlt` (data load tool) to load NewsAPI data into BigQuery\n   - Serves as a simple example of data ingestion\n   - Located in the `dlt-data-dumper` directory\n   - Includes NewsAPI integration for article data collection\n\n2. **Main Focus: Data Quality Engineering**:\n   - **dbt Transformations**:\n     - Structured data modeling with staging, intermediate, and mart layers\n     - Demonstrates testing and documentation best practices\n     - Shows how to implement data contracts and quality checks\n   \n   - **Soda Integration**:\n     - Automated data quality monitoring\n     - Integration with dbt metadata\n     - Real-time quality checks and alerting\n     - Data freshness and volume monitoring\n\nThe primary goal is to showcase how to implement robust data quality practices using dbt and Soda in a BigQuery environment.\n\n## Quick Start\n\nFor detailed setup and usage instructions, please see our [GETTING_STARTED.md](GETTING_STARTED.md) guide, which includes:\n- Development environment setup (Dev Container recommended for Windows users)\n- Prerequisites and account requirements\n- Step-by-step configuration\n- Testing and data quality monitoring\n\nFor comprehensive testing documentation, including all test types, configurations, and best practices, see our [GETTING_STARTED_TESTING.md](GETTING_STARTED_TESTING.md) guide.\n\n## Credential Setup\n1. Create a `credentials` directory if it doesn't exist\n2. Copy `credentials/soda-credentials.env.template` to `credentials/soda-credentials.env`\n3. Add your service account JSON file as `credentials/service-account.json`\n4. For dlt-data-dumper:\n   - Create `credentials/dlt-secrets.toml` with the following structure:\n     ```toml\n     [destination.bigquery]\n     location = \"EU\"\n\n     [destination.bigquery.credentials]\n     project_id = \"your-project-id\"\n     private_key = \"your-private-key\"\n     client_email = \"your-service-account-email\"\n\n     [sources.newsapi_pipeline]\n     api_key = \"your-newsapi-key\"\n\n     [newsapi_pipeline.destination]\n     schema_name = \"ingest_newsapi_v1\"\n     ```\n   - Replace the placeholder values with your actual credentials\n5. Update the credentials files with your actual credentials\n\n## Important Notes\n\n- **Development Environment**: We recommend using VS Code with Dev Containers, especially for Windows users\n- **Required Accounts**:\n  - Google Cloud Platform with BigQuery access\n  - Soda Cloud (45-day free trial available)\n  - NewsAPI account (for data ingestion)\n- **Learning Resources**: \n  - [dbt Fundamentals Course](https://learn.getdbt.com/courses/dbt-fundamentals) (Recommended)\n  - Detailed documentation in GETTING_STARTED.md\n  - Comprehensive testing guide in GETTING_STARTED_TESTING.md\n\nFor detailed setup instructions and best practices, please refer to our comprehensive [Getting Started Guide](GETTING_STARTED.md).","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fekoepplin%2Fdbt-bigquery-core","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fekoepplin%2Fdbt-bigquery-core","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fekoepplin%2Fdbt-bigquery-core/lists"}