https://github.com/meltanolabs/tap-dbt
Singer Tap for dbt API v2 built with the Meltano SDK
https://github.com/meltanolabs/tap-dbt
dbt dbt-cloud elt extract-data meltano-sdk singer-io singer-tap
Last synced: 8 months ago
JSON representation
Singer Tap for dbt API v2 built with the Meltano SDK
- Host: GitHub
- URL: https://github.com/meltanolabs/tap-dbt
- Owner: MeltanoLabs
- License: apache-2.0
- Created: 2021-04-07T07:41:30.000Z (about 5 years ago)
- Default Branch: main
- Last Pushed: 2025-03-31T19:09:16.000Z (about 1 year ago)
- Last Synced: 2025-03-31T20:23:50.777Z (about 1 year ago)
- Topics: dbt, dbt-cloud, elt, extract-data, meltano-sdk, singer-io, singer-tap
- Language: Python
- Homepage: https://pypi.org/p/tap-dbt
- Size: 1000 KB
- Stars: 12
- Watchers: 5
- Forks: 7
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# tap-dbt
[](https://pypi.org/project/tap-dbt/)
[](https://pypi.org/project/tap-dbt/)
[](https://hub.meltano.com/taps/dbt)
[](https://github.com/MeltanoLabs/tap-dbt/actions/workflows/ci.yml)
`tap-dbt` is a Singer tap for the [dbt Cloud API v2][dbtcloud].
Built with the [Singer SDK][sdk].
- [Installation](#Installation)
- [Features](#Features)
- [Configuration](#Configuration)
- [Inputs](#Inputs)
- [JSON example](#JSON-example)
- [Environment variables example](#Environment-variables-example)
- [Meltano Example](#Meltano-Example)
- [Usage](#Usage)
- [Executing the Tap Directly](#Executing-the-Tap-Directly)
- [With Meltano](#With-Meltano)
## Installation
```shell
pip install tap-dbt
```
## Features
### Streams
Full stream metadata is available in the dbt Labs repository: [openapi_schema]
#### Selected by default
The following will be extracted by default if no catalog is used:
- [x] Stream: accounts
- [x] Stream: jobs
- [x] Stream: projects
- [x] Stream: runs
#### Configurable
Can be enabled by setting `selected` in the catalog:
- [x] Stream: connections
- [x] Stream: environments
- [x] Stream: repositories
- [x] Stream: users
### Incremental Run Stream
Ordering the query from the Runs endpoint by `-finished_at`, i.e. descending Run Finished Datetime, yields:
|id|finished_at|updated_at|created_at|
|---|---|---|---|
|314516|None|2023-05-27 21:05:16.109548+00:00|2023-05-27 21:05:05.664170+00:00|
|314514|None|2023-05-27 21:00:16.847296+00:00|2023-05-27 21:00:05.458908+00:00|
|314513|None|2023-05-27 21:00:16.355680+00:00|2023-05-27 21:00:05.427258+00:00|
|314517|None|2023-05-27 21:05:17.094309+00:00|2023-05-27 21:05:05.696222+00:00|
|314515|2023-05-27 21:01:28.568431+00:00|2023-05-27 21:01:29.269048+00:00|2023-05-27 21:00:05.488543+00:00|
|314512|2023-05-27 20:48:59.342035+00:00|2023-05-27 20:48:59.844412+00:00|2023-05-27 20:45:04.509746+00:00|
|314511|2023-05-27 20:48:46.571106+00:00|2023-05-27 20:48:47.079130+00:00|2023-05-27 20:40:04.257950+00:00|
|314505|2023-05-27 20:41:35.591976+00:00|2023-05-27 20:41:36.305364+00:00|2023-05-27 20:15:02.808079+00:00|
|314510|2023-05-27 20:39:27.162437+00:00|2023-05-27 20:39:28.628257+00:00|2023-05-27 20:35:03.939439+00:00|
|314509|2023-05-27 20:37:39.965974+00:00|2023-05-27 20:37:40.496212+00:00|2023-05-27 20:30:03.802620+00:00|
The incremental sync has been set up so that it works on `replication_key = "finished_at"`, when an INCREMENTAL sync is run:
- If the bookmark is set, the stream is queried in reverse `finished_at` order.
- If the `finished_at` value is not set, the run is assumed to still be running so the record is included, plus the sort order implies that there should be records with populated `finished_at` appearing later in the stream - *Repeated sync operation will yield the same records if the dbt Job Run is still underway, however this adheres to the 'at least once' delivery promise - https://sdk.meltano.com/en/latest/implementation/at_least_once.html*
- Once the sync operation reaches records with populated `finished_at`, the values are compared with the bookmark and once the `finished_at` value becomes less than the bookmark the stream finishes syncing.
## Configuration
Visit the [API docs][apidocs] for instructions on how to get your API key.
You can pass configuration using environment variables with the `TAP_DBT_` prefix followed by the uppercased field name
```shell
tap-dbt --config=ENV
```
or a JSON file
```shell
tap-dbt --config=config.json
```
### Inputs
| Field | Description | Type | Required | Default |
|---------------|-----------------------------------------------------------------|----------------|----------|--------------------------------------------------|
| `api_key` | API key for the dbt Cloud API | `string` | yes | |
| `account_ids` | dbt Cloud account IDs | `list(string)` | yes | |
| `user_agent` | User-Agent to make requests with | `string` | no | `tap-dbt/0.1.0 Singer Tap for the dbt Cloud API` |
| `base_url` | Base URL for the dbt Cloud API | `string` | no | `https://cloud.getdbt.com/api/v2` |
| `page_size` | Number of records per API call, sets the `limit=` url parameter | `integer` | no | 5000 |
A full list of supported settings and capabilities for this tap is available by running:
```shell
tap-dbt --about --format json
```
### JSON example
```json
{
"api_key": "da39a3ee5e6b4b0d3255bfef95601890afd80709",
"account_ids": ["51341"],
"user_agent": "tap-dbt/0.1.0 Singer Tap for the dbt Cloud API",
"base_url": "https://my-dbt-cloud-api.com",
"page_size": 5000
}
```
### Environment variables example
```dotenv
TAP_DBT_API_KEY=da39a3ee5e6b4b0d3255bfef95601890afd80709
TAP_DBT_ACCOUNT_IDS=51341
TAP_DBT_USER_AGENT='tap-dbt/0.1.0 Singer Tap for the dbt Cloud API'
TAP_DBT_BASE_URL=https://my-dbt-cloud-api.com"
TAP_DBT_PAGE_SIZE=5000
```
### Meltano Example
```yaml
plugins:
extractors:
- name: tap-dbt
logo_url: https://hub.meltano.com/assets/logos/taps/dbt.png
label: dbt Cloud
docs: https://hub.meltano.com/taps/dbt
repo: https://github.com/edgarrmondragon/tap-dbt
namespace: dbt
pip_url: tap-dbt
executable: tap-dbt
capabilities:
- catalog
- discover
settings:
- name: base_url
label: dbt Cloud URL
placeholder: "https://cloud.getdbt.com/api/v2"
- name: api_key
kind: password
label: API Key
docs: "https://docs.getdbt.com/dbt-cloud/api#section/Authentication"
- name: account_ids
kind: array
label: Account IDs
- name: user_agent
label: User-Agent
placeholder: "tap-dbt/0.1.0 Singer Tap for the dbt Cloud API"
- name: page_size
kind: integer
label: Page Size
```
## Usage
You can easily run `tap-dbt` with the CLI or using [Meltano][meltano].
### Executing the Tap Directly
```shell
tap-dbt --version
tap-dbt --help
tap-dbt --config .secrets/example.json --discover > ./catalog/json
```
### With Meltano
```shell
meltano elt tap-dbt target-snowflake --job_id dbt_snowflake
```
[dbtcloud]: https://cloud.getdbt.com
[sdk]: https://gitlab.com/meltano/singer-sdk
[apidocs]: https://docs.getdbt.com/dbt-cloud/api#section/Authentication
[meltano]: https://www.meltano.com
[openapi_schema]: https://github.com/dbt-labs/dbt-cloud-openapi-spec/blob/master/openapi-v3.yaml