Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/gnilrets/dbt-container-skeleton
All the basics to get a nice containerized dbt development environment
https://github.com/gnilrets/dbt-container-skeleton
Last synced: 3 months ago
JSON representation
All the basics to get a nice containerized dbt development environment
- Host: GitHub
- URL: https://github.com/gnilrets/dbt-container-skeleton
- Owner: gnilrets
- License: mit
- Created: 2021-03-26T23:24:07.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2022-06-02T12:55:07.000Z (over 2 years ago)
- Last Synced: 2024-05-14T09:33:09.673Z (6 months ago)
- Language: Python
- Size: 73.2 KB
- Stars: 56
- Watchers: 3
- Forks: 13
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
- awesome-dbt - dbt-container-skeleton - All the basics to get a nice containerized dbt development environment. (Utilities)
README
# dbt container skeleton
This project is intended to be used as a way to bootstrap a
containerized dbt full development environment.## Features
This respository has several features to help get a new dbt project started, or
enhance an existing one, including:* A containerized development environment -- Simply run `inv dbt-shell` and run all your
dbt commands, knowing that everyone else on your team is running the same version
of dbt and dependencies.
* [dtspec](https://github.com/inside-track/dtspec) testing -- The dbt shell environment
comes installed with the dtspec testing package and the CLI tools needed to make
it work easily with dbt.
* [SQLfluff](https://docs.sqlfluff.com/en/stable/) -- The dbt shell also comes with
SQLfluff, which will lint your dbt SQL code. This ensures that all of your team is
following the same code style best practices. No more internal wars over SQL formatting!
* [GitHub Actions](https://docs.github.com/en/actions) -- This project comes with
a basic github action workflow (in [.github/workflows](.github/workflows)). This workflow
will use GitHub Actions to lint and test your code every time it is changed
and pushed to GitHub. Modify as you see fit to build out your own CI/CD processes.
* [Invoke](http://www.pyinvoke.org/) task execution for automating various
[project tasks](tasks.py) using Python.## Setup
First, you'll need to download and install [docker community
edition](https://www.docker.com/). It is strongly recommended that you
install a Python environment manager like
[miniconda](https://conda.io/miniconda.html). After installing miniconda,
create a minconda environment for this project viaconda create --name dbt-container-skeleton python=3.8
Then activate the environment (this will need to be done every time you open a new shell):
conda activate dbt-container-skeleton
Next, install the Python packages needed to manage your dbt container via
pip install -r host_requirements.txt
Build the dbt container via
inv build
The example dbt project included assumes there is a Postgres database running on your
host machine. To configure the credentials needed to connect to this database runinv config
This will ask you for your Postgres username and password and allows you to customize
the host and database name as wellPOSTGRES_HOST=host.docker.internal # Use this for a local database running on the host
POSTGRES_USER=
POSTGRES_PASSWORD=
POSTGRES_DBNAME=dbt_container_skeleton # You will have to create this Postgres databaseLastly, open up a shell in the dbt container and try running the example dbt project
inv dbt-shell
dbt seed
dbt runThe container that runs mounts your local repository into the
container, so whenever you modify code on your host machine, it will
be immediately available in the container.### dtspec
This project also supports running [dtspec](https://github.com/inside-track/dtspec) tests.
To run those, first create a test database on your host's postgres server named
`_dtspec` and initialize the database via (in the `dbt-shell`)dtspec db --init-test-db
Then run the dtspec tests with:
dtspec dtspec-test-dbt
### SQLFluff
In the dbt shell continer, you can also use [SQLfluff](https://docs.sqlfluff.com/en/stable/)
to lint your dbt SQL code. Simply runsqlfluff lint models
The SQLFluff configuration file is found in [dbt/.sqlfluff](dbt/.sqlfluff).
## Use
### Next steps for your project
After going through the setup exercises, you'll want to start
customizing this project for your dbt needs. If you already have a
dbt project, you should be able to just copy your project into the
`dbt` directory. One exception to this involves your dbt
`profiles.yml` file. That file also needs to be placed in the `dbt`
directory. However, you shouldn't commit this file to version control
if it contains any secrets. Instead, all of the secrets in your
existing `profiles.yml` file should be replaced by environment
variables that get set via the `inv config` command mentioned above.
Examine the `dbt/profiles.yml` file in this project to how it is used.If you need to add or modify the environment variables that get set in
the container, you only need to edit the `config_template.env` file.
This file also should not contain any secrets, as it merely lists out
the customizable environment variables with some non-secret defaults. After
modifying this file, rerun `inv config` to set any secrets.### A note about secrets
When you run `inv config` for the first time, a secret key is
generated and stored on your host machine at
`~/.dbt-container-skeleton`. This is a key used to encrypt and
decrypt your secrets, which get stored in `.config.env` (also excluded
from version control). If you lose the secret key, you will be unable
to decrypt this file and will have to rerun `inv config`. This
ensures that secrets are not stored in plaintext on your host machine.If you need to view the secrets in plaintext, you may do show with
inv config-show
### Invoke
All of the `inv` commands are run via
[invoke](http://www.pyinvoke.org/) tasks defined in `tasks.py`. To
see a list of available commads, runinv -l
### Upgrading/adding python packages
You will want to make sure that the python packages used by this project
are kept up to date. There is a handy invoke task configured to upgrade
these packages (using [pip-tools](https://github.com/jazzband/pip-tools)) by running:inv requirements-upgrade
This will upgrade all of the packages listed in `requirements.in` and their
dependencies.If you ever need to add a Python package to this project, add it to the `requirements.in` file
and runinv requirements-compile
This will add the new package to the project, but **will not** upgrade the other packages
listed in `requirements.in`.### When to build?
Your host dbt folder is mounted within the container, so you do not
need to rebuild the container every time you change a file. However,
whenever you need to modify a Python or dbt package, you'll need to run
`inv build` afterwards.For example, if you needed to downgrade dbt to version `0.18.1`, You would
modify the `requirements.in` file and change the dbt line to `dbt==0.18.1`. Then
compile the requirements to resolve package dependencies viainv requirements-compile
If all is successful (no package conflicts), rebuild the container with `inv build`.