Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/Victoriapm/dbt-basics-workshop

dbt basics workshop
https://github.com/Victoriapm/dbt-basics-workshop

begginer-friendly dbt postgresql workshop-materials

Last synced: about 2 months ago
JSON representation

dbt basics workshop

Awesome Lists containing this project

README

        

Welcome to the [DBT-in-Practice Workshop](https://www.eventbrite.com/e/dbt-in-practice-workshop-with-victoria-perez-mola-tickets-170094173251) with Victoria Perez Mola by [Dat0s](https://www.linkedin.com/company/dat0s-org)!
![image](https://user-images.githubusercontent.com/4315804/133504036-037630b8-9c9c-4169-99c8-a3c13e200c65.png)

### How to run this project
#### Prerequisites
We will build a project using dbt and a postgres database, but any other database of your choice could be used.
The requirements are:
- [docker](https://www.docker.com/) to run postgres and can be used for dbt as well (optional)
- [postgres](https://www.postgresql.org/) (optional - but a database is needed). Follow [this link](https://www.postgresqltutorial.com/install-postgresql-macos/) for instructions on how to install and create a database. I used one database 'production' with a schema for local development 'dbt_victoria_mola' and another schema 'master' for production deployment.
- [dbt](https://docs.getdbt.com/dbt-cli/installation) (local installation) Follow link for instructions or use a docker image from oficial [dbt repo](https://github.com/dbt-labs/dbt/)

### About the data
This project is based in open data sets found in Kaggle.com:
- [Netflix audience behaviour - UK movies](https://www.kaggle.com/vodclickstream/netflix-audience-behaviour-uk-movies) (data from year 2019 has been deleted)
- [Movies and TV Shows listings on Netflix](https://www.kaggle.com/shivamb/netflix-shows)

### About the project
This project is based in [dbt starter project](https://github.com/dbt-labs/dbt-starter-project) (generated by running `dbt init`)
Try running the following commands:
- dbt run
- dbt test

A project includes the following files:
- dbt_project.yml: file used to configure the dbt project, make sure the profile here matches the one setup during local installation in ~/.dbt/profiles.yml
- csv files in the data folder: these will be our sources, files described above
- Files inside folder models: The sql files contain the scripts to run our models, this will cover staging, core and a datamarts models. At the end, these models will follow this structure:

![image](https://user-images.githubusercontent.com/4315804/134244783-e324a928-114c-4ff7-8975-7919a774bc9a.png)

#### Workflow
![image](https://user-images.githubusercontent.com/4315804/134247720-c3ed8c50-b50b-47b3-89be-e4c9984da09b.png)

#### Execution
After having installed the required tools and cloning this repo, execute the following commnads:

1. Change into the project's directory from the command line: `$ cd dbt-basics-workshop/complete-project`
2. Load the CSVs into the database. This materializes the CSVs as tables in your target schema: `$ dbt seed`
3. Run the models: `$ dbt run`
4. Generate documentation for the project: `$ dbt docs generate`
5. View the documentation for the project, this step should open the documentation page on a webserver, but it can also be accessed from http://localhost:8080 : `$ dbt docs serve`

### Resources:
- Learn more about dbt [in the docs](https://docs.getdbt.com/docs/introduction)
- Check out [Discourse](https://discourse.getdbt.com/) for commonly asked questions and answers
- Join the [chat](http://slack.getdbt.com/) on Slack for live discussions and support
- Find [dbt events](https://events.getdbt.com) near you
- Check out [the blog](https://blog.getdbt.com/) for the latest news on dbt's development and best practices