Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/tufin/espresso

A framework for writing testable BigQuery queries
https://github.com/tufin/espresso

bigquery sql testing

Last synced: about 2 months ago
JSON representation

A framework for writing testable BigQuery queries

Awesome Lists containing this project

README

        

[![CI](https://github.com/Tufin/espresso/workflows/go/badge.svg)](https://github.com/Tufin/espresso/actions)
[![codecov](https://codecov.io/gh/tufin/espresso/branch/main/graph/badge.svg?token=4neEgts50n)](https://codecov.io/gh/tufin/espresso)
[![Go Report Card](https://goreportcard.com/badge/github.com/tufin/espresso)](https://goreportcard.com/report/github.com/tufin/espresso)
[![GoDoc](https://godoc.org/github.com/tufin/espresso?status.svg)](https://godoc.org/github.com/tufin/espresso)
[![Docker Image Version](https://img.shields.io/docker/v/tufin/espresso?sort=semver)](https://hub.docker.com/r/tufin/espresso/tags)

# espresso - a framework for testing BigQuery queries

## Goals
- Componentization: compose complex queries from smaller, reusable components
- Test driven development: write tests for each query and run them like unit tests (except for the fact that they make calls to BigQuery)
- Data as code: input and required output for tests can be defined as part of the code (as well as in real database tables)
- No new languages to learn
- Write tests in your own development stack: programming language, IDE and CI/CD pipeline

## Writing Your Own SQL Tests
1. Write an SQL query using Go Text Template notation, for example:
```sql
{{ define "fruit" }}

WITH base AS (
{{ .Base }}
)
SELECT
fruit
FROM base

{{ end }}
```

The query may contain parameters, like {{ .Base }}
2. Add additional SQL queries to pass as arguments to the main query, for example:
```sql
{{ define "base" }}

SELECT
"orange" AS fruit
UNION ALL
SELECT
"apple"

{{ end }}
```

3. Write your expected result query, for example:
```sql
{{ define "fruit_result" }}

SELECT
"orange" AS fruit
UNION ALL
SELECT
"apple"

{{ end }}
```
4. Create a query definition file describing your query and one or more tests, for example:
```yaml
Name: fruit
Requires:
- Base
Tests:
Test1:
Args:
- Name: Base
Source: base
Result:
Source: fruit_result
```
5. Put all files together in a directory

## Query Definitions
The query definition file specifies how to construct an SQL query from the SQL templates.
A definition file contains Tests, each specifying how to construct an SQL query and an optional expected result.
To create a query from its components, espresso looks for an SQL template with the same name as the definition file itself, and then executes it with the given Args.
Each Arg must have a Name field that corresponds to a required argument in the SQL template and one of the following fields:
1. Source - Another SQL template which will be executed and injected into the containing query.
2. Table - a table name which will be combined with the Google project and BigQuery dataset (defined in Shot) and injected into the containing query.
3. Const - a string that will be injected into the containing query.

Source may contain its own Args. If it doesn't, espresso will look for a corresponding template file with a test of the same name and parse the args from there.

Result is an optional field specifiying the test's expected result.
It should have a Source field specifying an SQL template that will be executed to generate the expected result query.
Source may contain Args as described above.

## Access To BigQuery
The tests require access to BigQuery API.
Please set the following environment variables to grant espresso access to BigQuery:
- `export GCLOUD_PROJECT_ID=`
- `export BIGQUERY_KEY=`

## Running Tests From The Command-line
```bash
go build
./espresso -dir="./shot/queries/fruit/" -query="fruit" -test="Test1"
```

## Running Tests From Docker
```bash
docker run --rm -t -e GCLOUD_PROJECT_ID=$GCLOUD_PROJECT_ID -e BIGQUERY_KEY=$BIGQUERY_KEY -v $(pwd)/shot:/shot:ro tufin/espresso -dir="/shot" -query="fruit" -test="Test1"
```

## Running Tests From Golang
1. Embed your tests directory
2. Create an "Espresso Shot" and run it
3. Use standard Go assertions to check the expected result against the actual output
```go
func TestEspressoShot_Filesystem(t *testing.T) {
queryValues, resultValues, err := shot.NewShotWithClient(env.GetGCPProjectID(), "", os.DirFS("./queries/fruit")).RunTest("fruit", "Test1", []bigquery.QueryParameter{}, &map[string]bigquery.Value{})
require.NoError(t, err)
require.ElementsMatch(t, queryValues, resultValues)
}
```

You can also embed the SQL templates directory into the code:
```go
//go:embed queries/fruit
var templates embed.FS

func TestEspressoShot_Embed(t *testing.T) {
queryValues, resultValues, err := shot.NewShotWithClient(env.GetGCPProjectID(), "", templates).RunTest("fruit", "Test1", []bigquery.QueryParameter{}, &map[string]bigquery.Value{})
require.NoError(t, err)
require.ElementsMatch(t, queryValues, resultValues)
}
```

## Running Tests In Other Programming Languages
Currently only Go is supported.
If you'd like to contribute additional language support, please start a dicssussion.

## Current Status
- This is an initial proof-of-concept and request-for-comments
- Please submit your feedback as pull requests, issues or discussions.