https://github.com/fivetran/benchmark
Benchmark data warehouses under Fivetran-like conditions
https://github.com/fivetran/benchmark
Last synced: 10 months ago
JSON representation
Benchmark data warehouses under Fivetran-like conditions
- Host: GitHub
- URL: https://github.com/fivetran/benchmark
- Owner: fivetran
- Created: 2017-06-27T02:17:33.000Z (about 9 years ago)
- Default Branch: master
- Last Pushed: 2022-12-24T21:21:42.000Z (over 3 years ago)
- Last Synced: 2024-12-12T03:20:45.957Z (over 1 year ago)
- Language: Shell
- Size: 408 KB
- Stars: 166
- Watchers: 55
- Forks: 42
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Results
https://fivetran.com/blog/warehouse-benchmark
# Design
This is based on the TPC-DS benchmark, a standard data warehouse benchmark that uses lots of joins, aggregations and subqueries.
The TPC-DS queries have been modified somewhat to improve portability across implementations, and eliminate the use of obscure SQL features like grouping-sets.
We generated 1 TB of data, which contains about 4 billion rows in the largest fact table.
We used the following warehouse configurations:
| | Configuration | Cost / Hour |
|-----------|---------------------|-------------|
| Redshift | 5x ra3.4xlarge | $16.30 |
| Snowflake | Large | $16.00 |
| Presto | 4x n2-highmem-32 | $8.02 |
| BigQuery | Flat-rate 500 slots | $13.70 |
# Usage
These scripts are intended to be manually copy-pasted into various terminals.
You can skip steps 1-4 since gs://fivetran-benchmark and s3://fivetran-benchmark are already populated.