{"id":37748224,"url":"https://github.com/aetperf/sql_olap_bench","last_synced_at":"2026-01-16T14:19:11.631Z","repository":{"id":256716455,"uuid":"626105373","full_name":"aetperf/sql_olap_bench","owner":"aetperf","description":"SQL OLAP benchmark","archived":false,"fork":false,"pushed_at":"2024-09-12T12:40:16.000Z","size":39,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-10-24T22:55:34.779Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aetperf.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-10T20:07:22.000Z","updated_at":"2024-09-12T12:40:21.000Z","dependencies_parsed_at":"2024-09-12T17:35:22.306Z","dependency_job_id":"d2648b5b-aa8d-446f-b319-39ae2853b896","html_url":"https://github.com/aetperf/sql_olap_bench","commit_stats":null,"previous_names":["aetperf/sql_olap_bench"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/aetperf/sql_olap_bench","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aetperf%2Fsql_olap_bench","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aetperf%2Fsql_olap_bench/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aetperf%2Fsql_olap_bench/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aetperf%2Fsql_olap_bench/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aetperf","download_url":"https://codeload.github.com/aetperf/sql_olap_bench/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aetperf%2Fsql_olap_bench/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28479374,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-16T11:59:17.896Z","status":"ssl_error","status_checked_at":"2026-01-16T11:55:55.838Z","response_time":107,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-16T14:19:11.577Z","updated_at":"2026-01-16T14:19:11.626Z","avatar_url":"https://github.com/aetperf.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# SQL OLAP bench\n\n## Generate data\n\n### Generate TPC-H data (Parquet, DuckDB and Hyper files)\n\n```bash\n$ python generate_tpch_data.py -h\nusage: generate_tpch_data.py [-h] [-sf INT] [-d TXT] [-n INT] [-s]\n\nCommand line interface to generate_TPC-H_data\n\noptions:\n  -h, --help            show this help message and exit\n  -sf NUM, --scale_factor NUM\n                        TPC-H scale factor\n  -d TXT, --data_dir TXT\n                        Data dir path\n  -n INT, --n_steps INT\n                        number of files\n  -s, --suite           Benchmark suite with scale factors [1, 3, 10, 30, 100]\n```\n\n- Examples:\n\n```bash\n$ python generate_tpch_data.py -sf 1 -d /home/francois/Workspace/pydbbench/data\n```\nCreates the folder `/home/francois/Data/dbbenchdata/tpch_1` with the following files:\n\n\tcustomer.parquet  lineitem.parquet  part.parquet      supplier.parquet\n\tdata.duckdb       nation.parquet    partsupp.parquet  tmp.hyper\n\tdata.hyper        orders.parquet    region.parquet\n\n```bash\n$ python generate_tpch_data.py -sf 1 -d /home/francois/Workspace/pydbbench/data -n 2\n```\nCreates the folder `/home/francois/Data/dbbenchdata/tpch_1` with the following files:\n\n\tcustomer_000.parquet  nation_000.parquet  partsupp_000.parquet\n\tcustomer_001.parquet  nation_001.parquet  partsupp_001.parquet\n\tdata.duckdb           orders_000.parquet  region_000.parquet\n\tdata.hyper            orders_001.parquet  region_001.parquet\n\tlineitem_000.parquet  part_000.parquet    supplier_000.parquet\n\tlineitem_001.parquet  part_001.parquet    supplier_001.parquet\n\n```bash\n$ python generate_tpch_data.py -sf 10 -d /home/francois/Workspace/pydbbench/data\n```\nCreates the folder `/home/francois/Data/dbbenchdata/tpch_10` with the same files as above.\n\n```bash\n$ python generate_tpch_data.py -s -d /home/francois/Workspace/pydbbench/data \n```\n\nCreates the benchmark suite with the following folders:  \n\n\t/home/francois/Workspace/pydbbench/data/tpch_1  \n\t/home/francois/Workspace/pydbbench/data/tpch_3  \n\t/home/francois/Workspace/pydbbench/data/tpch_10  \n\t/home/francois/Workspace/pydbbench/data/tpch_30  \n\t/home/francois/Workspace/pydbbench/data/tpch_100  \n\n\n### Generate TPC-DS data (Parquet, DuckDB and Hyper files)\n\n```bash\n$ python generate_tpcds_data.py -h\nusage: generate_tpcds_data.py [-h] [-sf INT] [-d TXT] [-s]\n\nCommand line interface to generate_TPC-H_data\n\noptions:\n  -h, --help            show this help message and exit\n  -sf NUM, --scale_factor NUM\n                        TPC-DS scale factor\n  -d TXT, --data_dir TXT\n                        Data dir path\n  -s, --suite           Benchmark suite with scale factors [1, 3, 10, 30, 100]\n```\n\n- Examples:\n\n```bash\n$ python generate_tpcds_data.py -sf 1 -d /home/francois/Workspace/pydbbench/data\n```\nCreates the folder `/home/francois/Data/dbbenchdata/tpcds_1` with the following files:\n\n\tcall_center.parquet             item.parquet\n\tcatalog_page.parquet            promotion.parquet\n\tcatalog_returns.parquet         reason.parquet\n\tcatalog_sales.parquet           ship_mode.parquet\n\tcustomer_address.parquet        store.parquet\n\tcustomer_demographics.parquet   store_returns.parquet\n\tcustomer.parquet                store_sales.parquet\n\tdata.duckdb                     time_dim.parquet\n\tdata.hyper                      warehouse.parquet\n\tdate_dim.parquet                web_page.parquet\n\thousehold_demographics.parquet  web_returns.parquet\n\tincome_band.parquet             web_sales.parquet\n\tinventory.parquet               web_site.parquet\n\n\n```bash\n$ python generate_tpcds_data.py -sf 10 -d /home/francois/Workspace/pydbbench/data\n```\nCreates the folder `/home/francois/Data/dbbenchdata/tpcds_10` with the same files as above.\n\n```bash\n$ python generate_tpcds_data.py -s -d /home/francois/Workspace/pydbbench/data \n```\n\nCreates the benchmark suite with the following folders:  \n\n\t/home/francois/Workspace/pydbbench/data/tpcds_1  \n\t/home/francois/Workspace/pydbbench/data/tpcds_3  \n\t/home/francois/Workspace/pydbbench/data/tpcds_10  \n\t/home/francois/Workspace/pydbbench/data/tpcds_30  \n\t/home/francois/Workspace/pydbbench/data/tpcds_100  \n\n## TPC-H benchmark\n\n```bash\n$ python tpch_bench.py -d /home/francois/Workspace/pydbbench/data\n```\n\nLoops over all the `tpch_*` subfolders of the data directory, run the queries and generates the CSV file: `timings_TPCH.csv`","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faetperf%2Fsql_olap_bench","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faetperf%2Fsql_olap_bench","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faetperf%2Fsql_olap_bench/lists"}