{"id":20102520,"url":"https://github.com/clickhouse/1trc","last_synced_at":"2026-03-09T11:32:44.777Z","repository":{"id":226018699,"uuid":"764106564","full_name":"ClickHouse/1trc","owner":"ClickHouse","description":"1 trillion rows","archived":false,"fork":false,"pushed_at":"2024-07-06T01:37:38.000Z","size":46,"stargazers_count":8,"open_issues_count":8,"forks_count":2,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-05-06T08:47:27.373Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ClickHouse.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-02-27T13:47:39.000Z","updated_at":"2025-02-14T15:43:27.000Z","dependencies_parsed_at":"2024-12-28T09:46:33.546Z","dependency_job_id":null,"html_url":"https://github.com/ClickHouse/1trc","commit_stats":null,"previous_names":["clickhouse/1trc"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ClickHouse/1trc","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ClickHouse%2F1trc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ClickHouse%2F1trc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ClickHouse%2F1trc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ClickHouse%2F1trc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ClickHouse","download_url":"https://codeload.github.com/ClickHouse/1trc/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ClickHouse%2F1trc/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30292439,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-09T11:12:22.024Z","status":"ssl_error","status_checked_at":"2026-03-09T11:10:54.577Z","response_time":61,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-13T17:31:32.321Z","updated_at":"2026-03-09T11:32:44.759Z","avatar_url":"https://github.com/ClickHouse.png","language":"Python","readme":"# ClickHouse - 1 trillion row challenges\n\nQuery trillion row datasets in Object Storage for a few cents using ClickHouse.\n\n## Objective\n\nWe aim to test the cost efficiency and performance of ClickHouse in querying files in object storage.\n\nTo this end, this repository contains Pulumi code to deploy a ClickHouse cluster in Cloud providers of a specified instance type, run a configured query against object storage, and shut the cluster down. The objective is to ensure this cost is as low as possible. In most cases (assuming pricing is linear), this should also mean faster queries.\n\nFor each cloud provider, the approach can differ, e.g., for AWS, we use spot instances. \n\n## Cloud providers\n\n- [AWS](./aws-starter/) - AWS using configurable spot instances.\n\n## Query\n\nAny query should not require data to be loaded into ClickHouse i.e. it should query data in object storage via functions such as the [s3Cluster function](https://clickhouse.com/docs/en/sql-reference/table-functions/s3Cluster). The query is configurable for providers.\n\n## Datasets\n\n### 1 trillion weather measurements\n\nAvailable at `s3://coiled-datasets-rp/1trc`. Requires requester to pay. This can be queried as shown below:\n\n```sql\nSELECT * FROM s3Cluster('default','https://coiled-datasets-rp.s3.us-east-1.amazonaws.com/1trc/measurements-*.parquet', '\u003cAWS_ACCESS_KEY_ID\u003e', '\u003cAWS_SECRET_ACCESS_KEY\u003e', headers('x-amz-request-payer' = 'requester'))\n```\n\nTo avoid data transfer costs, ensure you query from `us-east-1`.\n\n## Examples\n\nFor an example, see [ClickHouse and The One Trillion Row Challenge](https://clickhouse.com/blog/clickhouse-1-trillion-row-challenge). This queries 1 trillion rows for $0.56 in S3.\n\n## Credit\n\nThe original work was inspired by https://github.com/coiled/1trc which in turn was inspired by Gunnar Morling's [one billion row challenge](https://github.com/gunnarmorling/1brc). \n\n## Contributing\n\nContributions are welcome to improve the code for a provider. This can include making providers more flexible or ensuring resources are deployed and destroyed faster.\n\nFor simplicity, we request all orchestration codes be in Pulumi.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fclickhouse%2F1trc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fclickhouse%2F1trc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fclickhouse%2F1trc/lists"}