{"id":31897226,"url":"https://github.com/aleklukanen/chapterhousedb-v1","last_synced_at":"2026-02-27T23:16:27.486Z","repository":{"id":260656619,"uuid":"805808374","full_name":"alekLukanen/ChapterhouseDB-v1","owner":"alekLukanen","description":"Allows you to create simple data streaming warehouses written in Golang using Apache Parquet and Arrow.","archived":false,"fork":false,"pushed_at":"2025-05-10T15:28:12.000Z","size":194,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-10-13T11:18:18.408Z","etag":null,"topics":["arrow","data","database","event","golang","ingestion","parquet","pipeline","processing","stream"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/alekLukanen.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-05-25T14:15:33.000Z","updated_at":"2025-07-02T03:47:18.000Z","dependencies_parsed_at":"2024-11-01T16:52:05.563Z","dependency_job_id":"5f97a8d3-7e48-41a6-80f0-9e7f1a8273f2","html_url":"https://github.com/alekLukanen/ChapterhouseDB-v1","commit_stats":null,"previous_names":["aleklukanen/chapterhousedb","aleklukanen/chapterhousedb-v1"],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/alekLukanen/ChapterhouseDB-v1","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alekLukanen%2FChapterhouseDB-v1","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alekLukanen%2FChapterhouseDB-v1/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alekLukanen%2FChapterhouseDB-v1/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alekLukanen%2FChapterhouseDB-v1/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/alekLukanen","download_url":"https://codeload.github.com/alekLukanen/ChapterhouseDB-v1/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alekLukanen%2FChapterhouseDB-v1/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279014766,"owners_count":26085593,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-13T02:00:06.723Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["arrow","data","database","event","golang","ingestion","parquet","pipeline","processing","stream"],"created_at":"2025-10-13T11:18:16.338Z","updated_at":"2025-10-13T11:18:18.879Z","avatar_url":"https://github.com/alekLukanen.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ChapterhouseDB\nAn extensible data ingestion warehouse build on Apache Arrow and Apache Parquet.\nInstead of relying on external data ingestion systems you implement the data warehouse that\nmeets the needs of your specific use case. This package provides you with a set\nof primitive patterns that allow you to build a data warehouse using Golang. The focus of this package\nis on the implementation of the data ingestion process. You define tables with a set \nof columns and how those tables are partitioned. Once you insert data into your warehouse\nthis package manages the distribution of data across a set of works connected to a \ncommon Redis compatible key-value database and an S3 compatible object storage. Data is \nprocessed in parallel across your workers allowing you to scale up workers as needed. \nOnly one worker can process a table partition at any given time, but if you increase\nthe number of partitions to a sufficiently large number then it is unlikely you \nwill ever limit your ability to process data in parallel. Tables should be partitioned using \na unique identifier since the partition key also ensures uniqueness of rows in the table. \nCurrently integer range and string hash partitioning is available. \nYou can have at most 2^32-1 partitions.\n\nAn example application and be found [here](https://github.com/alekLukanen/ChapterhouseDB-example-app)  \n\n## View KeyDB\n\nUse the redis-commander tool\n```\nredis-commander\n```\n\n## Querying Local Files With DuckDB\n\n```sql\ncreate secret locals3mock3 (\n  TYPE S3,\n  KEY_ID \"key\",\n  SECRET \"secret\",\n  ENDPOINT \"localhost:9090\",\n  URL_STYLE \"path\",\n  USE_SSL false\n);\n\nselect * from 's3://default/chdb/table-state/part-data/table1/0/d_2_0.parquet';\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faleklukanen%2Fchapterhousedb-v1","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faleklukanen%2Fchapterhousedb-v1","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faleklukanen%2Fchapterhousedb-v1/lists"}