Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/elgca/spark-clickhouse-connector
https://github.com/elgca/spark-clickhouse-connector
Last synced: 25 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/elgca/spark-clickhouse-connector
- Owner: elgca
- Created: 2021-04-03T14:15:03.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2021-04-06T13:37:02.000Z (over 3 years ago)
- Last Synced: 2024-10-15T02:05:34.586Z (2 months ago)
- Size: 2.93 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# spark-clickhouse-connector
Read and write clickhouse cluster in parallel
![image](https://user-images.githubusercontent.com/22080060/113481081-27627280-94ca-11eb-8855-ec66fbc4bef1.png)
# Read
![image](https://user-images.githubusercontent.com/22080060/113481254-09494200-94cb-11eb-8c3f-304d5c9e9624.png)
`The actual number of partitions = the number of scan partitions * the number of shard`
The scan partition will be assigned to each shard, and each shard will get data from ClickHouse through the conditions of the scan partition
# Write
![image](https://user-images.githubusercontent.com/22080060/113481211-cbe4b480-94ca-11eb-83f5-c869c273d069.png)
A write partition will be allocated to a shard, and the data will be written to the clickhosue that can be reached under the shard