Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/n0vad3v/simple-multinode-clickhouse-cluster
Deploy a simple Multi-Node Clickhouse Cluster with docker-compose in minutes.
https://github.com/n0vad3v/simple-multinode-clickhouse-cluster
clickhouse clickhouse-cluster clickhouse-pool clickhouse-server docker-compose
Last synced: 1 day ago
JSON representation
Deploy a simple Multi-Node Clickhouse Cluster with docker-compose in minutes.
- Host: GitHub
- URL: https://github.com/n0vad3v/simple-multinode-clickhouse-cluster
- Owner: n0vad3v
- License: gpl-3.0
- Created: 2022-02-11T04:17:00.000Z (almost 3 years ago)
- Default Branch: master
- Last Pushed: 2022-02-11T04:57:51.000Z (almost 3 years ago)
- Last Synced: 2024-11-11T09:43:45.202Z (5 days ago)
- Topics: clickhouse, clickhouse-cluster, clickhouse-pool, clickhouse-server, docker-compose
- Language: Python
- Homepage:
- Size: 140 KB
- Stars: 17
- Watchers: 1
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Simple Multi Node Clickhouse Cluster
![](./topo.png)
I hate those single-node clickhouse clusters and manually installation, I mean, why should we:
* Running multiple clickhouse instance inside one docker-compose. (e.g: [tetafro/clickhouse-cluster](https://github.com/tetafro/clickhouse-cluster))
* Manually install JRE, download zookeeper.tar.gz, and modifying those annoying config files like a hell as some Chinese blog/book does (e,g: [ClickHouse集群多实例部署](https://blog.csdn.net/ashic/article/details/105901792), [2021年最新 大数据 Clickhouse零基础教程](https://www.bilibili.com/video/BV1yf4y1M7gw?p=8))this is just weird!
So this repo tries to solve these problem.
## Note
* This is a simplified model of Multi Node Clickhouse Cluster, which lacks: LoadBalancer config/Automated Failover/MultiShard Config generation.
* All clickhouse data is persisted under `event-data`, if you need to move clickhouse to some other directory, you'll just need to move the directory(that contains `docker-compose.yml`) and `docker-compose up -d` to fire it up again.
* `Host` network mode is used to simplify the whole deploy procedure, so you might need to create addition firewall rules if you are running this on a public accessible machine.## Prerequisites
To use this, we need `docker` and `docker-compose` installed, recommended OS is `ubuntu`, and it's recommended to install `clickhouse-client` on machine, so on a typical ubuntu server, doing the following should be sufficient.
```bash
apt update
curl -fsSL https://get.docker.com -o get-docker.sh && sh get-docker.sh && rm -f get-docker.sh
apt install docker-compose clickhouse-client -y
```## Usage
1. Clone this repo
2. Edit the necessary server info in `topo.yml`
3. Run `python3 generate.py`
4. Your cluster info should be in the `cluster` directory now
5. Sync those files to related nodes and run `docker-compose up -d` on them
6. Your cluster is ready to goIf you still cannot understand what I'm saying above, see the example below.
## Example Usage
### Edit information
I've Clone the repo and would like to set a 3-master clickhouse cluster and has the following specs
* 3 replica(one replica on each node)
* 1 Shard only![](./demo-cluster.png)
So I need to edit the `topo.yml` as follows:
```yml
global:
clickhouse_image: "yandex/clickhouse-server:21.3.2.5"
zookeeper_image: "bitnami/zookeeper:3.6.1"zookeeper_servers:
- host: 192.168.33.101
- host: 192.168.33.102
- host: 192.168.33.103clickhouse_servers:
- host: 192.168.33.101
- host: 192.168.33.102
- host: 192.168.33.103clickhouse_topology:
- clusters:
- name: "novakwok_cluster"
shards:
- name: "novakwok_shard"
servers:
- host: 192.168.33.101
- host: 192.168.33.102
- host: 192.168.33.103
```### Generate Config
After `python3 generate.py`, a structure has been generated under `cluster` directory, looks like this:
```
➜ simple-multinode-clickhouse-cluster git:(master) ✗ python3 generate.py
Write clickhouse-config.xml to cluster/192.168.33.101/clickhouse-config.xml
Write clickhouse-config.xml to cluster/192.168.33.102/clickhouse-config.xml
Write clickhouse-config.xml to cluster/192.168.33.103/clickhouse-config.xml➜ simple-multinode-clickhouse-cluster git:(master) ✗ tree cluster
cluster
├── 192.168.33.101
│ ├── clickhouse-config.xml
│ ├── clickhouse-user-config.xml
│ └── docker-compose.yml
├── 192.168.33.102
│ ├── clickhouse-config.xml
│ ├── clickhouse-user-config.xml
│ └── docker-compose.yml
└── 192.168.33.103
├── clickhouse-config.xml
├── clickhouse-user-config.xml
└── docker-compose.yml3 directories, 9 files
```### Sync Config
Now we need to sync those files to related hosts(of course you can use ansible here):
```
rsync -aP ./cluster/192.168.33.101/ [email protected]:/root/ch/
rsync -aP ./cluster/192.168.33.102/ [email protected]:/root/ch/
rsync -aP ./cluster/192.168.33.103/ [email protected]:/root/ch/
```### Start Cluster
Now run `docker-compose up -d` on every hosts' `/root/ch/` directory.
### Validation
On 192.168.33.101, use `clickhouse-client` to connect to local instance and check if cluster is there.
```
root@192-168-33-101:~/ch# clickhouse-client
ClickHouse client version 18.16.1.
Connecting to localhost:9000.
Connected to ClickHouse server version 21.3.2 revision 54447.192-168-33-101 :) SELECT * FROM system.clusters;
SELECT *
FROM system.clusters┌─cluster──────────────────────────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name──────┬─host_address───┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─estimated_recovery_time─┐
│ novakwok_cluster │ 1 │ 1 │ 1 │ 192.168.33.101 │ 192.168.33.101 │ 9000 │ 1 │ default │ │ 0 │ 0 │
│ novakwok_cluster │ 1 │ 1 │ 2 │ 192.168.33.102 │ 192.168.33.102 │ 9000 │ 0 │ default │ │ 0 │ 0 │
│ novakwok_cluster │ 1 │ 1 │ 3 │ 192.168.33.103 │ 192.168.33.103 │ 9000 │ 0 │ default │ │ 0 │ 0 │
│ test_cluster_two_shards │ 1 │ 1 │ 1 │ 127.0.0.1 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │
│ test_cluster_two_shards │ 2 │ 1 │ 1 │ 127.0.0.2 │ 127.0.0.2 │ 9000 │ 0 │ default │ │ 0 │ 0 │
│ test_cluster_two_shards_internal_replication │ 1 │ 1 │ 1 │ 127.0.0.1 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │
│ test_cluster_two_shards_internal_replication │ 2 │ 1 │ 1 │ 127.0.0.2 │ 127.0.0.2 │ 9000 │ 0 │ default │ │ 0 │ 0 │
│ test_cluster_two_shards_localhost │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │
│ test_cluster_two_shards_localhost │ 2 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │
│ test_shard_localhost │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │
│ test_shard_localhost_secure │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9440 │ 0 │ default │ │ 0 │ 0 │
│ test_unavailable_shard │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │
│ test_unavailable_shard │ 2 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 1 │ 0 │ default │ │ 0 │ 0 │
└──────────────────────────────────────────────┴───────────┴──────────────┴─────────────┴────────────────┴────────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────────────┘
↘ Progress: 13.00 rows, 1.58 KB (4.39 thousand rows/s., 532.47 KB/s.)
13 rows in set. Elapsed: 0.003 sec.
```Let's create a DB with replica:
```
192-168-33-101 :) create database novakwok_test on cluster novakwok_cluster;CREATE DATABASE novakwok_test ON CLUSTER novakwok_cluster
┌─host───────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ 192.168.33.103 │ 9000 │ 0 │ │ 2 │ 0 │
│ 192.168.33.101 │ 9000 │ 0 │ │ 1 │ 0 │
│ 192.168.33.102 │ 9000 │ 0 │ │ 0 │ 0 │
└────────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
← Progress: 3.00 rows, 174.00 B (16.07 rows/s., 931.99 B/s.) 99%
3 rows in set. Elapsed: 0.187 sec.192-168-33-101 :) show databases;
SHOW DATABASES
┌─name──────────┐
│ default │
│ novakwok_test │
│ system │
└───────────────┘
↑ Progress: 3.00 rows, 479.00 B (855.61 rows/s., 136.61 KB/s.)
3 rows in set. Elapsed: 0.004 sec.```
Connect to another host to see if it's really working.
```
root@192-168-33-101:~/ch# clickhouse-client -h 192.168.33.102
ClickHouse client version 18.16.1.
Connecting to 192.168.33.102:9000.
Connected to ClickHouse server version 21.3.2 revision 54447.192-168-33-102 :) show databases;
SHOW DATABASES
┌─name──────────┐
│ default │
│ novakwok_test │
│ system │
└───────────────┘
↘ Progress: 3.00 rows, 479.00 B (623.17 rows/s., 99.50 KB/s.)
3 rows in set. Elapsed: 0.005 sec.```
## License
GPL