https://github.com/threecommaio/k8s-zookeeper
Fork of kubernetes contrib repo for zookeeper stateful set
https://github.com/threecommaio/k8s-zookeeper
Last synced: 7 months ago
JSON representation
Fork of kubernetes contrib repo for zookeeper stateful set
- Host: GitHub
- URL: https://github.com/threecommaio/k8s-zookeeper
- Owner: threecommaio
- License: other
- Created: 2019-01-17T18:37:24.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2019-01-17T19:04:46.000Z (over 7 years ago)
- Last Synced: 2024-12-28T07:26:35.229Z (over 1 year ago)
- Language: Shell
- Size: 8.79 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Kubernetes ZooKeeper K8SZK
This project contains a Docker image meant to facilitate the deployment of
[Apache ZooKeeper](https://zookeeper.apache.org/) on [Kubernetes](http://kubernetes.io/) using
[StatefulSets](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/).
## Limitations
1. Scaling is not currently supported. An ensemble's membership can not be updated in a safe way
in ZooKeeper 3.4.10 (The current stable release).
2. Observers are currently not supported. Contributions are welcome.
3. Persistent Volumes must be used. emptyDirs will likely result in a loss of data.
## Docker Image
The docker image contained in this repository is comprised of a base Ubuntu 16.04 image using the latest
release of the OpenJDK JRE based on the 1.8 JVM (JDK 8u111) and the latest stable release of
ZooKeeper, 3.4.10. Ubuntu is a much larger image than BusyBox or Alpine, but these images contain
mucl or ulibc. This requires a custom version of OpenJDK to be built against a libc runtime other
than glibc. No vendor of the ZooKeeper software supplies or verifies the software against such a
JVM, and, while Alpine or BusyBox would provide smaller images, we have prioritized a well known
environment.
The image is built such that the ZooKeeper process is designated to run as a non-root user. By default,
this user is zookeeper. The ZooKeeper package is installed into the /opt/zookeeper directory, all
configuration is sym linked into the /usr/etc/zookeeper/, and all executables are sym linked into
/usr/bin. The ZooKeeper data directories are contained in /var/lib/zookeeper. This is identical to
the RPM distribution that users should be familiar with.
## Configuration
### Headless Service
The ZooKeeper Stateful Set requires a Headless Service to control the network domain for the
ZooKeeper processes. An example configuration is provided below.
```yaml
apiVersion: v1
kind: Service
metadata:
name: zk-svc
labels:
app: zk-svc
spec:
ports:
- port: 2888
name: server
- port: 3888
name: leader-election
clusterIP: None
selector:
app: zk-svc
```
Note that the Service contains two ports. The server port is used for followers to tail the leaders
even log, and the leader-election port is used by the ensemble to perform leader election.
### Stateful Set
The Stateful Set configuration must match the Headless Service, and it must provide the number of
replicas. In the example below we request a ZooKeeper ensemble of size 3.
**As weighted quorums are not supported, it is imperative that an odd number of replicas be chosen.
Moreover, the number of replicas should be either 1, 3, 5, or 7. Ensembles may be scaled to larger
membership for read fan out, but, as this will adversely impact write performance, careful thought
should be given to selecting a larger value.**
```yaml
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: zk
spec:
serviceName: zk-svc
replicas: 3
```
### Container Configuration
The zkGenConfig.sh script will generate the ZooKeeper configuration (zoo.cfg), Log4J configuration
(log4j.properties), and JVM configuration (jvm.env). These will be written to the
/opt/zookeeper/conf directory with correct read permissions for the zookeeper user. These files are
generated from environment variables that are injected into the container as in the example, minimal
configuration below.
```yaml
containers:
- name: k8szk
imagePullPolicy: Always
image: gcr.io/google_samples/k8szk:v3
ports:
- containerPort: 2181
name: client
- containerPort: 2888
name: server
- containerPort: 3888
name: leader-election
env:
- name : ZK_ENSEMBLE
value: "zk-0;zk-1;zk-2"
- name: ZK_CLIENT_PORT
value: "2181"
- name: ZK_SERVER_PORT
value: "2888"
- name: ZK_ELECTION_PORT
value: "3888"
```
#### Membership Configuration
|Variable|Type|Default|Description|
|:------:|:---:|:-----:|:---------|
|ZK_ENSEMBLE|string|N/A|A colon separated list of servers in the ensemble.|
This is a mandatory configuration variable that is used to configure the membership of the
ZooKeeper ensemble. It is also used to prevent data loss during accidental scale operations. The
set can be computed as follows. For all integers in the range [0,replicas), prepend the name of
service followed by a dash to the integer. So for the Stateful Set above, the name is zk and we have
3 replicas. for the set {0,1,2} we prepend zk- giving us zk-0;zk-1;zk-2.
#### Network Configuration
|Variable|Type|Default|Description|
|:------:|:---:|:-----:|:--------|
|ZK_CLIENT_PORT|integer|2181|The port on which the server will accept client requests.|
|ZK_SERVER_PORT|integer|2888|The port on which the leader will send events to followers.|
|ZK_ELECTION_PORT|integer|3888|The port on which the ensemble performs leader election.|
|ZK_MAX_CLIENT_CNXNS|integer|60|The maximum number of concurrent client connections that a server in the ensemble will accept.|
The ZK_CLIENT_PORT, ZK_ELECTION_PORT, and ZK_SERVERS_PORT must be set to the containerPorts
specified in the container configuration, and the ZK_SERVER_PORT and ZK_ELECTION_PORT
must match the Headless Service configuration. However, if the default values of
the environment variables are used for both the containerPorts and the Headless Service, the
environment variables may be omitted from the configuration.
#### ZooKeeper Time Configuration
|Variable|Type|Default|Description|
|:------:|:---:|:-----:|:--------|
|ZK_TICK_TIME|integer|2000|The number of wall clock ms that corresponds to a Tick for the ensembles internal time.|
|ZK_INIT_LIMIT|integer|5|The number of Ticks that an ensemble member is allowed to perform leader election.|
|ZK_SYNC_LIMIT|integer|10|The number of Tick by which a follower may lag behind the ensembles leader.|
#### ZooKeeper Session Configuration
|Variable|Type|Default|Description|
|:------:|:---:|:-----:|:--------|
|ZK_MIN_SESSION_TIMEOUT|integer|2 * ZK_TICK_TIME|The minimum session timeout that the ensemble will allow a client to request.|
|ZK_MAX_SESSION_TIMEOUT|integer|20 * ZK_TICK_TIME|The maximum session timeout that the ensemble will allow a client to request.|
#### Data Retention Configuration
**ZooKeeper does not, by default, purge old transactions logs or snapshots. This can cause
the disk to become full.** If you have backup procedures and retention policies that rely on
external systems, the snapshots can be retrieved manually from the /var/lib/zookeeper/data directory,
and the logs can be retrieved manually from the /var/lib/zookeeper/log directory.
These will be stored on the persistent volume. The zkCleanup.sh script can be used to manually purge
outdated logs and snapshots.
If you do not have an existing retention policy and backup procedure, and if you are comfortable with
an automatic procedure, you can use the environment variables below to enable and configure
automatic data purge policies.
|Variable|Type|Default|Description|
|:------:|:---:|:-----:|:---------|
|ZK_SNAP_RETAIN_COUNT|integer|3|The number of snapshots that the ZooKeeper process will retain if ZK_PURGE_INTERVAL is set to a value greater than 0.|
|ZK_PURGE_INTERVAL|integer|0|The delay, in hours, between ZooKeeper log and snapshot cleanups.|
#### JVM Configuration
Currently the only supported JVM configuration is the JVM heap size. Be sure that the heap size you
request does not cause the process to swap out.
|Variable|Type|Default|Description|
|:------:|:---:|:-----:|:--------|
|ZK_HEAP_SIZE|integer|2|The JVM heap size in Gibibytes.|
#### Log Level Configuration
|Variable|Type|Default|Description|
|:------:|:---:|:-----:|:--------|
|ZK_LOG_LEVEL|enum(TRACE,DEBUG,INFO,WARN,ERROR,FATAL)|INFO|The Log Level that for the ZooKeeper processes logger.|
#### Liveness and Readiness
The zkOk.sh script can be used to check the liveness and readiness of ZooKeeper process. The example
below demonstrates how to configure liveness and readiness probes for the Pods in the Stateful Set.
```yaml
readinessProbe:
exec:
command:
- sh
- -c
- "zkOk.sh"
initialDelaySeconds: 15
timeoutSeconds: 5
livenessProbe:
exec:
command:
- sh
- -c
- "zkOk.sh"
initialDelaySeconds: 15
timeoutSeconds: 5
```
#### Volume Mounts
volumeMounts for the container should be defined as below.
```yaml
volumeMounts:
- name: datadir
mountPath: /var/lib/zookeeper
```
### Storage Configuration
Currently, the use of Persistent Volumes to provide durable, network attached storage is mandatory.
**If you use the provided image with emptyDirs, you will likely suffer a data loss.** The example
below demonstrates how to request a dynamically provisioned persistent volume of 20 GiB.
```yaml
volumeClaimTemplates:
- metadata:
name: datadir
annotations:
volume.alpha.kubernetes.io/storage-class: anything
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 20Gi
```
### Logging
The Log Level configuration may be modified via the ZK_LOG_LEVEL environment variable as described
above. However, the location of the log output is not modifiable. The ZooKeeper process must be
run in the foreground, and the log information will be shipped to the stdout. This is considered
to be a best practice for containerized applications, and it allows users to make use of the
log rotation and retention infrastructure that already exists for K8s.
### Metrics
The zkMetrics script can be used to retrieve metrics from the ZooKeeper process and print them to
stdout. A recurring Kubernetes job can be used to collect these metrics and provide them to a
collector.
```bash
bash$ kubectl exec zk-0 zkMetrics.sh
zk_version 3.4.9-1757313, built on 08/23/2016 06:50 GMT
zk_avg_latency 0
zk_max_latency 0
zk_min_latency 0
zk_packets_received 21
zk_packets_sent 20
zk_num_alive_connections 1
zk_outstanding_requests 0
zk_server_state follower
zk_znode_count 4
zk_watch_count 0
zk_ephemerals_count 0
zk_approximate_data_size 27
zk_open_file_descriptor_count 39
zk_max_file_descriptor_count 1048576
```