Tutorial on ZooKeeper – Part 3: Setup ZooKeeper Cluster

In last tutorial, how to install and configure ZooKeeper in standalone mode and replicated mode is introduced. Now I will give you an explicit example on how to setup this replicated mode ( also known as ZooKeeper Cluster ) starting from scratch.

Pre-requisites

Before the deployment, please see System Requirements in the Admin guide to install some required software especially Java.

ZooKeeper Deployment Topology

Few things we have to be clear enough before we start the deployment of ZooKeeper.

  1. The number of ZooKeeper servers that you want to form your ZooKeeper Cluster.

    From the first tutorial, we know that odd number of machines are best for a cluster. In my example, I use 5 servers to form the cluster with tolerance of 2 failures.

  2. About the boxes/physical machines that you want to deploy on.

    Of course, you can install all your ZooKeeper servers on a single box. But for better high availability (aka HA), you should not put all the ZooKeeper servers into one sinlge basket, because two is better than one and more is better than less.

    To achieve the highest probability of tolerating a failure you should try to make machine failures independent. For example, if most of the machines share the same switch, failure of that switch could cause a correlated failure and bring down the service. The same holds true of shared power circuits, cooling systems, etc.

    In my example, I deploy the cluster on 2 boxes with ip 192.168.0.100 and 192.168.0.101.

  3. Ports on every server, including client port (default 2181), quorum port (default 2888) and leader election port (default 3888).

    In order to deploy several servers on a single box, all the three ports here have to be modified accordingly.

    Below is my configuration, which will be configured in zoo.cfg on every server.

    1
    2
    3
    4
    5
    6
    |  Server ID    | Client Port | Quorum Port | Leader Election Port |
    | 192.168.0.100 | 2181 | 2888 | 3888 |
    | 192.168.0.100 | 2182 | 2889 | 3889 |
    | 192.168.0.100 | 2183 | 2890 | 3890 |
    | 192.168.0.101 | 2181 | 2888 | 3888 |
    | 192.168.0.101 | 2181 | 2889 | 3889 |
  4. Where to put data (attribute dataDir in zoo.cfg) for ZooKeeper Server?

    dataDir is the location where ZooKeeper will store the in-memory database snapshots(persistent copies of the znodes) and, unless specified otherwise, the transaction log of updates to the database.

    As changes are made to the znodes, these changes are appended to a transaction log, occasionally. When a log grows large, a snapshot of the current state of all znodes will be written to the filesystem. This snapshot supercedes all previous logs.

    In my exmpale, I use /var/lib/zookeeper/zookeeper-[id]/data as data home.

  5. Any performance improvement settings?

    Just as the admin guide said,

    If ZooKeeper has to contend with other applications for access to resourses like storage media, CPU, network, or memory, its performance will suffer markedly. ZooKeeper has strong durability guarantees, which means it uses storage media to log changes before the operation responsible for the change is allowed to complete. You should be aware of this dependency then, and take great care if you want to ensure that ZooKeeper operations aren’t held up by your media. Here are some things you can do to minimize that sort of degradation:

    • ZooKeeper‘s transaction log must be on a dedicated device. (A dedicated partition is not enough.) ZooKeeper writes the log sequentially, without seeking Sharing your log device with other processes can cause seeks and contention, which in turn can cause multi-second delays. For this setting, please refer to attribute dataLogDir (which allows a dedicated log device to be used, and helps avoid competition between logging and snaphots) in zoo.cfg. You can find more here.

    • Do not put ZooKeeper in a situation that can cause a swap. In order for ZooKeeper to function with any sort of timeliness, it simply cannot be allowed to swap. Therefore, make certain that the maximum heap size given to ZooKeeper is not bigger than the amount of real memory available to ZooKeeper. For more on this, see Things to Avoid.

    Having a dedicated log device has a large impact on throughput and stable latencies. It is highly recommened to dedicate a log device and set dataLogDir to point to a directory on that device, and then make sure to point dataDir to a directory not residing on that device.

ZooKeeper Cluster (Multi-Server) Setup

Let’s begin installation and configuration of ZooKeeper.

Step 1: Create Directory Structure

  • on host 192.168.0.100

    1
    2
    3
    ip-192-168-0-100:~$ mkdir -p /var/lib/zookeeper/zookeeper-1 /var/lib/zookeeper/zookeeper-2 /var/lib/zookeeper/zookeeper-3
    ip-192-168-0-100:~$ mkdir -p /var/lib/zookeeper/zookeeper-1/data /var/lib/zookeeper/zookeeper-2/data /var/lib/zookeeper/zookeeper-3/data
    ip-192-168-0-100:~$ mkdir -p /var/lib/zookeeper/zookeeper-1/log /var/lib/zookeeper/zookeeper-2/log /var/lib/zookeeper/zookeeper-3/log
  • on host 192.168.0.101

    1
    2
    3
    ip-192-168-0-101:~$ mkdir -p /var/lib/zookeeper/zookeeper-4 /var/lib/zookeeper/zookeeper-5
    ip-192-168-0-101:~$ mkdir -p /var/lib/zookeeper/zookeeper-4/data /var/lib/zookeeper/zookeeper-5/data
    ip-192-168-0-101:~$ mkdir -p /var/lib/zookeeper/zookeeper-4/log /var/lib/zookeeper/zookeeper-5/log
  • Let’s take a look above created directory structure

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    ip-192-168-0-100:~$ tree /var/lib/zookeeper

    /var/lib/zookeeper
    ├── zookeeper-1
    │   └── data
    │   └── log
    └── zookeeper-2
    │   └── data
    │   └── log
    └── zookeeper-3
    │   └── data
    │   └── log
    1
    2
    3
    4
    5
    6
    7
    8
    9
    ip-192-168-0-101:~$ tree /var/lib/zookeeper

    /var/lib/zookeeper
    ├── zookeeper-4
    │   └── data
    │   └── log
    └── zookeeper-5
    │   └── data
    │   └── log

Step 2: Creating file myid (a ZooKeeper Server ID)

Basically this file myid reside in the ZooKeeper data directory.

  • on host 192.168.0.100

    1
    2
    3
    ip-192-168-0-100:~$ echo 1 > /var/lib/zookeeper/zookeeper-1/data/myid
    ip-192-168-0-100:~$ echo 2 > /var/lib/zookeeper/zookeeper-2/data/myid
    ip-192-168-0-100:~$ echo 3 > /var/lib/zookeeper/zookeeper-3/data/myid
  • on host 192.168.0.101

    1
    2
    ip-192-168-0-101:~$ echo 4 > /var/lib/zookeeper/zookeeper-4/data/myid
    ip-192-168-0-101:~$ echo 5 > /var/lib/zookeeper/zookeeper-5/data/myid

Step 3: Downloading Stable ZooKeeper Release and Copying Files

I use the version 3.4.6 of ZooKeeper as an example.

  • on host 192.168.0.100

    1
    2
    3
    4
    5
    6
    ip-192-168-0-100:~$ cd /opt
    ip-192-168-0-100:/opt$ wget http://apache.arvixe.com/zookeeper/stable/zookeeper-3.4.6.tar.gz
    ip-192-168-0-100:/opt$ tar -zxf zookeeper-3.4.6.tar.gz
    ip-192-168-0-100:/opt$ cp ./zookeeper-3.4.6/* /var/lib/zookeeper/zookeeper-1/
    ip-192-168-0-100:/opt$ cp ./zookeeper-3.4.6/* /var/lib/zookeeper/zookeeper-2/
    ip-192-168-0-100:/opt$ cp ./zookeeper-3.4.6/* /var/lib/zookeeper/zookeeper-3/
  • on host 192.168.0.101

    1
    2
    3
    4
    5
    ip-192-168-0-101:~$ cd /opt
    ip-192-168-0-101:/opt$ wget http://apache.arvixe.com/zookeeper/stable/zookeeper-3.4.6.tar.gz
    ip-192-168-0-101:/opt$ tar -zxf zookeeper-3.4.6.tar.gz
    ip-192-168-0-101:/opt$ cp ./zookeeper-3.4.6/* /var/lib/zookeeper/zookeeper-4/
    ip-192-168-0-101:/opt$ cp ./zookeeper-3.4.6/* /var/lib/zookeeper/zookeeper-5/

Step 4: Configure Every ZooKeeper Sever

The configuration file zoo.cfg locates at /var/lib/zookeeper/[zookeeper-id]/conf/zoo.cfg.

Here I will show you how to configure ZooKeeper server1, which also works for other 4 servers.

1
2
ip-192-168-0-100:~$ cp /var/lib/zookeeper/zookeeper-1/conf/zoo_sample.cfg /var/lib/zookeeper/zookeeper-1/conf/zoo.cfg
ip-192-168-0-100:~$ vim /var/lib/zookeeper/zookeeper-1/conf/zoo.cfg

You will see below configuration and take some modifications (including clientPort, dataDir and dataLogDir). For the last five servers’ setting, please refer to replicated mode in tutorial 2 for more explanation and the ports settings in #3 of ZooKeeper Deployment Topology.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# The number of milliseconds of each tick
tickTime=2000

# The number of ticks that the initial synchronization phase can take
initLimit=10

# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5

# the directory where the snapshot is stored.
# Choose appropriately for your environment
dataDir=/var/lib/zookeeper/zookeeper-1/data/

# the port at which the clients will connect
clientPort=2181

# the directory where transaction log is stored.
# this parameter provides dedicated log device for ZooKeeper
dataLogDir=/var/lib/zookeeper/zookeeper-1/log/

# ZooKeeper server and its port no.
# ZooKeeper ensemble should know about every other machine in the ensemble
# specify server id by creating 'myid' file in the dataDir
# use hostname instead of IP address for convenient maintenance
server.1=192.168.0.100:2888:3888
server.2=192.168.0.100:2889:3889
server.3=192.168.0.100:2890:3890
server.4=192.168.0.101:2888:3888
server.5=192.168.0.101:2889:3889

Next please perform same steps with appropriate values (including clientPort, dataDir, dataLogDir) for the rest 4 ZooKeeper servers.

Step 5: Configuration ZooKeeper Logger for deployment.

Following are the default values of log4j.properties and it holds dev nature in it; update it as per your environment and need.

1
2
3
4
5
6
7
zookeeper.root.logger=INFO, CONSOLE
zookeeper.console.threshold=INFO
zookeeper.log.dir=.
zookeeper.log.file=zookeeper.log
zookeeper.log.threshold=DEBUG
zookeeper.tracelog.dir=.
zookeeper.tracelog.file=zookeeper_trace.log

Step 6: Start the ZooKeeper Cluster

Once the zoo.cfg file created for all the 5 servers, we can start the ZooKeeper Cluster by starting every ZooKeeper server.

First let’s start the server1

1
2
3
4
5
ip-192-168-0-100:/opt$ cd /var/lib/zookeeper/zookeeper-1/
ip-192-168-0-100:zookeeper-1$ ./bin/zkServer.sh start
JMX enabled by default
Using config: /var/lib/zookeeper/zookeeper-1/conf/zoo.cfg
Starting zookeeper ... STARTED

Now, go ahead and start the remaining 4 ZooKeeper servers. Tail the zookeeper.out file in the bin directory to see more information.

After all these 5 servers start, we use status to see/check every ZooKeeper Server status:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
ip-192-168-0-100:zookeeper-1$ ./bin/zkServer.sh status
JMX enabled by default
Using config: /var/lib/zookeeper/zookeeper-1/conf/zoo.cfg
Mode: follower
ip-192-168-0-100:zookeeper-1$ cd ../zookeeper-2
ip-192-168-0-100:zookeeper-2$ ./bin/zkServer.sh status
JMX enabled by default
Using config: /var/lib/zookeeper/zookeeper-2/conf/zoo.cfg
Mode: follower
ip-192-168-0-100:zookeeper-2$ cd ../zookeeper-3
ip-192-168-0-100:zookeeper-3$ ./bin/zkServer.sh status
JMX enabled by default
Using config: /var/lib/zookeeper/zookeeper-3/conf/zoo.cfg
Mode: leader
1
2
3
4
5
6
7
8
9
ip-192-168-0-101:zookeeper-4$ ./bin/zkServer.sh status
JMX enabled by default
Using config: /var/lib/zookeeper/zookeeper-4/conf/zoo.cfg
Mode: follower
ip-192-168-0-101:zookeeper-4$ cd ../zookeeper-5
ip-192-168-0-101:zookeeper-5$ ./bin/zkServer.sh status
JMX enabled by default
Using config: /var/lib/zookeeper/zookeeper-5/conf/zoo.cfg
Mode: follower

Reference