In last tutorial, some concepts and terminologies are introduced. To further investigate and use
ZooKeeper, we move to the next step - install and configure
ZooKeeper is very simple to install.
See System Requirements in the Admin guide.
Download the Source Code and Install
To get a
ZooKeeper distribution, download a recent stable release from one of the Apache Download Mirrors.
$ cd /opt
For Ubuntu, you can also install
ZooKeeper using directly debian packages.
$ apt-cache search zookeeper
ZooKeeper, there are two types of modes. One is the standalone mode, the other is the replicated mode.
Setting up a
ZooKeeper server in standalone mode is straightforward. The server is contained in a single JAR file, so installation consists of creating a configuration.
$ cd /opt/zookeeper-3.4.6/conf
Then open the file
zoo.cfg, you will see
# The number of milliseconds of each tick
Change the value of
dataDir to specify an existing (empty to start with) directory.
For standalone mode, only the below three fields are needed and meaningful:
Now that you created the configuration file, you can start
ZooKeeper in standalone mode is convenient for evaluation, some development, and testing.
You can find the meanings of these and other configuration settings in the section Configuration Parameters. A word though about a few here:
Every machine that is part of the ZooKeeper ensemble should know about every other machine in the ensemble. You accomplish this with the series of lines of the form
server.id=host:port:port. The parameters host and port are straightforward. You attribute the server id to each machine by creating a file named myid, one for each server, which resides in that server’s data directory, as specified by the configuration file parameter dataDir.
But in production, you should run
ZooKeeper in replicated mode. A replicated group of servers in the same application is called a
quorum, and in replicated mode, all servers in the
quorum have copies of the same configuration file. The file is similar to the one used in standalone mode, but with a few differences. Here is an example:
For the meanings of the new entries,
syncLimit, please refer to the comments in the file
zoo.cfg of Standalone Mode.
The entries of the form
server.X list the servers that make up the
ZooKeeper service. When the server starts up, it knows which server it is by looking for the file myid in the data directory. This file, which I will show its usage in the next tutorial, is quite IMPORTANT and INDISPENSABLE. That file contains the server number, which is a cluster-unique
ZooKeeper‘s instance id (1-255) in ASCII, and it should match
server.X in the left hand side of this setting.
The list of servers that make up
ZooKeeper servers that is used by the clients must match the list of
ZooKeeper servers that each
ZooKeeper server has.
Finally, note the two port numbers after each server name: 2888 and 3888. Peers use the former port to connect to other peers. Such a connection is necessary so that peers can communicate, for example, to agree upon the order of updates. More specifically, a
ZooKeeper server uses this port to connect followers to the leader. When a new leader arises, a follower opens a TCP connection to the leader using this port. Because the default leader election also uses TCP, we currently require another port for leader election. This is the second port in the server entry.
In the next tutorial, I will give an explicit example on how to setup this replicated mode/a cluster of
ZooKeeper server ( also known as an
ensemble ) starting from scratch.