# Tutorial on ZooKeeper – Part 2: Installation and Configuration

In last tutorial, some concepts and terminologies are introduced. To further investigate and use ZooKeeper, we move to the next step - install and configure ZooKeeper.

## Installation

ZooKeeper is very simple to install.

### Pre-requisites

See System Requirements in the Admin guide.

### Download the Source Code and Install

To get a ZooKeeper distribution, download a recent stable release from one of the Apache Download Mirrors.

For Ubuntu, you can also install ZooKeeper using directly debian packages.

## Configuration

In ZooKeeper, there are two types of modes. One is the standalone mode, the other is the replicated mode.

### Standalone Mode

Setting up a ZooKeeper server in standalone mode is straightforward. The server is contained in a single JAR file, so installation consists of creating a configuration.

Then open the file zoo.cfg, you will see

Change the value of dataDir to specify an existing (empty to start with) directory.

For standalone mode, only the below three fields are needed and meaningful:

• tickTime
• clientPort

Now that you created the configuration file, you can start ZooKeeper:

### Replicated Mode

Running ZooKeeper in standalone mode is convenient for evaluation, some development, and testing.

You can find the meanings of these and other configuration settings in the section Configuration Parameters. A word though about a few here:

Every machine that is part of the ZooKeeper ensemble should know about every other machine in the ensemble. You accomplish this with the series of lines of the form server.id=host:port:port. The parameters host and port are straightforward. You attribute the server id to each machine by creating a file named myid, one for each server, which resides in that server’s data directory, as specified by the configuration file parameter dataDir.

But in production, you should run ZooKeeper in replicated mode. A replicated group of servers in the same application is called a quorum, and in replicated mode, all servers in the quorum have copies of the same configuration file. The file is similar to the one used in standalone mode, but with a few differences. Here is an example:

For the meanings of the new entries, initLimit and syncLimit, please refer to the comments in the file zoo.cfg of Standalone Mode.

The entries of the form server.X list the servers that make up the ZooKeeper service. When the server starts up, it knows which server it is by looking for the file myid in the data directory. This file, which I will show its usage in the next tutorial, is quite IMPORTANT and INDISPENSABLE. That file contains the server number, which is a cluster-unique ZooKeeper‘s instance id (1-255) in ASCII, and it should match X in server.X in the left hand side of this setting.

The list of servers that make up ZooKeeper servers that is used by the clients must match the list of ZooKeeper servers that each ZooKeeper server has.

Finally, note the two port numbers after each server name: 2888 and 3888. Peers use the former port to connect to other peers. Such a connection is necessary so that peers can communicate, for example, to agree upon the order of updates. More specifically, a ZooKeeper server uses this port to connect followers to the leader. When a new leader arises, a follower opens a TCP connection to the leader using this port. Because the default leader election also uses TCP, we currently require another port for leader election. This is the second port in the server entry.

In the next tutorial, I will give an explicit example on how to setup this replicated mode/a cluster of ZooKeeper server ( also known as an ensemble ) starting from scratch.