:: krowemoh

Saturday | 08 NOV 2025
Posts Links Other About Now

previous
next

2025-10-23
Notes on Zookeeper

solr, search

Zookeeper is a distributed tool to manage configuration for distributed services. I'm not sure what it's actual use case is but I need it for Solr and so these are my notes on getting it set up.

I'm using a single machine to run 3 zookeeper instances, I could possibly run this in standalone mode but that seems to be largely a development thing. Ideally I run this on multiple machines but 3 instances on a single machine is a good enough approximation.

Installation

Like solr, zookeeper just needs java 11.

sudo yum install java-11-openjdk-devel

Download zookeeper:

wget https://dlcdn.apache.org/zookeeper/zookeeper-3.9.4/apache-zookeeper-3.9.4-bin.tar.gz

Untar the file:

tar xvf apache-zookeeper-3.9.4-bin.tar.gz

Rename the folder and copy it twice. This way we will have 3 copies of zookeeper.

mv apache-zookeeper-3.9.4-bin zookeeper_1

cp -R zookeeper_1 zookeeper_2
cp -R zookeeper_1 zookeeper_3

Now we have 3 copies of zookeeper. Each folder is it's own instance.

Setup

Now we need to create the data directories and do some housekeeping.

mkdir /var/lib/zookeeper_1/
mkdir /var/lib/zookeeper_2/
mkdir /var/lib/zookeeper_3/

We also need to add a file called myid in each data directory that has the id of the server.

echo "1" > /var/lib/zookeeper_1/myid
echo "2" > /var/lib/zookeeper_2/myid
echo "3" > /var/lib/zookeeper_3/myid

Configuring Zookeeper

Now we can update the zookeeper configuration. In each instance/directory, we will need to create a zoo.cfg file.

The first file will be, /opt/zookeeper_1/conf/zoo.cfg:

tickTime=2000
dataDir=/var/lib/zookeeper_1
clientPort=2181
initLimit=5
syncLimit=2
server.1=localhost:2887:3887
server.2=localhost:2888:3888
server.3=localhost:2889:3889

The second file will be, /opt/zookeeper_2/conf/zoo.cfg:

tickTime=2000
dataDir=/var/lib/zookeeper_2
clientPort=2182
initLimit=5
syncLimit=2
server.1=localhost:2887:3887
server.2=localhost:2888:3888
server.3=localhost:2889:3889

The third file will be, /opt/zookeeper_3/conf/zoo.cfg:

tickTime=2000
dataDir=/var/lib/zookeeper_3
clientPort=2183
initLimit=5
syncLimit=2
server.1=localhost:2887:3887
server.2=localhost:2888:3888
server.3=localhost:2889:3889

The dataDir and clientPort are changing in each instance but otherwise the configs are the same.

Starting Zookeeper

Now with the config files ready to go and everything set up, we can start zookeeper.

In each directory of zookeeper, we run the start command:

cd /opt/zookeeper_1
bin/zkServer.sh start

We should then see:

/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper_1/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

We then repeat this for each directory.

If you want to stop zookeeper, this can be done by running the stop command:

cd /opt/zookeeper_1
bin/zkServer.sh stop

Once all the instances are running we can take a look at the listeners:

netstat -tulpn | grep java
tcp6       0      0 127.0.0.1:2887          :::*                    LISTEN      29307/java
tcp6       0      0 127.0.0.1:3887          :::*                    LISTEN      29307/java
tcp6       0      0 127.0.0.1:3889          :::*                    LISTEN      29475/java
tcp6       0      0 127.0.0.1:3888          :::*                    LISTEN      29381/java
tcp6       0      0 :::41097                :::*                    LISTEN      29475/java
tcp6       0      0 :::45543                :::*                    LISTEN      29381/java
tcp6       0      0 :::2183                 :::*                    LISTEN      29475/java
tcp6       0      0 :::2182                 :::*                    LISTEN      29381/java
tcp6       0      0 :::2181                 :::*                    LISTEN      29307/java
tcp6       0      0 :::8080                 :::*                    LISTEN      29307/java
tcp6       0      0 :::38913                :::*                    LISTEN      29307/java

There are a number of things listening, some are obviously tied to the configuration however there are some extra ports that I don't recognize. 8080 is running the jetty web server. The other high ports are probably being used for some communication.

Errors

If you run into errors, there are logs located under the logs directory in each zookeeper directory.

This was handy as I did run into errors with my configuration being invalid. I was missing the initLimit field and also didn't have the myid file set up. The error for the missing myid is not very helpful as it's not clear that the file was missing or was even required.

Conclusion

Similar to solr, zookeeper is a java application that I can simply download and run. This was great. However configuring it and setting it up isn't always as clear as it can be. I also don't understand what zookeeper is and why solr requires it but I'm guessing its tied to solr being spread across multiple servers.

I wonder what the solution is for people who only really have one server. It seems a bit much to expect so much convolution for a single server.