Running SolrCloud with Zookeeper

Spread the love

SolrCloud and Zookeeper

You may know that in order to run SolrCloud—the distributed Solr installation—you need to have Apache ZooKeeper installed. Zookeeper is a centralized service for maintaining configurations, naming, and provisioning service synchronization. SolrCloud uses ZooKeeper to synchronize con guration and cluster states (such as elected shard leaders), and that’s why it is crucial to have a highly available and fault tolerant ZooKeeper installation. If you have a single ZooKeeper instance and it fails then your SolrCloud cluster will crash too. So, this recipe will show you how to install ZooKeeper so that it’s not a single point of failure in your cluster configurations.

Getting started

The installation instruction in this article  contains information about installing ZooKeeper Version 3.4.3, but it should be useable for any minor release changes of Apache ZooKeeper. To download ZooKeeper please go to http://zookeeper.apache.org/releases.html. This article will show you how to install ZooKeeper in a Linux-based environment. You also need Java installed like we have done in previous articles.

How its done.

The procedure is fairly simple. I’m assuming you have multiple servers, each with it’s own IP address. It doesn’t matter if these are Virtual or bare metal machines. The principal stays the same.
Let’s assume that we decided to install ZooKeeper in the

/usr/share/zookeeper

directory of our server and we want to have three servers (with IP addresses 192.168.1.1, 192.168.1.2, and 192.168.1.3) hosting the distributed ZooKeeper installation. After downloading the ZooKeeper installation, we create the necessary directory:

    1. sudo mkdir /usr/share/zookeeper
    2. Then we unpack the downloaded archive to the newly created directory. We do that on three servers.
    3. Next we need to change our ZooKeeper configuration and specify the servers that will form the ZooKeeper quorum, so we edit the
      /usr/share/zookeeper/conf/zoo.cfg

      File and we add the following entries;

      clientPort=2181 dataDir=/usr/share/zookeeper/data tickTime=2000 initLimit=10 syncLimit=5 server.1=192.168.1.1:2888:3888 server.2=192.168.1.2:2888:3888 server.3=192.168.1.3:2888:3888
      As you can see. every instance uses the same port ranges; 2888- 3888. This is essential for the servers to be able to find their companions.
      
    4. Now, we can start the ZooKeeper servers with the following command:
      /usr/share/zookeeper/bin/zkServer.sh start

      If everything went well you should see something like the following:

JMX enabled by default
   Using config: /usr/share/zookeeper/bin/../conf/zoo.cfg
   Starting zookeeper ... STARTED

Congratulations! You successfully installed and configured your SolrCloud service! In the next article we’ll start crawling the web (or a part of it!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.