We used a software load balancer to handle traffic between many of our different services. This allows us to completely isolate our backend changes from our REST clients. The same load balancer infrastructure handles Elasticsearch and Apache Hadoop. This allows REST clients to interact with both without having multiple connection points.
An example of the load balancing infrastructure is as follows:
The load balancer connects to Elasticsearch client or data nodes depending on the size of the cluster. This prevents a single point of failure when connecting to our Elasticsearch infrastrucure.
Elasticsearch and the new Java High Level REST Client
One of our REST clients uses the new Java High Level REST Client to connect via our load balancing infrastructure. The Java client works much better than the low level client for simple operations. Furthermore, we have reduced the number of firewall ports that need to be exposed to users compared to the older transport client. There have been very few issues moving from the transport client to the Java High Level REST Client. The Java High Level REST client documentaton and supported APIs keep improving with each release.
Elasticsearch Java High Level REST Client Scroll API and Load Balancing
The Elasticsearch Scroll API allows a client to retrieve a large number of results if necessary. The Scroll API can make multiple requests for each partition of the results until there are no more. The multiple requests mean that the Java High Level REST Client must be configured correctly or only the first request will work.
For our load balancer infrastructure, we rely on URL path prefixes to handle all requests behind the same domain. This makes it easier to not request a new domain for each new load balancer endpoint. The URL path prefix must be configured in the Java High Level REST Client initialization otherwise the requests will fail.
The Java High Level REST Client initialization states that you must first build a Java Low Level REST Client. The Java Low Level REST Client documentation does not state how to pass the load balancer path prefix and so Scroll API requests fail.
The trick is to configure the Java Low Level REST Client with the Elastic
RestClientBuilder class. This class takes care of building the actual
RestClient. The method
setPathPrefix allows setting the prefix that the load balancer requires. With this in place, the Scroll API requests now work correctly.
Using the Elastic Java High Level REST Client sometimes requires understanding how the Java Low Level REST Client works. The Elastic Java REST client Javadoc (low level and high level) can be very helpful in determining what features are available where official documentation examples are lacking.