I already understand that Elastic Search is supposed to be deployed in a distributed topology, in that you can have multiple nodes for a cluster of ES instances.
I like the API, and it looks promising. But I wonder how Elastic Search solves some standard problems in distributed computing. My “architectural” questions are motivated out of trying to understand whether I ought to commit to using the product.
Consider membership problem: If I add a node to the cluster, how does Elastic Search cluster know to add that node wants to join? Or what if a node goes down (or the network to the data center it is in) gets too flaky. Is it using some sort of gossip protocol like Amazon Dynamo does?
Consider the consensus problem: if I POST a document to one node to the cluster, how does the change get propagated to the other nodes? Of course, I understand that the document has to be indexed, so I cannot expect a search query to find it immediately. Does ES use a Paxos
algorithm? Raft
algorithm?
UPDATE: A nice discussion of the split-brain problem in Elastic Search.
5
There are also some tweaks you can apply on server configuration, in order to reduce split-brain problems in large(ish) clusters, namely the so-called discovery.zen.*
parameters.
There is not that much information about the inner workings and algorithms of ES, but what I found during my evaluation was the reference documentation about the server setup.
In particular, a nice thing about ES is that almost all the server configuration is done through a single file, elasticsearch.yml
, under the ./config
dir of your installation.
So, I would start with the setup configuration and I would go on opening the actual .yml on your installation. It’s commented really well, and goes into explaining a lot of things (including the “zen” parameters used to solve split-brain problems) and cross-links to web pages.
There is not an incredible amount of detail in the online Elasticsearch: The Definitive Guide, but suffice to say, with respect to:
- Cluster membership: the oldest node becomes the “master” and it appears some sort of gossip protocol is used when nodes join and leave the cluster.
- A leadership election does occur when the master leaves the cluster (which I tried for myself). Elastic Search does not state how they pull this off, and they want to leave this encapsulated for their users.