Table of contents
No headings in the article.
This is a template documentation about scaling an es cluster to a single node without any errors and minimum downtime. Feel free to post recommendations using the comment section below.
First, run this command to exclude node(s). You can use a comma-separated list of IP addresses; (Note: Explain why and what is this setting)
$ curl -XPUT X.X.X.X:9200/_cluster/settings -H 'Content-Type: application/json' -d '{
"persistent" :{
"cluster.routing.allocation.exclude._ip" : "Y.Y.Y.Y,Z.Z.Z.Z"
}
}';echo
Check cluster settings if nodes are excluded correctly;
$ curl -XGET 'X.X.X.X:9200/_cluster/settings?pretty'
It should look similar to this:
{
"persistent" : {
"cluster" : {
"routing" : {
"allocation" : {
"exclude" : {
"_ip" : "Y.Y.Y.Y,Z.Z.Z.Z"
}
}
}
}
},
"transient" : { }
}
Wait until you see status: green
and relocating shards: 0
when you run the following command. This process may take some time depending on the replication count setting in the indexes.
$ curl -XGET 'X.X.X.X:9200/_cluster/health?pretty';
{
"cluster_name" : "cluster-name",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 60,
"active_shards" : 120,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
After that, you should edit the elasticsearch.yml
file on the last server standing so that there is no other elastic node in the list. In my case the relevant lines are:
discovery.zen.ping.unicast.hosts: ["X.X.X.X", "Y.Y.Y.Y", "Z.Z.Z.Z"]
discovery.zen.minimum_master_nodes: 2
I've commented out the minimum_master_node
line as it is. I also left only one IP in the ping.unicast.hosts
list;
discovery.zen.ping.unicast.hosts: ["X.X.X.X"]
#discovery.zen.minimum_master_nodes: 2
Then you have to restart the elasticsearch service. You can now safely close the nodes that you have excluded from the cluster.
You might want to check the status of the shards to make sure everything is correct. If you see an UNASSIGNED
shard in the list, it is likely because elastic does not have a node left to replicate this shard.
$ curl -XGET 'http://X.X.X.X:9200/_cat/shards?v';
index shard prirep state docs store ip node
index-name 3 r STARTED 42 74.4kb X.X.X.X localhost.localdomain
index-name 3 r UNASIGNED
This causes the cluster health to appear yellow and is simple to fix;
$ curl -XPUT X.X.X.X:9200/_settings -H 'Content-Type: application/json' -d '{
"index" :{
"number_of_replicas": 0
}
}';echo
After this step, all shards should be listed as STARTED
Currently, all data is on the server with the X.X.X.X
address and is not trying to communicate with other nodes. If you follow the steps in reverse, you can restore the cluster.