DevOps
Moving Elasticsearch indexes with elasticdump
Introduction
Recently, I had a requirement to migrate data between Elasticsearch clusters while building a new ELK stack.
There are a few tools that can get the job done, including the input-elasticsearch Logstash plugin, Elasticsearch reindex API and elasticdump.
On my journey of evaluating new tools, I experimented with elasticdump
and this post provides details on how it can complete a requirement of moving data from one Elasticsearch node into another.
First things first. What is elasticdump?
Elasticdump is an open-source tool, which according to its official description has the goal of moving and saving Elasticsearch indexes. It works by requesting data from an input and redirecting it into an output. Either input or output may be an Elasticsearch URL
or a File
.
Elasticsearch scenario
For this article scenario, the requirement is to move data from one Elasticsearch cluster into another.
Based on elasticdump features, 2 options can achieve the goal:
Option 1) Using an Elasticsearch URL for both input and output.
This approach is the most straightforward option, requiring one command to move the data across 2 Elasticsarch clusters.
--input=cluster_one
--output=cluster_two
Option 2) Using an Elasticsearch URL for input and File as output, followed by a File input and an Elasticsearch URL output.
This approach requires at least 2 commands, one to save the data from a cluster into a file followed by a command to use the generated file as input to another cluster.
This approach may be useful if you want to perform a backup of the indexes before taking further actions.
--input=cluster_one
--output=data.json
--input=data.json
--output-cluster_two
Installing Elasticdump
Installation of Elasticdump can be performed via npm install. Npm is short for Node Package Manager, an online repository hosting open-source Node.js projects.
You can install npm through:
# Ubuntu
sudo apt-get install npm
# CentOS
sudo yum install npm
With npm installed, install elasticdump with:
npm install elasticdump -g
Using elasticdump
Using elasticdump is as simple as performing the following
elasticdump \
--input={{INPUT}} \
--output={{OUTPUT}} \
--type={{TYPE}}
Where {{INPUT}} or {{OUTPUT}} can be an Elasticsearch URL such as {protocol}://{host}:{port}/{index}
or a File such as /tmp/dump.json
, while {{TYPE}}
must be analyzer
, mapping
or data
.
Export Elasticsearch Data – Scenario 1
For this scenario, the aim is to export data from an Elasticsearch index called docker-daemon
while injecting the data into a remote Elasticsearch node, keeping the same index name.
You can achieve the objective with the following command:
elasticdump \
--input=http://user:password@old_node:9200/docker-daemon \
--output=http://user:password@new_node:9200/docker-daemon \
--type=data
The following is a sample of the expected output:
Thu, 21 Sep 2017 14:40:29 GMT | starting dump
Thu, 21 Sep 2017 14:40:31 GMT | from source elasticsearch (offset: 0)
Thu, 21 Sep 2017 14:40:33 GMT | to destination elasticsearch, wrote 53
Thu, 21 Sep 2017 14:40:33 GMT | from source elasticsearch (offset: 53)
Thu, 21 Sep 2017 14:40:33 GMT | Total Writes: 53
Thu, 21 Sep 2017 14:40:33 GMT | dump complete
To confirm the data transfer occurred successfully, perform the following command on the target Elasticsearch node:
$ curl -u user:password localhost:9200/_cat/indices?v | grep docker-daemon
If the elasticdump action executed successfully, the index should appear as available.
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 2394 100 2394 0 0 245k 0 --:--:-- --:--:-- --:--:-- 259k
green open logstash-docker-daemon eilJdiZvSGixTNIfMwP-kw 5 2 41 0 292.3kb 292.3kb
Export Elasticsearch Data – Scenario 2
For this scenario, two distinct steps are necessary:
- Export data from an Elasticsearch index into a file.
- Import data from a file into an Elasticsearch index.
The following commands achieve the desired goal:
elasticdump \
--input=http://user:password@old_node:9200/docker-daemon \
--output=/data/docker-daemon.json \
--type=data
elasticdump \
--input=/data/docker-daemon.json \
--output=http://user:password@new_node:9200/docker-daemon \
--type=data
Observe that we first export the data from the index into the file /data/docker-daemon.json
. We then use this newly generated file as input data for another Elasticsearch node.
Analyzers and Mappings
What this article showed so far was the most basic method of moving an index from a node into a new one. In a more realistic scenario, when moving an index you will want to move the index with its appropriate analyzers and field mappings.
For this scenario, analysers and mappings should be moved prior to transferring the indexes. We can achieve this by cascading the 3 statements as shown below:
elasticdump \
--input=http://user:password@old_node:9200/docker-daemon \
--output=http://user:password@new_node:9200/docker-daemon \
--type=analyzer
elasticdump \
--input=http://user:password@old_node:9200/docker-daemon \
--output=http://user:password@new_node:9200/docker-daemon \
--type=mapping
elasticdump \
--input=http://user:password@old_node:9200/docker-daemon \
--output=http://user:password@new_node:9200/docker-daemon \
--type=date
Extra Options
Even though this article presented the basic parameters of elasticdump, we have a series of other parameters available. Some commonly used parameters include:
--searchBody
: Useful when you do not want to export an entire index. Example:--searchBody '{"query":{"term":{"containerName": "nginx"}}}'
.--limit
: Shows how many objects to move in a batch per operation. Defaults to 100.--delete
: Delete documents from the input source as we move them.
You can find the full list of parameters on the official tool page here.
Final Considerations
Moving Elasticsearch indexes across nodes and clusters should not be a burden, and elasticdump proves that. Elasticdump is easy to use and contains good documentation, so if you need to move Elasticsearch indexes around, look no further.
As mentioned in the article, there are other alternatives such as using logstash elasticsearch-input-plugin
or even Elasticsearch Reindex API
, but these will be covered in further articles. Stay tuned!
Kelson
//iamkel.devSoftware engineer. Geek. Traveller. Wannabe athlete. Lifelong student. Works at IBM and hosts the @HardcodeCast.