Loading... ## Elasticsearch跨集群数据迁移 ### 0.迁移前提: ### 根据业务需求需要将生产环境部分数据迁移至机房测试环境机器进行需求测试。 ### 1.常用方案:本次使用elasticsearchdump - elasticsearchdump:适合数据量不大,迁移索引个数不多的场景 - logstash:适用数据量大的场景 - snapshot:从一个ES集群中读取数据然后写入到另一个ES集群 ### 2.服务搭建: #### (1)安装Nodejs环境: ```shell $ wget https://npm.taobao.org/mirrors/node/v14.10.0/node-v14.10.0-linux-x64.tar.xz $ tar xf node-v14.10.0-linux-x64.tar.xz $ mv node-v14.10.0-linux-x64 /usr/local/ $ ln -s /usr/local/node-v14.10.0-linux-x64/bin/npm /usr/local/bin/npm $ ln -s /usr/local/node-v14.10.0-linux-x64/bin/node /usr/local/bin/node $ npm version { npm: '6.14.8', ares: '1.16.0', brotli: '1.0.9', cldr: '37.0', icu: '67.1', llhttp: '2.0.4', modules: '83', napi: '6', nghttp2: '1.41.0', node: '14.10.0', openssl: '1.1.1g', tz: '2020a', unicode: '13.0', uv: '1.39.0', v8: '8.4.371.19-node.16', zlib: '1.2.11' } ``` #### (2)安装elasticsearch插件: ```shell $ npm install elasticdump -g npm WARN deprecated request@2.88.2: request has been deprecated, see https://github.com/request/request/issues/3142 npm WARN deprecated har-validator@5.1.5: this library is no longer supported npm WARN deprecated s3signed@0.1.0: This module is no longer maintained. It is provided as is. /usr/local/node-v14.10.0-linux-x64/bin/elasticdump -> /usr/local/node-v14.10.0-linux-x64/lib/node_modules/elasticdump/bin/elasticdump /usr/local/node-v14.10.0-linux-x64/bin/multielasticdump -> /usr/local/node-v14.10.0-linux-x64/lib/node_modules/elasticdump/bin/multielasticdump + elasticdump@6.33.3 added 97 packages from 147 contributors in 55.235s ``` #### (3)常用选项 ```shel --input: 源地址,可为ES集群URL、文件或stdin,可指定索引,格式为:{protocol}://[{username}:{password}@]{host}:{port}/{index} --input-index: 源ES集群中的索引 --output: 目标地址,可为ES集群地址URL、文件或stdout,可指定索引,格式为:{protocol}://[{username}:{password}@]{host}:{port}/{index} --output-index: 目标ES集群的索引 --type: 迁移类型,默认为data,表明只迁移数据,可选settings, analyzer, data, mapping, alias --limit:每次向目标ES集群写入数据的条数,不可设置的过大,以免bulk队列写满 ``` ### 3.同步数据 #### (1)迁移分词器Analyzer: ```shell # cd /usr/local/node-v14.10.0-linux-x64/ # ./bin/elasticdump --input=http://114.115.xxx.xxx:9200/rn_20200828 --output=http://172.16.0.177:9200/rn_20200828 --type=analyzer Mon, 14 Sep 2020 08:35:15 GMT | starting dump Mon, 14 Sep 2020 08:35:16 GMT | got 1 objects from source elasticsearch (offset: 0) Mon, 14 Sep 2020 08:35:46 GMT | sent 1 objects to destination elasticsearch, wrote 1 Mon, 14 Sep 2020 08:35:46 GMT | got 0 objects from source elasticsearch (offset: 1) Mon, 14 Sep 2020 08:35:46 GMT | Total Writes: 1 Mon, 14 Sep 2020 08:35:46 GMT | dump complete ``` #### (2)迁移元数据mapping: ```shell # cd /usr/local/node-v14.10.0-linux-x64/ # ./bin/elasticdump --input=http://114.115.xxx.xxx:9200/rn_20200902 --output=http://172.16.0.177:9200/rn_20200902 --type=mapping Mon, 14 Sep 2020 08:35:54 GMT | starting dump Mon, 14 Sep 2020 08:35:54 GMT | got 1 objects from source elasticsearch (offset: 0) Mon, 14 Sep 2020 08:35:54 GMT | sent 1 objects to destination elasticsearch, wrote 1 Mon, 14 Sep 2020 08:35:54 GMT | got 0 objects from source elasticsearch (offset: 1) Mon, 14 Sep 2020 08:35:54 GMT | Total Writes: 1 ``` #### (3)同步数据: ```shell # cd /usr/local/node-v14.10.0-linux-x64/ # ./bin/elasticdump --input=http://114.115.xxx.xxx:9200/rn_20200828 --output=http://172.16.0.177:9200/rn_20200828 --type=data --limit=10000 Mon, 14 Sep 2020 10:00:53 GMT | starting dump Mon, 14 Sep 2020 10:00:54 GMT | got 10000 objects from source elasticsearch (offset: 0) Mon, 14 Sep 2020 10:00:57 GMT | sent 10000 objects to destination elasticsearch, wrote 10000 Mon, 14 Sep 2020 10:00:57 GMT | got 10000 objects from source elasticsearch (offset: 10000) Mon, 14 Sep 2020 10:01:00 GMT | sent 10000 objects to destination elasticsearch, wrote 10000 Mon, 14 Sep 2020 10:01:01 GMT | got 10000 objects from source elasticsearch (offset: 20000) Mon, 14 Sep 2020 10:01:03 GMT | sent 10000 objects to destination elasticsearch, wrote 10000 Mon, 14 Sep 2020 10:01:04 GMT | got 10000 objects from source elasticsearch (offset: 30000) ``` Last modification:September 23rd, 2020 at 04:34 pm © 允许规范转载