Elasticsearch 版本为 5.5.0,下面是主要的核心流程,忽略异常补偿部分
问题来源
测试环境 ES 集群只有两台机器,昨晚不小心重启了一台,但是不知道为什么第二台机器突然崩了,原因还在排查,估计是数据量过大,一台机器难以承受,死机了.
GET /_cat/health
查看集群健康状态,发现完了,变成 RED 了,说明部分的分片可用,表明部分主分片损坏
1501728250 10:44:10 elasticsearch red 4 4 22 20 0 0 12 0
GET /_cat/shards
发现存在好多 UNASSIGNED 的分片
.marvel-2017.08.02 0 r STARTED 53054 97.5mb 123.57.68.250 123.57.68.250 .marvel-2017.08.02 0 p STARTED 53054 96.9mb 123.57.68.250 123.57.68.250 .marvel-2017.08.03 0 p STARTED 14729 34.9mb 10.251.1.123 123.56.125.43 .marvel-2017.08.03 0 r STARTED 14729 35.6mb 123.57.68.250 123.57.68.250 user_d 0 p STARTED 938 904.2kb 10.251.1.123 123.56.125.43 user_d 3 p UNASSIGNED user_d 1 p STARTED 1086 872kb 10.251.1.123 123.56.125.43 user_d 2 p UNASSIGNED index_v2 0 p STARTED 109226 23.5mb 10.251.1.123 123.56.125.43 index_v2 3 p STARTED 108133 23.3mb 10.251.1.123 123.56.125.43 index_v2 1 p STARTED 109711 23.6mb 123.57.68.250 123.57.68.250 index_v2 2 p STARTED 106542 22.9mb 123.57.68.250 123.57.68.250 contest 0 p STARTED 16 37.8kb 10.251.1.123 123.56.125.43 contest 3 p UNASSIGNED contest 1 p STARTED 38 126.5kb 123.57.68.250 123.57.68.250 contest 2 p STARTED 50 113.8kb 10.251.1.123 123.56.125.43 .marvel-2017.01.10 0 p UNASSIGNED .marvel-2017.01.10 0 r UNASSIGNED contest_d 0 p UNASSIGNED contest_d 3 p UNASSIGNED contest_d 1 p STARTED 43 72.3kb 123.57.68.250 123.57.68.250 contest_d 2 p STARTED 44 79.5kb 123.57.68.250 123.57.68.250 user 0 p STARTED 124228 120mb 123.57.68.250 123.57.68.250 user 3 p STARTED 126029 120.9mb 123.57.68.250 123.57.68.250 user 1 p UNASSIGNED user 2 p UNASSIGNED index_d 0 p UNASSIGNED index_d 3 p UNASSIGNED index_d 1 p STARTED 3603 4.7mb 10.251.1.123 123.56.125.43 index_d 2 p STARTED 3586 4.3mb 10.251.1.123 123.56.125.43 500px.tribe-2017-03-27 0 p STARTED 5 69.9kb 123.57.68.250 123.57.68.250 500px.tribe-2017-03-27 3 p UNASSIGNED 500px.tribe-2017-03-27 1 p STARTED 10 63.8kb 10.251.1.123 123.56.125.43 500px.tribe-2017-03-27 2 p STARTED 4 52.3kb 10.251.1.123 123.56.125.43
curl -s "http://localhost:9200/_cat/shards" | grep UNASSIGNED
找出所有的 UNASSIGNED 的分片
index_d 0 p UNASSIGNED index_d 3 p UNASSIGNED contest 3 p UNASSIGNED 500px.tribe-2017-03-27 3 p UNASSIGNED contest_d 0 p UNASSIGNED contest_d 3 p UNASSIGNED user_d 3 p UNASSIGNED user_d 2 p UNASSIGNED user 1 p UNASSIGNED user 2 p UNASSIGNED .marvel-2017.01.10 0 p UNASSIGNED .marvel-2017.01.10 0 r UNASSIGNED
curl 'localhost:9200/_nodes/process?pretty'
查询得到 master 节点的唯一标识
"cluster_name": "elasticsearch", "nodes": { "RVBi3XH_Qd2Mq1KAg5ELxg": { "name": "123.57.68.250", "transport_address": "inet[/10.171.100.171:9300]", "host": "visualchina-dev-01", "ip": "123.57.68.250", "version": "1.5.1", "build": "5e38401", "http_address": "inet[/10.171.100.171:9200]", "attributes": { "master": "true" }, "process": { "refresh_interval_in_millis": 1000, "id": 26432, "max_file_descriptors": 65535, "mlockall": false } }
/_cluster/reroute
对 UNASSIGNED 分片进行重新分配
POST /_cluster/reroute { "commands": [ { "allocate": { "index": "user_d", "shard": 3, "node": "RVBi3XH_Qd2Mq1KAg5ELxg", "allow_primary": true } } ] }
欢迎来到这里!
我们正在构建一个小众社区,大家在这里相互信任,以平等 • 自由 • 奔放的价值观进行分享交流。最终,希望大家能够找到与自己志同道合的伙伴,共同成长。
注册 关于