Elasticsearch 版本为 5.5.0,下面是主要的核心流程,忽略异常补偿部分
问题来源
测试环境 ES 集群只有两台机器,昨晚不小心重启了一台,但是不知道为什么第二台机器突然崩了,原因还在排查,估计是数据量过大,一台机器难以承受,死机了.
GET /_cat/health
查看集群健康状态,发现完了,变成 RED 了,说明部分的分片可用,表明部分主分片损坏
1501728250 10:44:10 elasticsearch red 4 4 22 20 0 0 12 0
GET /_cat/shards
发现存在好多 UNASSIGNED 的分片
.marvel-2017.08.02 0 r STARTED 53054 97.5mb 123.57.68.250 123.57.68.250
.marvel-2017.08.02 0 p STARTED 53054 96.9mb 123.57.68.250 123.57.68.250
.marvel-2017.08.03 0 p STARTED 14729 34.9mb 10.251.1.123 123.56.125.43
.marvel-2017.08.03 0 r STARTED 14729 35.6mb 123.57.68.250 123.57.68.250
user_d 0 p STARTED 938 904.2kb 10.251.1.123 123.56.125.43
user_d 3 p UNASSIGNED
user_d 1 p STARTED 1086 872kb 10.251.1.123 123.56.125.43
user_d 2 p UNASSIGNED
index_v2 0 p STARTED 109226 23.5mb 10.251.1.123 123.56.125.43
index_v2 3 p STARTED 108133 23.3mb 10.251.1.123 123.56.125.43
index_v2 1 p STARTED 109711 23.6mb 123.57.68.250 123.57.68.250
index_v2 2 p STARTED 106542 22.9mb 123.57.68.250 123.57.68.250
contest 0 p STARTED 16 37.8kb 10.251.1.123 123.56.125.43
contest 3 p UNASSIGNED
contest 1 p STARTED 38 126.5kb 123.57.68.250 123.57.68.250
contest 2 p STARTED 50 113.8kb 10.251.1.123 123.56.125.43
.marvel-2017.01.10 0 p UNASSIGNED
.marvel-2017.01.10 0 r UNASSIGNED
contest_d 0 p UNASSIGNED
contest_d 3 p UNASSIGNED
contest_d 1 p STARTED 43 72.3kb 123.57.68.250 123.57.68.250
contest_d 2 p STARTED 44 79.5kb 123.57.68.250 123.57.68.250
user 0 p STARTED 124228 120mb 123.57.68.250 123.57.68.250
user 3 p STARTED 126029 120.9mb 123.57.68.250 123.57.68.250
user 1 p UNASSIGNED
user 2 p UNASSIGNED
index_d 0 p UNASSIGNED
index_d 3 p UNASSIGNED
index_d 1 p STARTED 3603 4.7mb 10.251.1.123 123.56.125.43
index_d 2 p STARTED 3586 4.3mb 10.251.1.123 123.56.125.43
500px.tribe-2017-03-27 0 p STARTED 5 69.9kb 123.57.68.250 123.57.68.250
500px.tribe-2017-03-27 3 p UNASSIGNED
500px.tribe-2017-03-27 1 p STARTED 10 63.8kb 10.251.1.123 123.56.125.43
500px.tribe-2017-03-27 2 p STARTED 4 52.3kb 10.251.1.123 123.56.125.43
curl -s "http://localhost:9200/_cat/shards" | grep UNASSIGNED
找出所有的 UNASSIGNED 的分片
index_d 0 p UNASSIGNED
index_d 3 p UNASSIGNED
contest 3 p UNASSIGNED
500px.tribe-2017-03-27 3 p UNASSIGNED
contest_d 0 p UNASSIGNED
contest_d 3 p UNASSIGNED
user_d 3 p UNASSIGNED
user_d 2 p UNASSIGNED
user 1 p UNASSIGNED
user 2 p UNASSIGNED
.marvel-2017.01.10 0 p UNASSIGNED
.marvel-2017.01.10 0 r UNASSIGNED
curl 'localhost:9200/_nodes/process?pretty'
查询得到 master 节点的唯一标识
"cluster_name": "elasticsearch",
"nodes": {
"RVBi3XH_Qd2Mq1KAg5ELxg":
{ "name": "123.57.68.250",
"transport_address": "inet[/10.171.100.171:9300]",
"host": "visualchina-dev-01", "ip": "123.57.68.250",
"version": "1.5.1",
"build": "5e38401",
"http_address": "inet[/10.171.100.171:9200]",
"attributes": {
"master": "true"
},
"process": {
"refresh_interval_in_millis": 1000,
"id": 26432,
"max_file_descriptors": 65535,
"mlockall": false }
}
/_cluster/reroute
对 UNASSIGNED 分片进行重新分配
POST /_cluster/reroute
{
"commands": [
{
"allocate": {
"index": "user_d",
"shard": 3,
"node": "RVBi3XH_Qd2Mq1KAg5ELxg",
"allow_primary": true
}
}
]
}
欢迎来到这里!
我们正在构建一个小众社区,大家在这里相互信任,以平等 • 自由 • 奔放的价值观进行分享交流。最终,希望大家能够找到与自己志同道合的伙伴,共同成长。
注册 关于