zookeeper 监控
git 项目地址:https://github.com/jiankunking/zookeeper_exporter
exporter 下载地址:https://github.com/carlpett/zookeeper_exporter/releases/download/v1.0.2/zookeeper_exporter
注意:export 适合 zookeeper3.4+
① 下载 zookeeper_export
wget https://github.com/carlpett/zookeeper_exporter/releases/download/v1.0.2/zookeeper_exporter
② 启动 zookeeper_export
nohup /usr/local/bin/zookeeper_exporter >>/dev/null 2>&1 &
③ 查看是否正常
④ 将 export 加入到 prometheus 服务端。
⑤ 登陆 grafana,导入模板;搜索 Zookeeper Exporer Overview 或者 拷贝 pid 9236
zookeeper alter 监控参考如下:
groups:
- name: zookeeperStatsAlert
rules:
- alert: 堆积请求数过大
expr: avg(zk_outstanding_requests) by (instance) > 10 for: 1m
labels: severity: critical
annotations:
summary: "Instance {{ $labels.instance }} "
description: "积请求数过大"
- alert: 阻塞中的 sync 过多
expr: avg(zk_pending_syncs) by (instance) > 10
for: 1m
labels:
severity: critical
annotations:
summary: "Instance {{ $labels.instance }} "
description: "塞中的 sync 过多"
- alert: 平均响应延迟过高
expr: avg(zk_avg_latency) by (instance) > 10
for: 1m
labels:
severity: critical
annotations:
summary: "Instance {{ $labels.instance }} "
description: '平均响应延迟过高'
- alert: 打开文件描述符数大于系统设定的大小
expr: zk_open_file_descriptor_count > zk_max_file_descriptor_count * 0.85
for: 1m
labels:
severity: critical
annotations:
summary: "Instance {{ $labels.instance }} "
description: '打开文件描述符数大于系统设定的大小'
- alert: zookeeper服务器宕机
expr: zk_up == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Instance {{ $labels.instance }} "
description: 'zookeeper服务器宕机'
- alert: zk主节点丢失
expr: absent(zk_server_state{state="leader"}) != 1
for: 1m
labels:
severity: critical
annotations:
summary: "Instance {{ $labels.instance }} "
description: 'zk主节点丢失'
需要指定阈值的指标
zk_outstanding_requests 堆积请求数
zk_pending_syncs 阻塞中的 sync 操作
zk_avg_latency 平均 响应延迟
zk_open_file_descriptor_count 打开 文件描述符 数
zk_max_file_descriptor_count 最大 文件描述符 数
zk_up 1
zk_server_state 主从状态
zk_num_alive_connections 活跃连接数
欢迎来到这里!
我们正在构建一个小众社区,大家在这里相互信任,以平等 • 自由 • 奔放的价值观进行分享交流。最终,希望大家能够找到与自己志同道合的伙伴,共同成长。
注册 关于