部署 prometheus operator

本贴最后更新于 610 天前,其中的信息可能已经时异事殊

prometheus-operator 创建了 4 个 CRD:

部署

git clone https://github.com/coreos/kube-prometheus.git

修改 kube-prometheus/manifests/grafana-service.yaml 和 prometheus-service.yaml,在 spec 下添加 type: NodePort 用于暴露服务

kubectl apply -f kube-prometheus/manifests

首次执行会报错,过一会再执行一遍 kubectl apply,应该是一些依赖的资源还没创建完,后面依赖它的资源先创建了,所以过一会儿再执行一遍就可以了

[root@k8s03 kube-prometheus]# kubectl get crd |grep coreos
alertmanagers.monitoring.coreos.com           2019-07-31T05:50:57Z
podmonitors.monitoring.coreos.com             2019-07-31T05:50:57Z
prometheuses.monitoring.coreos.com            2019-07-31T05:50:58Z
prometheusrules.monitoring.coreos.com         2019-07-31T05:50:59Z
servicemonitors.monitoring.coreos.com         2019-07-31T05:51:00Z

[root@k8s03 kube-prometheus]# kubectl get pods -n monitoring
NAME                                   READY   STATUS             RESTARTS   AGE
alertmanager-main-0                    2/2     Running            0          10m
alertmanager-main-1                    2/2     Running            0          7m38s
alertmanager-main-2                    2/2     Running            0          4m1s
grafana-7dc5f8f9f6-ghtk5               1/1     Running            0          19m
kube-state-metrics-58b66579dc-cm7g8    3/4     ImagePullBackOff   0          19m
node-exporter-kb2j9                    2/2     Running            0          19m
node-exporter-lqs5d                    2/2     Running            0          19m
node-exporter-tf6f6                    2/2     Running            0          19m
prometheus-adapter-668748ddbd-z9lbv    1/1     Running            0          19m
prometheus-k8s-0                       3/3     Running            1          10m
prometheus-k8s-1                       3/3     Running            1          10m
prometheus-operator-7447bf4dcb-7d4t4   1/1     Running            0          19m

[root@k8s03 manifests]# kubectl get svc -n monitoring
NAME                    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
alertmanager-main       ClusterIP   10.100.176.121   <none>        9093/TCP            32m
alertmanager-operated   ClusterIP   None             <none>        9093/TCP,6783/TCP   23m
grafana                 NodePort    10.100.48.63     <none>        3000:31493/TCP      32m
kube-state-metrics      ClusterIP   None             <none>        8443/TCP,9443/TCP   32m
node-exporter           ClusterIP   None             <none>        9100/TCP            32m
prometheus-adapter      ClusterIP   10.102.205.39    <none>        443/TCP             32m
prometheus-k8s          NodePort    10.102.176.248   <none>        9090:31782/TCP      32m
prometheus-operated     ClusterIP   None             <none>        9090/TCP            23m
prometheus-operator     ClusterIP   None             <none>        8080/TCP            32m

打开 prometheus 的 targets 页面,可以看到有 2 个组件监控不到

monitoring/kube-controller-manager/0 (0/0 up)
monitoring/kube-scheduler/0 (0/0 up)

查看 prometheus-serviceMonitorKubeScheduler.yaml 和 prometheus-serviceMonitorKubeControllerManager.yaml 可以看到 servicemonitor 通过 k8s-app=kube-scheduler 和 k8s-app: kube-controller-manager 进行匹配的,
使用命令 kubectl get svc -n kube-system 可以看到并没有这两个 service,所以需要手动创建这两个 service,注意 labels 和 selector 部分的配置必须和 ServiceMonitor 对象中的保持一致。
10251 是 kube-scheduler 组件 metrics 数据所在的端口,10252 是 kube-controller-manager 组件的监控数据所在端口。

cat prometheus-kubeSchedulerService.yaml

apiVersion: v1
kind: Service
metadata:
  namespace: kube-system
  name: kube-scheduler
  labels:
    k8s-app: kube-scheduler
spec:
  selector:
    component: kube-scheduler
  ports:
  - name: http-metrics
    port: 10251
    targetPort: 10251
    protocol: TCP

cat prometheus-kubeControllerManagerService.yaml

apiVersion: v1
kind: Service
metadata:
  namespace: kube-system
  name: kube-controller-manager
  labels:
    k8s-app: kube-controller-manager
spec:
  selector:
    component: kube-controller-manager
  ports:
  - name: http-metrics
    port: 10252
    targetPort: 10252
    protocol: TCP
kubectl apply -f prometheus-kubeSchedulerService.yaml
kubectl apply -f prometheus-kubeControllerManagerService.yaml

查看 prometheus-operator 状态

kubectl get pod -n monitoring -o wide | grep prometheus-operator
kubectl get service -n monitoring | grep prometheus-operator
kubectl get ServiceMonitor -n monitoring | grep prometheus-operator
kubectl api-versions| grep monitoring
kubectl get --raw "/apis/monitoring.coreos.com/v1"|jq .
  • DevOps

    DevOps(Development 和 Operations 的组合词)是一组过程、方法与系统的统称,用于促进开发(应用程序/软件工程)、技术运营和质量保障(QA)部门之间的沟通、协作与整合。

    31 引用 • 23 回帖 • 2 关注

广告 我要投放

欢迎来到这里!

我们正在构建一个小众社区,大家在这里相互信任,以平等 • 自由 • 奔放的价值观进行分享交流。最终,希望大家能够找到与自己志同道合的伙伴,共同成长。

注册 关于
请输入回帖内容 ...