SkyWalking8.4 部署以及 UI 使用说明

本贴最后更新于 1239 天前,其中的信息可能已经时移世易

Apache SkyWalking

简介:

skywalking 是 Apache 下面的一个开源的分布式链路追踪系统(APM).

SkyWalking is an open source APM system, including monitoring, tracing, diagnosing capabilities for distributed system in Cloud Native architecture. The core features are following.

  • Service, service instance, endpoint metrics analysis
  • Root cause analysis. Profile the code on the runtime
  • Service topology map analysis
  • Service, service instance and endpoint dependency analysis
  • Slow services and endpoints detected
  • Performance optimization
  • Distributed tracing and context propagation
  • Database access metrics. Detect slow database access statements(including SQL statements)
  • Alarm
  • Browser performance monitoring
  • Infrastructure(VM, network, disk etc.) monitoring
  • Collaboration across metrics, traces, and logs

Github 项目地址

Architecture :

swinfra.jpg

1、java-Agent 配置

首先下载 apache-skywalking-apm-8.4.0.tar.gz 安装包并解压,保证 apache-skywalking-apm-bin 文件夹和创建的 dockerfile 处于同一目录下,dockerfile 内容如下所示。由于官方没有提供 agent docker 镜像,需要自己定义 Agent 部署方式。

Agent 部署方式大致分为两种

  • Sidecar。通过 initContainers 将 Agent 拷贝到 Pod 到共享目录中,将此目录挂到主容器中,在启动服务时指定 Agent.jar 的目录启动 java 服务。如:
java -javaagent:/usr/skywalking/agent/skywalking-agent.jar -jar app.jar --spring.profiles.active=test
  • 直接打包到 java 基础镜像中。在配置 JDK 基础镜像时,直接将 skywalking Agent 程序打包到基础镜像中,springboot 服务启动时候直接指定 aent.jar 的目录,或者通过参数传入 docker 中。
export JAVA_OPTS=-javaagent:/root/skywalking/agent/skywalking-agent.jar
  • Dockerfile
# 执行镜像构建前先下载skywalking包到当前目录
# wget https://mirrors.tuna.tsinghua.edu.cn/apache/skywalking/8.4.0/apache-skywalking-apm-8.4.0.tar.gz && tar -zxvf apache-skywalking-apm-8.4.0.tar.gz
FROM busybox:latest 

ENV LANG=C.UTF-8

RUN set -eux && mkdir -p /usr/skywalking/agent/

ADD apache-skywalking-apm-bin/agent/ /usr/skywalking/agent/

WORKDIR /

Plugin 配置

默认自带插件存放在 plugins 目录,可选插件存放在 optional-plugins 目录中,根据自己的应用场景进行选择,如果需要启用某个插件,只要将对应的 jar 包拷贝到 plugins 目录即可。如下图:

swagent1.png

注入 Agent 访问流程

swoapui.png

2、skywalking 服务部署 template

测试服务 Agent 植入

---
# Deployment include Skywalking Agent of Sidecar. The Version's 8.4.0-es6
# This is Gateway service for springcloudAlibaba
# By John Wang 2021-03-25 PM 11:30
apiVersion: apps/v1
kind: Deployment
metadata:
  name: deepsight-gateway
  namespace: deepsight-test
  labels:
    app: deepsight-gateway
spec:
  replicas: 1
  selector:
    matchLabels:
      app: deepsight-gateway
  template:
    metadata:
      labels:
        app: deepsight-gateway
    spec:
      imagePullSecrets:
      - name: registry-pull-secret
      initContainers:
        - image: hub.deepsight.cloud/skywalking/skywalking-agent-sidecar:8.4.0
          name: sw-agent-sidecar
          imagePullPolicy: IfNotPresent
          command: ["sh"]
          args:
            [
            "-c",
            "mkdir -p /skywalking/agent && cp -r /usr/skywalking/agent/* /skywalking/agent",
            ]
          volumeMounts:
            - mountPath: /skywalking/agent
              name: sw-agent
      containers:
      - name: ds-gateway
        image:  $IMAGE_NAME
        imagePullPolicy: IfNotPresent
        command: ["java"]
        args:
          [
           "-javaagent:/usr/skywalking/agent/skywalking-agent.jar", "-jar", "app.jar","--spring.profiles.active=test",
          ]
        env:
          - name: SW_AGENT_NAME # 定义服务名称,在skywalking UI中显示服务的实例名称
            value: deepsight-gateway
          - name: SW_AGENT_COLLECTOR_BACKEND_SERVICES # 定义OAP server Addresses
            value: oap.skywalking:11800 
          - name: SERVER_PORT # 配置java服务启动的端口,如果已经指定将此行注释
            value: "8080"
        resources:
          limits:
            memory: "700Mi"
            cpu: "700m"
          requests:
            memory: "512Mi"
            cpu: "500m"
        readinessProbe:
          httpGet:
           path: /actuator/health
           port: 80
          initialDelaySeconds: 30 # 容器启动后多少秒开始健康检查
          periodSeconds: 10 # Inspection interval
        livenessProbe:
          httpGet:
            path: /actuator/health
            port: 80
          initialDelaySeconds: 30
          periodSeconds: 10
        ports:
        - containerPort: 80
          name: httpservice
          protocol: TCP
        volumeMounts:
        - name: host-time
          mountPath: /etc/localtime
        - name: sw-agent
          mountPath: /usr/skywalking/agent
      volumes:
        - name: sw-agent
          emptyDir: {}
        - name: host-time
          hostPath:
            path: /etc/localtime
---
# Serivce For Deepsight-Gateway
apiVersion: v1
kind: Service
metadata:
  name: deepsight-gateway
  namespace: deepsight-test
  labels:
    app: deepsight-test
spec:
  ports:
  - name: web
    port: 80
    protocol: TCP
    targetPort: 80
  selector:
    app: deepsight-gateway

上面展示的 deployment 文件中,Agent 部署方式通过 sidecar 的方式注入到 java 服务中,这样做对原服务镜像无需任何修改,兼容性和灵活性强。

* 配置 Elasticsearch 集群

通过 emptyDir 的方式部署在 kubernetes 中,无需部署 index 的清理策略,配置文件中 recordDataTTL、otherMetricsDataTTL 和 monthMetricsDataTTL 已经设置了数据留存的时间

es-sts-template.yaml

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch
  namespace: skywalking
spec:
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
  serviceName: elasticsearch
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      imagePullSecrets:
      - name: registry-pull-secret
      containers:
      - env:
        - name: cluster.name
          value: k8s-logs
        - name: node.name
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: discovery.zen.ping.unicast.hosts
          value: elasticsearch-0.elasticsearch,elasticsearch-1.elasticsearch,elasticsearch-2.elasticsearch
        - name: discovery.zen.minimum_master_nodes
          value: "2"
        - name: ES_JAVA_OPTS
          value: -Xms512m -Xmx512m
        image: hub.deepsight.cloud/skywalking/elasticsearch:6.4.3
        imagePullPolicy: Always
        name: elasticsearch
        ports:
        - containerPort: 9200
          name: rest
          protocol: TCP
        - containerPort: 9300
          name: inter-node
          protocol: TCP
        resources:
          limits:
            cpu: "1"
          requests:
            cpu: 100m
        volumeMounts:
        - mountPath: /usr/share/elasticsearch/data
          name: data
      initContainers:
      - command:
        - sh
        - -c
        - chown -R 1000:1000 /usr/share/elasticsearch/data
        image: busybox
        imagePullPolicy: Always
        name: fix-permissions
        securityContext:
          privileged: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /usr/share/elasticsearch/data
          name: data
      - command:
        - sysctl
        - -w
        - vm.max_map_count=262144
        image: busybox
        imagePullPolicy: Always
        name: increase-vm-max-map
        resources: {}
        securityContext:
          privileged: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      - command:
        - sh
        - -c
        - ulimit -n 65536
        image: busybox
        imagePullPolicy: Always
        name: increase-fd-ulimit
        resources: {}
        securityContext:
          privileged: true
      volumes:
      - emptyDir: {}
        name: data

---
kind: Service
apiVersion: v1
metadata:
  name: elasticsearch
  namespace: skywalking
  labels:
    app: elasticsearch
spec:
  selector:
    app: elasticsearch
  clusterIP: None
  ports:
    - port: 9200
      name: rest
    - port: 9300
      name: inter-node

---
kind: Service
apiVersion: v1
metadata:
  name: elasticsearch-logging
  namespace: skywalking
  labels:
    app: elasticsearch
spec:
  selector:
    app: elasticsearch
  ports:
    - port: 9200
      name: external

* skywalking OAP-server & UI 部署

OAP-deployment.yaml

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: skywalking-oap
  namespace: skywalking
spec:
  replicas: 2
  selector:
    matchLabels:
      app: skywalking-oap
  template:
    metadata:
      labels:
        app: skywalking-oap
    spec:
      serviceAccountName: skywalking-oap-sa
      containers:
      - name: oap
        image: hub.deepsight.cloud/skywalking/skywalking-oap-server:8.4.0-es6
        imagePullPolicy: Always
        livenessProbe:
          tcpSocket:
            port: 12800
          initialDelaySeconds: 15
          periodSeconds: 20
        readinessProbe:
          tcpSocket:
            port: 12800
          initialDelaySeconds: 15
          periodSeconds: 20
        ports:
        - containerPort: 11800
          name: grpc
        - containerPort: 12800
          name: rest
        resources:
          requests:
            memory: 1Gi
          limits:
            memory: 2Gi
        env:
        - name: JAVA_OPTS
          value: "-Xmx2g -Xms2g"
        - name: SW_CLUSTER
          value: standalone
        - name: SKYWALKING_COLLECTOR_UID
          valueFrom:
            fieldRef:
              fieldPath: metadata.uid
        - name: SW_STORAGE
          value: elasticsearch
        - name: SW_STORAGE_ES_CLUSTER_NODES
          value: elasticsearch-logging:9200
        - name: SW_NAMESPACE
          value: skywalking
      imagePullSecrets:
      - name: registry-pull-secret
---
apiVersion: v1
kind: Service
metadata:
  name: skywalking-oap
  namespace: skywalking
  labels:
    app: skywalking-oap
spec:
  ports:
  - port: 12800
    name: rest
  - port: 11800
    name: grpc
  selector:
    app: skywalking-oap
---

ui-deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ui-deployment
  namespace: skywalking
  labels:
    app: ui
spec:
  replicas: 1
  selector:
    matchLabels:
      app: ui
  template:
    metadata:
      labels:
        app: ui
    spec:
      imagePullSecrets:
      - name: registry-pull-secret
      containers:
      - name: ui
        image: hub.deepsight.cloud/skywalking/skywalking-ui:8.4.0
        imagePullPolicy: Always
        ports:
        - containerPort: 8080
          name: page
        resources:
          requests:
            memory: 1Gi
          limits:
            memory: 2Gi
        env:
        - name: SW_OAP_ADDRESS
          value: skywalking-oap.skywalking:12800
---
apiVersion: v1
kind: Service
metadata:
  name: ui
  namespace: skywalking
  labels:
    service: ui
spec:
  ports:
  - port: 8080
    name: page
  selector:
    app: ui
  type: NodePort

skywalking serviceAccount(可根据情况忽略)

# 根据版本可做相应的修改
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: skywalking-oap-sa
  namespace: skywalking
---


kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: skywalking-clusterrolebinding
subjects:
- kind: Group
  name: system:serviceaccounts:skywalking
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: skywalking-clusterrole
  apiGroup: rbac.authorization.k8s.io
---

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: skywalking-clusterrole
rule:
- apiGroup: [""]
  resources: ["pods"]
  verbs: ["get", "watch", "list"]
---

3、skywalking UI 参数使用说

UI 访问主界面

swui2.png

  • 最上方为功能区,用来切换 SW 不同的功能;
  • 功能下方为指标对象,SW 监控对象分为 服务端点``实例 三种;
  • 右下角为时区,用来设定统计指标的时间区域。点击右上角 自动 按钮开启自动刷新模式;
  • 其余空间为指标盘展示区域;
  • 服务器(service):表示对请求提供相同行为的一系列或一组工作负载。

    这里,我们可以看到 应用的服务"deepsight-gateway",这是在 agent 环境变量 SW_AGENT_NAME 中所定义的。

  • 端点(Endpoint):对于特定服务所接收的请求路径, 如 HTTP 的 URI 路径和 gRPC 服务的类名 + 方法签名。

    这里,我们可以看到 Spring Boot 应用的一个端点,为 API 接口 /deepsight-express/express/confrmReceipt

  • 服务实例(Service Instance):上述的一组工作负载中的每一个工作负载称为一个实例。就像 Kubernetes 中的 pods 一样, 服务实例未必就是操作系统上的一个进程。但当你在使用 Agent 的时候, 一个服务实例实际就是操作系统上的一个真实进程。

    这里,我们可以看到 Spring Boot 应用的实例{进程UUID}@{hostname},由 Agent 自动生成。

服务指标

点击仪表盘,选择需要查询的应用,如:deepsight-oauth,再切换仪表盘为 service 模式,即可查询对应的服务指标

swui3.png

服务慢端点(Service Slow Endpoints)

服务指标仪表盘会列举出当前服务响应时间最大的端点 Top5,如果有端点的响应时间过高,则需要进一步关注其指标(点击可以复制端点名称)。

swui4.png

端点指标

如果发现有端点的响应时间过高,可以进一步查询该端点的指标信息。和服务指标类似,端点指标也包括吞吐量、SLA、响应时间等指标

swui6.png

服务实例指标

选择服务的实例并切换仪表盘,即可查看服务某个实例的指标数据。除了常规的吞吐量、SLA、响应时间等指标外,实例信息中还会给出 JVM 的信息,如堆栈使用量,GC 耗时和次数等。

swui5.png

DB 数据查询指标

除了服务本身的指标,SW 也监控了服务依赖的 DB 指标。切换 DB 指标盘并选择对应 DB 实例,就可以看到从服务角度(client)来看该 DB 实例的吞吐量、SLA、响应时间等指标。

更进一步,该 DB 执行慢 SQL 会被自动列出,可以直接粘贴出来,便于定位耗时原因。

swui7.png

拓扑结构

  • 不同于仪表盘来展示单一服务的指标,拓扑图是来展示服务和服务之间的依赖关系。
  • 用户可以选择单一服务查询,也可以将多个服务设定为一组同时查询。
  • 点击服务图片会自动显示当前的服务指标;
  • SW 会根据请求数据,自动探测出依赖的服务,DB 和中间件等。
  • 点击依赖线上的圆点,会显示服务之间的依赖情况,如每分钟吞吐量,平均延迟时间,和侦察端模式(client/Server)

swui8.png

请求追踪

当用户发现服务的 SLA 降低,或者某个具体的端口响应时间上扬明显,可以使用追踪功能查询具体的请求记录。

  • 最上方为搜索区,用户可以指定搜索条件,如隶属于哪个服务、哪个实例、哪个端口,或者请求是成功还是失败;也可以根据上文提到的 TraceID 精确查询。
  • 整个调用链上每一个跨度的耗时和执行结果都会被列出(默认是列表,也可选择树形结构和表格的形式);
  • 如果有步骤失败,该步骤会标记为红色。

swui9.png

  • 点击跨度,会显示跨度详情,如果有异常发生,异常的种类、信息和堆栈都会被自动捕获;

swui10.png

  • 如果跨度为数据库操作,执行的 SQL 也会被自动记录。

swui11.png

性能剖析

追踪功能展示出的跨度是服务调用粒度的,如果要看应用实时的堆栈信息,可以选择性能剖析功能。

  • 新建分析任务;
  • 选指定的服务和端点作为分析对象;
  • 设定采样频率和次数;

swui12.png

新建任务后,SW 将开始采集应用的实时堆栈信息。采样结束后,用户点击分析即可查看具体的堆栈信息。

  1. 点击跨度右侧的“查看”,可以看到调用链的具体详情;
  2. 跨度目录下方是 SW 收集到的具体进程堆栈信息和耗时情况。

swui13.png

Alarm-setting 告警配置

rule 规则说明

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Sample alarm rules.
rules:
  # Rule unique name, must be ended with `_rule`.
  service_resp_time_rule: # 服务响应时间
    metrics-name: service_resp_time # 度量名称
    op: ">"                         # 比较符
    threshold: 1000                 # 1000ms 预值,服务响应时间大于1s
    period: 10                      # 多久检查一次当前当前指标是否触发预值,这里设定为10m
    count: 3                        # 达到多少次触发告警,这里是3次
    silence-period: 5               # 多久之类忽略相同的告警信息,这里设定为5m
    message: Response time of service {name} is more than 1000ms in 3 minutes of last 10 minutes. # 告警消息
  service_sla_rule: # 服务SLA
    # Metrics value need to be long, double or int
    metrics-name: service_sla
    op: "<"
    threshold: 8000
    # The length of time to evaluate the metrics
    period: 10
    # How many times after the metrics match the condition, will trigger alarm
    count: 2
    # How many times of checks, the alarm keeps silence after alarm triggered, default as same as period.
    silence-period: 3
    message: Successful rate of service {name} is lower than 80% in 2 minutes of last 10 minutes
  service_resp_time_percentile_rule:
    # Metrics value need to be long, double or int
    metrics-name: service_percentile
    op: ">"
    threshold: 1000,1000,1000,1000,1000
    period: 10
    count: 3
    silence-period: 5
    message: Percentile response time of service {name} alarm in 3 minutes of last 10 minutes, due to more than one condition of p50 > 1000, p75 > 1000, p90 > 1000, p95 > 1000, p99 > 1000
  service_instance_resp_time_rule: # 服务实例响应时间
    metrics-name: service_instance_resp_time
    op: ">"
    threshold: 1000
    period: 10
    count: 2
    silence-period: 5
    message: Response time of service instance {name} is more than 1000ms in 2 minutes of last 10 minutes
  database_access_resp_time_rule:
    metrics-name: database_access_resp_time
    threshold: 1000
    op: ">"
    period: 10
    count: 2
    message: Response time of database access {name} is more than 1000ms in 2 minutes of last 10 minutes
  endpoint_relation_resp_time_rule: # 关联端点响应时间
    metrics-name: endpoint_relation_resp_time
    threshold: 1000
    op: ">"
    period: 10
    count: 2
    message: Response time of endpoint relation {name} is more than 1000ms in 2 minutes of last 10 minutes
#  Active endpoint related metrics alarm will cost more memory than service and service instance metrics alarm.
#  Because the number of endpoint is much more than service and instance.
#
#  endpoint_avg_rule:
#    metrics-name: endpoint_avg
#    op: ">"
#    threshold: 1000
#    period: 10
#    count: 2
#    silence-period: 5
#    message: Response time of endpoint {name} is more than 1000ms in 2 minutes of last 10 minutes

webhooks: # 告警产生后的回调地址,对接钉钉,微信,邮箱,实现方法在webhook服务中实现
  - http://skywalking-webhook-service:8080/skywalking/alarm

告警规则配置项的说明:

  • **Rule name:**规则名称,也是在告警信息中显示的唯一名称。必须以 _rule 结尾,前缀可自定义
  • **Metrics name:**度量名称,取值为 oal 脚本中的度量名,目前只支持 longdoubleint 类型。详见 Official OAL script
  • **Include names:**该规则作用于哪些实体名称,比如服务名,终端名(可选,默认为全部)
  • **Exclude names:**该规则作不用于哪些实体名称,比如服务名,终端名(可选,默认为空)
  • **Threshold:**阈值
  • OP: 操作符,目前支持 ><=
  • **Period:**多久告警规则需要被核实一下。这是一个时间窗口,与后端部署环境时间相匹配
  • **Count:**在一个 Period 窗口中,如果 values 超过 Threshold 值(按 op),达到 Count 值,需要发送警报
  • **Silence period:**在时间 N 中触发报警后,在 TN -> TN + period 这个阶段不告警。 默认情况下,它和 Period 一样,这意味着相同的告警(在同一个 Metrics name 拥有相同的 Id)在同一个 Period 内只会触发一次
  • **message:**告警消息

SkyWalking 的告警消息会通过 HTTP 请求进行发送,请求方法为 POSTContent-Typeapplication/json,其 JSON 数据实基于 List<org.apache.skywalking.oap.server.core.alarm.AlarmMessage 进行序列化的。JSON 数据示例:

[{
    "scopeId": 1,
    "scope": "SERVICE",
    "name": "serviceA",
    "id0": 12,
    "id1": 0,
    "ruleName": "service_resp_time_rule",
    "alarmMessage": "alarmMessage xxxx",
    "startTime": 1560524171000
}, {
    "scopeId": 1,
    "scope": "SERVICE",
    "name": "serviceB",
    "id0": 23,
    "id1": 0,
    "ruleName": "service_resp_time_rule",
    "alarmMessage": "alarmMessage yyy",
    "startTime": 1560524171000
}]

字段说明:

  • **scopeId、scope:**所有可用的 Scope 详见 org.apache.skywalking.oap.server.core.source.DefaultScopeDefine
  • **name:**目标 Scope 的实体名称
  • **id0:**Scope 实体的 ID
  • **id1:**保留字段,目前暂未使用
  • **ruleName:**告警规则名称
  • **alarmMessage:**告警消息内容
  • **startTime:**告警时间,格式为时间戳

钉钉告警信息截图

swalarm1.png

4、webhook server 部署

webhook server 使用的是 java 开发的项目来自 github。

docker 镜像制作

FROM openjdk:8u92-alpine

ENV SKYWALKING_WORK_SPACE=/skywalking \
    APP_NAME=skywalking-webhook-dingding-talk.jar \
    SKYWALKING_WEBHOOK_CONFIG_DIR=/skywalking/config
RUN mkdir -p ${SKYWALKING_WORK_SPACE}/webhook && \
    mkdir ${SKYWALKING_WEBHOOK_CONFIG_DIR} 
COPY ${APP_NAME} ${SKYWALKING_WORK_SPACE}/webhook
COPY start.sh ${SKYWALKING_WORK_SPACE}
RUN chmod 775 ${SKYWALKING_WORK_SPACE}/start.sh && \
    chmod 775 ${SKYWALKING_WORK_SPACE}/webhook/${APP_NAME}

WORKDIR ${SKYWALKING_WORK_SPACE}
EXPOSE 8080
CMD ["/skywalking/start.sh"]

k8s 部署 webhook

apiVersion: v1
kind: ConfigMap
metadata:
  name: dingtalk-configmap
  namespace: skywalking
data:
  application.properties: |-
    server.port=8080
    dingtalk.webhook=https://oapi.dingtalk.com/robot/send?access_token=d22a21469b4acd900xxxxxx
    dingtalk.secret=SEC8d8ccc523755feef0xxxxxxxxx    
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: skywalking-webhook
  name: skywalking-webhook-dingdingtalk
  namespace: skywalking
spec:
  replicas: 1
  selector:
    matchLabels:
      app: skywalking-webhook
  template:
    metadata:
      labels:
        app: skywalking-webhook
    spec:
      containers:
      - name: skywalking-webhook
        image: hub.deepsight.cloud/skywalking/skywalking-webhook-dingtalk:v0.1
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 8080
          name: http
          protocol: TCP
        volumeMounts:
        - mountPath: /skywalking/config
          name: dingtalk-volume
      volumes:
      - name: dingtalk-volume
        configMap:
          name: dingtalk-configmap

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: skywalking-webhook
  name: skywalking-webhook-service
  namespace: skywalking
spec:
  ports:
  - name: http
    port: 8080
    protocol: TCP
    targetPort: 8080
  selector:
    app: skywalking-webhook
  type: ClusterIP

5、ui 相关参数详解

CPM:每分钟请求调用的次数

SLA: 服务等级协议(简称:SLA,全称:service level agreement)

百分位数:skywalking 中有 P50,P90,P95 这种统计口径,就是百分位数的概念。

释义:在一个样本数据集合中,通过某个样本值,可以得到小于这个样本值的数据占整体的百分之多少,这个样本值的值就是这个百分数对应的百分位数。

举例:全公司参加考试,有百分之八十的人都低于 60 分,那么对于整个公司的考试成绩这个样本集合来说,第八十百分位数就是 60;

图例:如下图,表示 7 月 22 日,14:56 分这个时间点探针反馈的统计结果来看,有 50% 的请求响应时间低于 60ms,有 75% 的请求响应时间低于 60ms,有 90% 的请求响应时间低于 550ms,有 95% 的请求响应时间低于 550ms,有 99% 的请求响应时间低于 550ms

swuip.png

相关帖子

欢迎来到这里!

我们正在构建一个小众社区,大家在这里相互信任,以平等 • 自由 • 奔放的价值观进行分享交流。最终,希望大家能够找到与自己志同道合的伙伴,共同成长。

注册 关于
请输入回帖内容 ...