MENU

Prometheus+Grafana(五)监控Kubernetes

July 22, 2021 • Read: 642 • 默认分类

监控指标
Kubernetes本身监控
• Node资源利用率
• Node数量
• Pods数量(Node)
• 资源对象状态
Pod监控
• Pod数量(项目)
• 容器资源利用率
• 应用程序

监控指标具体实现举例
Node资源利用率node-exporter节点CPU,内存利用率
Pod资源利用率cAdvisor容器CPU,内存利用率
K8S资源对象状态kube-state-metricsPod/Deployment/Service

整体架构图:

基于Kubernetes的服务发现类型:

服务发现类型描述
node发现集群中的节点,默认地址为kubelet的HTTP端口
service发现所有Service及端口为目标
pod发现所有Pod为目标
endpoints从Service列表中的Endpoint发现Pod为目标
ingress发现ingress路径为目标

参考文档:https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config

监控K8s集群Pod步骤:

1、K8s RBAC授权

kubectl apply -f rbac.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups:
  - ""
  resources:
  - nodes
  - services
  - endpoints
  - pods
  - nodes/proxy
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - "extensions"
  resources:
    - ingresses
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - configmaps
  - nodes/metrics
  verbs:
  - get
- nonResourceURLs:
  - /metrics
  verbs:
  - get
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: kube-system

2、获取Token并保存到文件

kubectl get serviceaccount -n kube-system | grep prometheus
kubectl get sa prometheus -n kube-system -o yaml

kubectl describe secret prometheus-token-xxx -n kube-system

记得将token复制到新的文件中

3、将token拷贝到Prometheus的节点上

scp token.k8s root@192.168.0.25:/opt/monitor/prometheus

4、创建Job和kubeconfig_sd_configs

vim /opt/monitor/prometheus/prometheus.yml
......

 - job_name: kubernetes-nodes-cadvisor
    metrics_path: /metrics
    scheme: https
    kubernetes_sd_configs:
    - role: node
      api_server: https://192.168.2.233:8443
      bearer_token_file: /opt/monitor/prometheus/token.k8s 
      tls_config:
        insecure_skip_verify: true
    bearer_token_file: /opt/monitor/prometheus/token.k8s 
    tls_config:
      insecure_skip_verify: true
    relabel_configs:
    # 将标签(.*)作为新标签名,原有值不变
    - action: labelmap
      regex: __meta_kubernetes_node_label_(.*)
    # 修改NodeIP:10250为APIServerIP:6443
    - action: replace
      regex: (.*)
      source_labels: ["__address__"]
      target_label: __address__
      replacement: 192.168.2.233:8443
    # 实际访问指标接口 https://NodeIP:10250/metrics/cadvisor 这个接口只能APISERVER访问,故此重新标记标签使用APISERVER代理访问
    - action: replace
      source_labels: [__meta_kubernetes_node_name]
      target_label: __metrics_path__
      regex: (.*)
      replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor 

  - job_name: kubernetes-service-endpoints
    kubernetes_sd_configs:
    - role: endpoints
      api_server: https://192.168.2.233:8443
      bearer_token_file: /opt/monitor/prometheus/token.k8s
      tls_config:
        insecure_skip_verify: true
    bearer_token_file: /opt/monitor/prometheus/token.k8s
    tls_config:
      insecure_skip_verify: true
    # Service没配置注解prometheus.io/scrape的不采集
    relabel_configs:
    - action: keep
      regex: true
      source_labels:
      - __meta_kubernetes_service_annotation_prometheus_io_scrape
    # 重命名采集目标协议
    - action: replace
      regex: (https?)
      source_labels:
      - __meta_kubernetes_service_annotation_prometheus_io_scheme
      target_label: __scheme__
    # 重命名采集目标指标URL路径
    - action: replace
      regex: (.+)
      source_labels:
      - __meta_kubernetes_service_annotation_prometheus_io_path
      target_label: __metrics_path__
    # 重命名采集目标地址
    - action: replace
      regex: ([^:]+)(?::\d+)?;(\d+)
      replacement: $1:$2
      source_labels:
      - __address__
      - __meta_kubernetes_service_annotation_prometheus_io_port
      target_label: __address__
    # 将K8s标签(.*)作为新标签名,原有值不变
    - action: labelmap
      regex: __meta_kubernetes_service_label_(.+)
    # 生成命名空间标签
    - action: replace
      source_labels:
      - __meta_kubernetes_namespace
      target_label: kubernetes_namespace
    # 生成Service名称标签
    - action: replace
      source_labels:
      - __meta_kubernetes_service_name
      target_label: kubernetes_service_name

  - job_name: kubernetes-pods
    kubernetes_sd_configs:
    - role: pod
      api_server: https://192.168.2.233:8443
      bearer_token_file: /opt/monitor/prometheus/token.k8s
      tls_config:
        insecure_skip_verify: true
    bearer_token_file: /opt/monitor/prometheus/token.k8s
    tls_config:
      insecure_skip_verify: true
    # 重命名采集目标协议
    relabel_configs:
    - action: keep
      regex: true
      source_labels:
      - __meta_kubernetes_pod_annotation_prometheus_io_scrape
    # 重命名采集目标指标URL路径
    - action: replace
      regex: (.+)
      source_labels:
      - __meta_kubernetes_pod_annotation_prometheus_io_path
      target_label: __metrics_path__
    # 重命名采集目标地址
    - action: replace
      regex: ([^:]+)(?::\d+)?;(\d+)
      replacement: $1:$2
      source_labels:
      - __address__
      - __meta_kubernetes_pod_annotation_prometheus_io_port
      target_label: __address__
    # 将K8s标签(.*)作为新标签名,原有值不变
    - action: labelmap
      regex: __meta_kubernetes_pod_label_(.+)
    # 生成命名空间标签
    - action: replace
      source_labels:
      - __meta_kubernetes_namespace
      target_label: kubernetes_namespace
    # 生成Service名称标签
    - action: replace
      source_labels:
      - __meta_kubernetes_pod_name
      target_label: kubernetes_pod_name

重启Prometheus server

$ systemctl restart prometheus
或者热加载
$ curl  -X POST localhost:9090/-/reload

5、Grafana导入仪表盘

推荐下几个不错的模板:

  • 集群资源监控:3119(以监控资源的具体指标值为主)
  • 资源状态监控:6417(以监控资源的具体数量为主)
  • Node监控:9276
  • Etcd监控:9733

监控K8s资源对象状态步骤:


1、部署kube-state-metrics

kubectl apply -f kube-state-metrics.yaml

curl 10.244.2.212:8080/metrics 查看是否采集到

这里的地址是pod ip 所以需要想办法能够联通,这里我添加一条路由试试

ip route add 10.244.0.0/16  via 192.168.0.48 dev ens160

2、Grafana导入仪表盘

  • 集群资源监控:3119(以监控资源的具体指标值为主)
  • 资源状态监控:6417(以监控资源的具体数量为主)
  • Node监控:9276
  • Etcd监控:9733

- - - The END - - -
  • 文章作者:谭先生
  • 版权所有:文章转载时,注明出处即可!
  • 本站部分资源收集于网络,纯个人收藏,无商业用途,如有侵权请及时告知!
  • Last Modified: July 27, 2021
    Archives QR Code Tip
    QR Code for this page
    Tipping QR Code
    阅读:642