k8s 中 pod 的自动扩缩容

vlambda
2021-05-04

k8s 中 pod 的自动扩缩容

HPA 说明

Horizontal Pod Autoscaler（HPA）控制器，用于实现基于 CPU 使用率进行自动 Pod 扩缩容的功能。HPA 控制器基于 Master 的 kube-controller-manager 服务启动参数 --horizontal-pod-autoscaler-sync-period 定义的探测周期（默认值为 15s），周期性地监测目标 Pod 的资源性能指标，并与 HPA 资源对象中的扩缩容条件进行对比，在满足条件时对 Pod 副本数量进行调整。

HPA 工作原理

Kubernetes 中的某个 Metrics Server 持续采集所有 Pod 副本的指标数据。HPA 控制器通过 Metrics Server 的 API（Heapster 的 API 或聚合 API）获取这些数据，基于用户定义的扩缩容规则进行计算，得到目标 Pod 副本数量。当目标 Pod 副本数量与当前副本数量不同时， HPA 控制器就向 Pod 的副本控制器（Deployment、 RC 或 ReplicaSet）发起 scale 操作，调整 Pod 的副本数量，完成扩缩容操作。如下图所示：

指标类型

默认的是每隔 15 秒，control manager 就会根据 HPA 定义的指标查询资源利用率：

resource metrics API （每个 pod 资源指标）
custom metrics API （其他指标）

Pod 水平自动伸缩

Pod 水平自动伸缩（Horizontal Pod Autoscaler）特性，可以基于 CPU 利用率自动伸缩 replication controller、deployment 和 replica set 中的 pod 数量，（除了 CPU 利用率）也可以基于其他应程序提供的度量指标 custom metrics。pod 自动缩放不适用于无法缩放的对象，比如 DaemonSets。

Pod 水平自动伸缩特性由 Kubernetes API 资源和控制器实现。资源决定了控制器的行为。控制器会周期性的获取平均 CPU 利用率，并与目标值相比较后来调整 replication controller 或 deployment 中的副本数量。

示例

基于 CPU 的 HPA

下面创建一个 deployment

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: mty-production-api
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mty-production-api
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: mty-production-api
    spec:
      containers:
      - image: harbor.ysmty.com:19999/onair/mty-production-api:202007151447-3.5.2-b9a7f09
        imagePullPolicy: IfNotPresent
        name: mty-production-api
        resources:
          limits:
            cpu: 4
            memory: 4Gi
          requests:
            cpu: 100m
            memory: 128Mi
        volumeMounts:
        - mountPath: /usr/local/mty-production-api/logs
          name: log-pv
          subPath: mty-production-api
      imagePullSecrets:
      - name: mima
      restartPolicy: Always
      volumes:
      - name: log-pv
        persistentVolumeClaim:
          claimName: log-pv

运行这个 yaml 文件即可，这时这个 deployment 资源 pod 会启动起来，现在正常应该是只启动一个 pod 下面，使用 HPA，基于 CPU 来做动态扩容

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-demo
  namespace: default
spec:
  maxReplicas: 5
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: mty-production-api
  targetCPUUtilizationPercentage: 10
status:
  currentReplicas: 1
  desiredReplicas: 0

完事之后，启动该 yaml 文件，可以查看 hpa 的资源类型

# kubectl get hpa
NAME       REFERENCE                       TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
hpa-demo   Deployment/mty-production-api   8%/10%    1         5         5          28m

使用简单的压测工具，进行测试下

ab -n 10000 -c 10 http://172.17.58.255:8080/api/healthy/check

随后，再次查看 pod 数量

# kubectl get pod | grep mty-production-api
mty-production-api-596dfc85c4-599xj               1/1     Running       0          28m
mty-production-api-596dfc85c4-922p4               1/1     Running       0          27m
mty-production-api-596dfc85c4-b6zcx               1/1     Running       0          27m
mty-production-api-596dfc85c4-cqdz2               1/1     Running       0          12d
mty-production-api-596dfc85c4-fmk5w               1/1     Running       0          27m

可以看到现在已经启动了 4 个了。说明 hpa 已经生效了。查看下 hpa 的相关信息

# kubectl describe hpa hpa-demo
Name:                                                  hpa-demo
Namespace:                                             default
Labels:                                                <none>
Annotations:                                           kubectl.kubernetes.io/last-applied-configuration:
                                                         {"apiVersion":"autoscaling/v1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"hpa-demo","namespace":"default"},"spe...
CreationTimestamp:                                     Mon, 03 Aug 2020 23:20:50 +0800
Reference:                                             Deployment/mty-production-api
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  8% (8m) / 10%
Min replicas:                                          1
Max replicas:                                          5
Deployment pods:                                       5 current / 5 desired
Conditions:
  Type            Status  Reason               Message
  ----            ------  ------               -------
  AbleToScale     True    ScaleDownStabilized  recent recommendations were higher than current one, applying the highest recent recommendation
  ScalingActive   True    ValidMetricFound     the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
  ScalingLimited  True    TooManyReplicas      the desired replica count is more than the maximum replica count
Events:
  Type    Reason             Age   From                       Message
  ----    ------             ----  ----                       -------
  Normal  SuccessfulRescale  29m   horizontal-pod-autoscaler  New size: 2; reason: cpu resource utilization (percentage of request) above target
  Normal  SuccessfulRescale  28m   horizontal-pod-autoscaler  New size: 4; reason: cpu resource utilization (percentage of request) above target
  Normal  SuccessfulRescale  28m   horizontal-pod-autoscaler  New size: 5; reason: cpu resource utilization (percentage of request) above target

停止压测，过一会，pod 的数量应该会再次变成一个 pod。

基于内存的 HPA

当前稳定版本autoscaling/v1只支持 CPU 的扩缩容，autoscaling/v2beta2支持内存和自定义指标的扩缩容，我们使用这个版本的接口测试。

为了方便测试，设置一个消耗内存的脚本，使用 configmap 挂载到容器里

# cat increase-mem.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: increase-mem-config
data:
  increase-mem.sh: |
    #!/bin/bash
    mkdir /tmp/memory
    mount -t tmpfs -o size=40M tmpfs /tmp/memory
    dd if=/dev/zero of=/tmp/memory/block
    sleep 60
    rm /tmp/memory/block
    umount /tmp/memory
    rmdir /tmp/memory

修改下之前的 deployment

# cat app-dep.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: mty-production-api
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mty-production-api
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: mty-production-api
    spec:
      containers:
      - image: harbor.ysmty.com:19999/onair/mty-production-api:202007151447-3.5.2-b9a7f09
        imagePullPolicy: IfNotPresent
        name: mty-production-api
        resources:
          limits:
            cpu: 4
            memory: 4Gi
          requests:
            cpu: 100m
            memory: 128Mi
        volumeMounts:
        - mountPath: /usr/local/mty-production-api/logs
          name: log-pv
          subPath: mty-production-api
        - name: increase-mem-script
          mountPath: /etc/script
        securityContext:
          privileged: true
      imagePullSecrets:
      - name: mima
      restartPolicy: Always
      volumes:
      - name: log-pv
        persistentVolumeClaim:
          claimName: log-pv
      - name: increase-mem-script
        configMap:
          name: increase-mem-config

注意把写的脚本挂载到容器里，另外需要使用特权模式编写一个基于内存的 HPA

# cat hpa-mem.yaml
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: nginx-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: mty-production-api
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - type: Resource
    resource:
      name: memory
      targetAverageUtilization: 60

查看 HPA 及 pod 情况

# kubectl get hpa
NAME        REFERENCE                       TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
nginx-hpa   Deployment/mty-production-api   463%/60%   1         5         5          6m44s

pod 也相应跟着动态扩容了

# kubectl get pod -o wide | grep mty
mty-production-api-66957fdcd6-dwzhf               1/1     Running            0          7m15s   172.17.135.143   k8s-node03   <none>           <none>
mty-production-api-66957fdcd6-mvftq               1/1     Running            0          7m15s   172.17.58.222    k8s-node02   <none>           <none>
mty-production-api-66957fdcd6-p455s               1/1     Running            0          7m15s   172.17.85.194    k8s-node01   <none>           <none>
mty-production-api-66957fdcd6-vcj4d               1/1     Running            0          9m23s   172.17.85.202    k8s-node01   <none>           <none>
mty-production-api-66957fdcd6-xktk4               1/1     Running            0          6m59s   172.17.135.129   k8s-node03   <none>           <none>

vlambda博客
学习文章列表