k8s应用更新策略:如何实现灰度发布
生产环境如何实现蓝绿部署?什么是蓝绿部署?蓝绿部署中,一共有两套系统:一套是正在提供服务系统,标记为“绿色”;另一套是准备发布的系统,标记为“蓝色”。两套系统都是功能完善的、正在运行的系统,只是系统版本和对外服务情况不同。这时候,一共有两套系统在运行,正在对外提供服务的老系统是绿色系统,新部署的系统是蓝色系统。蓝色系统不对外提供服务,用来做什么呢?用来做发布前测试,测试过程中发现任何问题,可以直接在蓝色系统上修改,不干扰用户正在使用的系统。(注意,两套系统没有耦合的时候才能百分百保证不干扰)蓝色系统经过反复的测试、修改、验证,确定达到上线标准之后,直接将用户切换到蓝色系统:切换后的一段时间内,依旧是蓝绿两套系统并存,但是用户访问的已经是蓝色系统。这段时间内观察蓝色系统(新系统)工作状态,如果出现问题,直接切换回绿色系统。原先的绿色系统可以销毁,将资源释放出来,用于部署下一个蓝色系统。蓝绿部署的优势和缺点优点:1、更新过程无需停机,风险较少2、回滚方便,只需要更改路由或者切换DNS服务器,效率较高缺点:1、成本较高,需要部署两套环境。如果新版本中基础服务出现问题,会瞬间影响全网用户;如果新版本有问题也会影响全网用户。2、需要部署两套机器,费用开销大3、在非隔离的机器(Docker、VM)上操作时,可能会导致蓝绿环境被摧毁风险4、负载均衡器/反向代理/路由/DNS处理不当,将导致流量没有切换过来情况出现实战1:通过k8s实现线上业务的蓝绿部署下面实验需要的镜像包在课件,把镜像压缩包上传到k8s的各个工作节点,docker load -i解压:docker load -i myapp-lan.tar.gzdocker load -i myapp-lv.tar.gzKubernetes不支持内置的蓝绿部署。目前最好的方式是创建新的deployment,然后更新应用程序的service以指向新的deployment部署的应用1.创建蓝色部署环境(新上线的环境,要替代绿色环境)下面步骤在k8s的控制节点操作:kubectl create ns blue-greencat lan.yaml然后可以使用kubectl命令创建部署。kubectl apply -f lan.yaml:kubectl get pods -n blue-green显示如下:NAME READY STATUS RESTARTS AGE1/1 Running 0 53s1/1 Running 0 53s1/1 Running 0 53s2.创建绿色部署环境(原来的部署环境)cat lv.yaml可以使用kubectl命令创建部署。kubectl apply -f lv.yaml创建前端servicecat service_lanlv.yaml更新服务:kubectl apply -f service_lanlv.yaml://k8s-master节点ip:30062 显示如下:配置文件,修改标签,让其匹配到蓝程序(升级之后的程序)cat service_lanlv.yaml更新资源清单文件:kubectl apply -f service_lanlv.yaml://k8s-master节点ip:30062 显示如下:实验完成之后,把资源先删除,以免影响后面实验:kubectl delete -f lan.yamlkubectl delete -f lv.yamlkubectl delete -f service_lanlv.yaml通过k8s实现滚动更新-滚动更新流程和策略滚动更新简介在k8s中实现金滚动更新首先看下Deployment资源对象的组成:kubectl explain deploymentkubectl explain deployment.specKIND: DeploymentVERSION: apps/v1RESOURCE: spec <Object>DESCRIPTION:Specification of the desired behavior of the Deployment.DeploymentSpec is the specification of the desired behavior of theDeployment.FIELDS:minReadySeconds <integer>Minimum number of seconds for which a newly created pod should be readywithout any of its container crashing, for it to be considered available.Defaults to 0 (pod will be considered available as soon as it is ready)paused <boolean>Indicates that the deployment is paused.#暂停,当我们更新的时候创建pod先暂停,不是立即更新progressDeadlineSeconds <integer>The maximum time in seconds for a deployment to make progress before it isconsidered to be failed. The deployment controller will continue to processfailed deployments and a condition with a ProgressDeadlineExceeded reasonwill be surfaced in the deployment status. Note that progress will not beestimated during the time a deployment is paused. Defaults to 600s.replicas <integer>Number of desired pods. This is a pointer to distinguish between explicitzero and not specified. Defaults to 1.revisionHistoryLimit <integer>#保留的历史版本数,默认是10个The number of old ReplicaSets to retain to allow rollback. This is apointer to distinguish between explicit zero and not specified. Defaults to10.selector <Object> -required-Label selector for pods. Existing ReplicaSets whose pods are selected bythis will be the ones affected by this deployment. It must match the podlabels.strategy <Object>#更新策略,支持的滚动更新策略The deployment strategy to use to replace existing pods with new ones.template <Object> -required-Template describes the pods that will be created.kubectl explain deploy.spec.strategyKIND: DeploymentVERSION: apps/v1RESOURCE: strategy <Object>DESCRIPTION:The deployment strategy to use to replace existing pods with new ones.DeploymentStrategy describes how to replace existing pods with new ones.FIELDS:rollingUpdate <Object>Rolling update config params. Present only if DeploymentStrategyType =RollingUpdate.type <string>Type of deployment. Can be "Recreate" or "RollingUpdate". Default isRollingUpdate.#支持两种更新,Recreate和RollingUpdate#Recreate是重建式更新,删除一个更新一个#RollingUpdate 滚动更新,定义滚动更新的更新方式的,也就是pod能多几个,少几个,控制更新力度的kubectl explain deploy.spec.strategy.rollingUpdateKIND: DeploymentVERSION: apps/v1RESOURCE: rollingUpdate <Object>DESCRIPTION:Rolling update config params. Present only if DeploymentStrategyType =RollingUpdate.Spec to control the desired behavior of rolling update.FIELDS:maxSurge <string>The maximum number of pods that can be scheduled above the desired numberof pods. Value can be an absolute number (ex: 5) or a percentage of desiredpods (ex: 10%). This can not be 0 if MaxUnavailable is 0. Absolute numberis calculated from percentage by rounding up. Defaults to 25%. Example:when this is set to 30%, the new ReplicaSet can be scaled up immediatelywhen the rolling update starts, such that the total number of old and newpods do not exceed 130% of desired pods. Once old pods have been killed,new ReplicaSet can be scaled up further, ensuring that total number of podsrunning at any time during the update is at most 130% of desired pods.#我们更新的过程当中最多允许超出的指定的目标副本数有几个;它有两种取值方式,第一种直接给定数量,第二种根据百分比,百分比表示原本是5个,最多可以超出20%,那就允许多一个,最多可以超过40%,那就允许多两个maxUnavailable <string>The maximum number of pods that can be unavailable during the update. Valuecan be an absolute number (ex: 5) or a percentage of desired pods (ex:Absolute number is calculated from percentage by rounding down. Thiscan not be 0 if MaxSurge is 0. Defaults to 25%. Example: when this is setto 30%, the old ReplicaSet can be scaled down to 70% of desired podsimmediately when the rolling update starts. Once new pods are ready, oldReplicaSet can be scaled down further, followed by scaling up the newensuring that the total number of pods available at all timesduring the update is at least 70% of desired pods.#最多允许几个不可用假设有5个副本,最多一个不可用,就表示最少有4个可用deployment是一个三级结构,deployment控制replicaset,replicaset控制pod,cat deploy-demo.yaml更新资源清单文件:kubectl apply -f deploy-demo.yaml查看deploy状态:kubectl get deploy -n blue-green显示如下:NAME READY UP-TO-DATE AVAILABLE AGE2/2 2 2 60skubectl get rs -n blue-green显示如下:AME DESIRED CURRENT READY AGE2 2 2 2m35s这个随机数字是我们引用pod的模板template的名字的hash值kubectl get pods -n blue-green显示如下:NAME READY STATUS RESTARTS AGE1/1 Running 0 3m23s1/1 Running 0 3m23scat deploy-demo.yaml直接修改replicas数量,如下,变成3spec:replicas: 3修改之后保存退出,执行kubectl apply -f deploy-demo.yaml注意:apply不同于create,apply可以执行多次;create执行一次,再执行就会报错有重复。kubectl get pods -n blue-green显示如下:NAME READY STATUS RESTARTS AGE1/1 Running 0 8m18s1/1 Running 0 8m18s1/1 Running 0 18s#查看myapp-v1这个控制器的详细信息kubectl describe deploy myapp-v1 -n blue-green#显示如下:Name: myapp-v1Namespace: blue-greenCreationTimestamp: Sun, 21 Mar 2021 18:46:52 +0800Labels: <none>Annotations: deployment.kubernetes.io/revision: 1Selector: app=myapp,version=v1Replicas: 3 desired | 3 updated | 3 total | 3 available | 0 unavailableStrategyType: RollingUpdate#默认的更新策略rollingUpdateMinReadySeconds: 0RollingUpdateStrategy: 25% max unavailable, 25% max surge#最多允许多25%个pod,25%表示不足一个,可以补一个Pod Template:Labels: app=myappversion=v1Containers:myapp:Image: janakiramm/myapp:v1Port: 80/TCPHost Port: 0/TCPEnvironment: <none>Mounts: <none>Volumes: <none>Conditions:Type Status Reason------ ------Progressing True NewReplicaSetAvailableAvailable True MinimumReplicasAvailableOldReplicaSets: <none>NewReplicaSet: myapp-v1-67fd9fc9c8 (3/3 replicas created)Events:Type Reason Age From Message------ ---- ---- -------Normal ScalingReplicaSet 3m26s deployment-controller Scaled down replica set myapp-v1-67fd9fc9c8 to 2Normal ScalingReplicaSet 2m1s (x2 over 10m) deployment-controller Scaled up replica set myapp-v1-67fd9fc9c8 to 3例子:测试滚动更新在终端执行如下:kubectl get pods -l app=myapp -n blue-green -w打开一个新的终端窗口更改镜像版本,按如下操作:vim deploy-demo.yaml: janakiramm/myapp:v1 变成image: janakiramm/myapp:v2kubectl apply -f deploy-demo.yaml再回到刚才监测的那个窗口,可以看到信息如下:NAME READY STATUS RESTARTS AGE1/1 Running 0 22m1/1 Running 0 22m1/1 Running 0 14m0/1 Pending 0 0s0/1 Pending 0 0s0/1 ContainerCreating 0 0s1/1 Running 0 11s1/1 Terminating 0 15m0/1 Pending 0 0s0/1 Pending 0 0s0/1 ContainerCreating 0 0s0/1 Terminating 0 15m1/1 Running 0 11s1/1 Terminating 0 23m0/1 Pending 0 0s0/1 Pending 0 0s0/1 ContainerCreating 0 0s1/1 Running 0 1s行一个pod,running起来一个pod之后再Terminating(停掉)一个pod,以此类推,直 到所有pod完成滚动升级在另外一个窗口执行kubectl get rs -n blue-green显示如下:NAME DESIRED CURRENT READY AGE3 3 3 2m7s0 0 0 25mkubectl rollout history deployment myapp-v1 -n blue-green查看myapp-v1这个控制器的滚动历史,显示如下:REVISION CHANGE-CAUSE1 <none>2 <none>kubectl rollout undo自定义滚动更新策略maxSurge和maxUnavailable用来控制滚动更新的更新策略取值范围数值maxUnavailable: [0, 副本数]maxSurge: [0, 副本数]注意:两者不能同时为0。比例maxUnavailable: [0%, 100%] 向下取整,比如10个副本,5%的话==0.5个,但计算按照0个;maxSurge: [0%, 100%] 向上取整,比如10个副本,5%的话==0.5个,但计算按照1个;注意:两者不能同时为0。建议配置maxUnavailable == 0maxSurge == 1这是我们生产环境提供给用户的默认配置。即“一上一下,先上后下”最平滑原则:ready(结合readiness)后,才销毁旧版本pod。此配置适用场景是平滑更新、保证服务平稳,但也有缺点,就是“太慢”了。总结:maxUnavailable:和期望的副本数比,不可用副本数最大比例(或最大值),这个值越小,越能保证服务稳定,更新越平滑;maxSurge:和期望的副本数比,超过期望副本数最大比例(或最大值),这个值调的越大,副本更新速度越快。自定义策略:=1,maxSurge=1kubectl patch deployment myapp-v1 -p '{"spec":{"strategy":{"rollingUpdate": {"maxSurge":1,"maxUnavailable":1}}}}' -n blue-green查看myapp-v1这个控制器的详细信息kubectl describe deployment myapp-v1 -n blue-green显示如下:RollingUpdateStrategy: 1 max unavailable, 1 max surge: 1 max unavailable, 1 max surge这个就是通过控制RollingUpdateStrategy这个字段来设置滚动更新策略的通过k8s完成线上业务的金丝雀发布金丝雀发布简介世纪,英国矿井工人发现,金丝雀对瓦斯这种气体十分敏感。空气中哪怕有极其微量的瓦斯,金丝雀也会停止歌唱;当瓦斯含量超过一定限度时,虽然鲁钝的人类毫无察觉,金丝雀却早已毒发身亡。当时在采矿设备相对简陋的条件下,工人们每次下井都会带上一只金丝雀作为瓦斯检测指标,以便在危险状况下紧急撤离。(Canary) 测试 (国内常称灰度测试)。如果金丝测试通过,则把剩余的V1版本全部升级为V2版本。如果金丝雀测试失败,则直接回退金丝雀,发布失败。优点:灵活,策略自定义,可以按照流量或具体的内容进行灰度(比如不同账号,不同参数),出现问题不会影响全网用户缺点:没有覆盖到所有的用户导致出现问题不好排查在k8s中实现金丝雀发布kubectl get pods -l app=myapp -n blue-green -wkubectl set image deployment myapp-v1 myapp=janakiramm/myapp:v2 -n blue-green && kubectl rollout pause deployment myapp-v1 -n blue-green回到标签1观察,显示如下:NAME READY STATUS RESTARTS AGE1/1 Running 0 86s1/1 Running 0 86s0/1 Pending 0 0s0/1 Pending 0 0s0/1 ContainerCreating 0 0s0/1 ContainerCreating 0 1s1/1 Running 0 2s:v2版本 更新镜像之后,创建一个新的pod就立即暂停,这就是我们说的金丝雀发布;如果暂停几个小时之后没有问题,那么取消暂停,就会依次执行后面步骤,把所有pod都升级。解除暂停:回到标签1继续观察:打开标签2执行如下:kubectl rollout resume deployment myapp-v1 -n blue-green在标签1可以看到如下一些信息,下面过程是把余下的pod里的容器都更的版本:NAME READY STATUS RESTARTS AGE1/1 Running 0 86s1/1 Running 0 86s0/1 Pending 0 0s0/1 Pending 0 0s0/1 ContainerCreating 0 0s0/1 ContainerCreating 0 1s1/1 Running 0 2s1/1 Terminating 0 10m0/1 Pending 0 0s0/1 Pending 0 0s0/1 ContainerCreating 0 0s0/1 ContainerCreating 0 1s0/1 Terminating 0 10m1/1 Running 0 2s1/1 Terminating 0 10m0/1 Terminating 0 10m0/1 Terminating 0 10m0/1 Terminating 0 10m0/1 Terminating 0 10m0/1 Terminating 0 10mkubectl get rs -n blue-green可以看到replicaset控制器有2个了NAME DESIRED CURRENT READY AGE0 0 0 13m2 2 2 7m28s回滚:kubectl rollout history deployment myapp-v1 -n blue-green显示如下:REVISION CHANGE-CAUSE1 <none>2 <none>上面说明一共有两个版本,回滚的话默认回滚到上一版本,可以指定参数回滚:kubectl rollout undo deployment myapp-v1 -n blue-green --to-revision=1#回滚到的版本号是1kubectl rollout history deployment myapp-v1 -n blue-green显示如下:REVISION CHANGE-CAUSE2 <none>3 <none>上面可以看到第一版没了,被还原成了第三版,第三版的前一版是第二版kubectl get rs -n blue-green -o wide显示如下:NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR2 2 2 18m myapp janakiramm/myapp:v1 app=myapp,pod-template-hash=67fd9fc9c8,version=v10 0 0 12m myapp janakiramm/myapp:v2 app=myapp,pod-template-hash=75fb478d6c,version=v1以看到上面的rs已经用第一个了,这个就是还原之后的rsController安装和配置实战3:通过Ingress-nginx实现灰度发布ingress工具,支持配置Ingress Annotations来实现不同场景下的灰度发布和测试。 Nginx Annotations 支持以下几种Canary规则:假设我们现在部署了两个版本的服务,老版本和canary版本Header的流量切分,适用于灰度发布以及 A/B 测试。当Request Header 设置为 always时,请求将会被一直发送到 Canary 版本;当 Request Header 设置为 never时,请求不会被发送到 Canary 入口。Request Header 的值,用于通知 Ingress 将请求路由到 Canary Ingress 中指定的服务。当 Request Header 设置为此值时,它将被路由到 Canary 入口。0 - 100 按百分比将请求路由到 Canary Ingress 中指定的服务。权重为 0 意味着该金丝雀规则不会向 Canary 入口的服务发送任何请求。权重为60意味着60%流量转到canary。权重为 100 意味着所有请求都将被发送到 Canary 入口。Cookie 的流量切分,适用于灰度发布与 A/B 测试。用于通知 Ingress 将请求路由到 Canary Ingress 中指定的服务的cookie。当 cookie 值设置为 always时,它将被路由到 Canary 入口;当 cookie 值设置为 never时,请求不会被发送到 Canary 入口。部署服务:deployment 就不展示了,service 配置如下:# 测试版本apiVersion: v1kind: Servicemetadata:name: hello-servicelabels:app: hello-servicespec:ports:port: 80protocol: TCPselector:app: hello-service# canary 版本apiVersion: v1kind: Servicemetadata:name: canary-hello-servicelabels:app: canary-hello-servicespec:ports:port: 80protocol: TCPselector:app: canary-hello-service根据权重转发:ingress 配置如下:apiVersion: extensions/v1beta1kind: Ingressmetadata:name: canaryannotations:: nginx: "true": "30"spec:rules:host: canary-service.abc.comhttp:paths:backend:serviceName: canary-hello-serviceservicePort: 80测试结果如下:for i in $(seq 1 10); do curl http://canary-service.abc.com; echo '\n'; donehello world-version1hello world-version1hello world-version2hello world-version2hello world-version1hello world-version1hello world-version1hello world-version1hello world-version1hello world-version1根据请求头转发:annotation 配置如下(ingress 其余部分省略)annotations:: nginx: "true": "test"测试结果如下:for i in $(seq 1 5); do curl -H 'test:always' http://canary-service.abc.com; echo '\n'; donehello world-version1hello world-version1hello world-version1hello world-version1hello world-version1for i in $(seq 1 5); do curl -H 'test:abc' http://canary-service.abc.com; echo '\n'; donehello world-version2hello world-version2hello world-version2hello world-version2hello world-version2根据cookie转发:test,比如用户的请求 cookie 中含有特殊的标签,那么我们可以把这部分用户的请求转发到特定的服务进行处理。annotation 配置如下:: nginx: "true": "like_music"测试结果如下:for i in $(seq 1 5); do curl -b 'like_music=1' http://canary-service.abc.com; echo '\n'; donehello world-version1hello world-version1hello world-version1hello world-version1hello world-version1for i in $(seq 1 5); do curl -b 'like_music=always' http://canary-service.abc.com; echo '\n'; donehello world-version2hello world-version2hello world-version2hello world-version2hello world-version2三种annotation按如下顺序匹配canary-by-header > canary-by-cookie > canary-weight
- END -
推荐阅读
长按或者扫码即可订阅
