怒喷k8s:竟然还要这么才能正确找到statefulset的pods
一、前言
今天在排查一个线上的中间件集群,该中间件集群是通过 helm 部署到k8s集群当中,有一个statefulset总是有一个pod还没有ready,故想去看看为啥一直不正常。在k8s当中关联对象之间的关系,我想当然的使用statefulset的label selector去找,结果却找到了其他类似statefulset所调度的pod。问了一些人,有说使用selector都能找到,众所周知,在k8s的很多描述文件中都是使用label相互之间关联,但是为啥这次不行了呢?有说直接使用名称匹配即可,感觉不适合靠谱。具体怎么找呢?
二、问题复现
我的statefulest描述文件-nginx1.yaml:
apiVersion: v1
kind: Service
metadata:
name: nginx1
labels:
app: nginx
spec:
ports:
- port: 80
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: nginx1
spec:
serviceName: "nginx1"
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
第二个statefulest描述文件-nginx2.yaml:
apiVersion: v1
kind: Service
metadata:
name: nginx2
labels:
app: nginx
spec:
ports:
- port: 80
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: nginx2
spec:
serviceName: "nginx2"
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
可以看到2个statefulset的selector都是app: nginx. \
执行kubectl get sts:
NAME READY AGEnginx1 1/1 5m16snginx2 1/1 2m19s
执行kubectl get pods:
NAME READY STATUS RESTARTS AGEnginx1-0 1/1 Running 0 6m24snginx2-0 1/1 Running 0 3m27s
发现statefulset已经正常起来了。
此时发现使用label的selector已经不能正常得到statefulset所管理的pod,执行kubectl get pods -l app=nginx结果如下:
NAME READY STATUS RESTARTS AGEnginx1-0 1/1 Running 0 9m30snginx2-0 1/1 Running 0 6m33s
所以怎么才能正确得到statefulset的pod呢?
源码解析
源码根据kubernets源码v1.20.2版本。\
可以大概猜到代码应该在pkg下的statefulest的controller,定位到pkg/controller/statefulset/stateful_set.go,稍稍搜索该代码会发现第一处代码:
// getPodsForStatefulSet returns the Pods that a given StatefulSet should manage.// It also reconciles ControllerRef by adopting/orphaning.//// NOTE: Returned Pods are pointers to objects from the cache.// If you need to modify one, you need to copy it first.func (ssc *StatefulSetController) getPodsForStatefulSet(set *apps.StatefulSet, selector labels.Selector) ([]*v1.Pod, error) { // List all pods to include the pods that don't match the selector anymore but // has a ControllerRef pointing to this StatefulSet. pods, err := ssc.podLister.Pods(set.Namespace).List(labels.Everything()) if err != nil { return nil, err }
filter := func(pod *v1.Pod) bool { // Only claim if it matches our StatefulSet name. Otherwise release/ignore. return isMemberOf(set, pod) }
cm := controller.NewPodControllerRefManager(ssc.podControl, set, selector, controllerKind, ssc.canAdoptFunc(set)) return cm.ClaimPods(pods, filter)}
该处代码是获取pod被哪些statefulset所管理,上面我的2个statefulset都是被selector``app: nginx搜管理,所以根据nginx1-0和nginx2-0任一都会找到2个satefulst:nginx1和nginx2。
第二处代码是根据statefulset找到所管理的pod:
// getPodsForStatefulSet returns the Pods that a given StatefulSet should manage.// It also reconciles ControllerRef by adopting/orphaning.//// NOTE: Returned Pods are pointers to objects from the cache.// If you need to modify one, you need to copy it first.func (ssc *StatefulSetController) getPodsForStatefulSet(set *apps.StatefulSet, selector labels.Selector) ([]*v1.Pod, error) { // List all pods to include the pods that don't match the selector anymore but // has a ControllerRef pointing to this StatefulSet. pods, err := ssc.podLister.Pods(set.Namespace).List(labels.Everything()) if err != nil { return nil, err }
filter := func(pod *v1.Pod) bool { // Only claim if it matches our StatefulSet name. Otherwise release/ignore. return isMemberOf(set, pod) }
cm := controller.NewPodControllerRefManager(ssc.podControl, set, selector, controllerKind, ssc.canAdoptFunc(set)) return cm.ClaimPods(pods, filter)}
此时注意2处代码filter和ClaimPods方法。先看filter方法逻辑isMemberOf,该方法就会过滤找出statefulset所管理的pod列表。
// isMemberOf tests if pod is a member of set.func isMemberOf(set *apps.StatefulSet, pod *v1.Pod) bool { return getParentName(pod) == set.Name}
明显该代码是匹配了statefulset的名字
// statefulPodRegex is a regular expression that extracts the parent StatefulSet and ordinal from the Name of a Podvar statefulPodRegex = regexp.MustCompile("(.*)-([0-9]+)$")
// getParentNameAndOrdinal gets the name of pod's parent StatefulSet and pod's ordinal as extracted from its Name. If// the Pod was not created by a StatefulSet, its parent is considered to be empty string, and its ordinal is considered// to be -1.func getParentNameAndOrdinal(pod *v1.Pod) (string, int) { parent := "" ordinal := -1 subMatches := statefulPodRegex.FindStringSubmatch(pod.Name) if len(subMatches) < 3 { return parent, ordinal } parent = subMatches[1] if i, err := strconv.ParseInt(subMatches[2], 10, 32); err == nil { ordinal = int(i) } return parent, ordinal}
正则表达式很明显看出是根据pod的名找到statefulset的名字,这里根据nginx1-0就会找到statefulset``nginx1。
继续回到方法ClaimPods方法:
// ClaimPods tries to take ownership of a list of Pods.//// It will reconcile the following:// * Adopt orphans if the selector matches.// * Release owned objects if the selector no longer matches.//// Optional: If one or more filters are specified, a Pod will only be claimed if// all filters return true.//// A non-nil error is returned if some form of reconciliation was attempted and// failed. Usually, controllers should try again later in case reconciliation// is still needed.//// If the error is nil, either the reconciliation succeeded, or no// reconciliation was necessary. The list of Pods that you now own is returned.func (m *PodControllerRefManager) ClaimPods(pods []*v1.Pod, filters ...func(*v1.Pod) bool) ([]*v1.Pod, error) { var claimed []*v1.Pod var errlist []error
match := func(obj metav1.Object) bool { pod := obj.(*v1.Pod) // Check selector first so filters only run on potentially matching Pods. if !m.Selector.Matches(labels.Set(pod.Labels)) { return false } for _, filter := range filters { if !filter(pod) { return false } } return true } ...
for _, pod := range pods { ok, err := m.ClaimObject(pod, match, adopt, release) ... } return claimed, utilerrors.NewAggregate(errlist)}
match方法已经揭露完整真相:现根据selector匹配,然后再filter里面的名称匹配。
三、总结
在statefulset中既使用了标签匹配,又使用了根据pod的名称截取出来statefulset的名称匹配。所以上面我的示例正确获取方法如下:
kubectl get pods -l app=nginx | grep -v NAME | awk '{print $1}' | grep -E --color '^(nginx1)-([0-9]+)$'
结果如下:
nginx1-0
