解决k8s删除pod以后无限重启该pod的问题

第一次接触K8S,一脸懵逼~
搭建完成k8s集群后,运行第一个插件Kubernetes-Dashboard,但是一直处于CrashLoopBackOff状态,下面记录下解决此问题的过程。(过程比较繁琐,因我是新手,老鸟估计一眼就看出来了)

首先,查看node节点的日志,路径在/var/log/message

1
2
3
4
Jun  1 11:32:34 apm-slave03 dockerd-current: time="2018-06-01T11:32:34.830329738+08:00" level=error msg="Handler for GET /containers/b532d65bd2ff380035560a33e435414b66ccfbfbbf6f3c9d51cb2f0add57b2d2/json returned error: No such container: b532d65bd2ff380035560a33e435414b66ccfbfbbf6f3c9d51cb2f0add57b2d2"
Jun 1 11:32:44 apm-slave03 kubelet: I0601 11:32:44.160859 20744 docker_manager.go:2495] checking backoff for container "kubernetes-dashboard" in pod "kubernetes-dashboard-latest-4167338039-95kb9"
Jun 1 11:32:44 apm-slave03 kubelet: I0601 11:32:44.161188 20744 docker_manager.go:2509] Back-off 5m0s restarting failed container=kubernetes-dashboard pod=kubernetes-dashboard-latest-4167338039-95kb9_kube-system(c2097a18-654a-11e8-8d29-005056bc2ad1)
Jun 1 11:32:44 apm-slave03 kubelet: E0601 11:32:44.161302 20744 pod_workers.go:184] Error syncing pod c2097a18-654a-11e8-8d29-005056bc2ad1, skipping: failed to "StartContainer" for "kubernetes-dashboard" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kubernetes-dashboard pod=kubernetes-dashboard-latest-4167338039-95kb9_kube-system(c2097a18-654a-11e8-8d29-005056bc2ad1)"

这里基本上看不出问题,继续查看pod

1
2
3
kubectl get pods -n kube-system 
NAME READY STATUS RESTARTS AGE
kubernetes-dashboard-latest-4167338039-bbwp4 0/1 ContainerCreating 0 47s

删除之

1
kubectl delete pod kubernetes-dashboard-latest-4167338039-bbwp4 -n kube-system

接着查看

1
2
3
kubectl get pods -n kube-system | grep -v Running                             
NAME READY STATUS RESTARTS AGE
kubernetes-dashboard-latest-4167338039-95kb9 0/1 Error 0 4s

居然有重启了一个,于是乎连续删了5次,依旧重启,这有点匪夷所思.
查看其中一个pod的描述信息:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
 kubectl describe pod kubernetes-dashboard-latest-4167338039-95kb9 -n kube-system                                             
Name: kubernetes-dashboard-latest-4167338039-95kb9
Namespace: kube-system
Node: apm-slave03/10.10.202.159
Start Time: Fri, 01 Jun 2018 11:20:39 +0800
Labels: k8s-app=kubernetes-dashboard
kubernetes.io/cluster-service=true
pod-template-hash=4167338039
version=latest
Status: Running
IP: 10.0.48.2
Controllers: ReplicaSet/kubernetes-dashboard-latest-4167338039
Containers:
kubernetes-dashboard:
Container ID: docker://fe222c62c496d1348b9da4d17da474721d941279c7bd476596a0e041353ccd55
Image: registry.cn-hangzhou.aliyuncs.com/google-containers/kubernetes-dashboard-amd64:v1.4.2
Image ID: docker-pullable://registry.cn-hangzhou.aliyuncs.com/google-containers/kubernetes-dashboard-amd64@sha256:8c9cafe41e0846c589a28ee337270d4e97d486058c17982314354556492f2c69
Port: 9090/TCP
Args:
--apiserver-host=http://apm-slave02:8080
Limits:
cpu: 100m
memory: 50Mi
Requests:
cpu: 100m
memory: 50Mi
State: Terminated
Reason: Error
Exit Code: 1
Started: Fri, 01 Jun 2018 11:21:04 +0800
Finished: Fri, 01 Jun 2018 11:21:06 +0800
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Fri, 01 Jun 2018 11:20:43 +0800
Finished: Fri, 01 Jun 2018 11:20:45 +0800
Ready: False
▽ Restart Count: 2
Liveness: http-get http://:9090/ delay=30s timeout=30s period=10s #success=1 #failure=3
Volume Mounts: <none>
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
No volumes.
QoS Class: Guaranteed
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
28s 28s 1 {default-scheduler } Normal Scheduled Successfully assigned kubernetes-dashboard-latest-4167338039-95kb9 to apm-slave03
28s 28s 1 {kubelet apm-slave03} spec.containers{kubernetes-dashboard} Normal Created Created container with docker id f880c9e76de7; Security:[seccomp=unconfined]
27s 27s 1 {kubelet apm-slave03} spec.containers{kubernetes-dashboard} Normal Started Started container with docker id f880c9e76de7
24s 24s 1 {kubelet apm-slave03} spec.containers{kubernetes-dashboard} Normal Created Created container with docker id 16f258977612; Security:[seccomp=unconfined]
24s 24s 1 {kubelet apm-slave03} spec.containers{kubernetes-dashboard} Normal Started Started container with docker id 16f258977612
22s 18s 2 {kubelet apm-slave03} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "kubernetes-dashboard" with CrashLoopBackOff: "Back-off 10s restarting failed container=kubernetes-dashboard pod=kubernetes-dashboard-latest-4167338039-95kb9_kube-system(c2097a18-654a-11e8-8d29-005056bc2ad1)"

28s 3s 4 {kubelet apm-slave03} Warning MissingClusterDNS kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to DNSDefault policy.
28s 3s 3 {kubelet apm-slave03} spec.containers{kubernetes-dashboard} Normal Pulled Container image "registry.cn-hangzhou.aliyuncs.com/google-containers/kubernetes-dashboard-amd64:v1.4.2" already present on machine
3s 3s 1 {kubelet apm-slave03} spec.containers{kubernetes-dashboard} Normal Created Created container with docker id fe222c62c496; Security:[seccomp=unconfined]
3s 3s 1 {kubelet apm-slave03} spec.containers{kubernetes-dashboard} Normal Started Started container with docker id fe222c62c496
22s 0s 3 {kubelet apm-slave03} spec.containers{kubernetes-dashboard} Warning BackOff Back-off restarting failed docker container
0s 0s 1 {kubelet apm-slave03} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "kubernetes-dashboard" with CrashLoopBackOff: "Back-off 20s restarting failed container=kubernetes-dashboard pod=kubernetes-dashboard-latest-4167338039-95kb9_kube-system(c2097a18-654a-11e8-8d29-005056bc2ad1)"

似乎也没看到想要的东西,于是面向Google了一把,最终结论是需要删除deployments才能完全删除那个pod,于是先看下长啥样~

1
2
3
kubectl get deployments --all-namespaces
NAMESPACE NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
kube-system kubernetes-dashboard-latest 1 1 1 0 2d

发现Name居然是我第一次创建的deployment,并不是我现在创建的,于是(咬牙切齿)删除之

1
2
3
4
5
6
7
8
kubectl delete deployments kubernetes-dashboard-latest -n kube-system
deployment "kubernetes-dashboard-latest" deleted

kubectl get deployments --all-namespaces
No resources found.

kubectl get pods -n kube-system | grep -v Running
No resources found.

无限重启pod的问题总算解决!

重新创建deployments

1
2
3
kubectl get deployments --all-namespaces                             
NAMESPACE NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
kube-system kubernetes-dashboard 1 1 1 1 5s

终于是AVAILABLE了。而且dashboard也终于running了~ 撒花✿✿ヽ(°▽°)ノ✿

Cco.Xyz wechat
坚持原创技术分享,您的支持将鼓励我继续创作!
0%