Flannel故障案例
1.案例一 : 未安装Flannel插件
[root@master231 pods]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
xixi 0/1 ContainerCreating 0 169m <none> worker233 <none> <none>
[root@master231 pods]#
[root@master231 pods]# kubectl describe pod xixi
Name: xixi
Namespace: default
Priority: 0
Node: worker233/10.0.0.233
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreatePodSandBox 9m43s (x3722 over 169m) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "ae48b3c943557dafdc5f8a3b06897da299233021ed2fd907818cc5acf86c16eb" network for pod "xixi": networkPlugin cni failed to set up pod "xixi_default" network: loadFlannelSubnetEnv failed: open /run/flannel/subnet.env: no such file or directory
Normal SandboxChanged 4m43s (x3850 over 169m) kubelet Pod sandbox changed, it will be killed and re-created.
[root@master231 pods]# 问题原因: - 发现缺少'/run/flannel/subnet.env'这个文件,说明没有部署Flannel插件,安装插件即可。
2.案例二 : 缺少Flannel程序
[root@master231 flannel]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ds-xiuxian-dcjsg 0/1 ContainerCreating 0 10s <none> worker232 <none> <none>
ds-xiuxian-vjbnw 0/1 ContainerCreating 0 10s <none> worker233 <none> <none>
[root@master231 flannel]#
[root@master231 flannel]# kubectl describe pod ds-xiuxian-dcjsg
Name: ds-xiuxian-dcjsg
Namespace: default
Priority: 0
Node: worker232/10.0.0.232
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 15s default-scheduler Successfully assigned default/ds-xiuxian-dcjsg to worker232
Warning FailedCreatePodSandBox 15s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "1fcadf406345fda4f800feb9ec42ddce495da0ecef8ba4f5cb5ebbdad795e4fe" network for pod "ds-xiuxian-dcjsg": networkPlugin cni failed to set up pod "ds-xiuxian-dcjsg_default" network: failed to find plugin "flannel" in path [/opt/cni/bin], failed to clean up sandbox container "1fcadf406345fda4f800feb9ec42ddce495da0ecef8ba4f5cb5ebbdad795e4fe" network for pod "ds-xiuxian-dcjsg": networkPlugin cni failed to teardown pod "ds-xiuxian-dcjsg_default" network: failed to find plugin "flannel" in path [/opt/cni/bin]]
Normal SandboxChanged 4s (x2 over 14s) kubelet Pod sandbox changed, it will be killed and re-created.
[root@master231 flannel]# 问题原因: 在"/opt/cni/bin"路径下找不到一个名为"flannel"的二进制文件。
官方关于Flannel的初始化容器中存在Flannel设备,因此删除Pod后就能够自动生成该程序文件。
[root@master231 flannel]# kubectl get pods -o wide -n kube-flannel
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-flannel-ds-b5wb4 1/1 Running 0 24m 10.0.0.231 master231 <none> <none>
kube-flannel-ds-jrj8q 1/1 Running 0 24m 10.0.0.233 worker233 <none> <none>
kube-flannel-ds-vsg8t 1/1 Running 0 24m 10.0.0.232 worker232 <none> <none>
[root@master231 flannel]#
[root@master231 flannel]# kubectl -n kube-flannel delete pod -l k8s-app=flannel
pod "kube-flannel-ds-b5wb4" deleted
pod "kube-flannel-ds-jrj8q" deleted
pod "kube-flannel-ds-vsg8t" deleted
[root@master231 flannel]#
[root@master231 flannel]# kubectl get pods -o wide -n kube-flannel
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-flannel-ds-4p825 1/1 Running 0 4s 10.0.0.233 worker233 <none> <none>
kube-flannel-ds-9jwwx 1/1 Running 0 4s 10.0.0.231 master231 <none> <none>
kube-flannel-ds-hx7pd 1/1 Running 0 4s 10.0.0.232 worker232 <none> <none>
[root@master231 flannel]# 3.案例三:缺失程序包文件
3.1 故障复现
[root@worker232 ~]# ll /opt/cni/bin/
total 68944
drwxrwxr-x 2 root root 4096 Sep 26 11:17 ./
drwxr-xr-x 3 root root 4096 Sep 19 10:12 ../
-rwxr-xr-x 1 root root 3859475 Jan 17 2023 bandwidth*
-rwxr-xr-x 1 root root 4299004 Jan 17 2023 bridge*
-rwxr-xr-x 1 root root 10167415 Jan 17 2023 dhcp*
-rwxr-xr-x 1 root root 3986082 Jan 17 2023 dummy*
-rwxr-xr-x 1 root root 4385098 Jan 17 2023 firewall*
-rwxr-xr-x 1 root root 3870731 Jan 17 2023 host-device*
-rwxr-xr-x 1 root root 3287319 Jan 17 2023 host-local*
-rwxr-xr-x 1 root root 3999593 Jan 17 2023 ipvlan*
-rwxr-xr-x 1 root root 3353028 Jan 17 2023 loopback*
-rwxr-xr-x 1 root root 4029261 Jan 17 2023 macvlan*
-rwxr-xr-x 1 root root 3746163 Jan 17 2023 portmap*
-rwxr-xr-x 1 root root 4161070 Jan 17 2023 ptp*
-rwxr-xr-x 1 root root 3550152 Jan 17 2023 sbr*
-rwxr-xr-x 1 root root 2845685 Jan 17 2023 static*
-rwxr-xr-x 1 root root 3437180 Jan 17 2023 tuning*
-rwxr-xr-x 1 root root 3993252 Jan 17 2023 vlan*
-rwxr-xr-x 1 root root 3586502 Jan 17 2023 vrf*
[root@worker232 ~]#
[root@worker232 ~]# mount -t tmpfs -o size=90M tmpfs /opt/
[root@worker232 ~]#
[root@worker232 ~]# df -h | grep opt
tmpfs 90M 0 90M 0% /opt
[root@worker232 ~]#
[root@worker232 ~]# ll /opt/
total 4
drwxrwxrwt 2 root root 40 Sep 26 11:23 ./
drwxr-xr-x 21 root root 4096 Sep 19 10:03 ../
[root@worker232 ~]# 3.2 创建测试的pod
[root@master231 flannel]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ds-xiuxian-hhvn2 0/1 ContainerCreating 0 5s <none> worker233 <none> <none>
ds-xiuxian-thchl 0/1 ContainerCreating 0 5s <none> worker232 <none> <none>
[root@master231 flannel]#
[root@master231 flannel]# kubectl describe pod ds-xiuxian-hhvn2
Name: ds-xiuxian-hhvn2
Namespace: default
Priority: 0
Node: worker233/10.0.0.233
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 11s default-scheduler Successfully assigned default/ds-xiuxian-hhvn2 to worker233
Warning FailedCreatePodSandBox 11s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "52b723c545c56d611e265b25bbd930fd66c9f1b651c65550fd515ea9702cbec6" network for pod "ds-xiuxian-hhvn2": networkPlugin cni failed to set up pod "ds-xiuxian-hhvn2_default" network: failed to find plugin "loopback" in path [/opt/cni/bin], failed to clean up sandbox container "52b723c545c56d611e265b25bbd930fd66c9f1b651c65550fd515ea9702cbec6" network for pod "ds-xiuxian-hhvn2": networkPlugin cni failed to teardown pod "ds-xiuxian-hhvn2_default" network: failed to find plugin "portmap" in path [/opt/cni/bin]]
Normal SandboxChanged 11s kubelet Pod sandbox changed, it will be killed and re-created.
[root@master231 flannel]# 问题原因: 在'/opt/cni/bin'目录下缺失'portmap'程序,该程序主要做端口映射相关的。
我们需要检查一下该目录是否有程序文件,也有可能是运维同事误操作挂载导致的问题。检查是否有挂载点冲突问题。
4.无删除CNI程序包案例
[root@worker232 ~]# wget https://github.com/containernetworking/plugins/releases/download/v1.8.0/cni-plugins-linux-amd64-v1.8.0.tgz
SVIP:
[root@worker232 ~]# wget http://192.168.16.253/Resources/Kubernetes/K8S%20Cluster/CNI/flannel/softwares/cni-plugins-linux-amd64-v1.8.0.tgz
[root@worker232 ~]# tar xf cni-plugins-linux-amd64-v1.8.0.tgz -C /opt/cni/bin/
[root@worker232 ~]#
[root@worker232 ~]# ll /opt/cni/bin/
total 98940
drwxr-xr-x 2 root root 4096 Sep 1 23:29 ./
drwxr-xr-x 3 root root 4096 Sep 19 10:12 ../
-rwxr-xr-x 1 root root 5042186 Sep 1 23:29 bandwidth*
-rwxr-xr-x 1 root root 5694189 Sep 1 23:29 bridge*
-rwxr-xr-x 1 root root 13719696 Sep 1 23:29 dhcp*
-rwxr-xr-x 1 root root 5251247 Sep 1 23:29 dummy*
-rwxr-xr-x 1 root root 5701763 Sep 1 23:29 firewall*
-rwxr-xr-x 1 root root 2907995 Sep 26 11:33 flannel*
-rwxr-xr-x 1 root root 5159307 Sep 1 23:29 host-device*
-rwxr-xr-x 1 root root 4350430 Sep 1 23:29 host-local*
-rwxr-xr-x 1 root root 5273398 Sep 1 23:29 ipvlan*
-rw-r--r-- 1 root root 11357 Sep 1 23:29 LICENSE
-rwxr-xr-x 1 root root 4301450 Sep 1 23:29 loopback*
-rwxr-xr-x 1 root root 5306499 Sep 1 23:29 macvlan*
-rwxr-xr-x 1 root root 5107586 Sep 1 23:29 portmap*
-rwxr-xr-x 1 root root 5474778 Sep 1 23:29 ptp*
-rw-r--r-- 1 root root 2343 Sep 1 23:29 README.md
-rwxr-xr-x 1 root root 4521078 Sep 1 23:29 sbr*
-rwxr-xr-x 1 root root 3772408 Sep 1 23:29 static*
-rwxr-xr-x 1 root root 5330851 Sep 1 23:29 tap*
-rwxr-xr-x 1 root root 4384728 Sep 1 23:29 tuning*
-rwxr-xr-x 1 root root 5266939 Sep 1 23:29 vlan*
-rwxr-xr-x 1 root root 4684912 Sep 1 23:29 vrf*
[root@worker232 ~]# 参考链接: https://github.com/containernetworking/plugins
5.Flannel路由缺失案例
[root@master231 deployments]# route -n # 删除之前检查本地路由没有10.100.0.0/16的路由
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.0.0.254 0.0.0.0 UG 0 0 0 eth0
10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
[root@master231 deployments]#
[root@master231 deployments]# kubectl get pods -n kube-flannel -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-flannel-ds-c9mp9 1/1 Running 0 39m 10.0.0.232 worker232 <none> <none>
kube-flannel-ds-pphgj 1/1 Running 0 39m 10.0.0.233 worker233 <none> <none>
kube-flannel-ds-vqr86 1/1 Running 0 39m 10.0.0.231 master231 <none> <none>
[root@master231 deployments]#
[root@master231 deployments]#
[root@master231 deployments]# kubectl -n kube-flannel delete pods --all
pod "kube-flannel-ds-c9mp9" deleted
pod "kube-flannel-ds-pphgj" deleted
pod "kube-flannel-ds-vqr86" deleted
[root@master231 deployments]#
[root@master231 deployments]# kubectl get pods -n kube-flannel -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-flannel-ds-b7vpr 0/1 PodInitializing 0 2s 10.0.0.231 master231 <none> <none>
kube-flannel-ds-lhb85 0/1 PodInitializing 0 2s 10.0.0.233 worker233 <none> <none>
kube-flannel-ds-lns8p 0/1 PodInitializing 0 3s 10.0.0.232 worker232 <none> <none>
[root@master231 deployments]#
[root@master231 deployments]# kubectl get pods -n kube-flannel -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-flannel-ds-b7vpr 1/1 Running 0 4s 10.0.0.231 master231 <none> <none>
kube-flannel-ds-lhb85 1/1 Running 0 4s 10.0.0.233 worker233 <none> <none>
kube-flannel-ds-lns8p 1/1 Running 0 5s 10.0.0.232 worker232 <none> <none>
[root@master231 deployments]#
[root@master231 deployments]#
[root@master231 deployments]# route -n # 删除之后检查本地路由存在10.100.0.0/16的路由,因此可以正常访问Pod。
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.0.0.254 0.0.0.0 UG 0 0 0 eth0
10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
10.100.1.0 10.0.0.232 255.255.255.0 UG 0 0 0 eth0
10.100.2.0 10.0.0.233 255.255.255.0 UG 0 0 0 eth0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
[root@master231 deployments]#
[root@master231 deployments]#