私は 5 つのノード、4 つの Google Compute Engine VM (1 つのコントローラーと 3 つのワーカー ノード)、および自宅の 1 つのベアメタル ローカル マシン (kube ワーカー ノード) を持つ Kubernetes クラスターを持っています。クラスターは稼働しており、すべてのノードが準備完了状態です。
- 以下に基づいて構成されたセルフマネージド クラスター:https://docs.projectcalico.org/getting-started/kubernetes/self-managed-public-cloud/gce
- すべての IP (0.0.0.0/0) と任意のポートに対して、Ingress および Engress のファイアウォール ルールが追加されます。
- マスター ノードのパブリック IP の **--control-plane-endpoint IP:PORT ** タグを使用して kube マスター ノードをアドバタイズし、それに基づいてワーカー ノードに参加します。
問題:アプリケーションをデプロイするときに問題が発生しています。GCE VM ワーカー上のコンテナは正常にデプロイされているのに、ローカル ワーカー ノード内のすべてのポッドが ContainerCreating ステータスのままになっています。この設定の問題と、この問題を解決する方法を知っている人はいますか?
- これは私のポッドのイベントの出力ですkubect ポッドの説明出力:
イベント: social-network/home-timeline-redis-6f4c5d55fc-tql2l を volatile に正常に割り当てました
Warning FailedCreatePodSandBox 3m14s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "32b64e6efcaff6401b7b0a6936f005a00a53c19a2061b0a14906b8bc3a81bf20" network for pod "home-timeline-redis-6f4c5d55fc-tql2l": networkPlugin cni failed to set up pod "home-timeline-redis-6f4c5d55fc-tql2l_social-network" network: unable to connect to Cilium daemon: failed to create cilium agent client after 30.000000 seconds timeout: Get "http:///var/run/cilium/cilium.sock/v1/config": dial unix /var/run/cilium/cilium.sock: connect: no such file or directory
Is the agent running?
Warning FailedCreatePodSandBox 102s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "1e95fa10d49abf5edc8693345256b91e88c31d1b6414761de80e6038cd7696a4" network for pod "home-timeline-redis-6f4c5d55fc-tql2l": networkPlugin cni failed to set up pod "home-timeline-redis-6f4c5d55fc-tql2l_social-network" network: unable to connect to Cilium daemon: failed to create cilium agent client after 30.000000 seconds timeout: Get "http:///var/run/cilium/cilium.sock/v1/config": dial unix /var/run/cilium/cilium.sock: connect: no such file or directory
Is the agent running?
Normal SandboxChanged 11s (x3 over 3m14s) kubelet Pod sandbox changed, it will be killed and re-created.
Warning FailedCreatePodSandBox 11s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "8f5959966e4c25f94bd49b82e1fa6da33a114b1680eae8898ba6685f22e7d37f" network for pod "home-timeline-redis-6f4c5d55fc-tql2l": networkPlugin cni failed to set up pod "home-timeline-redis-6f4c5d55fc-tql2l_social-network" network: unable to connect to Cilium daemon: failed to create cilium agent client after 30.000000 seconds timeout: Get "http:///var/run/cilium/cilium.sock/v1/config": dial unix /var/run/cilium/cilium.sock: connect: no such file or directory
Is the agent running?
アップデート
すべてのノードで kubeadm をリセットし、cilium を削除して calico cni を再作成しました。また、コンテナーを変更したCDIR
ところsudo kubeadm init --pod-network-cidr=20.96.0.0/12 --control-plane-endpoint "34.89.7.120:6443"
、ホスト CDIR との競合が解決されたようです。ただし、Volatile (ローカル マシン) のポッドは ContainerCreating のままです。
`
>root@controller:~# kubectl get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-kube-controllers-744cfdf676-bh2nc 1/1 Running 0 12m 20.109.133.129 worker-2 <none> <none>
calico-node-frv5r 1/1 Running 0 12m 10.240.0.11 controller <none> <none>
calico-node-lplx6 1/1 Running 0 12m 10.240.0.20 worker-0 <none> <none>
calico-node-lwrdr 1/1 Running 0 12m 10.240.0.21 worker-1 <none> <none>
calico-node-ppczn 0/1 CrashLoopBackOff 7 12m 130.239.41.206 volatile <none> <none>
calico-node-zplwx 1/1 Running 0 12m 10.240.0.22 worker-2 <none> <none>
coredns-74ff55c5b-69mn2 1/1 Running 0 14m 20.105.55.194 controller <none> <none>
coredns-74ff55c5b-djczf 1/1 Running 0 14m 20.105.55.193 controller <none> <none>
etcd-controller 1/1 Running 0 14m 10.240.0.11 controller <none> <none>
kube-apiserver-controller 1/1 Running 0 14m 10.240.0.11 controller <none> <none>
kube-controller-manager-controller 1/1 Running 0 14m 10.240.0.11 controller <none> <none>
kube-proxy-5vzdf 1/1 Running 0 13m 10.240.0.20 worker-0 <none> <none>
kube-proxy-d22q4 1/1 Running 0 13m 10.240.0.22 worker-2 <none> <none>
kube-proxy-hml5c 1/1 Running 0 14m 10.240.0.11 controller <none> <none>
kube-proxy-hw8kl 1/1 Running 0 13m 10.240.0.21 worker-1 <none> <none>
kube-proxy-zb6t7 1/1 Running 0 13m 130.239.41.206 volatile <none> <none>
kube-scheduler-controller 1/1 Running 0 14m 10.240.0.11 controller <none> <none>
root@controller:~# kubectl describe pod calico-node-ppczn -n kube-system
:
> root@controller:~# kubectl describe pod calico-node-ppczn -n kube-system
Name: calico-node-ppczn
Namespace: kube-system
Priority: 2000001000
Priority Class Name: system-node-critical
Node: volatile/130.239.41.206
Start Time: Mon, 04 Jan 2021 13:01:36 +0000
Labels: controller-revision-hash=89c447898
k8s-app=calico-node
pod-template-generation=1
Annotations: <none>
Status: Running
IP: 130.239.41.206
IPs:
IP: 130.239.41.206
Controlled By: DaemonSet/calico-node
Init Containers:
upgrade-ipam:
Container ID: docker://27f988847a484c5f74e000c4b8f473895b71ed49f27e0bf4fab4b425940951dc
Image: docker.io/calico/cni:v3.17.1
Image ID: docker-pullable://calico/cni@sha256:3dc2506632843491864ce73a6e73d5bba7d0dc25ec0df00c1baa91d17549b068
Port: <none>
Host Port: <none>
Command:
/opt/cni/bin/calico-ipam
-upgrade
State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 04 Jan 2021 13:01:37 +0000
Finished: Mon, 04 Jan 2021 13:01:38 +0000
Ready: True
Restart Count: 0
Environment Variables from:
kubernetes-services-endpoint ConfigMap Optional: true
Environment:
KUBERNETES_NODE_NAME: (v1:spec.nodeName)
CALICO_NETWORKING_BACKEND: <set to the key 'calico_backend' of config map 'calico-config'> Optional: false
Mounts:
/host/opt/cni/bin from cni-bin-dir (rw)
/var/lib/cni/networks from host-local-net-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-8r94c (ro)
install-cni:
Container ID: docker://5629f6984cfe545864d187112a0c1f65e7bdb7dbfae9b4971579f420ab55b77b
Image: docker.io/calico/cni:v3.17.1
Image ID: docker-pullable://calico/cni@sha256:3dc2506632843491864ce73a6e73d5bba7d0dc25ec0df00c1baa91d17549b068
Port: <none>
Host Port: <none>
Command:
/opt/cni/bin/install
State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 04 Jan 2021 13:01:39 +0000
Finished: Mon, 04 Jan 2021 13:01:41 +0000
Ready: True
Restart Count: 0
Environment Variables from:
kubernetes-services-endpoint ConfigMap Optional: true
Environment:
CNI_CONF_NAME: 10-calico.conflist
CNI_NETWORK_CONFIG: <set to the key 'cni_network_config' of config map 'calico-config'> Optional: false
KUBERNETES_NODE_NAME: (v1:spec.nodeName)
CNI_MTU: <set to the key 'veth_mtu' of config map 'calico-config'> Optional: false
SLEEP: false
Mounts:
/host/etc/cni/net.d from cni-net-dir (rw)
/host/opt/cni/bin from cni-bin-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-8r94c (ro)
flexvol-driver:
Container ID: docker://3a4bf307a347926893aeb956717d84049af601fd4cc4aa7add6e182c85dc4e7c
Image: docker.io/calico/pod2daemon-flexvol:v3.17.1
Image ID: docker-pullable://calico/pod2daemon-flexvol@sha256:48f277d41c35dae051d7dd6f0ec8f64ac7ee6650e27102a41b0203a0c2ce6c6b
Port: <none>
Host Port: <none>
State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 04 Jan 2021 13:01:43 +0000
Finished: Mon, 04 Jan 2021 13:01:43 +0000
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/host/driver from flexvol-driver-host (rw)
/var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-8r94c (ro)
Containers:
calico-node:
Container ID: docker://2576b2426c2a3fc4b6a972839a94872160c7ac5efa5b1159817be8d4ad4ddf60
Image: docker.io/calico/node:v3.17.1
Image ID: docker-pullable://calico/node@sha256:25e0b0495c0df3a7a06b6f9e92203c53e5b56c143ac1c885885ee84bf86285ff
Port: <none>
Host Port: <none>
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Mon, 04 Jan 2021 13:18:48 +0000
Finished: Mon, 04 Jan 2021 13:19:57 +0000
Ready: False
Restart Count: 9
Requests:
cpu: 250m
Liveness: exec [/bin/calico-node -felix-live -bird-live] delay=10s timeout=1s period=10s #success=1 #failure=6
Readiness: exec [/bin/calico-node -felix-ready -bird-ready] delay=0s timeout=1s period=10s #success=1 #failure=3
Environment Variables from:
kubernetes-services-endpoint ConfigMap Optional: true
Environment:
DATASTORE_TYPE: kubernetes
WAIT_FOR_DATASTORE: true
NODENAME: (v1:spec.nodeName)
CALICO_NETWORKING_BACKEND: <set to the key 'calico_backend' of config map 'calico-config'> Optional: false
CLUSTER_TYPE: k8s,bgp
IP: autodetect
CALICO_IPV4POOL_IPIP: Always
CALICO_IPV4POOL_VXLAN: Never
FELIX_IPINIPMTU: <set to the key 'veth_mtu' of config map 'calico-config'> Optional: false
FELIX_VXLANMTU: <set to the key 'veth_mtu' of config map 'calico-config'> Optional: false
FELIX_WIREGUARDMTU: <set to the key 'veth_mtu' of config map 'calico-config'> Optional: false
CALICO_DISABLE_FILE_LOGGING: true
FELIX_DEFAULTENDPOINTTOHOSTACTION: ACCEPT
FELIX_IPV6SUPPORT: false
FELIX_LOGSEVERITYSCREEN: info
FELIX_HEALTHENABLED: true
Mounts:
/lib/modules from lib-modules (ro)
/run/xtables.lock from xtables-lock (rw)
/sys/fs/ from sysfs (rw)
/var/lib/calico from var-lib-calico (rw)
/var/log/calico/cni from cni-log-dir (ro)
/var/run/calico from var-run-calico (rw)
/var/run/nodeagent from policysync (rw)
/var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-8r94c (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
lib-modules:
Type: HostPath (bare host directory volume)
Path: /lib/modules
HostPathType:
var-run-calico:
Type: HostPath (bare host directory volume)
Path: /var/run/calico
HostPathType:
var-lib-calico:
Type: HostPath (bare host directory volume)
Path: /var/lib/calico
HostPathType:
xtables-lock:
Type: HostPath (bare host directory volume)
Path: /run/xtables.lock
HostPathType: FileOrCreate
sysfs:
Type: HostPath (bare host directory volume)
Path: /sys/fs/
HostPathType: DirectoryOrCreate
cni-bin-dir:
Type: HostPath (bare host directory volume)
Path: /opt/cni/bin
HostPathType:
cni-net-dir:
Type: HostPath (bare host directory volume)
Path: /etc/cni/net.d
HostPathType:
cni-log-dir:
Type: HostPath (bare host directory volume)
Path: /var/log/calico/cni
HostPathType:
host-local-net-dir:
Type: HostPath (bare host directory volume)
Path: /var/lib/cni/networks
HostPathType:
policysync:
Type: HostPath (bare host directory volume)
Path: /var/run/nodeagent
HostPathType: DirectoryOrCreate
flexvol-driver-host:
Type: HostPath (bare host directory volume)
Path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds
HostPathType: DirectoryOrCreate
calico-node-token-8r94c:
Type: Secret (a volume populated by a Secret)
SecretName: calico-node-token-8r94c
Optional: false
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: :NoSchedule op=Exists
:NoExecute op=Exists
CriticalAddonsOnly op=Exists
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/network-unavailable:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 22m default-scheduler Successfully assigned kube-system/calico-node-ppczn to volatile
Normal Pulled 22m kubelet Container image "docker.io/calico/cni:v3.17.1" already present on machine
Normal Created 22m kubelet Created container upgrade-ipam
Normal Started 22m kubelet Started container upgrade-ipam
Normal Pulled 21m kubelet Container image "docker.io/calico/cni:v3.17.1" already present on machine
Normal Started 21m kubelet Started container install-cni
Normal Created 21m kubelet Created container install-cni
Normal Pulled 21m kubelet Container image "docker.io/calico/pod2daemon-flexvol:v3.17.1" already present on machine
Normal Created 21m kubelet Created container flexvol-driver
Normal Started 21m kubelet Started container flexvol-driver
Normal Pulled 21m kubelet Container image "docker.io/calico/node:v3.17.1" already present on machine
Normal Created 21m kubelet Created container calico-node
Normal Started 21m kubelet Started container calico-node
Warning Unhealthy 21m (x2 over 21m) kubelet Liveness probe failed: calico/node is not ready: Felix is not live: Get "http://localhost:9099/liveness": dial tcp 127.0.0.1:9099: connect: connection refused
Warning Unhealthy 11m (x51 over 21m) kubelet Readiness probe failed: calico/node is not ready: BIRD is not ready: Failed to stat() nodename file: stat /var/lib/calico/nodename: no such file or directory
Warning DNSConfigForming 115s (x78 over 22m) kubelet Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 130.239.40.2 130.239.40.3 2001:6b0:e:4040::2
calico-node-ppczn ログ:
> root@controller:~# kubectl logs calico-node-ppczn -n kube-system
2021-01-04 13:17:38.010 [INFO][8] startup/startup.go 379: Early log level set to info
2021-01-04 13:17:38.010 [INFO][8] startup/startup.go 395: Using NODENAME environment for node name
2021-01-04 13:17:38.010 [INFO][8] startup/startup.go 407: Determined node name: volatile
2021-01-04 13:17:38.011 [INFO][8] startup/startup.go 439: Checking datastore connection
2021-01-04 13:18:08.011 [INFO][8] startup/startup.go 454: Hit error connecting to datastore - retry error=Get "https://10.96.0.1:443/api/v1/nodes/foo": dial tcp 10.96.0.1:443: i/o timeout
ローカルマシン上:
> root@volatile:~# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
39efaf54f558 k8s.gcr.io/pause:3.2 "/pause" 19 minutes ago Up 19 minutes k8s_POD_calico-node-ppczn_kube-system_7e98eb90-f581-4dbc-b877-da25bc2868f9_0
05bd9fa182e5 e3f6fcd87756 "/usr/local/bin/kube…" 20 minutes ago Up 20 minutes k8s_kube-proxy_kube-proxy-zb6t7_kube-system_90529aeb-d226-4061-a87f-d5b303207a2f_0
ae11c77897b0 k8s.gcr.io/pause:3.2 "/pause" 20 minutes ago Up 20 minutes k8s_POD_kube-proxy-zb6t7_kube-system_90529aeb-d226-4061-a87f-d5b303207a2f_0
> root@volatile:~# docker logs 39efaf54f558
> root@volatile:~# docker logs 05bd9fa182e5
I0104 13:00:51.131737 1 node.go:172] Successfully retrieved node IP: 130.239.41.206
I0104 13:00:51.132027 1 server_others.go:142] kube-proxy node IP is an IPv4 address (130.239.41.206), assume IPv4 operation
W0104 13:00:51.162536 1 server_others.go:578] Unknown proxy mode "", assuming iptables proxy
I0104 13:00:51.162615 1 server_others.go:185] Using iptables Proxier.
I0104 13:00:51.162797 1 server.go:650] Version: v1.20.1
I0104 13:00:51.163080 1 conntrack.go:52] Setting nf_conntrack_max to 262144
I0104 13:00:51.163289 1 config.go:315] Starting service config controller
I0104 13:00:51.163300 1 config.go:224] Starting endpoint slice config controller
I0104 13:00:51.163304 1 shared_informer.go:240] Waiting for caches to sync for service config
I0104 13:00:51.163311 1 shared_informer.go:240] Waiting for caches to sync for endpoint slice config
I0104 13:00:51.263469 1 shared_informer.go:247] Caches are synced for endpoint slice config
I0104 13:00:51.263487 1 shared_informer.go:247] Caches are synced for service config
> root@volatile:~# docker logs ae11c77897b0
root@volatile:~# ls /etc/cni/net.d/
10-calico.conflist calico-kubeconfig
root@volatile:~# ls /var/lib/calico/
root@volatile:~#
答え1
ホストでは、volatile
に cilium が設定されているようです/etc/cni/net.d/*.conf
。これは、Kubernetes で利用できる多くのプラグインの 1 つであるネットワーク プラグインです。これらのファイルの 1 つには、次のような内容が含まれている可能性があります。
{
"name": "cilium",
"type": "cilium-cni"
}
これが偶然である場合は、そのようなファイルを削除してください。Project Calico による競合ネットワーク プラグインを既に実行しているようですが、これで十分なようです。そのため、 namespace でポッド calico-kube-controllers を再作成しkube-system
、成功させてから、他のポッドを再作成してください。
そのホストでCiliumを使用する場合は、Cillium インストール ガイドもう一度実行すると、/var/run/cilium/cilium.sock が作成されていることがわかります。