kubeadm 1.25 init가 컨테이너가 있는 Debian 11에서 실패함 -> 연결이 거부됨

kubeadm 1.25 init가 컨테이너가 있는 Debian 11에서 실패함 -> 연결이 거부됨

kubeadm 버전 1.25.4-00을 사용하여 Debian GNU/Linux 11(bullseye) 시스템에서 실행되는 kubernetes 마스터 노드를 초기화하려고 합니다.

kubernetes.io의 공식 지침을 따랐습니다. 를 설치 containerd하고 설정 SystemdCgroup = true했습니다 /etc/containerd/config.toml.

  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
    [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
      runtime_type = "io.containerd.runc.v2"
      runtime_engine = ""
      runtime_root = ""
      privileged_without_host_devices = false
      base_runtime_spec = ""
      [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
        SystemdCgroup = true

Containerd는 괜찮은 것 같습니다.

$ sudo systemctl status containerd
● containerd.service - containerd container runtime
     Loaded: loaded (/lib/systemd/system/containerd.service; enabled; vendor preset: enabled)
     Active: active (running) since Mon 2022-11-21 08:12:35 UTC; 1min 7s ago
       Docs: https://containerd.io
   Main PID: 7897 (containerd)
      Tasks: 8
     Memory: 10.5M
        CPU: 470ms
     CGroup: /system.slice/containerd.service
             └─7897 /usr/bin/containerd

Nov 21 08:12:35 master-1 containerd[7897]: time="2022-11-21T08:12:35.900148031Z" level=info msg=serving... address=/run/containerd/containerd.sock.ttrpc
Nov 21 08:12:35 master-1 containerd[7897]: time="2022-11-21T08:12:35.900245191Z" level=info msg=serving... address=/run/containerd/containerd.sock
Nov 21 08:12:35 master-1 containerd[7897]: time="2022-11-21T08:12:35.900338622Z" level=info msg="containerd successfully booted in 0.046780s"
Nov 21 08:12:35 master-1 systemd[1]: Started containerd container runtime.
Nov 21 08:12:35 master-1 containerd[7897]: time="2022-11-21T08:12:35.909836633Z" level=info msg="Start subscribing containerd event"
Nov 21 08:12:35 master-1 containerd[7897]: time="2022-11-21T08:12:35.909931756Z" level=info msg="Start recovering state"
Nov 21 08:12:35 master-1 containerd[7897]: time="2022-11-21T08:12:35.910044670Z" level=info msg="Start event monitor"
Nov 21 08:12:35 master-1 containerd[7897]: time="2022-11-21T08:12:35.910056885Z" level=info msg="Start snapshots syncer"
Nov 21 08:12:35 master-1 containerd[7897]: time="2022-11-21T08:12:35.910069145Z" level=info msg="Start cni network conf syncer"
Nov 21 08:12:35 master-1 containerd[7897]: time="2022-11-21T08:12:35.910079607Z" level=info msg="Start streaming server"
....

kubeadm init를 실행하면 시스템이 멈추고 4분 후에 시간 초과가 발생합니다.

$ sudo kubeadm init --pod-network-cidr=10.244.0.0/16 -v=9

방화벽 문제는 없는 것으로 보이며 kubeadm은 컨테이너 및 cgroup을 올바르게 감지하는 것 같습니다.

I1121 08:16:46.935270    8096 initconfiguration.go:117] detected and using CRI socket: unix:///var/run/containerd/containerd.sock
I1121 08:16:46.935936    8096 interface.go:432] Looking for default routes with IPv4 addresses
I1121 08:16:46.936037    8096 interface.go:437] Default route transits interface "eth0"
I1121 08:16:46.936268    8096 interface.go:209] Interface eth0 is up
I1121 08:16:46.936427    8096 interface.go:257] Interface "eth0" has 3 addresses :[x.x.y.y/32 .........::1/64 ......../64].
I1121 08:16:46.936525    8096 interface.go:224] Checking addr  x.x.y.y/32.
I1121 08:16:46.936596    8096 interface.go:231] IP found x.x.y.y
I1121 08:16:46.936616    8096 interface.go:263] Found valid IPv4 address x.x.y.y for interface "eth0".
I1121 08:16:46.936710    8096 interface.go:443] Found active IP x.x.y.y 
I1121 08:16:46.936803    8096 kubelet.go:218] the value of KubeletConfiguration.cgroupDriver is empty; setting it to "systemd"
I1121 08:16:46.948350    8096 version.go:186] fetching Kubernetes version from URL: https://dl.k8s.io/release/stable-1.txt
I1121 08:16:47.327247    8096 version.go:255] remote version is much newer: v1.25.4; falling back to: stable-1.24
I1121 08:16:47.327368    8096 version.go:186] fetching Kubernetes version from URL: https://dl.k8s.io/release/stable-1.25.txt
[init] Using Kubernetes version: v1.25.4
[preflight] Running pre-flight checks
I1121 08:16:47.716620    8096 checks.go:570] validating Kubernetes and kubeadm version
I1121 08:16:47.716770    8096 checks.go:170] validating if the firewall is enabled and active
I1121 08:16:47.731470    8096 checks.go:205] validating availability of port 6443
I1121 08:16:47.732017    8096 checks.go:205] validating availability of port 10259
....

kubelet이 부팅되기를 기다리는 동안 다음 경고가 표시됩니다. 이 메시지는 4분 후 시간 초과될 때까지 표시됩니다.

[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
I1121 08:17:12.320743    8096 round_trippers.go:466] curl -v -XGET  -H "Accept: application/json, */*" -H "User-Agent: kubeadm/v1.25.4 (linux/amd64) kubernetes/fdc7750" 'https://x.x.y.y:6443/healthz?timeout=10s'
I1121 08:17:12.321047    8096 round_trippers.go:508] HTTP Trace: Dial to tcp:x.x.y.y:6443 failed: dial tcp x.x.y.y:6443: connect: connection refused
I1121 08:17:12.321112    8096 round_trippers.go:553] GET https://x.x.y.y:6443/healthz?timeout=10s  in 0 milliseconds
I1121 08:17:12.321157    8096 round_trippers.go:570] HTTP Statistics: DNSLookup 0 ms Dial 0 ms TLSHandshake 0 ms Duration 0 ms
I1121 08:17:12.321209    8096 round_trippers.go:577] Response Headers:
I1121 08:17:12.821526    8096 round_trippers.go:466] curl -v -XGET  -H "Accept: application/json, */*" -H "User-Agent: kubeadm/v1.25.4 (linux/amd64) kubernetes/fdc7750" 'https://x.x.y.y:6443/healthz?timeout=10s'
I1121 08:17:12.821882    8096 round_trippers.go:508] HTTP Trace: Dial to tcp:x.x.y.y:6443 failed: dial tcp x.x.y.y:6443: connect: connection refused
.....

kulet 상태를 확인하면 다음이 표시됩니다.

$ sudo systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
     Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/kubelet.service.d
             └─10-kubeadm.conf
     Active: active (running) since Mon 2022-11-21 08:17:12 UTC; 4min 30s ago
       Docs: https://kubernetes.io/docs/home/
   Main PID: 8228 (kubelet)
      Tasks: 14 (limit: 4556)
     Memory: 52.0M
        CPU: 6.246s
     CGroup: /system.slice/kubelet.service
             └─8228 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --container-ru>

Nov 21 08:21:42 master-1 kubelet[8228]: E1121 08:21:42.526642    8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:21:42 master-1 kubelet[8228]: E1121 08:21:42.626872    8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:21:42 master-1 kubelet[8228]: E1121 08:21:42.727919    8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:21:42 master-1 kubelet[8228]: E1121 08:21:42.829055    8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:21:42 master-1 kubelet[8228]: E1121 08:21:42.930002    8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:21:42 master-1 kubelet[8228]: E1121 08:21:42.959961    8228 eviction_manager.go:254] "Eviction manager: failed to get summary stats" err="failed to get node info: node \"master->
Nov 21 08:21:43 master-1 kubelet[8228]: E1121 08:21:43.029432    8228 kubelet.go:2349] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady >
Nov 21 08:21:43 master-1 kubelet[8228]: E1121 08:21:43.030749    8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:21:43 master-1 kubelet[8228]: E1121 08:21:43.130874    8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:21:43 master-1 kubelet[8228]: E1121 08:21:43.231537    8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"

Journalctl을 확인하면 다음이 표시됩니다.

$ sudo journalctl -xeu kubelet
Nov 21 08:22:37 master-1 kubelet[8228]: E1121 08:22:37.585238    8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:22:37 master-1 kubelet[8228]: E1121 08:22:37.685464    8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:22:37 master-1 kubelet[8228]: E1121 08:22:37.786279    8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:22:37 master-1 kubelet[8228]: E1121 08:22:37.887211    8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:22:37 master-1 kubelet[8228]: E1121 08:22:37.987526    8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:22:38 master-1 kubelet[8228]: E1121 08:22:38.045350    8228 kubelet.go:2349] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady >
Nov 21 08:22:38 master-1 kubelet[8228]: E1121 08:22:38.088201    8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
....
Nov 21 08:22:40 master-1 kubelet[8228]: E1121 08:22:40.500610    8228 controller.go:144] failed to ensure lease exists, will retry in 7s, error: Get "https://x.x.y.y:6443/apis/coordin>
Nov 21 08:22:40 master-1 kubelet[8228]: E1121 08:22:40.512026    8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:22:40 master-1 kubelet[8228]: E1121 08:22:40.613041    8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:22:40 master-1 kubelet[8228]: I1121 08:22:40.700243    8228 kubelet_node_status.go:70] "Attempting to register node" node="master-1"
Nov 21 08:22:40 master-1 kubelet[8228]: E1121 08:22:40.701021    8228 kubelet_node_status.go:92] "Unable to register node with API server" err="Post \"https://x.x.y.y:6443/api/v1/node>
...
.....

이 문제의 원인을 어떻게 알 수 있나요? 로그 파일은 실제로 유용한 힌트를 제공하지 않았습니다.

메모:Containerd 대신 CRI-O를 설치하면 kubeadm이 매력적으로 작동합니다.

답변1

kubeadm v1.25.4 및 Containerd v1.4.13에서도 동일한 문제가 있습니다.

Containerd도 괜찮은 것 같고 Kubelet 서비스는 활성 상태이지만 kubelet-api는 모든 제어 평면 포드와 함께 작동 중지 상태를 유지합니다.

kubectl get pods --all-namespaces
The connection to the server localhost:8080 was refused - did you specify the right host or port?

그리고 내 syslog 파일에는 다른 로그가 있습니다.

Nov 25 09:39:08 master-1 kubelet[2809]: E1125 09:39:08.517592    2809 kubelet.go:2448] "Error getting node" err="node \"master-1\" not found"
Nov 25 09:39:08 master-1 kubelet[2809]: E1125 09:39:08.618103    2809 kubelet.go:2448] "Error getting node" err="node \"master-1\" not found"
Nov 25 09:39:08 master-1 kubelet[2809]: E1125 09:39:08.718895    2809 kubelet.go:2448] "Error getting node" err="node \"master-1\" not found"
Nov 25 09:39:08 master-1 containerd[450]: time="2022-11-25T09:39:08.774397538Z" level=info msg="RunPodsandbox for &PodSandboxMetadata{Name:kube-scheduler-master-1,Uid:c8fdb264532b280b4098380e628d113d,Namespace:kube-system,Attempt:0,}"
Nov 25 09:39:08 master-1 containerd[450]: time="2022-11-25T09:39:08.774397563Z" level=info msg="RunPodsandbox for &PodSandboxMetadata{Name:kube-apiserver-master-1,Uid:e8e76556f3e67024151f36c60b85b622,Namespace:kube-system,Attempt:0,}"
Nov 25 09:39:08 master-1 containerd[450]: time="2022-11-25T09:39:08.800116714Z" level=error msg="RunPodSandbox for &PodSandboxMetadata{Name:kube-scheduler-master-1,Uid:c8fdb264532b280b4098380e628d113d,Namespace:kube-system,Attempt:0,} failed, error" error="rpc error: code = InvalidArgument desc = failed to create containerd container: create container failed validation: container.Runtime.Name must be set: invalid argument"
Nov 25 09:39:08 master-1 kubelet[2809]: E1125 09:39:08.800548    2809 remote_runtime.go:233] "RunPodSandbox from runtime service failed" err="rpc error: code = InvalidArgument desc = failed to create containerd container: create container failed validation: container.Runtime.Name must be set: invalid argument"
Nov 25 09:39:08 master-1 kubelet[2809]: E1125 09:39:08.800620    2809 kuberuntime_sandbox.go:71] "Failed to create sandbox for pod" err="rpc error: code = InvalidArgument desc = failed to create containerd container: create container failed validation: container.Runtime.Name must be set: invalid argument" pod="kube-system/kube-scheduler-master-1"
Nov 25 09:39:08 master-1 kubelet[2809]: E1125 09:39:08.800653    2809 kuberuntime_manager.go:772] "CreatePodSandbox for pod failed" err="rpc error: code = InvalidArgument desc = failed to create containerd container: create container failed validation: container.Runtime.Name must be set: invalid argument" pod="kube-system/kube-scheduler-master-1"
Nov 25 09:39:08 master-1 kubelet[2809]: E1125 09:39:08.800729    2809 pod_workers.go:965] "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"kube-scheduler-master-1_kube-system(c8fdb264532b280b4098380e628d113d)\" with CreatePodSandboxError: \"Failed to create sandbox for pod \\\"kube-scheduler-master-1_kube-system(c8fdb264532b280b4098380e628d113d)\\\": rpc error: code = InvalidArgument desc = failed to create containerd container: create container failed validation: container.Runtime.Name must be set: invalid argument\"" pod="kube-system/kube-scheduler-master-1" podUID=c8fdb264532b280b4098380e628d113d

누군가 해결책이나 단서를 갖고 있다면 나는 당신의 주제를 따릅니다.

관련 정보