Os pods do Kubernetes em execução no mesmo nó não podem acessar uns aos outros por meio do serviço

Os pods do Kubernetes em execução no mesmo nó não podem acessar uns aos outros por meio do serviço

Estou tentando fazer chamadas de um pod para outro por meio do serviço quando eles estão no mesmo nó.

explicar

  • Nó1 - serviço1a (pod1A), serviço1b (pod1B)
  • Nó2 - serviço2a (pod2A)

quando:

  • PING pod1A -> pod1B OK
  • PING pod1A -> pod2A OK
  • CURL pod1A -> serviço2A OK
  • CURL pod1A -> TEMPO LIMITE do serviço1B

Passe vários dias alterando alguma parte das configurações e também pesquise o mesmo problema na internet, mas sem sorte.

apenas o mesmo problema estava aqui:https://stackoverflow.com/questions/64073696/pods-running-on-the-same-node-cant-access-to-each-other-through-service, mas resolvido mudando de IPVS para IPTABLES. Eu tenho IPTables.

PODS: (os nomes foram alterados)

NAMESPACE              NAME                                            READY   STATUS    RESTARTS       AGE    IP              NODE          NOMINATED NODE   READINESS GATES
dev                    pod/pod1B-5d595bf69-8fgsm                       1/1     Running   0              16h    10.244.2.63     kube-node2    <none>           <none>
dev                    pod/pod1A-database-596f76d8b5-6bdqv             1/1     Running   0              16h    10.244.2.65     kube-node2    <none>           <none>
dev                    pod/pod2A-dbb8fd4d-xv54n                        1/1     Running   1              15h    10.244.3.104    kube-node3    <none>           <none>
kube-system            pod/coredns-6d4b75cb6d-6b2cn                    1/1     Running   4 (50d ago)    292d   10.244.0.10     kube-master   <none>           <none>
kube-system            pod/coredns-6d4b75cb6d-6m7q2                    1/1     Running   4 (50d ago)    292d   10.244.0.11     kube-master   <none>           <none>
kube-system            pod/etcd-kube-master                            1/1     Running   2 (50d ago)    50d    172.31.42.90    kube-master   <none>           <none>
kube-system            pod/kube-apiserver-kube-master                  1/1     Running   2 (50d ago)    50d    172.31.42.90    kube-master   <none>           <none>
kube-system            pod/kube-controller-manager-kube-master         1/1     Running   1 (50d ago)    50d    172.31.42.90    kube-master   <none>           <none>
kube-system            pod/kube-flannel-ds-bwkjg                       1/1     Running   0              62s    172.31.45.210   kube-node3    <none>           <none>
kube-system            pod/kube-flannel-ds-g9v9m                       1/1     Running   0              66s    172.31.42.90    kube-master   <none>           <none>
kube-system            pod/kube-flannel-ds-hljj5                       1/1     Running   0              30s    172.31.42.77    kube-node2    <none>           <none>
kube-system            pod/kube-flannel-ds-k4zfw                       1/1     Running   0              65s    172.31.43.77    kube-node1    <none>           <none>
kube-system            pod/kube-proxy-68k5n                            1/1     Running   0              35m    172.31.45.210   kube-node3    <none>           <none>
kube-system            pod/kube-proxy-lb6s9                            1/1     Running   0              35m    172.31.42.90    kube-master   <none>           <none>
kube-system            pod/kube-proxy-vggwk                            1/1     Running   0              35m    172.31.43.77    kube-node1    <none>           <none>
kube-system            pod/kube-proxy-wxwd7                            1/1     Running   0              34m    172.31.42.77    kube-node2    <none>           <none>
kube-system            pod/kube-scheduler-kube-master                  1/1     Running   1 (50d ago)    50d    172.31.42.90    kube-master   <none>           <none>
kube-system            pod/metrics-server-55d58b59c9-569p5             1/1     Running   0              15h    10.244.2.69     kube-node2    <none>           <none>
kubernetes-dashboard   pod/dashboard-metrics-scraper-8c47d4b5d-2vxfj   1/1     Running   0              16h    10.244.2.64     kube-node2    <none>           <none>

SERVIÇOS:(nomes alterados)

NAMESPACE              NAME                                TYPE           CLUSTER-IP       EXTERNAL-IP    PORT(S)                                           AGE    SELECTOR
default                service/kubernetes                  ClusterIP      10.96.0.1        <none>         443/TCP                                           292d   <none>
dev                    service/pod1B(service1b)            ClusterIP      10.102.52.69     <none>         5432/TCP                                          42h    app=database
dev                    service/pof2A(service2a)            ClusterIP      10.105.208.135   <none>         8080/TCP                                          42h    app=keycloak
dev                    service/pod1A(service1a)            ClusterIP      10.111.140.245   <none>         5432/TCP                                          42h    app=keycloak-database
kube-system            service/kube-dns                    ClusterIP      10.96.0.10       <none>         53/UDP,53/TCP,9153/TCP                            292d   k8s-app=kube-dns
kube-system            service/metrics-server              ClusterIP      10.111.227.187   <none>         443/TCP                                           285d   k8s-app=metrics-server
kubernetes-dashboard   service/dashboard-metrics-scraper   ClusterIP      10.110.143.2     <none>         8000/TCP                                          247d   k8s-app=dashboard-metrics-scraper

use kube-proxy IPTABLES e os serviços estão configurados para CLUSTER IP (eu uso cerca de 2-3 NodePort, mas acho que isso não é um problema).

Configuração de deamonset do proxy Kube

apiVersion: v1
data:
  config.conf: |-
    apiVersion: kubeproxy.config.k8s.io/v1alpha1
    bindAddress: 0.0.0.0
    bindAddressHardFail: false
    clientConnection:
      acceptContentTypes: ""
      burst: 0
      contentType: ""
      kubeconfig: /var/lib/kube-proxy/kubeconfig.conf
      qps: 0
    clusterCIDR: 10.244.10.0/16
    configSyncPeriod: 0s
    conntrack:
      maxPerCore: null
      min: null
      tcpCloseWaitTimeout: null
      tcpEstablishedTimeout: null
    detectLocal:
      bridgeInterface: ""
      interfaceNamePrefix: ""
    detectLocalMode: ""
    enableProfiling: false
    healthzBindAddress: ""
    hostnameOverride: ""
    iptables:
      masqueradeAll: false
      masqueradeBit: null
      minSyncPeriod: 0s
      syncPeriod: 0s
    ipvs:
      excludeCIDRs: null
      minSyncPeriod: 0s
      scheduler: ""
      strictARP: false
      syncPeriod: 0s
      tcpFinTimeout: 0s
      tcpTimeout: 0s
      udpTimeout: 0s
    kind: KubeProxyConfiguration
    metricsBindAddress: 127.0.0.1:10249
    mode: ""
    nodePortAddresses: null
    oomScoreAdj: null
    portRange: ""
    showHiddenMetricsForVersion: ""
    udpIdleTimeout: 0s
    winkernel:
      enableDSR: false
      forwardHealthCheckVip: false
      networkName: ""
      rootHnsEndpointName: ""
      sourceVip: ""
  kubeconfig.conf: |-
    apiVersion: v1
    kind: Config
    clusters:
    - cluster:
        certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        server: https://172.31.42.90:6443
      name: default
    contexts:
    - context:
        cluster: default
        namespace: default
        user: default
      name: default
    current-context: default
    users:
    - name: default
      user:
        tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
kind: ConfigMap
metadata:
  annotations:
    kubeadm.kubernetes.io/component-config.hash: sha256:ebcfa3923c1228031a5b824f2edca518edc4bd49fd07cedeffa371084cba342b
  creationTimestamp: "2022-07-03T19:28:14Z"
  labels:
    app: kube-proxy
  name: kube-proxy
  namespace: kube-system
  resourceVersion: "40174591"
  uid: cfadfa22-ed43-4a3f-9897-25f605ebb8b9

configuração de deamonset de flanela

apiVersion: v1
data:
  cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }
kind: ConfigMap
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","data":{"cni-conf.json":"{\n  \"name\": \"cbr0\",\n  \"cniVersion\": \"0.3.1\",\n  \"plugins\": [\n    {\n      \"type\": \"flannel\",\n      \"delegate\": {\n        \"hairpinMode\": true,\n        \"isDe$  creationTimestamp: "2022-07-03T20:34:32Z"
  labels:
    app: flannel
    tier: node
  name: kube-flannel-cfg
  namespace: kube-system
  resourceVersion: "40178136"
  uid: ccb81719-7013-4f0b-8c20-ab9d5c3d5f8e

IP TABLES sistema kube

iptables -L -t nat | grep kube-system
KUBE-MARK-MASQ  all  --  ip-10-244-2-69.us-east-2.compute.internal  anywhere             /* kube-system/metrics-server:https */
DNAT       tcp  --  anywhere             anywhere             /* kube-system/metrics-server:https */ tcp DNAT [unsupported revision]
KUBE-MARK-MASQ  all  --  ip-10-244-0-11.us-east-2.compute.internal  anywhere             /* kube-system/kube-dns:dns-tcp */
DNAT       tcp  --  anywhere             anywhere             /* kube-system/kube-dns:dns-tcp */ tcp DNAT [unsupported revision]
KUBE-MARK-MASQ  all  --  ip-10-244-0-10.us-east-2.compute.internal  anywhere             /* kube-system/kube-dns:metrics */
DNAT       tcp  --  anywhere             anywhere             /* kube-system/kube-dns:metrics */ tcp DNAT [unsupported revision]
KUBE-MARK-MASQ  all  --  ip-10-244-0-10.us-east-2.compute.internal  anywhere             /* kube-system/kube-dns:dns-tcp */
DNAT       tcp  --  anywhere             anywhere             /* kube-system/kube-dns:dns-tcp */ tcp DNAT [unsupported revision]
KUBE-MARK-MASQ  all  --  ip-10-244-0-10.us-east-2.compute.internal  anywhere             /* kube-system/kube-dns:dns */
DNAT       udp  --  anywhere             anywhere             /* kube-system/kube-dns:dns */ udp DNAT [unsupported revision]
KUBE-MARK-MASQ  all  --  ip-10-244-0-11.us-east-2.compute.internal  anywhere             /* kube-system/kube-dns:dns */
DNAT       udp  --  anywhere             anywhere             /* kube-system/kube-dns:dns */ udp DNAT [unsupported revision]
KUBE-MARK-MASQ  all  --  ip-10-244-0-11.us-east-2.compute.internal  anywhere             /* kube-system/kube-dns:metrics */
DNAT       tcp  --  anywhere             anywhere             /* kube-system/kube-dns:metrics */ tcp DNAT [unsupported revision]
KUBE-SVC-TCOU7JCQXEZGVUNU  udp  --  anywhere             ip-10-96-0-10.us-east-2.compute.internal  /* kube-system/kube-dns:dns cluster IP */ udp dpt:domain
KUBE-SVC-JD5MR3NA4I4DYORP  tcp  --  anywhere             ip-10-96-0-10.us-east-2.compute.internal  /* kube-system/kube-dns:metrics cluster IP */ tcp dpt:9153
KUBE-SVC-Z4ANX4WAEWEBLCTM  tcp  --  anywhere             ip-10-111-227-187.us-east-2.compute.internal  /* kube-system/metrics-server:https cluster IP */ tcp dpt:https
KUBE-SVC-ERIFXISQEP7F7OF4  tcp  --  anywhere             ip-10-96-0-10.us-east-2.compute.internal  /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:domain
KUBE-MARK-MASQ  tcp  -- !ip-10-244-0-0.us-east-2.compute.internal/16  ip-10-96-0-10.us-east-2.compute.internal  /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:domain
KUBE-SEP-OP4AXEAS4OXHBEQX  all  --  anywhere             anywhere             /* kube-system/kube-dns:dns-tcp -> 10.244.0.10:53 */ statistic mode random probability 0.50000000000
KUBE-SEP-A7YQ4MY4TZII3JTK  all  --  anywhere             anywhere             /* kube-system/kube-dns:dns-tcp -> 10.244.0.11:53 */
KUBE-MARK-MASQ  tcp  -- !ip-10-244-0-0.us-east-2.compute.internal/16  ip-10-96-0-10.us-east-2.compute.internal  /* kube-system/kube-dns:metrics cluster IP */ tcp dpt:9153
KUBE-SEP-HJ7EWOW62IX6GL6R  all  --  anywhere             anywhere             /* kube-system/kube-dns:metrics -> 10.244.0.10:9153 */ statistic mode random probability 0.50000000000
KUBE-SEP-ZJHOSXJEKQGYJUBC  all  --  anywhere             anywhere             /* kube-system/kube-dns:metrics -> 10.244.0.11:9153 */
KUBE-MARK-MASQ  udp  -- !ip-10-244-0-0.us-east-2.compute.internal/16  ip-10-96-0-10.us-east-2.compute.internal  /* kube-system/kube-dns:dns cluster IP */ udp dpt:domain
KUBE-SEP-R7EMXN5TTQQVP4UW  all  --  anywhere             anywhere             /* kube-system/kube-dns:dns -> 10.244.0.10:53 */ statistic mode random probability 0.50000000000
KUBE-SEP-VR6VIIG2A6524KLY  all  --  anywhere             anywhere             /* kube-system/kube-dns:dns -> 10.244.0.11:53 */
KUBE-MARK-MASQ  tcp  -- !ip-10-244-0-0.us-east-2.compute.internal/16  ip-10-111-227-187.us-east-2.compute.internal  /* kube-system/metrics-server:https cluster IP */ tcp dpt:https
KUBE-SEP-6BOUBB2FEQTN2GDB  all  --  anywhere             anywhere             /* kube-system/metrics-server:https -> 10.244.2.69:4443 */
iptables -L -t nat | grep dev
KUBE-MARK-MASQ  all  --  ip-10-244-3-104.us-east-2.compute.internal  anywhere             /* dev/pod2A:pod2A */
DNAT       tcp  --  anywhere             anywhere             /* dev/pod2A:pod2A */ tcp DNAT [unsupported revision]
KUBE-MARK-MASQ  all  --  ip-10-244-2-63.us-east-2.compute.internal  anywhere             /* dev/pod1B:pod1B */
DNAT       tcp  --  anywhere             anywhere             /* dev/pod1B:pod1B */ tcp DNAT [unsupported revision]
KUBE-MARK-MASQ  all  --  ip-10-244-2-65.us-east-2.compute.internal  anywhere             /* dev/pod1A:pod1A */
DNAT       tcp  --  anywhere             anywhere             /* dev/pod1A:pod1A */ tcp DNAT [unsupported revision]
KUBE-SVC-MI7BJVF4L3EWWCLA  tcp  --  anywhere             ip-10-105-208-135.us-east-2.compute.internal  /* dev/pod2A:pod2A cluster IP */ tcp dpt:http-alt
KUBE-SVC-S2FASJERAWCYNV26  tcp  --  anywhere             ip-10-111-140-245.us-east-2.compute.internal  /* dev/pod1A:pod1A cluster IP */ tcp dpt:postgresql
KUBE-SVC-5JHIIG3NJGZTIC4I  tcp  --  anywhere             ip-10-102-52-69.us-east-2.compute.internal  /* dev/pod1B:pod1B cluster IP */ tcp dpt:postgresql
KUBE-MARK-MASQ  tcp  -- !ip-10-244-0-0.us-east-2.compute.internal/16  ip-10-102-52-69.us-east-2.compute.internal  /* dev/pod1B:pod1B cluster IP */ tcp dpt:postgresql
KUBE-SEP-FOQDGOYPAUSJGXYE  all  --  anywhere             anywhere             /* dev/pod1B:pod1B -> 10.244.2.63:5432 */
KUBE-MARK-MASQ  tcp  -- !ip-10-244-0-0.us-east-2.compute.internal/16  ip-10-105-208-135.us-east-2.compute.internal  /* dev/pod2A:pod2A cluster IP */ tcp dpt:http-alt
KUBE-SEP-AWG5CHYLOHV7OCEH  all  --  anywhere             anywhere             /* dev/pod2A:pod2A -> 10.244.3.104:8080 */
KUBE-MARK-MASQ  tcp  -- !ip-10-244-0-0.us-east-2.compute.internal/16  ip-10-111-140-245.us-east-2.compute.internal  /* dev/pod1A:pod1A cluster IP */ tcp dpt:postgresql
KUBE-SEP-YDMKERDDJRZDWWVM  all  --  anywhere             anywhere             /* dev/pod1A:pod1A -> 10.244.2.65:5432 */

Alguém pode me ajudar a descobrir por que não consigo ligar do pod para o serviço no mesmo nó?

Responder1

Resolvido, mas talvez alguém possa encontrar aqui uma solução. A questão era mais problemas ao mesmo tempo.

  1. Tínhamos diferentes versões de Servidor, Cliente e Kubelets nos trabalhadores (1.24.3, 1.24.5, .... evento 1.26.x).
  2. A configuração em kube-proxy / masqueradeAll não estava correta para as versões 1.26

Depois de atualizar todos os planos e nós para a mesma versão, no nosso caso usando kubeadm de 1.24.x->1.25.9->1.26.4 + atualizar o sistema operacional Ubuntuapt upgrade

O cluster começou a ficar estável novamente e todos os nós foram anexados corretamentekubectl get nodes

A mudança final ocorreu devido à atualização 1.25 -> 1.26 e discussão no GitHubresposta mweissdigchg...enfrentou os mesmos problemas após atualizar da v1.25 para a v1.26.

... descobriu-se que a configuração do kube-proxy iptables masqueradeAll: true tornou nossos serviços inacessíveis a partir de pods em outros nós. Parece que a configuração padrão mudou de masqueradeAll: false para masqueradeAll: true....

informação relacionada