Los pods de Kubernetes que se ejecutan en el mismo nodo no pueden acceder entre sí a través del servicio

Los pods de Kubernetes que se ejecutan en el mismo nodo no pueden acceder entre sí a través del servicio

Estoy atascado al intentar llamar de un pod a otro a través del servicio cuando están en el mismo nodo.

para explicar

  • Nodo1 - servicio1a (pod1A), servicio1b (pod1B)
  • Nodo2 - servicio2a (pod2A)

cuando:

  • PING pod1A -> pod1B OK
  • PING pod1A -> pod2A OK
  • CURL pod1A -> servicio2A Aceptar
  • CURL pod1A -> servicio1B TIEMPO DE ESPERA

Pase varios días cambiando alguna parte de las configuraciones y también busque el mismo problema en Internet, pero no tuvo suerte.

El mismo problema estaba aquí:https://stackoverflow.com/questions/64073696/pods-running-on-the-same-node-cant-access-to-each-other-through-service, pero se solucionó pasando de IPVS a IPTABLES. Tengo IPTables.

PODS: (se cambiaron los nombres)

NAMESPACE              NAME                                            READY   STATUS    RESTARTS       AGE    IP              NODE          NOMINATED NODE   READINESS GATES
dev                    pod/pod1B-5d595bf69-8fgsm                       1/1     Running   0              16h    10.244.2.63     kube-node2    <none>           <none>
dev                    pod/pod1A-database-596f76d8b5-6bdqv             1/1     Running   0              16h    10.244.2.65     kube-node2    <none>           <none>
dev                    pod/pod2A-dbb8fd4d-xv54n                        1/1     Running   1              15h    10.244.3.104    kube-node3    <none>           <none>
kube-system            pod/coredns-6d4b75cb6d-6b2cn                    1/1     Running   4 (50d ago)    292d   10.244.0.10     kube-master   <none>           <none>
kube-system            pod/coredns-6d4b75cb6d-6m7q2                    1/1     Running   4 (50d ago)    292d   10.244.0.11     kube-master   <none>           <none>
kube-system            pod/etcd-kube-master                            1/1     Running   2 (50d ago)    50d    172.31.42.90    kube-master   <none>           <none>
kube-system            pod/kube-apiserver-kube-master                  1/1     Running   2 (50d ago)    50d    172.31.42.90    kube-master   <none>           <none>
kube-system            pod/kube-controller-manager-kube-master         1/1     Running   1 (50d ago)    50d    172.31.42.90    kube-master   <none>           <none>
kube-system            pod/kube-flannel-ds-bwkjg                       1/1     Running   0              62s    172.31.45.210   kube-node3    <none>           <none>
kube-system            pod/kube-flannel-ds-g9v9m                       1/1     Running   0              66s    172.31.42.90    kube-master   <none>           <none>
kube-system            pod/kube-flannel-ds-hljj5                       1/1     Running   0              30s    172.31.42.77    kube-node2    <none>           <none>
kube-system            pod/kube-flannel-ds-k4zfw                       1/1     Running   0              65s    172.31.43.77    kube-node1    <none>           <none>
kube-system            pod/kube-proxy-68k5n                            1/1     Running   0              35m    172.31.45.210   kube-node3    <none>           <none>
kube-system            pod/kube-proxy-lb6s9                            1/1     Running   0              35m    172.31.42.90    kube-master   <none>           <none>
kube-system            pod/kube-proxy-vggwk                            1/1     Running   0              35m    172.31.43.77    kube-node1    <none>           <none>
kube-system            pod/kube-proxy-wxwd7                            1/1     Running   0              34m    172.31.42.77    kube-node2    <none>           <none>
kube-system            pod/kube-scheduler-kube-master                  1/1     Running   1 (50d ago)    50d    172.31.42.90    kube-master   <none>           <none>
kube-system            pod/metrics-server-55d58b59c9-569p5             1/1     Running   0              15h    10.244.2.69     kube-node2    <none>           <none>
kubernetes-dashboard   pod/dashboard-metrics-scraper-8c47d4b5d-2vxfj   1/1     Running   0              16h    10.244.2.64     kube-node2    <none>           <none>

SERVICIOS:(nombres cambiados)

NAMESPACE              NAME                                TYPE           CLUSTER-IP       EXTERNAL-IP    PORT(S)                                           AGE    SELECTOR
default                service/kubernetes                  ClusterIP      10.96.0.1        <none>         443/TCP                                           292d   <none>
dev                    service/pod1B(service1b)            ClusterIP      10.102.52.69     <none>         5432/TCP                                          42h    app=database
dev                    service/pof2A(service2a)            ClusterIP      10.105.208.135   <none>         8080/TCP                                          42h    app=keycloak
dev                    service/pod1A(service1a)            ClusterIP      10.111.140.245   <none>         5432/TCP                                          42h    app=keycloak-database
kube-system            service/kube-dns                    ClusterIP      10.96.0.10       <none>         53/UDP,53/TCP,9153/TCP                            292d   k8s-app=kube-dns
kube-system            service/metrics-server              ClusterIP      10.111.227.187   <none>         443/TCP                                           285d   k8s-app=metrics-server
kubernetes-dashboard   service/dashboard-metrics-scraper   ClusterIP      10.110.143.2     <none>         8000/TCP                                          247d   k8s-app=dashboard-metrics-scraper

use kube-proxy IPTABLES y los servicios están configurados en CLUSTER IP (uso alrededor de 2-3 NodePort pero creo que esto no es un problema).

Configuración del demonio de Kube-proxy

apiVersion: v1
data:
  config.conf: |-
    apiVersion: kubeproxy.config.k8s.io/v1alpha1
    bindAddress: 0.0.0.0
    bindAddressHardFail: false
    clientConnection:
      acceptContentTypes: ""
      burst: 0
      contentType: ""
      kubeconfig: /var/lib/kube-proxy/kubeconfig.conf
      qps: 0
    clusterCIDR: 10.244.10.0/16
    configSyncPeriod: 0s
    conntrack:
      maxPerCore: null
      min: null
      tcpCloseWaitTimeout: null
      tcpEstablishedTimeout: null
    detectLocal:
      bridgeInterface: ""
      interfaceNamePrefix: ""
    detectLocalMode: ""
    enableProfiling: false
    healthzBindAddress: ""
    hostnameOverride: ""
    iptables:
      masqueradeAll: false
      masqueradeBit: null
      minSyncPeriod: 0s
      syncPeriod: 0s
    ipvs:
      excludeCIDRs: null
      minSyncPeriod: 0s
      scheduler: ""
      strictARP: false
      syncPeriod: 0s
      tcpFinTimeout: 0s
      tcpTimeout: 0s
      udpTimeout: 0s
    kind: KubeProxyConfiguration
    metricsBindAddress: 127.0.0.1:10249
    mode: ""
    nodePortAddresses: null
    oomScoreAdj: null
    portRange: ""
    showHiddenMetricsForVersion: ""
    udpIdleTimeout: 0s
    winkernel:
      enableDSR: false
      forwardHealthCheckVip: false
      networkName: ""
      rootHnsEndpointName: ""
      sourceVip: ""
  kubeconfig.conf: |-
    apiVersion: v1
    kind: Config
    clusters:
    - cluster:
        certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        server: https://172.31.42.90:6443
      name: default
    contexts:
    - context:
        cluster: default
        namespace: default
        user: default
      name: default
    current-context: default
    users:
    - name: default
      user:
        tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
kind: ConfigMap
metadata:
  annotations:
    kubeadm.kubernetes.io/component-config.hash: sha256:ebcfa3923c1228031a5b824f2edca518edc4bd49fd07cedeffa371084cba342b
  creationTimestamp: "2022-07-03T19:28:14Z"
  labels:
    app: kube-proxy
  name: kube-proxy
  namespace: kube-system
  resourceVersion: "40174591"
  uid: cfadfa22-ed43-4a3f-9897-25f605ebb8b9

ajuste de demonio de franela

apiVersion: v1
data:
  cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }
kind: ConfigMap
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","data":{"cni-conf.json":"{\n  \"name\": \"cbr0\",\n  \"cniVersion\": \"0.3.1\",\n  \"plugins\": [\n    {\n      \"type\": \"flannel\",\n      \"delegate\": {\n        \"hairpinMode\": true,\n        \"isDe$  creationTimestamp: "2022-07-03T20:34:32Z"
  labels:
    app: flannel
    tier: node
  name: kube-flannel-cfg
  namespace: kube-system
  resourceVersion: "40178136"
  uid: ccb81719-7013-4f0b-8c20-ab9d5c3d5f8e

TABLAS IP sistema-kube

iptables -L -t nat | grep kube-system
KUBE-MARK-MASQ  all  --  ip-10-244-2-69.us-east-2.compute.internal  anywhere             /* kube-system/metrics-server:https */
DNAT       tcp  --  anywhere             anywhere             /* kube-system/metrics-server:https */ tcp DNAT [unsupported revision]
KUBE-MARK-MASQ  all  --  ip-10-244-0-11.us-east-2.compute.internal  anywhere             /* kube-system/kube-dns:dns-tcp */
DNAT       tcp  --  anywhere             anywhere             /* kube-system/kube-dns:dns-tcp */ tcp DNAT [unsupported revision]
KUBE-MARK-MASQ  all  --  ip-10-244-0-10.us-east-2.compute.internal  anywhere             /* kube-system/kube-dns:metrics */
DNAT       tcp  --  anywhere             anywhere             /* kube-system/kube-dns:metrics */ tcp DNAT [unsupported revision]
KUBE-MARK-MASQ  all  --  ip-10-244-0-10.us-east-2.compute.internal  anywhere             /* kube-system/kube-dns:dns-tcp */
DNAT       tcp  --  anywhere             anywhere             /* kube-system/kube-dns:dns-tcp */ tcp DNAT [unsupported revision]
KUBE-MARK-MASQ  all  --  ip-10-244-0-10.us-east-2.compute.internal  anywhere             /* kube-system/kube-dns:dns */
DNAT       udp  --  anywhere             anywhere             /* kube-system/kube-dns:dns */ udp DNAT [unsupported revision]
KUBE-MARK-MASQ  all  --  ip-10-244-0-11.us-east-2.compute.internal  anywhere             /* kube-system/kube-dns:dns */
DNAT       udp  --  anywhere             anywhere             /* kube-system/kube-dns:dns */ udp DNAT [unsupported revision]
KUBE-MARK-MASQ  all  --  ip-10-244-0-11.us-east-2.compute.internal  anywhere             /* kube-system/kube-dns:metrics */
DNAT       tcp  --  anywhere             anywhere             /* kube-system/kube-dns:metrics */ tcp DNAT [unsupported revision]
KUBE-SVC-TCOU7JCQXEZGVUNU  udp  --  anywhere             ip-10-96-0-10.us-east-2.compute.internal  /* kube-system/kube-dns:dns cluster IP */ udp dpt:domain
KUBE-SVC-JD5MR3NA4I4DYORP  tcp  --  anywhere             ip-10-96-0-10.us-east-2.compute.internal  /* kube-system/kube-dns:metrics cluster IP */ tcp dpt:9153
KUBE-SVC-Z4ANX4WAEWEBLCTM  tcp  --  anywhere             ip-10-111-227-187.us-east-2.compute.internal  /* kube-system/metrics-server:https cluster IP */ tcp dpt:https
KUBE-SVC-ERIFXISQEP7F7OF4  tcp  --  anywhere             ip-10-96-0-10.us-east-2.compute.internal  /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:domain
KUBE-MARK-MASQ  tcp  -- !ip-10-244-0-0.us-east-2.compute.internal/16  ip-10-96-0-10.us-east-2.compute.internal  /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:domain
KUBE-SEP-OP4AXEAS4OXHBEQX  all  --  anywhere             anywhere             /* kube-system/kube-dns:dns-tcp -> 10.244.0.10:53 */ statistic mode random probability 0.50000000000
KUBE-SEP-A7YQ4MY4TZII3JTK  all  --  anywhere             anywhere             /* kube-system/kube-dns:dns-tcp -> 10.244.0.11:53 */
KUBE-MARK-MASQ  tcp  -- !ip-10-244-0-0.us-east-2.compute.internal/16  ip-10-96-0-10.us-east-2.compute.internal  /* kube-system/kube-dns:metrics cluster IP */ tcp dpt:9153
KUBE-SEP-HJ7EWOW62IX6GL6R  all  --  anywhere             anywhere             /* kube-system/kube-dns:metrics -> 10.244.0.10:9153 */ statistic mode random probability 0.50000000000
KUBE-SEP-ZJHOSXJEKQGYJUBC  all  --  anywhere             anywhere             /* kube-system/kube-dns:metrics -> 10.244.0.11:9153 */
KUBE-MARK-MASQ  udp  -- !ip-10-244-0-0.us-east-2.compute.internal/16  ip-10-96-0-10.us-east-2.compute.internal  /* kube-system/kube-dns:dns cluster IP */ udp dpt:domain
KUBE-SEP-R7EMXN5TTQQVP4UW  all  --  anywhere             anywhere             /* kube-system/kube-dns:dns -> 10.244.0.10:53 */ statistic mode random probability 0.50000000000
KUBE-SEP-VR6VIIG2A6524KLY  all  --  anywhere             anywhere             /* kube-system/kube-dns:dns -> 10.244.0.11:53 */
KUBE-MARK-MASQ  tcp  -- !ip-10-244-0-0.us-east-2.compute.internal/16  ip-10-111-227-187.us-east-2.compute.internal  /* kube-system/metrics-server:https cluster IP */ tcp dpt:https
KUBE-SEP-6BOUBB2FEQTN2GDB  all  --  anywhere             anywhere             /* kube-system/metrics-server:https -> 10.244.2.69:4443 */
iptables -L -t nat | grep dev
KUBE-MARK-MASQ  all  --  ip-10-244-3-104.us-east-2.compute.internal  anywhere             /* dev/pod2A:pod2A */
DNAT       tcp  --  anywhere             anywhere             /* dev/pod2A:pod2A */ tcp DNAT [unsupported revision]
KUBE-MARK-MASQ  all  --  ip-10-244-2-63.us-east-2.compute.internal  anywhere             /* dev/pod1B:pod1B */
DNAT       tcp  --  anywhere             anywhere             /* dev/pod1B:pod1B */ tcp DNAT [unsupported revision]
KUBE-MARK-MASQ  all  --  ip-10-244-2-65.us-east-2.compute.internal  anywhere             /* dev/pod1A:pod1A */
DNAT       tcp  --  anywhere             anywhere             /* dev/pod1A:pod1A */ tcp DNAT [unsupported revision]
KUBE-SVC-MI7BJVF4L3EWWCLA  tcp  --  anywhere             ip-10-105-208-135.us-east-2.compute.internal  /* dev/pod2A:pod2A cluster IP */ tcp dpt:http-alt
KUBE-SVC-S2FASJERAWCYNV26  tcp  --  anywhere             ip-10-111-140-245.us-east-2.compute.internal  /* dev/pod1A:pod1A cluster IP */ tcp dpt:postgresql
KUBE-SVC-5JHIIG3NJGZTIC4I  tcp  --  anywhere             ip-10-102-52-69.us-east-2.compute.internal  /* dev/pod1B:pod1B cluster IP */ tcp dpt:postgresql
KUBE-MARK-MASQ  tcp  -- !ip-10-244-0-0.us-east-2.compute.internal/16  ip-10-102-52-69.us-east-2.compute.internal  /* dev/pod1B:pod1B cluster IP */ tcp dpt:postgresql
KUBE-SEP-FOQDGOYPAUSJGXYE  all  --  anywhere             anywhere             /* dev/pod1B:pod1B -> 10.244.2.63:5432 */
KUBE-MARK-MASQ  tcp  -- !ip-10-244-0-0.us-east-2.compute.internal/16  ip-10-105-208-135.us-east-2.compute.internal  /* dev/pod2A:pod2A cluster IP */ tcp dpt:http-alt
KUBE-SEP-AWG5CHYLOHV7OCEH  all  --  anywhere             anywhere             /* dev/pod2A:pod2A -> 10.244.3.104:8080 */
KUBE-MARK-MASQ  tcp  -- !ip-10-244-0-0.us-east-2.compute.internal/16  ip-10-111-140-245.us-east-2.compute.internal  /* dev/pod1A:pod1A cluster IP */ tcp dpt:postgresql
KUBE-SEP-YDMKERDDJRZDWWVM  all  --  anywhere             anywhere             /* dev/pod1A:pod1A -> 10.244.2.65:5432 */

¿Alguien puede ayudarme a descubrir por qué no puedo conectarme desde el pod al servicio en el mismo nodo?

Respuesta1

Resuelto, pero quizás alguien más pueda encontrar aquí una solución. La cuestión eran más problemas a la vez.

  1. Teníamos diferentes versiones de Server, Client y Kubelets en los trabajadores (1.24.3, 1.24.5, .... evento 1.26.x).
  2. La configuración en kube-proxy/masqueradeAll no era correcta para las versiones 1.26

Después de actualizar todos los planos y nodos a la misma versión, en nuestro caso usando kubeadm desde 1.24.x->1.25.9->1.26.4 + actualizar ubuntu OSapt upgrade

El clúster comienza a ser estable nuevamente y todos los nodos se conectaron correctamentekubectl get nodes

El cambio final se debió a la actualización 1.25 -> 1.26 y la discusión en GitHub.respuesta mweissdigchg...experimentó los mismos problemas después de actualizar de v1.25 a v1.26.

... resultó que la configuración de kube-proxy iptables masqueradeAll: true hacía que nuestros servicios fueran inaccesibles desde pods en otros nodos. Parece que la configuración predeterminada cambió de masqueradeAll: false a masqueradeAll: true....

información relacionada