Kubernetes FailedAttachVolume und FailedMount

Kubernetes FailedAttachVolume und FailedMount

ich habe einen Kubernetes-Cluster auf der OVH-Cloud. Heute hat Nginx beim Aufruf der Website plötzlich mit einem 503-Fehler geantwortet. Ich habe dann den Kubernetes-Cluster überprüft kubectl get podsund konnte sehen, dass alle Pods, die einem bestimmten Volume zugeordnet sind, nicht mehr bereit waren. Bei allen Pods werden in den Ereignissen FailedAttachVolume- und FailedMount-Fehler angezeigt. Als Beispiel das Ereignisprotokoll eines Pods:

      Warning  FailedAttachVolume  15m                  attachdetach-controller  AttachVolume.Attach failed for volume "example-managed-kubernetes-mrx2n8-pvc-ca435065-1111-aaaa-0123-543465516bb2" : rpc error: code = Internal desc = [ControllerPublishVolume] Attach Volume failed with error failed to attach 0655d400-3333-2222-1111-6fbcc2b62f94 volume to 5d51a18b-abcd-w2re-wfe2-a94d2b4ca988 compute: Bad request with: [POST https://compute.de1.cloud.example.net/v2.1/e2b2680af21e4q9n3e8hfoc39rpgowpd/servers/5d51a18b-abcd-w2re-wfe2-a94d2b4ca988/os-volume_attachments], error message: {"badRequest": {"code": 400, "message": "Invalid input received: Invalid volume: Volume 0655d400-3333-2222-1111-6fbcc2b62f94 status must be available or downloading to reserve, but the current status is in-use. (HTTP 400) (Request-ID: req-b1820e9f-935b-442e-b68e-efe7de0feb35)"}}
      Warning  FailedAttachVolume  12m (x2 over 14m)    attachdetach-controller  AttachVolume.Attach failed for volume "example-managed-kubernetes-mrx2n8-pvc-ca435065-1111-aaaa-0123-543465516bb2" : rpc error: code = Internal desc = [ControllerPublishVolume] Attach Volume failed with error failed to attach 0655d400-3333-2222-1111-6fbcc2b62f94 volume to 5d51a18b-abcd-w2re-wfe2-a94d2b4ca988 compute: Bad request with: [POST https://compute.de1.cloud.example.net/v2.1/e2b2680af21e4q9n3e8hfoc39rpgowpd/servers/5d51a18b-abcd-w2re-wfe2-a94d2b4ca988/os-volume_attachments], error message: {"badRequest": {"code": 400, "message": "Invalid input received: Invalid volume: Volume 0655d400-3333-2222-1111-6fbcc2b62f94 status must be available or downloading to reserve, but the current status is in-use. (HTTP 400) (Request-ID: req-c6e51d31-8646-44a2-ba75-7069e3ed87fa)"}}
      Warning  FailedAttachVolume  10m                  attachdetach-controller  AttachVolume.Attach failed for volume "example-managed-kubernetes-mrx2n8-pvc-ca435065-1111-aaaa-0123-543465516bb2" : rpc error: code = Internal desc = [ControllerPublishVolume] Attach Volume failed with error failed to attach 0655d400-3333-2222-1111-6fbcc2b62f94 volume to 5d51a18b-abcd-w2re-wfe2-a94d2b4ca988 compute: Bad request with: [POST https://compute.de1.cloud.example.net/v2.1/e2b2680af21e4q9n3e8hfoc39rpgowpd/servers/5d51a18b-abcd-w2re-wfe2-a94d2b4ca988/os-volume_attachments], error message: {"badRequest": {"code": 400, "message": "Invalid input received: Invalid volume: Volume 0655d400-3333-2222-1111-6fbcc2b62f94 status must be available or downloading to reserve, but the current status is in-use. (HTTP 400) (Request-ID: req-a5c9c89e-5578-4b6c-8722-acf583dea1a8)"}}
      Warning  FailedMount         3m18s (x4 over 12m)  kubelet                  Unable to attach or mount volumes: unmounted volumes=[files-volume], unattached volumes=[kube-api-access-q9pjc files-volume]: timed out waiting for the condition
      Warning  FailedMount         63s (x3 over 14m)    kubelet                  Unable to attach or mount volumes: unmounted volumes=[files-volume], unattached volumes=[files-volume kube-api-access-q9pjc]: timed out waiting for the condition
      Warning  FailedAttachVolume  11s (x5 over 8m21s)  attachdetach-controller  (combined from similar events): AttachVolume.Attach failed for volume "example-managed-kubernetes-mrx2n8-pvc-ca435065-1111-aaaa-0123-543465516bb2" : rpc error: code = Internal desc = [ControllerPublishVolume] Attach Volume failed with error failed to attach 0655d400-3333-2222-1111-6fbcc2b62f94 volume to 5d51a18b-abcd-w2re-wfe2-a94d2b4ca988 compute: Bad request with: [POST https://compute.de1.cloud.example.net/v2.1/e2b2680af21e4q9n3e8hfoc39rpgowpd/servers/5d51a18b-abcd-w2re-wfe2-a94d2b4ca988/os-volume_attachments], error message: {"badRequest": {"code": 400, "message": "Invalid input received: Invalid volume: Volume 0655d400-3333-2222-1111-6fbcc2b62f94 status must be available or downloading to reserve, but the current status is in-use. (HTTP 400) (Request-ID: req-39554882-3f5c-40e9-aad2-482ff427c632)"}}

Das PVC ist in allen Bereitstellungen wie folgt integriert:

spec:
  ...
  template:
    spec:
      ...
      containers:
      - name: ...
        ...
        volumeMounts:
        - name: files-volume
          mountPath: /files
        ...
      volumes:
        - name: files-volume
          persistentVolumeClaim:
            claimName: pv-files-claim

Das PVC sieht so aus:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pv-files-claim
spec:
  storageClassName: csi-cinder-high-speed
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi

Wie kann dieser Fehler auftreten und wie behebe ich ihn? Und wie kann ich den Fehler in Zukunft verhindern?

Mittlerweile haben sich alle Pods, bis auf einen, von alleine wieder mit dem Volume verbunden. Bei einem Pod scheint es allerdings nicht zu klappen.

verwandte Informationen