ich habe einen Kubernetes-Cluster auf der OVH-Cloud. Heute hat Nginx beim Aufruf der Website plötzlich mit einem 503-Fehler geantwortet. Ich habe dann den Kubernetes-Cluster überprüft kubectl get pods
und konnte sehen, dass alle Pods, die einem bestimmten Volume zugeordnet sind, nicht mehr bereit waren. Bei allen Pods werden in den Ereignissen FailedAttachVolume- und FailedMount-Fehler angezeigt. Als Beispiel das Ereignisprotokoll eines Pods:
Warning FailedAttachVolume 15m attachdetach-controller AttachVolume.Attach failed for volume "example-managed-kubernetes-mrx2n8-pvc-ca435065-1111-aaaa-0123-543465516bb2" : rpc error: code = Internal desc = [ControllerPublishVolume] Attach Volume failed with error failed to attach 0655d400-3333-2222-1111-6fbcc2b62f94 volume to 5d51a18b-abcd-w2re-wfe2-a94d2b4ca988 compute: Bad request with: [POST https://compute.de1.cloud.example.net/v2.1/e2b2680af21e4q9n3e8hfoc39rpgowpd/servers/5d51a18b-abcd-w2re-wfe2-a94d2b4ca988/os-volume_attachments], error message: {"badRequest": {"code": 400, "message": "Invalid input received: Invalid volume: Volume 0655d400-3333-2222-1111-6fbcc2b62f94 status must be available or downloading to reserve, but the current status is in-use. (HTTP 400) (Request-ID: req-b1820e9f-935b-442e-b68e-efe7de0feb35)"}}
Warning FailedAttachVolume 12m (x2 over 14m) attachdetach-controller AttachVolume.Attach failed for volume "example-managed-kubernetes-mrx2n8-pvc-ca435065-1111-aaaa-0123-543465516bb2" : rpc error: code = Internal desc = [ControllerPublishVolume] Attach Volume failed with error failed to attach 0655d400-3333-2222-1111-6fbcc2b62f94 volume to 5d51a18b-abcd-w2re-wfe2-a94d2b4ca988 compute: Bad request with: [POST https://compute.de1.cloud.example.net/v2.1/e2b2680af21e4q9n3e8hfoc39rpgowpd/servers/5d51a18b-abcd-w2re-wfe2-a94d2b4ca988/os-volume_attachments], error message: {"badRequest": {"code": 400, "message": "Invalid input received: Invalid volume: Volume 0655d400-3333-2222-1111-6fbcc2b62f94 status must be available or downloading to reserve, but the current status is in-use. (HTTP 400) (Request-ID: req-c6e51d31-8646-44a2-ba75-7069e3ed87fa)"}}
Warning FailedAttachVolume 10m attachdetach-controller AttachVolume.Attach failed for volume "example-managed-kubernetes-mrx2n8-pvc-ca435065-1111-aaaa-0123-543465516bb2" : rpc error: code = Internal desc = [ControllerPublishVolume] Attach Volume failed with error failed to attach 0655d400-3333-2222-1111-6fbcc2b62f94 volume to 5d51a18b-abcd-w2re-wfe2-a94d2b4ca988 compute: Bad request with: [POST https://compute.de1.cloud.example.net/v2.1/e2b2680af21e4q9n3e8hfoc39rpgowpd/servers/5d51a18b-abcd-w2re-wfe2-a94d2b4ca988/os-volume_attachments], error message: {"badRequest": {"code": 400, "message": "Invalid input received: Invalid volume: Volume 0655d400-3333-2222-1111-6fbcc2b62f94 status must be available or downloading to reserve, but the current status is in-use. (HTTP 400) (Request-ID: req-a5c9c89e-5578-4b6c-8722-acf583dea1a8)"}}
Warning FailedMount 3m18s (x4 over 12m) kubelet Unable to attach or mount volumes: unmounted volumes=[files-volume], unattached volumes=[kube-api-access-q9pjc files-volume]: timed out waiting for the condition
Warning FailedMount 63s (x3 over 14m) kubelet Unable to attach or mount volumes: unmounted volumes=[files-volume], unattached volumes=[files-volume kube-api-access-q9pjc]: timed out waiting for the condition
Warning FailedAttachVolume 11s (x5 over 8m21s) attachdetach-controller (combined from similar events): AttachVolume.Attach failed for volume "example-managed-kubernetes-mrx2n8-pvc-ca435065-1111-aaaa-0123-543465516bb2" : rpc error: code = Internal desc = [ControllerPublishVolume] Attach Volume failed with error failed to attach 0655d400-3333-2222-1111-6fbcc2b62f94 volume to 5d51a18b-abcd-w2re-wfe2-a94d2b4ca988 compute: Bad request with: [POST https://compute.de1.cloud.example.net/v2.1/e2b2680af21e4q9n3e8hfoc39rpgowpd/servers/5d51a18b-abcd-w2re-wfe2-a94d2b4ca988/os-volume_attachments], error message: {"badRequest": {"code": 400, "message": "Invalid input received: Invalid volume: Volume 0655d400-3333-2222-1111-6fbcc2b62f94 status must be available or downloading to reserve, but the current status is in-use. (HTTP 400) (Request-ID: req-39554882-3f5c-40e9-aad2-482ff427c632)"}}
Das PVC ist in allen Bereitstellungen wie folgt integriert:
spec:
...
template:
spec:
...
containers:
- name: ...
...
volumeMounts:
- name: files-volume
mountPath: /files
...
volumes:
- name: files-volume
persistentVolumeClaim:
claimName: pv-files-claim
Das PVC sieht so aus:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pv-files-claim
spec:
storageClassName: csi-cinder-high-speed
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
Wie kann dieser Fehler auftreten und wie behebe ich ihn? Und wie kann ich den Fehler in Zukunft verhindern?
Mittlerweile haben sich alle Pods, bis auf einen, von alleine wieder mit dem Volume verbunden. Bei einem Pod scheint es allerdings nicht zu klappen.