在「準備好」之前,如何確定 kubernetes 部署的正在進行/準備狀態?

在「準備好」之前,如何確定 kubernetes 部署的正在進行/準備狀態?

作為 CI 測試邏輯的一部分,我有一個腳本,它為一組專用節點中的每一個創建一個 Kubernetes 部署文件,刪除它們之前的任何部署,然後啟動新的部署。 (配置附加在底部,因為它可能並不重要。)一旦我用它們運行測試,它們就會關閉,準備下一次測試運行。節點只運行我的部署,所以我不必費心聲明它們需要多少 CPU/內存/任何東西,而且我沒有任何準備腳本,因為容器在它們之間工作,我只需要談論一旦擁有IP 地址,就可以連接到狀態監控服務。

通常它們在一分鐘左右就準備好了並開始工作 - 我的腳本監視以下命令的輸出,直到沒有報告“錯誤” - 但有時它們不會在我允許的時間內啟動:我不'如果出現問題,我不想等待不確定的時間- 我需要收集地址以將它們提供給下游進程,以便使用已完成的部署來設定我的測試- 但如果kubernetes 無法向我顯示有意義的進度或診斷為什麼事情很慢,我除了中止不完整的部署之外無能為力。

kubectl get pods -l pod-agent=$AGENT_NAME \
      -o 'jsonpath={range .items[*]}{..status.conditions[?(@.type=="Ready")].status}:{.status.podIP}:{.status.phase}:{.metadata.name} '

我推測可能是其中一個容器之前沒有在該主機上使用過,並且可能花了很長時間才將其複製到每個主機,以至於整個部署超出了我的腳本的超時時間,所以我添加了這個(忽略| cat | - 這是 IntelliJ 終端錯誤的解決方法)

kubectl describe pod $REPLY | cat | sed -n '/Events:/,$p; /emulator.*:/,/Ready:/p'

為了讓我了解每個 pod 正在做什麼,每次第一個命令返回“false”時,但我得到的結果看起來不一致:雖然“事件”部分聲稱容器已拉出並啟動,但結構化輸出相同的命令將容器顯示為“ContainerCreating”:

     1  False::Pending:kubulator-mysh-automation11-dlan-666b96d788-6gfl7
  emulator-5554:
    Container ID:   
    Image:          dockerio.dlan/auto/android-avd-10a29v8-emu29_0_11_kuber-snapshot-skin_name-540x1060-hw_lcd_density-240
    Image ID:       
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
  emulator-5556:
    Container ID:   
    Image:          dockerio.dlan/auto/android-avd-10a29v8-emu29_0_11_kuber-snapshot-skin_name-540x1060-hw_lcd_density-240
    Image ID:       
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False

..更多相同,然後

Events:
  Type    Reason     Age   From                        Message
  ----    ------     ----  ----                        -------
  Normal  Scheduled  23s   default-scheduler           Successfully assigned auto/kubulator-mysh-automation11-dlan-666b96d788-6gfl7 to automation11.dlan
  Normal  Pulling    16s   kubelet, automation11.dlan  Pulling image "dockerio.dlan/auto/ticket-machine"
  Normal  Pulled     16s   kubelet, automation11.dlan  Successfully pulled image "dockerio.dlan/auto/ticket-machine"
  Normal  Created    16s   kubelet, automation11.dlan  Created container ticket-machine
  Normal  Started    16s   kubelet, automation11.dlan  Started container ticket-machine
  Normal  Pulling    16s   kubelet, automation11.dlan  Pulling image "dockerio.dlan/qa/cgi-bin-remote"
  Normal  Created    15s   kubelet, automation11.dlan  Created container cgi-adb-remote
  Normal  Pulled     15s   kubelet, automation11.dlan  Successfully pulled image "dockerio.dlan/qa/cgi-bin-remote"
  Normal  Started    15s   kubelet, automation11.dlan  Started container cgi-adb-remote
  Normal  Pulling    15s   kubelet, automation11.dlan  Pulling image "dockerio.dlan/auto/android-avd-10a29v8-emu29_0_11_kuber-snapshot-skin_name-540x1060-hw_lcd_density-240"
  Normal  Pulled     15s   kubelet, automation11.dlan  Successfully pulled image "dockerio.dlan/auto/android-avd-10a29v8-emu29_0_11_kuber-snapshot-skin_name-540x1060-hw_lcd_density-240"
  Normal  Created    15s   kubelet, automation11.dlan  Created container emulator-5554
  Normal  Started    15s   kubelet, automation11.dlan  Started container emulator-5554
  Normal  Pulled     15s   kubelet, automation11.dlan  Successfully pulled image "dockerio.dlan/auto/android-avd-10a29v8-emu29_0_11_kuber-snapshot-skin_name-540x1060-hw_lcd_density-240"
  Normal  Pulling    15s   kubelet, automation11.dlan  Pulling image "dockerio.dlan/auto/android-avd-10a29v8-emu29_0_11_kuber-snapshot-skin_name-540x1060-hw_lcd_density-240"
  Normal  Created    14s   kubelet, automation11.dlan  Created container emulator-5556
  Normal  Started    14s   kubelet, automation11.dlan  Started container emulator-5556
  Normal  Pulling    14s   kubelet, automation11.dlan  Pulling image "dockerio.dlan/auto/android-avd-10a29v8-emu29_0_11_kuber-snapshot-skin_name-540x1060-hw_lcd_density-240"

因此事件聲稱容器已啟動,但結構化資料與此相矛盾。我會使用這些事件作為權威,但儘管伺服器沒有設定任何事件速率限製配置,但它們在 26 個前導(!)條目處被相當奇怪地截斷。

我包含了事件聲稱在最後「啟動」的容器之一的完整描述,但我在完整輸出中沒有看到任何線索。

一旦部署開始 - 即第一行顯示“true”,所有容器突然顯示為“Running”。

describe pod因此,我的基本問題是,鑑於顯然不可靠和/或不完整,我如何確定部署的實際狀態(顯然是由“事件”表示的),以了解在失敗時它被卡住的原因和位置?

除了「kubectl get pods」之外,還有什麼東西可以用來找到真實的遊戲狀態嗎? (最好不要像透過 ssh 連接到伺服器並嗅探其原始日誌這樣的粗俗行為。)

謝謝。

kubectl版本客戶端版本:version.Info{Major:“1”,Minor:“16”,GitVersion:“v1.16.3”,GitCommit:“b3cbbae08ec52a7fc73d334838e18d17e8512749”,Git73d334838e18d17e8512749”,GitreeT1013125353535353535353535300003253535335333533353353332年:53030303253353353353335332年: 23:11Z", GoVersion:"go1.12.12", 編譯器:"gc", 平台:"linux/amd64"} 伺服器版本: version.Info{Major:"1", Minor:"15", GitVersion: "v1 .15.3”,GitCommit:“2d3c76f9091b6bec110a5e63777c332469e0cba2”,GitTreeState:“乾淨”,BuildDate:“2019-08-19T11:05:乾淨”,BuildDate:“2019-08-19T11:05:30025:300025:300025:3003:3003:303:30個linux/amd64 ”}


我的部署檔案:

apiVersion: v1
kind: Service
metadata:
  name: kubulator-mysh-automation11-dlan
  labels:
    run: kubulator-mysh-automation11-dlan
    pod-agent: mysh
spec:
  type: ClusterIP
  clusterIP: None
  ports:
    - name: http
      protocol: TCP
      port: 8088
      targetPort: 8088
    - name: adb-remote
      protocol: TCP
      port: 8080
      targetPort: 8080
    - name: adb
      protocol: TCP
      port: 9100
      targetPort: 9100
  selector:
    run: kubulator-mysh-automation11-dlan
    kubernetes.io/hostname: automation11.dlan
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kubulator-mysh-automation11-dlan
  labels:
    pod-agent: mysh
spec:
  selector:
    matchLabels:
      run: kubulator-mysh-automation11-dlan
      pod-agent: mysh
  replicas: 1
  template:
    metadata:
      labels:
        run: kubulator-mysh-automation11-dlan
        pod-agent: mysh
    spec:
      nodeSelector:
        kubernetes.io/hostname: automation11.dlan
      volumes:
        - name: dev-kvm
          hostPath:
            path: /dev/kvm
            type: CharDevice
        - name: logs
          emptyDir: {}
      containers:
- name: ticket-machine
  image: dockerio.dlan/auto/ticket-machine
  args: ['--', '--count', '20']  # --adb /local/adb-....
  imagePullPolicy: Always
  volumeMounts:
    - mountPath: /logs
      name: logs
  ports:
    - containerPort: 8088
  env:
    - name: ANDROID_ADB_SERVER_PORT
      value: "9100"
    - name: ANDROID_ADB_SERVER
      value: host
- name: cgi-adb-remote
  image: dockerio.dlan/qa/cgi-bin-remote
  args: ['/root/git/CgiAdbRemote/CgiAdbRemote.pl', '-foreground', '-port=8080', "-adb=/root/adb-8aug-usbbus-maxemu-v39"]
  imagePullPolicy: Always
  ports:
    - containerPort: 8080
  env:
    - name: ADB_SERVER_SOCKET
      value: "tcp:localhost:9100"
    - name: ANDROID_ADB_SERVER
      value: host
- name: emulator-5554
  image: dockerio.dlan/auto/android-avd-10a29v8-emu29_0_11_kuber-snapshot-skin_name-540x1060-hw_lcd_density-240
  imagePullPolicy: Always
  securityContext:
    privileged: true
  volumeMounts:
    - mountPath: /logs
      name: logs
    - mountPath: /dev/kvm
      name: dev-kvm
  env:
    - name: ANDROID_ADB_VERSION
      value: v39
    - name: ANDROID_ADB_SERVER_PORT
      value: '9100'
    - name: EMULATOR_PORT
      value: '5554'
    - name: EMULATOR_MAX_SECS
      value: '2400'
    - name: ANDROID_ADB_SERVER
      value: host
    - name: EMU_WINDOW
      value: '2'
- name: emulator-5556
  image: dockerio.dlan/auto/android-avd-10a29v8-emu29_0_11_kuber-snapshot-skin_name-540x1060-hw_lcd_density-240
... etc - several more of these emulator containers.

以及由事件聲明為「啟動」的容器的完整「描述」:

  emulator-5554:
    Container ID:   
    Image:          dockerio.dlan/auto/android-avd-10a29v8-emu29_0_11_kuber-snapshot-skin_name-540x1060-hw_lcd_density-240
    Image ID:       
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:
      ANDROID_ADB_VERSION:      v39
      ANDROID_ADB_SERVER_PORT:  9100
      EMULATOR_PORT:            5554
      EMULATOR_MAX_SECS:        2400
      ANDROID_ADB_SERVER:       host
      EMU_WINDOW:               2
    Mounts:
      /dev/kvm from dev-kvm (rw)
      /logs from logs (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-2jrv5 (ro)

答案1

您可以使用kubectl 等待暫停測試執行,直到 Pod 處於Ready狀態。

請記住,如果您的應用程式不使用就緒性機率,則 Pod 處於某種Ready狀態並不意味著您的應用程式實際上已準備好接收流量,這可能會使您的測試不穩定。

相關內容