Grafana Loki，AlertManager - 無法讀取規則目錄，開啟 /tmp/loki/rules/fake：沒有這樣的檔案或目錄

2024-6-28 • tag-icon

Grafana Loki，AlertManager - 無法讀取規則目錄，開啟 /tmp/loki/rules/fake：沒有這樣的檔案或目錄

我已經在本機上的 k3d 叢集上使用 Helm 圖表部署了 promtail、Grafana、Loki 和 AlertManager。我想在 Loki 中製定一些規則，以便如果發生某些事情，應該通知 AlertManager。現在我只嘗試了一些簡單的規則，只是為了檢查它是否有效。

我的洛基版本：{"version":"2.6.1","revision":"6bd05c9a4","branch":"HEAD","buildUser":"root@ea1e89b8da02","buildDate":"2022-07-18T08:49:07Z","goVersion":""}

我的 Grafana 版本：

以及 Loki 配置：

loki:
  # should loki be deployed on cluster?
  enabled: true

  image:
    repository: grafana/loki
    pullPolicy: Always
    pullSecrets:
      - registry
  priorityClassName: normal
  resources:
    limits:
      memory: 3Gi
      cpu: 0
    requests:
      memory: 0
      cpu: 0
  config:
    chunk_store_config:
      max_look_back_period: 30d
    table_manager:
      retention_deletes_enabled: true
      retention_period: 30d
    query_range:
      split_queries_by_interval: 0
      parallelise_shardable_queries: false
    querier:
      max_concurrent: 2048
    frontend:
      max_outstanding_per_tenant: 4096
      compress_responses: true
    ingester:
      wal:
        enabled: true
        dir: /tmp/wal
    schema_config:
      configs:
        - from: 2022-12-05
          store: boltdb-shipper
          object_store: filesystem
          schema: v11
          index:
            prefix: index_
            period: 24h
    storage_config:
      boltdb_shipper:
        active_index_directory: /tmp/loki/boltdb-shipper-active
        cache_location: /tmp/loki/boltdb-shipper-cache
        cache_ttl: 24h         # Can be increased for faster performance over longer query periods, uses more disk space 
        shared_store: filesystem
      filesystem:
        directory: /tmp/loki/chunks
    compactor:
      working_directory: /tmp/loki/boltdb-shipper-compactor
      shared_store: filesystem
    ruler:
      storage:
        type: local
        local:
          directory: /tmp/loki/rules/
      ring:
        kvstore:
          store: inmemory
      rule_path: /tmp/loki/rules-temp
      alertmanager_url: http://onprem-kube-prometheus-alertmanager.svc.mylocal-monitoring:9093
      enable_api: true
      enable_alertmanager_v2: true
  write:
    extraVolumeMounts:
      - name: rules-config
        mountPath: /tmp/loki/rules/fake/
    extraVolumes:
      - name: rules-config
        configMap:
          name: rules-cfgmap
          items:
            - key: "rules.yaml"
              path: "rules.yaml"
  read:
    extraVolumeMounts:
      - name: rules-config
        mountPath: /tmp/loki/rules/fake/
    extraVolumes:
      - name: rules-config
        configMap:
          name: rules-cfgmap
          items:
            - key: "rules.yaml"
              path: "rules.yaml"

promtail:
  image:
    registry: docker
    pullPolicy: Always
  imagePullSecrets:
    - name: registry
  priorityClassName: normal
  resources:
    limits:
      memory: 256Mi
      cpu: 0
    requests:
      memory: 0
      cpu: 0
  livenessProbe:
    failureThreshold: 5
    httpGet:
      path: /ready
      port: http-metrics
    initialDelaySeconds: 10
    periodSeconds: 10
    successThreshold: 1
    timeoutSeconds: 1
  config:
    snippets:
      pipelineStages:
        - cri: {}
      common:
        - action: replace
          source_labels:
            - __meta_kubernetes_pod_node_name
          target_label: node_name
        - action: replace
          source_labels:
            - __meta_kubernetes_namespace
          target_label: namespace
        - action: replace
          source_labels:
            - __meta_kubernetes_pod_container_name
          target_label: container
        - action: replace
          replacement: /var/log/pods/*$1/*.log
          separator: /
          source_labels:
            - __meta_kubernetes_pod_uid
            - __meta_kubernetes_pod_container_name
          target_label: __path__
        - action: replace
          replacement: /var/log/pods/*$1/*.log
          regex: true/(.*)
          separator: /
          source_labels:
            - __meta_kubernetes_pod_annotationpresent_kubernetes_io_config_hash
            - __meta_kubernetes_pod_annotation_kubernetes_io_config_hash
            - __meta_kubernetes_pod_container_name
          target_label: __path__

monitoring:
  enabled: false

networkPolicies:
  enabled: false

問題是，當我想檢查規則時，這樣做curl -X GET localhost:3100/loki/api/v1/rules會告訴我：unable to read rule dir /tmp/loki/rules/fake: open /tmp/loki/rules/fake: no such file or directory。

所以看起來它找不到規則檔。

我也嘗試像這樣更改配置：

write:
    extraVolumeMounts:
      - name: rules-conf
        mountPath: /tmp/loki/rules/fake/rules.yaml
    extraVolumes:
      - name: rules-conf
  read:
    extraVolumeMounts:
      - name: rules-conf
        mountPath: /tmp/loki/rules/fake/rules.yaml
    extraVolumes:
      - name: rules-conf

還有我的配置圖：

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: rules-cfgmap
  namespace: mylocal-monitoring
data:
  rules.yaml: |
    groups:
      - name: PrometheusAlertsGroup
    rules:
      - alert: test1
      expr: |
        1 > 0
        for: 0m
        labels:
          severity: critical
        annotations:
          summary: TEST: testing test
          description: test

以及規則文件：

groups:
  - name: PrometheusAlertsGroup
  rules:
    - alert: test1
      expr: |
        1 > 0
        for: 0m
        labels:
          severity: critical
        annotations:
          summary: TEST: testing test
          description: test

但問題是一樣的。有任何想法嗎？

當我手動創建時它終於起作用了/tmp/loki/rules/fake/rules.yaml，但這不是手動創建它的重點。

相關內容