準備

Question 1

自訂實例健康檢查（頁面底部）是一種選擇。

您將在機器（或任何真正的機器）上運行一段單獨的程式碼，用於監視運行狀況並運行將實例設定為不健康的 api 調用

我還有另一個半成形的想法，但我不太確定如何實現這個。我是一個本地系統的架構師，我們將負載平衡器呼叫到實例上的一個單獨的 Web 伺服器中，在我們的例子中，它是一個小型的自訂 Java Web 伺服器，大約有 50 行程式碼。它會傳回 HTTP 狀態碼，如果正常則傳回 200（正常），如果需要終止則傳回 500（錯誤）。我懷疑類似的東西可以與自動縮放集成，但我有一段時間沒有這樣做了，我不確定如何將其與自動縮放集成。

這是上面第一個想法的命令

aws autoscaling set-instance-health --instance-id i-123abc45d --health-status Unhealthy

Answer

自訂實例健康檢查（頁面底部）是一種選擇。

您將在機器（或任何真正的機器）上運行一段單獨的程式碼，用於監視運行狀況並運行將實例設定為不健康的 api 調用

我還有另一個半成形的想法，但我不太確定如何實現這個。我是一個本地系統的架構師，我們將負載平衡器呼叫到實例上的一個單獨的 Web 伺服器中，在我們的例子中，它是一個小型的自訂 Java Web 伺服器，大約有 50 行程式碼。它會傳回 HTTP 狀態碼，如果正常則傳回 200（正常），如果需要終止則傳回 500（錯誤）。我懷疑類似的東西可以與自動縮放集成，但我有一段時間沒有這樣做了，我不確定如何將其與自動縮放集成。

這是上面第一個想法的命令

aws autoscaling set-instance-health --instance-id i-123abc45d --health-status Unhealthy

Question 2

對於遇到這個問題的任何人：

儘管我相信 AWS 應該在 CloudWatch 中包含這樣的功能，但遺憾的是我找不到任何表明該功能可用的資訊。因此，我創建了一個 bash 腳本，用於查詢 CloudWatch API 以確定資源消耗指標，然後相應地設定實例運行狀況，如建議的那樣提姆：

準備

如果你還沒有這樣做，安裝 AWS 命令列介面。也可透過yum或取得apt。
配置 AWS CLI透過運行aws configure，填寫您的 API 金鑰和其他設定。重要的：如果您打算像我一樣以 root 身分執行下面的腳本，則必須以 root 身分執行此設定命令。否則，腳本將會失敗。

/root/my-health-check.sh

#!/bin/bash
# retrieve metrics starting from 20 minutes ago (3 data points)
# Note: Sometimes CloudWatch failed to gather data for a specific period,
# then the number of data points returned could be less than what we expect.
# Also, when the instance just started, there will be no data point.
start_time=$(date -d "-20 minutes" -u +"%Y-%m-%dT%H:%M:%SZ")
# retrieve metrics up to now
end_time=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
# get current instance ID [1]
instance_id=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
# get current region [2]
# This is only needed if you have multiple regions to manage, otherwise just
# specify a region via `aws configure`.
region=$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone | sed 's/\(.*\)[a-z]/\1/')
# save data retrieved for processing [3]
# Here I used an example of retrieving "NetworkIn" of "AWS/EC2" namespace,
# with metric resolution set to 300 (5 minutes).
# For a list of available metrics, run `aws cloudwatch list-metrics`
datapoints=$(aws cloudwatch get-metric-statistics --namespace AWS/EC2 --metric-name NetworkIn --dimensions Name=InstanceId,Value=$instance_id --statistics Average --start-time $start_time --end-time $end_time --period 300 --region $region --output text | awk '{ print $2 }')
# custom handler
# In this example, the health check will fail if all data points fall below
# my threshold. The health check will not fail if there is no data.
healthy=0
hasdata=0
THRESHOLD=300000
for i in $datapoints; do
    # In this case, the metric(NetworkIn) is not integer.
    if (( $(echo "$i $THRESHOLD" | awk '{print ($1 > $2)}') )); then
        healthy=1
    fi
    hasdata=1
done
if [ $hasdata -eq 1 ]; then
    if [ $healthy -eq 0 ]; then
        aws autoscaling set-instance-health --instance-id $instance_id --health-status Unhealthy --region $region
    fi
fi

其餘的部分

讓腳本定期運行

$ chmod +x /root/my-health-check.sh
# run the script at 0, 5, 10, 15 ... 55 of every hour
$ echo "*/5 * * * * root /root/my-health-check.sh 2>&1 | /usr/bin/logger -t ec2_health_check" >> /etc/crontab

關閉執行個體電源並建立 AMI。完成後，使用 AMI 建立新的自動擴展組。現在，如果指標不符合健康條件，它應該自行終止並啟動一個新的指標。瞧！

參考：

[1]：EC2實例元數據

[2]：取得AWS中的當前區域 - 堆疊記憶體溢出

[3]：CloudWatch - 獲取指標統計信息

Answer

對於遇到這個問題的任何人：

儘管我相信 AWS 應該在 CloudWatch 中包含這樣的功能，但遺憾的是我找不到任何表明該功能可用的資訊。因此，我創建了一個 bash 腳本，用於查詢 CloudWatch API 以確定資源消耗指標，然後相應地設定實例運行狀況，如建議的那樣提姆：

準備

如果你還沒有這樣做，安裝 AWS 命令列介面。也可透過yum或取得apt。
配置 AWS CLI透過運行aws configure，填寫您的 API 金鑰和其他設定。重要的：如果您打算像我一樣以 root 身分執行下面的腳本，則必須以 root 身分執行此設定命令。否則，腳本將會失敗。

/root/my-health-check.sh

#!/bin/bash
# retrieve metrics starting from 20 minutes ago (3 data points)
# Note: Sometimes CloudWatch failed to gather data for a specific period,
# then the number of data points returned could be less than what we expect.
# Also, when the instance just started, there will be no data point.
start_time=$(date -d "-20 minutes" -u +"%Y-%m-%dT%H:%M:%SZ")
# retrieve metrics up to now
end_time=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
# get current instance ID [1]
instance_id=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
# get current region [2]
# This is only needed if you have multiple regions to manage, otherwise just
# specify a region via `aws configure`.
region=$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone | sed 's/\(.*\)[a-z]/\1/')
# save data retrieved for processing [3]
# Here I used an example of retrieving "NetworkIn" of "AWS/EC2" namespace,
# with metric resolution set to 300 (5 minutes).
# For a list of available metrics, run `aws cloudwatch list-metrics`
datapoints=$(aws cloudwatch get-metric-statistics --namespace AWS/EC2 --metric-name NetworkIn --dimensions Name=InstanceId,Value=$instance_id --statistics Average --start-time $start_time --end-time $end_time --period 300 --region $region --output text | awk '{ print $2 }')
# custom handler
# In this example, the health check will fail if all data points fall below
# my threshold. The health check will not fail if there is no data.
healthy=0
hasdata=0
THRESHOLD=300000
for i in $datapoints; do
    # In this case, the metric(NetworkIn) is not integer.
    if (( $(echo "$i $THRESHOLD" | awk '{print ($1 > $2)}') )); then
        healthy=1
    fi
    hasdata=1
done
if [ $hasdata -eq 1 ]; then
    if [ $healthy -eq 0 ]; then
        aws autoscaling set-instance-health --instance-id $instance_id --health-status Unhealthy --region $region
    fi
fi

其餘的部分

讓腳本定期運行

$ chmod +x /root/my-health-check.sh
# run the script at 0, 5, 10, 15 ... 55 of every hour
$ echo "*/5 * * * * root /root/my-health-check.sh 2>&1 | /usr/bin/logger -t ec2_health_check" >> /etc/crontab

關閉執行個體電源並建立 AMI。完成後，使用 AMI 建立新的自動擴展組。現在，如果指標不符合健康條件，它應該自行終止並啟動一個新的指標。瞧！

參考：

[1]：EC2實例元數據

[2]：取得AWS中的當前區域 - 堆疊記憶體溢出

[3]：CloudWatch - 獲取指標統計信息

準備

答案1

答案2

準備

/root/my-health-check.sh

其餘的部分

參考：

相關內容