準備

Question 1

カスタムインスタンスヘルスチェック(ページの下部) は 1 つのオプションです。

マシン（または実際にはどのマシンでも）上で稼働する別のコードがあり、ヘルスを監視し、インスタンスを不健全な状態に設定するAPI呼び出しを実行します。

他にもまだ完成途中のアイデアがあるのですが、これをどう実装したらよいかよくわかりません。私はオンプレミスシステムのアーキテクトを務めていましたが、そのシステムでは、インスタンス上の別の Web サーバーにロードバランサーが呼び出されていました。私たちの場合、それは小さなカスタム Java Web サーバーで、コードは約 50 行でした。HTTP ステータスコードが返され、正常に動作している場合は 200 (OK)、終了する必要がある場合は 500 (ERROR) でした。このようなものを自動スケーリングと統合できるのではないかと思いますが、しばらくこれをやっていないので、これを自動スケーリングとどのように統合すればよいかわかりません。

上記の最初のアイデアからのコマンドは次のとおりです

aws autoscaling set-instance-health --instance-id i-123abc45d --health-status Unhealthy

Answer

カスタムインスタンスヘルスチェック(ページの下部) は 1 つのオプションです。

マシン（または実際にはどのマシンでも）上で稼働する別のコードがあり、ヘルスを監視し、インスタンスを不健全な状態に設定するAPI呼び出しを実行します。

他にもまだ完成途中のアイデアがあるのですが、これをどう実装したらよいかよくわかりません。私はオンプレミスシステムのアーキテクトを務めていましたが、そのシステムでは、インスタンス上の別の Web サーバーにロードバランサーが呼び出されていました。私たちの場合、それは小さなカスタム Java Web サーバーで、コードは約 50 行でした。HTTP ステータスコードが返され、正常に動作している場合は 200 (OK)、終了する必要がある場合は 500 (ERROR) でした。このようなものを自動スケーリングと統合できるのではないかと思いますが、しばらくこれをやっていないので、これを自動スケーリングとどのように統合すればよいかわかりません。

上記の最初のアイデアからのコマンドは次のとおりです

aws autoscaling set-instance-health --instance-id i-123abc45d --health-status Unhealthy

Question 2

この質問に遭遇した人へ:

AWSはCloudWatchにそのような機能を搭載すべきだったと思うのですが、残念ながら、これが利用可能であることを示唆する情報は見つかりませんでした。そこで、CloudWatch APIにクエリを実行してリソース消費メトリクスを決定し、それに応じてインスタンスのヘルスを設定するbashスクリプトを作成しました。ティム:

準備

まだ行っていない場合は、AWS コマンドラインインターフェイスをインストールするyumまたはからもご利用いただけますapt。
AWS CLI を設定するを実行してaws configure、API キーとその他の設定を入力します。重要: 私のように以下のスクリプトを root として実行する場合は、この設定コマンドを root として実行する必要があります。そうしないと、スクリプトは失敗します。

/root/my-health-check.sh

#!/bin/bash
# retrieve metrics starting from 20 minutes ago (3 data points)
# Note: Sometimes CloudWatch failed to gather data for a specific period,
# then the number of data points returned could be less than what we expect.
# Also, when the instance just started, there will be no data point.
start_time=$(date -d "-20 minutes" -u +"%Y-%m-%dT%H:%M:%SZ")
# retrieve metrics up to now
end_time=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
# get current instance ID [1]
instance_id=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
# get current region [2]
# This is only needed if you have multiple regions to manage, otherwise just
# specify a region via `aws configure`.
region=$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone | sed 's/\(.*\)[a-z]/\1/')
# save data retrieved for processing [3]
# Here I used an example of retrieving "NetworkIn" of "AWS/EC2" namespace,
# with metric resolution set to 300 (5 minutes).
# For a list of available metrics, run `aws cloudwatch list-metrics`
datapoints=$(aws cloudwatch get-metric-statistics --namespace AWS/EC2 --metric-name NetworkIn --dimensions Name=InstanceId,Value=$instance_id --statistics Average --start-time $start_time --end-time $end_time --period 300 --region $region --output text | awk '{ print $2 }')
# custom handler
# In this example, the health check will fail if all data points fall below
# my threshold. The health check will not fail if there is no data.
healthy=0
hasdata=0
THRESHOLD=300000
for i in $datapoints; do
    # In this case, the metric(NetworkIn) is not integer.
    if (( $(echo "$i $THRESHOLD" | awk '{print ($1 > $2)}') )); then
        healthy=1
    fi
    hasdata=1
done
if [ $hasdata -eq 1 ]; then
    if [ $healthy -eq 0 ]; then
        aws autoscaling set-instance-health --instance-id $instance_id --health-status Unhealthy --region $region
    fi
fi

残り

スクリプトを定期的に実行する

$ chmod +x /root/my-health-check.sh
# run the script at 0, 5, 10, 15 ... 55 of every hour
$ echo "*/5 * * * * root /root/my-health-check.sh 2>&1 | /usr/bin/logger -t ec2_health_check" >> /etc/crontab

インスタンスの電源をオフにして AMI を作成します。完了したら、AMI を使用して新しい自動スケーリンググループを作成します。メトリックが正常な状態を満たさない場合は、自動的に終了して新しいグループを起動するはずです。これで完了です。

参考文献:

[1]:EC2 インスタンスのメタデータ

[2]:AWS で現在のリージョンを取得する - StackOverflow

[3]:CloudWatch - メトリック統計の取得

Answer

この質問に遭遇した人へ:

AWSはCloudWatchにそのような機能を搭載すべきだったと思うのですが、残念ながら、これが利用可能であることを示唆する情報は見つかりませんでした。そこで、CloudWatch APIにクエリを実行してリソース消費メトリクスを決定し、それに応じてインスタンスのヘルスを設定するbashスクリプトを作成しました。ティム:

準備

まだ行っていない場合は、AWS コマンドラインインターフェイスをインストールするyumまたはからもご利用いただけますapt。
AWS CLI を設定するを実行してaws configure、API キーとその他の設定を入力します。重要: 私のように以下のスクリプトを root として実行する場合は、この設定コマンドを root として実行する必要があります。そうしないと、スクリプトは失敗します。

/root/my-health-check.sh

#!/bin/bash
# retrieve metrics starting from 20 minutes ago (3 data points)
# Note: Sometimes CloudWatch failed to gather data for a specific period,
# then the number of data points returned could be less than what we expect.
# Also, when the instance just started, there will be no data point.
start_time=$(date -d "-20 minutes" -u +"%Y-%m-%dT%H:%M:%SZ")
# retrieve metrics up to now
end_time=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
# get current instance ID [1]
instance_id=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
# get current region [2]
# This is only needed if you have multiple regions to manage, otherwise just
# specify a region via `aws configure`.
region=$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone | sed 's/\(.*\)[a-z]/\1/')
# save data retrieved for processing [3]
# Here I used an example of retrieving "NetworkIn" of "AWS/EC2" namespace,
# with metric resolution set to 300 (5 minutes).
# For a list of available metrics, run `aws cloudwatch list-metrics`
datapoints=$(aws cloudwatch get-metric-statistics --namespace AWS/EC2 --metric-name NetworkIn --dimensions Name=InstanceId,Value=$instance_id --statistics Average --start-time $start_time --end-time $end_time --period 300 --region $region --output text | awk '{ print $2 }')
# custom handler
# In this example, the health check will fail if all data points fall below
# my threshold. The health check will not fail if there is no data.
healthy=0
hasdata=0
THRESHOLD=300000
for i in $datapoints; do
    # In this case, the metric(NetworkIn) is not integer.
    if (( $(echo "$i $THRESHOLD" | awk '{print ($1 > $2)}') )); then
        healthy=1
    fi
    hasdata=1
done
if [ $hasdata -eq 1 ]; then
    if [ $healthy -eq 0 ]; then
        aws autoscaling set-instance-health --instance-id $instance_id --health-status Unhealthy --region $region
    fi
fi

残り

スクリプトを定期的に実行する

$ chmod +x /root/my-health-check.sh
# run the script at 0, 5, 10, 15 ... 55 of every hour
$ echo "*/5 * * * * root /root/my-health-check.sh 2>&1 | /usr/bin/logger -t ec2_health_check" >> /etc/crontab

インスタンスの電源をオフにして AMI を作成します。完了したら、AMI を使用して新しい自動スケーリンググループを作成します。メトリックが正常な状態を満たさない場合は、自動的に終了して新しいグループを起動するはずです。これで完了です。

参考文献:

[1]:EC2 インスタンスのメタデータ

[2]:AWS で現在のリージョンを取得する - StackOverflow

[3]:CloudWatch - メトリック統計の取得

準備

答え1

答え2

準備

/root/my-health-check.sh

残り

参考文献:

関連情報