我有以下兩個腳本來模擬一些工作:
start.sh
只需使用 script 啟動 2 (mpi) 個進程mpiproc.sh
。
啟動文件
#!/bin/bash
function trap_with_arg() {
func="$1" ; shift
for sig ; do
trap "$func $sig" "$sig"
done
}
function handleSignal() {
echo "Received signal (sleep for 10 sec)"
for i in {1..2}
do
echo "start.sh: sleeping $i"
sleep 1s
done
exit 0
}
# Setup the Trap
trap_with_arg handleSignal SIGINT SIGTERM SIGUSR1 SIGUSR2
mpirun -n 2 mpiproc.sh
mpiproc.sh
function trap_with_arg() {
func="$1" ; shift
for sig ; do
trap "$func $sig" "$sig"
done
}
function handleSignal() {
echo "Rank: ${OMPI_COMM_WORLD_RANK} : Received signal (sleep for 10 sec)"
for i in {1..10}
do
echo "Rank: ${OMPI_COMM_WORLD_RANK} sleeping $i"
sleep 1s
done
exit 0
}
# Setup the Trap
trap_with_arg handleSignal SIGINT SIGTERM SIGUSR1 SIGUSR2
echo "MPI Proc Rank: ${OMPI_COMM_WORLD_RANK} start."
sleep 30s
我運行腳本的叢集start.sh
向 start.sh 發送 SIGUSR2 訊號(這就是我的想法)。問題是我的handleSignal
in mpiproc 沒有完成,因為 start.sh 已經執行了它的handleSignal
並且呼叫了exit 0
。如何使handleSignal 呼叫沿著進程樹向上移動?這意味著首先 mpiproc.sh 需要處理訊號(start.sh 以某種方式等待該訊號?),然後 start.sh 進行清理然後退出?
謝謝!