
我想出了一個我認為可以做到這一點的設置,但它不起作用:
#!/bin/bash
echo "Launching a background process that may take hours to finish.."
myprog &
pid=$!
retval=
##At this time pid should hold the process id of myprog
echo "pid=${pid}"
{
##check if the process is still running
psl=$(ps -f -p ${pid} | grep -E "\bmyprog\b")
killit=
while [[ ! -z ${psl} ]]
do
##if a file named "kill_flag" is detected, kill the process
if [[ -e "kill_flag" ]]
then
killit=YES
break
fi
#check every 3 seconds
sleep 3
psl=$(ps -f -p ${pid} | grep -E "\bmyprog\b")
done
##killit not set, normal exit, read from fd5
if [[ -z ${killit} ]]
then
read <&5 retval
else
##kill here, the wait will return and the sub process ends
kill ${pid}
fi
} 5< <( wait ${pid} > /dev/null 2>&1; echo $? )
echo "retval=$retval"
第一次運行似乎一切都很好,我可以通過 終止該進程touch kill_flag
,否則它會等到 myprog 正常完成。但後來我注意到我總是在 retval 中得到-1。 myprog 回傳 0,這是正常運作所確認的。進一步調查表明,「echo $?
」部分是在腳本啟動後立即執行的,而不是在 wait 命令退出後執行的。我想知道這是怎麼回事。我對 bash 還很陌生。
答案1
wait
只能對目前 shell 進程的子進程起作用。解釋內部程式碼的子 shell<(...)
不能等待姊妹進程。
等待必須由啟動 pid 的同一個 shell 程序來完成。用zsh
而不是bash
(這裡假設沒有其他後台作業運行):
cmd & pid=$!
while (($#jobstates)) {
[[ -e killfile ]] && kill $pid
sleep 3
}
wait $pid; echo $?
答案2
找出一個可行的版本:
#!/bin/bash
export retval=
##At this time pid should hold the process id of myprog
{
##This is the subshell that launched and monitoring myprog
subsh=$!
##Since myprog is probably the only child process of this subsh, it should be pretty safe
pid=$(ps -f --ppid ${subsh} | grep -E "\bmyprog\b" | gawk '{print $2}' )
##check if the process is still running
psl=$(ps -f -p ${pid} | grep -E "\bmyprog\b")
killit=
while [[ ! -z ${psl} ]]
do
##if a file named "kill_flag" is detected, kill the process
if [[ -e "kill_flag" ]]
then
killit=YES
break
fi
#check every 3 seconds
sleep 3
psl=$(ps -f -p ${pid} | grep -E "\bmyprog\b")
done
##killit not set, normal exit, read from fd5
if [[ -z ${killit} ]]
then
read <&5 retval
else
##kill here, the wait will return and the sub process ends
kill ${pid}
fi
} 5< <( myprog >>logfile 2>&1; echo $? )
echo "retval=$retval"
唯一煩人的是,當我用信號量殺死 myprog 時,由於進程替換已死,會出現錯誤,但它很容易被捕獲。