從後台進程獲取退出代碼的可靠方法,同時監視並在必要時終止它

從後台進程獲取退出代碼的可靠方法,同時監視並在必要時終止它

我想出了一個我認為可以做到這一點的設置,但它不起作用:

#!/bin/bash

echo "Launching a background process that may take hours to finish.."
myprog &
pid=$!
retval=
##At this time pid should hold the process id of myprog
echo "pid=${pid}"

{
    ##check if the process is still running
    psl=$(ps -f -p ${pid} | grep -E "\bmyprog\b")
    killit=
    while [[ ! -z ${psl} ]]
    do
        ##if a file named "kill_flag" is detected, kill the process
        if [[ -e "kill_flag" ]]
        then
            killit=YES
            break
        fi
        #check every 3 seconds
        sleep 3
        psl=$(ps -f -p ${pid} | grep -E "\bmyprog\b")
    done

    ##killit not set, normal exit, read from fd5
    if [[ -z ${killit} ]]
    then
        read <&5 retval
  else
    ##kill here, the wait will return and the sub process ends
    kill ${pid}
  fi

} 5< <( wait ${pid} > /dev/null 2>&1; echo $? )

echo "retval=$retval"

第一次運行似乎一切都很好,我可以通過 終止該進程touch kill_flag,否則它會等到 myprog 正常完成。但後來我注意到我總是在 retval 中得到-1。 myprog 回傳 0,這是正常運作所確認的。進一步調查表明,「echo $?」部分是在腳本啟動後立即執行的,而不是在 wait 命令退出後執行的。我想知道這是怎麼回事。我對 bash 還很陌生。

答案1

wait只能對目前 shell 進程的子進程起作用。解釋內部程式碼的子 shell<(...)不能等待姊妹進程。

等待必須由啟動 pid 的同一個 shell 程序來完成。用zsh而不是bash(這裡假設沒有其他後台作業運行):

cmd & pid=$!
while (($#jobstates)) {
  [[ -e killfile ]] && kill $pid
  sleep 3
}
wait $pid; echo $?

答案2

找出一個可行的版本:

#!/bin/bash
export retval=
##At this time pid should hold the process id of myprog
{
    ##This is the subshell that launched and monitoring myprog
    subsh=$!

    ##Since myprog is probably the only child process of this subsh, it should be pretty safe
    pid=$(ps -f --ppid ${subsh} | grep -E "\bmyprog\b" | gawk '{print $2}' )
    ##check if the process is still running
    psl=$(ps -f -p ${pid} | grep -E "\bmyprog\b")
    killit=
    while [[ ! -z ${psl} ]]
    do
        ##if a file named "kill_flag" is detected, kill the process
        if [[ -e "kill_flag" ]]
        then
            killit=YES
            break
        fi
        #check every 3 seconds
        sleep 3
        psl=$(ps -f -p ${pid} | grep -E "\bmyprog\b")
    done

    ##killit not set, normal exit, read from fd5
    if [[ -z ${killit} ]]
    then
        read <&5 retval
  else
    ##kill here, the wait will return and the sub process ends
    kill ${pid}
  fi

} 5< <( myprog >>logfile 2>&1; echo $? )

echo "retval=$retval"

唯一煩人的是,當我用信號量殺死 myprog 時,由於進程替換已死,會出現錯誤,但它很容易被捕獲。

相關內容