matar subprocesos y trampas

2024-5-21 • tag-icon

Tengo los siguientes dos scripts que simulan algún trabajo:

start.shsimplemente lanza 2 procesos (mpi) con script mpiproc.sh.

inicio.sh

#!/bin/bash

function trap_with_arg() {
    func="$1" ; shift
    for sig ; do
        trap "$func $sig" "$sig"
    done
}

function handleSignal() {
    echo "Received signal (sleep for 10 sec)"
    for i in {1..2}
    do
      echo "start.sh: sleeping $i"
      sleep 1s
    done
    exit 0
}

# Setup the Trap
trap_with_arg handleSignal SIGINT SIGTERM SIGUSR1 SIGUSR2

mpirun -n 2 mpiproc.sh

mpiproc.sh

function trap_with_arg() {
    func="$1" ; shift
    for sig ; do
        trap "$func $sig" "$sig"
    done
}


function handleSignal() {
    echo "Rank: ${OMPI_COMM_WORLD_RANK} : Received signal (sleep for 10 sec)"
    for i in {1..10}
    do
      echo "Rank: ${OMPI_COMM_WORLD_RANK} sleeping $i"
      sleep 1s
    done
    exit 0
}

# Setup the Trap
trap_with_arg handleSignal SIGINT SIGTERM SIGUSR1 SIGUSR2

echo "MPI Proc  Rank: ${OMPI_COMM_WORLD_RANK} start."
sleep 30s

El clúster en el que estoy ejecutando el script start.shenvía una señal SIGUSR2 a start.sh (eso es lo que pienso). El problema es que mi handleSignalin mpiproc no termina porque start.sh ya ejecutó su handleSignaly llamadas exit 0. ¿Cómo puedo hacer que las llamadas handleSignal suban por el árbol de procesos? Lo que significa que primero mpiproc.sh necesita manejar la señal (¿start.sh de alguna manera espera eso?) y luego start.sh hace la limpieza y luego sale.

¡Gracias!

información relacionada