[Bug 12236] ash hangs after wait returns EINTR

bugzilla at busybox.net bugzilla at busybox.net
Tue Oct 8 13:56:42 UTC 2019


https://bugs.busybox.net/show_bug.cgi?id=12236

--- Comment #1 from Denys Vlasenko <vda.linux at googlemail.com> ---
Running ./pid with PID 512 wait4 returns with EINTR and then wait is called
again and again returning ECHILD. ash does not recover from this.

Other shells like bash or zsh seems to handle this differently. They don't hang
at PID 512.

ioctl(0, SNDCTL_TMR_START or TCSETS, {B134 -opost isig icanon echo ...}) = 0
rt_sigaction(SIGWINCH, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, NULL, 8) =
0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=NULL) = 512
setpgid(512, 512)                       = 0
wait4(-1, My process ID : 512
My parent's ID: 143
[{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WSTOPPED, NULL) = 512
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=512, si_uid=0,
si_status=0, si_utime=0, si_stime=7} ---
rt_sigreturn({mask=[]})                 = -1 EINTR (Interrupted system call)
wait4(-1, 0x7fc08b98, WSTOPPED, NULL)   = -1 ECHILD (No child processes)
wait4(-1, 0x7fc08b98, WSTOPPED, NULL)   = -1 ECHILD (No child processes)

The above looks like the following. wait4 syscall returns 512:
the "WSTOPPED, NULL) = 512" part clearly shows that syscall does NOT return
EINTR, ERESTARTSYS or the like. It returns 512.

However, SIGCHLD is simultaneously delivered, therefore on return from wait4 
kernel constructs the signal frame on stack and runs the signal handler
(in case of ash, it's signal_handler() function).

When it returns, (signal frame on stack is constructed so that) it returns to a
stub in C library which executes rt_sigreturn syscall, which *should* restore
registers and stack to the state it was before signal was delivered. However,
we see "= -1 EINTR" there. If signal handler was invoked immediately on exit
from wait4 syscall, this means we see a bug: rt_sigreturn did not correctly
restore the syscall return register.

(This may not be so as there is no way to see when signal was delivered, it
could be delivered a microsecond later later - however, as your other
experiment shows, _usually_ SIGCHLD is delivered immediately at wait4.)

The bug may be caused by the fact that ERESTARTSYS constant *is* equal to 512.
So rt_sigreturn code may be buggy and might be misinterpreting the value of 512
somewhere as ERESTARTSYS.

Actually, this seems to be what happens in linux/arch/nios2/kernel/signal.c

static int do_signal(struct pt_regs *regs)
{
        unsigned int retval = 0, continue_addr = 0, restart_addr = 0;
        int restart = 0;
        struct ksignal ksig;

        current->thread.kregs = regs;

        /*
         * If we were from a system call, check for system call restarting...
         */
        if (regs->orig_r2 >= 0) {
                continue_addr = regs->ea;
                restart_addr = continue_addr - 4;
                retval = regs->r2;

                /*
                 * Prepare for system call restart. We do this here so that a
                 * debugger will see the already changed PC.
                 */
                switch (retval) {
                case ERESTART_RESTARTBLOCK:
                        restart = -2;
                case ERESTARTNOHAND:
                case ERESTARTSYS:
                case ERESTARTNOINTR:
                        restart++;
                        regs->r2 = regs->orig_r2;
                        regs->r7 = regs->orig_r7;
                        regs->ea = restart_addr;
                        break;
                }
        }

        if (get_signal(&ksig)) {
                /* handler */
                if (unlikely(restart && regs->ea == restart_addr)) {
                        if (retval == ERESTARTNOHAND ||
                            retval == ERESTART_RESTARTBLOCK ||
                             (retval == ERESTARTSYS
                                && !(ksig.ka.sa.sa_flags & SA_RESTART))) {
                                regs->r2 = EINTR;
                                regs->r7 = 1;
                                regs->ea = continue_addr;
                        }
                }
                handle_signal(&ksig, regs);
                return 0;
        }

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the busybox-cvs mailing list