[PATCH] fsck: Fix incorrect handling of child exit

Niklas Hambüchen mail at nh2.me
Thu May 24 19:51:17 UTC 2018


Thanks!

That will probably also do, though it isn't as legit as it could be:

  https://www.gnu.org/software/libc/manual/html_node/Process-Completion-Status.html

says the return value of WEXITSTATUS is only well-defined "If WIFEXITED is true of status".
So it would make sense to use WEXITSTATUS only in a branch conditioned on WIFEXITED.

The way you do it now, using WEXITSTATUS in all cases where it's not WIFSIGNALED, makes the assumption that WIFSIGNALED is the precise inverse of WIFEXITED.

But I haven't found the glibc (API) to make that guarantee anywhere.

Best,
Niklas

On 24/05/2018 15.29, Denys Vlasenko wrote:
> Fixed in a bit different form. Thanks!
> 
> On Tue, May 8, 2018 at 4:47 PM, Niklas Hambüchen <mail at nh2.me> wrote:
>> In commit
>>
>>   c4fb8c6a - fsck: do not use statics
>>
>> not only statics were changed but also a couple of
>> statics-unrelated changes were made.
>>
>> This included the handling of the child termination status
>> as follows:
>>
>>     - if (WIFEXITED(status))
>>     -   status = WEXITSTATUS(status);
>>     - else if (WIFSIGNALED(status)) {
>>     + status = WEXITSTATUS(status);
>>     + if (WIFSIGNALED(status)) {
>>
>> Change to unconditionally call WEXITSTATUS() was not semantics-preserving,
>> since you can only reasonably call WEXITSTATUS() on status if
>> you checked WIFEXITED() before; see `man 2 waitpid`:
>>
>>     WEXITSTATUS(status)
>>       [...]
>>       This macro should be employed only if WIFEXITED
>>       returned true.
>>
>> `status = WEXITSTATUS(status)` unconditionally masks away the
>> parts of status that indicate whether a signal was raised,
>> so that afterwards WIFSIGNALED() is true even if no signal
>> was raised.
>>
>> As a result, busybox's `fsck` wrapper set `status = EXIT_ERROR = 8`,
>> thus not forwarding the exit code of the filesystem-specific
>> fsck utility (such as fsck.ext4) to the caller and returning
>> exit code 8 instead.
>>
>> The exit codes of fsck have well-specified meanings (see `man fsck`)
>> that operating systems use in order to decide whether they should
>> prompt the user due to unrecoverable errors, or continue booting
>> after errors were successfully fixed automatically.
>>
>> Consequently, this regression in busybox's fsck stopped my server
>> from booting though, and manual intervention via a keyboard was
>> needed.
>>
>> Remark: Tracking down this issue would have been significantly
>> less effort if unrelated code changes were not snuck into a
>> commit labelled "fsck: do not use statics".
>>
>> This issue was found as part of the NixOS project
>> (https://github.com/NixOS/nixpkgs/issues/40174#issuecomment-387405554)
>> and this fix has been tested on it.
>>
>> Signed-off-by: Niklas Hambüchen <mail at nh2.me>
>> ---
>>  e2fsprogs/fsck.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/e2fsprogs/fsck.c b/e2fsprogs/fsck.c
>> index 1c285bb..1a6f7d0 100644
>> --- a/e2fsprogs/fsck.c
>> +++ b/e2fsprogs/fsck.c
>> @@ -448,8 +448,9 @@ static int wait_one(int flags)
>>         }
>>   child_died:
>>
>> -       status = WEXITSTATUS(status);
>> -       if (WIFSIGNALED(status)) {
>> +       if (WIFEXITED(status))
>> +               status = WEXITSTATUS(status);
>> +       else if (WIFSIGNALED(status)) {
>>                 sig = WTERMSIG(status);
>>                 status = EXIT_UNCORRECTED;
>>                 if (sig != SIGINT) {
>> --
>> 2.7.4
>> _______________________________________________
>> busybox mailing list
>> busybox at busybox.net
>> http://lists.busybox.net/mailman/listinfo/busybox


More information about the busybox mailing list