dc hitting a compiler bug, or undefined behavior

Ralf Friedl Ralf.Friedl at online.de
Sun Mar 30 21:37:35 UTC 2014


Lauri Kasanen wrote:
> On Sun, Mar 30, 2014, at 18:26, Ralf Friedl wrote:
>>> What's even worse is that adding any output to push(), even a puts("hi")
>>> that does not print the argument or any of the stack vars, fixes it. So
>>> something magic is going on inside the GCC optimization, I'm afraid this
>>> is above my pay grade.
>> Could you send the file miscutils/dc.o that is created with and without
>> this puts("hi") in push()?
> Attached.
Are you using some special compiler options, especially regarding 
parameter passing in registers and stack alignment?

The relevant part of fail-dc.o is this:
00000000 <push>:
    0:   dd 07                   fldl   (%edi)
The function expects the value to push at the address pointed to by 
%edi. But the functions that call push pass the value at the top of the 
CPU stack (not to be confused with the stack dc implements).

The relevant part ofsuccess-dc.o is this:
00000000 <push>:
    0:   57                      push   %edi
    1:   8d 7c 24 08             lea    0x8(%esp),%edi
    5:   83 e4 f0                and    $0xfffffff0,%esp
    8:   ff 77 fc                pushl  -0x4(%edi)
    b:   55                      push   %ebp
    c:   89 e5                   mov    %esp,%ebp
    e:   57                      push   %edi
    f:   83 ec 14                sub    $0x14,%esp
   12:   dd 07                   fldl   (%edi)
These lines set up an aligned stack.
0: save %edi
1: put address of top of stack at the time the function was called in 
%edi. This is the address of the parameter.
5: align stack to 0x10 boundary
8: push return address of the function
b, c: normal frame setup
e: save %edi for later use
f: make space for a double and align to 0x10
12: load parameter, %edi still points to the address of the parameter.

The instruction at 12 loads the double from address %edi after %edi has 
been set to point to the parameter area. The instruction at 0 in the 
failed case is exactly the same, except that %edi has not been setup 
before. So I would consider this a compiler bug.

I wrote that the instruction at f makes space for an aligned double. 
This is itself is strange because later on the double that is loaded 
from %edi is saved on the CPU stack and later loaded from the CPU stack 
and saved in the dc stack, which is unnecessary. Also the double is 
always loaded to the FPU stack and then removed if bb_error_msg_and_die 
is called, instead of loading it only after it is clear that it will be 
used. So there is also opportunity for further optimization of the compiler.

This stack alignment makes your code bigger, and the additional 
instructions also have to be executed, which also takes time. I'm not 
sure whether the aligned stack saves enough time to offset this.


More information about the busybox mailing list