Patch: use BB_GLOBAL_CONST where applicable

Rasmus Villemoes rasmus.villemoes at prevas.dk
Fri Jun 21 07:57:48 UTC 2019


On 06/06/2019 20.19, Denys Vlasenko wrote:
> On Wed, Jun 5, 2019 at 5:51 PM Luís Marques <luismarques at lowrisc.org> wrote:
>> Hello,
>>
>> This patch mainly intends to add Clang/LLVM support, which currently is broken.
>>
>> Problem: the const pointer trick (used by the struct globals, etc.) is
>> technically undefined behavior. In practice, it causes problems with
>> an LLVM-based toolchain, since LLVM optimizes away the store to the
>> const pointer (despite the memory barrier), and therefore the applets
>> crash when they dereference the null pointers.
> 
> Can you experiment with LLVM and find a definition of SET_PTR_TO_GLOBALS()
> which works for it?

I think something like this should work without invoking UB - from the C
_compiler_'s perspective, we never cast a const qualifier away, so all
accesses to the objects go through proper lvalues. It's just that the
assembler and linker have colluded to make the two objects occupy the
same memory location (in .data, or perhaps .bss).

$ grep . g.c h.c
g.c:static int __some_global;
g.c:extern const int some_global
__attribute__((__alias__("__some_global")));
g.c:void init(void)
g.c:{
g.c:    __some_global = 123;
g.c:}
h.c:#include <stdio.h>
h.c:extern void init(void);
h.c:extern const int some_global;
h.c:int main(int argc, char *argv[])
h.c:{
h.c:    init();
h.c:    printf("%d\n", some_global);
h.c:    printf("%d\n", some_global);
h.c:    return 0;
h.c:}

$ gcc -O2 -U_FORTIFY_SOURCE -o g.o -c g.c
$ gcc -O2 -U_FORTIFY_SOURCE -o h.o -c h.c
$ nm g.o
0000000000000000 T init
0000000000000000 b __some_global
0000000000000000 B some_global

and objdump on h.o shows that some_global gets loaded just once to a
callee-saved register:
0000000000000000 <main>:
   0:   53                      push   %rbx
   1:   e8 00 00 00 00          callq  6 <main+0x6>     2:
R_X86_64_PLT32       init-0x4
   6:   8b 1d 00 00 00 00       mov    0x0(%rip),%ebx        # c
<main+0xc>     8: R_X86_64_PC32        some_global-0x4
   c:   48 8d 3d 00 00 00 00    lea    0x0(%rip),%rdi        # 13
<main+0x13>   f: R_X86_64_PC32        .LC0-0x4
  13:   31 c0                   xor    %eax,%eax
  15:   89 de                   mov    %ebx,%esi
  17:   e8 00 00 00 00          callq  1c <main+0x1c>   18:
R_X86_64_PLT32      printf-0x4
  1c:   48 8d 3d 00 00 00 00    lea    0x0(%rip),%rdi        # 23
<main+0x23>   1f: R_X86_64_PC32       .LC0-0x4
  23:   89 de                   mov    %ebx,%esi
  25:   31 c0                   xor    %eax,%eax
  27:   e8 00 00 00 00          callq  2c <main+0x2c>   28:
R_X86_64_PLT32      printf-0x4
  2c:   31 c0                   xor    %eax,%eax
  2e:   5b                      pop    %rbx
  2f:   c3                      retq

So this of course requires that the code that initializes __some_global
is guaranteed to run before anything can look at some_global - in fact,
above, the compiler _could_ have decided to hoist the load of
some_global before the call of init(), but that seems to be true for the
current situation as well.

Rasmus


More information about the busybox mailing list