wget code shrink (recent change)

Xabier Oneca -- xOneca xoneca at gmail.com
Tue Nov 20 08:37:53 UTC 2018


Hi Raffaello & Denys,

First, thank you both very much for your quick response.

>> I can't see why that change would generate less asm. Out of curiosity,
>> anybody cares to explain?
>
> Not hard to guess… option_mask32 will already be in a register right after
> the if, while it might need to be reloaded after the bb_error_msg() call.
Or if
> the architecture supports indirect operands (like x86-64), the compiler
might
> still generate shorter opcodes by replacing two indirect instructions
with a
> load, two register instructions, and a store.

That makes sense. I didn't thought on that... :/

> > I can't see why that change would generate less asm. Out of curiosity,
> > anybody cares to explain?
>
> make networking/wget.s
>
>         movl    option_mask32, %eax     # option_mask32, option_mask32.23
>         testb   $32, %ah        #, option_mask32.23
>         jne     .L13    #,
>         orb     $32, %ah        #, option_mask32.23
>         movl    %eax, option_mask32     # option_mask32.23, option_mask32
>         pushl   $.LC3   #
>         call    bb_error_msg    #
>         popl    %esi    #
> .L13:

The new code seems longer to me (note: I don't usually work with asm).

--- networking/wget.s.old    2018-11-20 09:07:14.894126056 +0100
+++ networking/wget.s.new    2018-11-20 09:07:14.902126010 +0100
@@ -136,11 +136,14 @@
     movq    %fs:40, %rax
     movq    %rax, 8(%rsp)
     xorl    %eax, %eax
-    testb    $32, option_mask32+1(%rip)
+    movl    option_mask32(%rip), %eax
+    testb    $32, %ah
     jne    .L8
+    orb    $32, %ah
     movl    $.LC4, %edi
+    movl    %eax, option_mask32(%rip)
+    xorl    %eax, %eax
     call    bb_error_msg
-    orl    $8192, option_mask32(%rip)
 .L8:
     movq    %rbx, %rdi
     call    xstrdup

In fact, I re-checked the bloatcheck, and now it gives +3 bytes. (I don't
know what did I do last time I checked... :S )

function                                             old     new   delta
spawn_ssl_client                                     282     285      +3
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 1/0 up/down: 3/0)                 Total: 3
bytes
   text       data        bss        dec        hex    filename
  87172       1406        488      89066      15bea    busybox_old
  87175       1406        488      89069      15bed    busybox_unstripped

> Now, if there's a way to code bit-level test_and_set idiom in C?
> x86 has BTS insn which does it efficiently, but it's not used here
> by the compiler. It could be:
>
>         btsl    $13, option_mask32
>         jnc     .L13    #,
>         pushl   $.LC3   #
>         call    bb_error_msg    #
>         popl    %esi    #
> .L13:

Oh! Interesting instruction. That could be optimization work for the
compiler...

Thanks!

Xabier Oneca_,,_
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.busybox.net/pipermail/busybox/attachments/20181120/0868cf21/attachment.html>


More information about the busybox mailing list