wget code shrink (recent change)
Xabier Oneca -- xOneca
xoneca at gmail.com
Tue Nov 20 08:37:53 UTC 2018
Hi Raffaello & Denys,
First, thank you both very much for your quick response.
>> I can't see why that change would generate less asm. Out of curiosity,
>> anybody cares to explain?
>
> Not hard to guess… option_mask32 will already be in a register right after
> the if, while it might need to be reloaded after the bb_error_msg() call.
Or if
> the architecture supports indirect operands (like x86-64), the compiler
might
> still generate shorter opcodes by replacing two indirect instructions
with a
> load, two register instructions, and a store.
That makes sense. I didn't thought on that... :/
> > I can't see why that change would generate less asm. Out of curiosity,
> > anybody cares to explain?
>
> make networking/wget.s
>
> movl option_mask32, %eax # option_mask32, option_mask32.23
> testb $32, %ah #, option_mask32.23
> jne .L13 #,
> orb $32, %ah #, option_mask32.23
> movl %eax, option_mask32 # option_mask32.23, option_mask32
> pushl $.LC3 #
> call bb_error_msg #
> popl %esi #
> .L13:
The new code seems longer to me (note: I don't usually work with asm).
--- networking/wget.s.old 2018-11-20 09:07:14.894126056 +0100
+++ networking/wget.s.new 2018-11-20 09:07:14.902126010 +0100
@@ -136,11 +136,14 @@
movq %fs:40, %rax
movq %rax, 8(%rsp)
xorl %eax, %eax
- testb $32, option_mask32+1(%rip)
+ movl option_mask32(%rip), %eax
+ testb $32, %ah
jne .L8
+ orb $32, %ah
movl $.LC4, %edi
+ movl %eax, option_mask32(%rip)
+ xorl %eax, %eax
call bb_error_msg
- orl $8192, option_mask32(%rip)
.L8:
movq %rbx, %rdi
call xstrdup
In fact, I re-checked the bloatcheck, and now it gives +3 bytes. (I don't
know what did I do last time I checked... :S )
function old new delta
spawn_ssl_client 282 285 +3
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 1/0 up/down: 3/0) Total: 3
bytes
text data bss dec hex filename
87172 1406 488 89066 15bea busybox_old
87175 1406 488 89069 15bed busybox_unstripped
> Now, if there's a way to code bit-level test_and_set idiom in C?
> x86 has BTS insn which does it efficiently, but it's not used here
> by the compiler. It could be:
>
> btsl $13, option_mask32
> jnc .L13 #,
> pushl $.LC3 #
> call bb_error_msg #
> popl %esi #
> .L13:
Oh! Interesting instruction. That could be optimization work for the
compiler...
Thanks!
Xabier Oneca_,,_
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.busybox.net/pipermail/busybox/attachments/20181120/0868cf21/attachment.html>
More information about the busybox
mailing list