[PATCH 0/8] ARC updates to uClibc
Vineet Gupta
Vineet.Gupta1 at synopsys.com
Wed Feb 18 05:51:17 UTC 2015
On Monday 16 February 2015 08:34 PM, Bernhard Reutner-Fischer wrote:
>> While it at I also did some arch specific adjustment in sigaction path
>> >- inlining the rt_sigaction syscall stub detour to reduce branch return
>> >stack mispredicts etc - which is what 6/8 does !
> This sounds suspicious.
> IIRC we already had that argument, last time around _dl_do_reloc and _dl_do_lazy_reloc.
> Could it be that your port has a bug here ( missed optimisation ) around ifunc handling? Sounds like back then on ARM https://gcc.gnu.org/PR40887#c6
>
> What am I missing?
I don't think my use-case is close to the ARM issue u pointed to above as there is
no ifunc or function pointer involved.
With orig code, we get 2 function calls on ARC:
0000b504 <__libc_sigaction>:
b504: push_s blink
b506: sub_s sp,sp,12
b508: bl.d 36b20 <__st_r13_to_r15>
...
b540: bl.d b750 <__syscall_rt_sigaction> <--- DIRECT CALL
b544: mov_s r3,8
b546: add_s sp,sp,20
b548: mov_s r12,12
b54a: b 36b88 <__ld_r13_to_r15_ret>
b54e: nop_s
0000b750 <__syscall_rt_sigaction>:
b750: mov r8,134
b754: swi <---- SYSCALL TRAP INTO KERNEL
b758: cmp r0,0xfffffc00
b75c: bls_s b76a
b75e: st.a blink,[sp,-4]
b762: bl b550 <__syscall_error>
b766: ld.ab blink,[sp,4]
b76a: j_s [blink]
The small function call is not necessarily good micro-architecturally when
returning due to limited number of call return stack entries. That cost is
amortized if function is largish.
I do understand that these small syscall wrappers are a common uClibc design
pattern and exist all over the place but given that this was all arch code I tool
the liberty of removing the one hop and the code now looks as below:
0000b4d8 <__libc_sigaction>:
b4d8: st.a gp,[sp,-4]
b4dc: sub_s sp,sp,20
b4de: add gp,pcl,0x00065284
b4e6: breq_s r1,0,b516
b4e8: ld_s r3,[r1,4]
...
b516: mov r8,134
b51a: mov_s r3,8
b51c: swi
b520: cmp r0,0xfffffc00
b524: bls_s b532
b526: st.a blink,[sp,-4]
b52a: bl b53c <__syscall_error>
b52e: ld.ab blink,[sp,4]
b532: ld.a gp,[sp,20]
b536: j_s.d [blink]
b538: add_s sp,sp,4
b53a: nop_s
-Vineet
More information about the uClibc
mailing list