[PATCH 0/8] ARC updates to uClibc
Vineet Gupta
Vineet.Gupta1 at synopsys.com
Wed Feb 18 08:11:27 UTC 2015
On Wednesday 18 February 2015 01:33 PM, Bernhard Reutner-Fischer wrote:
> On February 18, 2015 6:51:17 AM GMT+01:00, Vineet Gupta <Vineet.Gupta1 at synopsys.com> wrote:
>> On Monday 16 February 2015 08:34 PM, Bernhard Reutner-Fischer wrote:
>>>> While it at I also did some arch specific adjustment in sigaction
>> path
>>>>> - inlining the rt_sigaction syscall stub detour to reduce branch
>> return
>>>>> stack mispredicts etc - which is what 6/8 does !
>>> This sounds suspicious.
>>> IIRC we already had that argument, last time around _dl_do_reloc and
>> _dl_do_lazy_reloc.
>>> Could it be that your port has a bug here ( missed optimisation )
>> around ifunc handling? Sounds like back then on ARM
>> https://gcc.gnu.org/PR40887#c6
>>> What am I missing?
>>
>> I don't think my use-case is close to the ARM issue u pointed to above
>> as there is
>> no ifunc or function pointer involved.
> I was more thinking about the relic functors.
> Does GCC 5 produce identical code for ARC master way to explicit function calls compared to using a function pointer like suggested and used in all other ports?
> If not then I'd consider this a bug.
>
>> With orig code, we get 2 function calls on ARC:
>>
>> 0000b504 <__libc_sigaction>:
>> b504: push_s blink
>> b506: sub_s sp,sp,12
>> b508: bl.d 36b20 <__st_r13_to_r15>
>> ...
>>
>> b540: bl.d b750 <__syscall_rt_sigaction> <--- DIRECT CALL
>> b544: mov_s r3,8
>> b546: add_s sp,sp,20
>> b548: mov_s r12,12
>> b54a: b 36b88 <__ld_r13_to_r15_ret>
>> b54e: nop_s
>>
>> 0000b750 <__syscall_rt_sigaction>:
>> b750: mov r8,134
>> b754: swi <---- SYSCALL TRAP INTO KERNEL
>> b758: cmp r0,0xfffffc00
>> b75c: bls_s b76a
>> b75e: st.a blink,[sp,-4]
>> b762: bl b550 <__syscall_error>
>> b766: ld.ab blink,[sp,4]
>> b76a: j_s [blink]
>>
>> The small function call is not necessarily good micro-architecturally
>> when
>> returning due to limited number of call return stack entries. That cost
>> is
>> amortized if function is largish.
>>
>> I do understand that these small syscall wrappers are a common uClibc
>> design
>> pattern and exist all over the place but given that this was all arch
>> code I tool
>> the liberty of removing the one hop and the code now looks as below:
>>
>> 0000b4d8 <__libc_sigaction>:
>> b4d8: st.a gp,[sp,-4]
>> b4dc: sub_s sp,sp,20
>> b4de: add gp,pcl,0x00065284
>> b4e6: breq_s r1,0,b516
>> b4e8: ld_s r3,[r1,4]
>> ...
>> b516: mov r8,134
>> b51a: mov_s r3,8
>> b51c: swi
>> b520: cmp r0,0xfffffc00
>> b524: bls_s b532
>> b526: st.a blink,[sp,-4]
>> b52a: bl b53c <__syscall_error>
>> b52e: ld.ab blink,[sp,4]
>> b532: ld.a gp,[sp,20]
>> b536: j_s.d [blink]
>> b538: add_s sp,sp,4
>> b53a: nop_s
> I would have assumed / hoped that GCC 5 should generate this 2nd variant for extern inline __syscall_rt_sigaction.
>
> Doesn't it do that?
ARC gcc upgrade to 5.0 is still being done - so i can't comment. CCing our gcc gurus !
-Vineet
More information about the uClibc
mailing list