[PATCH 0/8] ARC updates to uClibc

Vineet Gupta Vineet.Gupta1 at synopsys.com
Wed Feb 18 08:11:27 UTC 2015


On Wednesday 18 February 2015 01:33 PM, Bernhard Reutner-Fischer wrote:
> On February 18, 2015 6:51:17 AM GMT+01:00, Vineet Gupta <Vineet.Gupta1 at synopsys.com> wrote:
>> On Monday 16 February 2015 08:34 PM, Bernhard Reutner-Fischer wrote:
>>>> While it at I also did some arch specific adjustment in sigaction
>> path
>>>>> - inlining the rt_sigaction syscall stub detour to reduce branch
>> return
>>>>> stack mispredicts etc - which is what 6/8 does !
>>> This sounds suspicious.
>>> IIRC we already had that argument, last time around _dl_do_reloc and
>> _dl_do_lazy_reloc.
>>> Could it be that your port has a bug here ( missed optimisation )
>> around ifunc handling? Sounds like back then on ARM
>> https://gcc.gnu.org/PR40887#c6
>>> What am I missing?
>>
>> I don't think my use-case is close to the ARM issue u pointed to above
>> as there is
>> no ifunc or function pointer involved.
> I was more thinking about the relic functors.
> Does GCC 5 produce identical code for ARC master way to explicit function calls compared to using a function pointer like suggested and used in all other ports?
> If not then I'd consider this a bug.
>
>> With orig code, we get 2 function calls on ARC:
>>
>> 0000b504 <__libc_sigaction>:
>>    b504:	push_s     blink
>>    b506:	sub_s      sp,sp,12
>>    b508:	bl.d       36b20 <__st_r13_to_r15>
>> ...
>>
>>    b540:	bl.d       b750 <__syscall_rt_sigaction>   <--- DIRECT CALL
>>    b544:	mov_s      r3,8
>>    b546:	add_s      sp,sp,20
>>    b548:	mov_s      r12,12
>>    b54a:	b          36b88 <__ld_r13_to_r15_ret>
>>    b54e:	nop_s
>>
>> 0000b750 <__syscall_rt_sigaction>:
>>    b750:	mov        r8,134
>> b754:	swi                                <---- SYSCALL TRAP INTO KERNEL
>>    b758:	cmp        r0,0xfffffc00
>>    b75c:	bls_s      b76a
>>    b75e:	st.a       blink,[sp,-4]
>>    b762:	bl         b550 <__syscall_error>
>>    b766:	ld.ab      blink,[sp,4]
>>    b76a:	j_s        [blink]
>>
>> The small function call is not necessarily good micro-architecturally
>> when
>> returning due to limited number of call return stack entries. That cost
>> is
>> amortized if function is largish.
>>
>> I do understand that these small syscall wrappers are a common uClibc
>> design
>> pattern and exist all over the place but given that this was all arch
>> code I tool
>> the liberty of removing the one hop and the code now looks as below:
>>
>> 0000b4d8 <__libc_sigaction>:
>>    b4d8:	st.a       gp,[sp,-4]
>>    b4dc:	sub_s      sp,sp,20
>>    b4de:	add        gp,pcl,0x00065284
>>    b4e6:	breq_s     r1,0,b516
>>    b4e8:	ld_s       r3,[r1,4]
>> ...
>>    b516:	mov        r8,134
>>    b51a:	mov_s      r3,8
>>    b51c:	swi
>>    b520:	cmp        r0,0xfffffc00
>>    b524:	bls_s      b532
>>    b526:	st.a       blink,[sp,-4]
>>    b52a:	bl         b53c <__syscall_error>
>>    b52e:	ld.ab      blink,[sp,4]
>>    b532:	ld.a       gp,[sp,20]
>>    b536:	j_s.d      [blink]
>>    b538:	add_s      sp,sp,4
>>    b53a:	nop_s
> I would have assumed / hoped that GCC 5 should generate this 2nd variant for extern inline __syscall_rt_sigaction.
>
> Doesn't it do that?

ARC gcc upgrade to 5.0 is still being done - so i can't comment. CCing our gcc gurus !

-Vineet


More information about the uClibc mailing list