pthreads stress test hangs on read() returning EINTR

Chase Douglas cndougla at us.ibm.com
Tue Sep 23 15:54:42 UTC 2008


On Sep 22, 2008, at 8:23 AM, Chase N Douglas wrote:
> On Sep 22, 2008, at 3:39 AM, Peter Kjellerstedt wrote:
>>> -----Original Message-----
>>> From: uclibc-bounces at uclibc.org [mailto:uclibc-bounces at uclibc.org]  
>>> On
>>> Behalf Of Chase Douglas
>>> Sent: den 19 september 2008 22:21
>>> To: Chase N Douglas
>>> Cc: uclibc at uclibc.org
>>> Subject: Re: pthreads stress test hangs on read() returning EINTR
>>>
>>> On Sep 19, 2008, at 4:05 PM, Chase N Douglas wrote:
>>>> I believe I have found the issue. Suppose there are threads A and  
>>>> B.
>>>> Thread A is running and is in the critical section of malloc()  
>>>> where
>>>> locks are held to manage the heap and such. A context switch then
>>>> occurs before thread A can relinquish the malloc locks. Thread B  
>>>> now
>>>> attempts to fork. After forking, the parent process (still thread  
>>>> B)
>>>> continues as normal, but the new child process immediately attempts
>>>> to free the manager thread resources as it's no longer needed. At
>>>> this point, the child process attempts to grab the malloc locks to
>>>> free the memory, but it sees that someone is holding the lock (what
>>>> used to be thread A). We now have a deadlock since even if thread A
>>>> relinquishes the lock, it's in a separate process now with a
>>>> separate virtual memory space. The lock state in the child process
>>>> will never be relinquished. For this reason, such locks should be
>>>> held before forking, the parent process after the fork should
>>>> relinquish them, and the child process after the fork should
>>>> reinitialize them.
>>>>
>>>> However, there are many locks in uClibc and not a single one is
>>>> managed in this way before and after forking. Shouldn't every  
>>>> single
>>>> lock be dealt with before and after forking, or am I missing
>>>> something here?
>>>
>>> After a second glance I see that the only locks this applies to are
>>> pthread_mutex_t locks and other types derived from pthread_mutex_t
>>> locks. Inside uClibc itself this seems to be only malloc locks and
>>> regex locks. I will be trying to put together a patch that will take
>>> care of both of these types of locks when forking.
>>
>> Please take a look at
>>
>> http://uclibc.org/lists/uclibc/2008-March/019187.html
>>
>> which reported this problem half a year ago...
>
> Thanks for pointing that out. However, like the submitter notes,  
> it's not a full solution as it only encompasses the malloc-standard  
> mutex. I need to be sure that a hang is not possible from any mutex  
> so my thought is to take the mutex initialization macros and turn  
> them into functions that will register the internal mutexes so that  
> when fork() is called we can make sure to grab each mutex before we  
> proceed.

According to the POSIX standard for fork (man 3p fork):

"If a multi-threaded process calls fork(), the new process shall  
contain a replica of the calling thread and its entire address space,  
possibly including the states of mutexes and other resources.   
Consequently, to avoid errors, the child process may only execute  
async-signal-safe operations until such time as one of the exec  
functions is called."

Here's a listing of all the POSIX stipulated async-signal-safe  
functions: http://docsun.cites.uiuc.edu/sun_docs/C/solaris_9/SUNWdev/MTP/p40.html 
.

I have audited the uClibc source code to try to find all the mutexes  
used by these functions so that a proper fork call will not hang as  
long as the child process only uses these async-signal-safe functions.  
My audit (hopefully accurate) showed that the only mutexes used were  
malloc mutexes and two fork mutexes (pthread_atfork mutex and  
pthread_once_mutex).

The attached patch locks all of these mutexes before calling the real  
fork syscall. The parent thread unlocks the mutexes afterwards while  
the child process reinitializes them. All three malloc libraries are  
supported and compile cleanly, however, only malloc-standard was  
tested. A modification to the "malloc" (i.e. not malloc-standard nor  
malloc-simple) library was made to remove the two heap locks from the  
heap structure so they could be made available through an extern  
variable. Also, a new __libc_lock_init_adaptive macro was added to the  
libc-lock.h header as well as a modification of the  
__libc_lock_init_recursive macro as its form did not match up with the  
current recursive lock type.

Thanks,
Chase Douglas

-------------- next part --------------
A non-text attachment was scrubbed...
Name: uclibc-ptfork-mutexes.patch
Type: application/octet-stream
Size: 22583 bytes
Desc: not available
Url : http://lists.busybox.net/pipermail/uclibc/attachments/20080923/8947dd8d/attachment-0002.obj 
-------------- next part --------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3892 bytes
Desc: not available
Url : http://lists.busybox.net/pipermail/uclibc/attachments/20080923/8947dd8d/attachment-0002.bin 


More information about the uClibc mailing list