[uClibc 0001013]: pthread_cancel/pthread_join sequence hangs when using select in an other thread

bugs at busybox.net bugs at busybox.net
Wed Sep 5 04:25:24 UTC 2007


A NOTE has been added to this issue. 
====================================================================== 
http://busybox.net/bugs/view.php?id=1013 
====================================================================== 
Reported By:                jalaber
Assigned To:                uClibc
====================================================================== 
Project:                    uClibc
Issue ID:                   1013
Category:                   Posix Threads
Reproducibility:            always
Severity:                   major
Priority:                   normal
Status:                     assigned
====================================================================== 
Date Submitted:             08-31-2006 09:04 PDT
Last Modified:              09-04-2007 21:25 PDT
====================================================================== 
Summary:                    pthread_cancel/pthread_join sequence hangs when
using select in an other thread
Description: 
Hello,

I have found a very strange bug in uClibc using
pthread_cancel/pthread_join. 

My test program launches 1 thread which basically makes a select call with
a struct timeval set to 600ms. Then the main thread calls pthread_cancel
and pthread_join, followed by a printf. The program hangs. 
However if you remove the printf call, then the program terminates
normally. I have tried to replace the select call with a sem_wait call,
and everything works fine with or without printf. So the problem seems to
happen only with select.

I use buildroot with kernel 2.4.28 and uclibc 0.9.28. I have attached the
program to reproduce. If you try to comment the printf("join OK\n"), it
works for me.

Thank you for your time and help,
Philippe.

====================================================================== 

---------------------------------------------------------------------- 
 dwagner - 10-23-06 11:43  
---------------------------------------------------------------------- 
I think this issue is responsible that the LIRC driver of directfb RC1 does
not terminate. The driver uses select() and
pthread_cancel()/pthread_join().

Please fix that. 

---------------------------------------------------------------------- 
 vapier - 11-16-06 23:07  
---------------------------------------------------------------------- 
here's a tip ... saying things like "Please fix that." makes people think
"Fix it your goddamn self."

the hang may be because of the IO mutex being held by the canceled thread
... if you turn on PDEBUG in libpthread/linuxthreads/debug.h, that may
give you helpful output 

---------------------------------------------------------------------- 
 chombourger - 06-27-07 12:58  
---------------------------------------------------------------------- 
I tried to reproduce this issue on uclibc 0.9.29 running on a PC with a 2.6
linux kernel and your program is still running as I am typing these lines!
Is it what you are getting? Running the same program on the host (compiled
and linked against glibc worked). I will now try to enable the debug traces
to see if that helps. 

---------------------------------------------------------------------- 
 chombourger - 06-27-07 13:15  
---------------------------------------------------------------------- 
traces with debug enabled in linuxthreads.old:

26294 : __pthread_initialize_manager: manager stack: size=8160,
bos=0x804a150, tos=0x804c130
26294 : __pthread_initialize_manager: send REQ_DEBUG to manager thread
26294 : pthread_create: write REQ_CREATE to manager thread
26294 : pthread_create: before suspend(self)
26295 : __pthread_manager: before poll
26295 : __pthread_manager: after poll
26295 : __pthread_manager: before __libc_read
26295 : __pthread_manager: after __libc_read, n=148
26295 : __pthread_manager: got REQ_CREATE
26295 : pthread_handle_create: cloning new_thread = 0xbf1ffe20
26295 : pthread_handle_create: new thread pid = 26296
26295 : __pthread_manager: restarting -1208466944
26294 : pthread_create: after suspend(self)
26295 : __pthread_manager: before poll
26296 : pthread_start_thread:
step 0
step 1
step 2
26295 : __pthread_manager: after poll
26295 : __pthread_manager: before poll
step 3
cancel th...
26294 : pthread_cancel: sending cancel signal to 26296
26294 : pthread_cancel: kill returned 0 

---------------------------------------------------------------------- 
 chombourger - 06-30-07 14:22  
---------------------------------------------------------------------- 
It seems that the created thread has no jmpbuf when
pthread_handle_sigcancel() is called in the created thread and the signal
handler returns and the thread was not rerouted.

select() does not behave like a cancellation point (while it should).
Could it be because select() is simply a syscall5 and we therefore never
reach the sigwait() function of the pthread library?

If I modify the select() call as follow, the thread is indeed canceled:

r = select(0, NULL, NULL, &tv);
if ((r == -1) && (errno == EINTR)) pthread_testcancel();

I eventually found where linuxthreads.old defines cancellable system
calls: wrapsyscall.c and added an entry for select(2). 

Note: select() was previously listed as a cancellation point but it got
removed by Ulrich Depper and I don't know why:

CVSROOT:	/cvs/glibc
Module name:	libc
Changes by:	drepper at sourceware.org	2002-12-15 13:43:25

Modified files:
	linuxthreads   : wrapsyscall.c 

Log message:
	Remove creat, poll, pselect, readv, select, sigpause, sigsuspend,
	sigwaitinfo, waitid, and writev wrappers.

I have attached to this report, a patch re-introducing select(2).

 

---------------------------------------------------------------------- 
 hmoffatt - 09-04-07 21:25  
---------------------------------------------------------------------- 
I have an application which is hanging with uClibc 0.9.29. The main process
is regularly calling fork() and exec(). There is also a thread which is
doing a select() with a 1ms timeout in an endless loop. The application
sometimes hangs just after the fork/exec; the exec has happened (there is
a zombie process left around).

I tried the patch in this bug report; now the program segfaults instead of
hanging. So I don't think the patch is the correct solution. 

Issue History 
Date Modified   Username       Field                    Change               
====================================================================== 
08-31-06 09:04  jalaber        New Issue                                    
08-31-06 09:04  jalaber        Status                   new => assigned     
08-31-06 09:04  jalaber        Assigned To               => uClibc          
08-31-06 09:04  jalaber        File Added: pthread_join_test.c                  
 
10-23-06 11:41  dwagner        Issue Monitored: dwagner                     
10-23-06 11:43  dwagner        Note Added: 0001715                          
11-16-06 23:07  vapier         Note Added: 0001744                          
06-27-07 12:58  chombourger    Note Added: 0002526                          
06-27-07 13:15  chombourger    Note Added: 0002527                          
06-27-07 13:26  chombourger    Note Added: 0002528                          
06-27-07 13:52  chombourger    Note Edited: 0002528                         
06-30-07 12:49  chombourger    Note Edited: 0002528                         
06-30-07 14:18  chombourger    File Added:
uClibc-select-cancellation-point.patch                    
06-30-07 14:22  chombourger    Note Edited: 0002528                         
07-01-07 22:26  chombourger    Issue Monitored: chombourger                    
09-04-07 21:25  hmoffatt       Note Added: 0002712                          
======================================================================




More information about the uClibc-cvs mailing list