segfault in dlclose() when lots of shared objects are dlopened

Anthony G. Basile basile at opensource.dyc.edu
Mon Dec 22 15:07:15 UTC 2014


Hi all,

I hit this stubborn segfault when buidling librsvg on uclibc. 
Specifically the problem occurred when running gdk-pixbuf-query-loaders. 
  What this does is it reads a libtool .la file and generates cache of 
loadable shared objects to be consumed by gdk-pixbuf.  In my particular 
case, this meant dlopening and dlclosing some 47 shared objects as 
reported by "LD_DEBUG=1 /usr/bin/gdk-pixbuf-query-loaders 
./libpixbufloader-svg.la 2>&1  | grep ^do_dlopen | grep ctors | wc -l"

Here's the backtrace of the segfault:

Program received signal SIGSEGV, Segmentation fault.
__deregister_frame_info (begin=0x7ffff22d96e0) at 
/var/tmp/portage/sys-devel/gcc-4.8.3/work/gcc-4.8.3/libgcc/unwind-dw2-fde.c:222
222 
/var/tmp/portage/sys-devel/gcc-4.8.3/work/gcc-4.8.3/libgcc/unwind-dw2-fde.c: 
No such file or directory.
(gdb) bt
#0 __deregister_frame_info (begin=0x7ffff22d96e0) at 
/var/tmp/portage/sys-devel/gcc-4.8.3/work/gcc-4.8.3/libgcc/unwind-dw2-fde.c:222
#1 0x00007ffff22c281e in __do_global_dtors_aux () from /lib/libbz2.so.1
#2 0x0000555555770da0 in ?? ()
#3 0x0000555555770da0 in ?? ()
#4 0x00007fffffffdde0 in ?? ()
#5 0x00007ffff22d8a2f in _fini () from /lib/libbz2.so.1
#6 0x00007fffffffdde0 in ?? ()
#7 0x00007ffff6f8018d in do_dlclose (vhandle=0x7ffff764a420 
<__malloc_lock>, need_fini=32767) at ldso/libdl/libdl.c:860
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Line 860 of libdl.c is "DL_CALL_FUNC_AT_ADDR (dl_elf_fini, 
tpnt->loadaddr, (int (*)(void)));"  This suggests that the dtor function 
at dl_elf_fini for libbz2.so.1 is not a valid address at this point.  An 
independent check dlopening/dlclosing libbz2.so.1 shows that the library 
is okay.

Nathan Copa suggested in 
http://lists.uclibc.org/pipermail/uclibc/2012-October/047059.html that 
it might be related to commit 9b42da7d0558884e2a3cc9a8674ccfc752369610. 
  So I wrote a little patch to toggle that commit on an environment 
variable NEW_START.  What I found is that previous to 9b42da7, all 
DL_LIB_UNMAP-ings are in fact broken.  All the lengths are wrong, so all 
the munmap syscalls fail.  But then my segfault goes away.  When I turn 
on the fix in 9b42da7, the segfault reappears:

# NEW_START=1 LD_DEBUG=1 /usr/bin/gdk-pixbuf-query-loaders 
libpixbufloader-svg.la
...
do_dlclose():859: running dtors for library /lib/libbz2.so.1 at 
0x7f5df192ba26
Segmentation fault

Note that the segfault occurs at DL_CALL_FUNC_AT_ADDR which is before 
the munmap() of libbz2.so as it should be.  So it looks like some 
*previous* munmapping unmapped the dl_elf_fini address for libbz2.so, 
leading to the segfault.

Now I'm stuck and don't know how to proceed.  I asked Rich Felker of 
musl, and I may be misunderstanding him, but it seems that these 
unmappings are intrinsically dangerous.  So, should I be tracking down 
some accounting error in all the libdl code which is messing up the 
mapping addresses, or should we just define DL_LIB_UNMAP to a noop?

Aside: There is an independent problem in commit 9b42da7.   It assumes 
that the start and end addresses for the munmapping are 32-bits.  The 
fix is:

-       unsigned int end = 0, start = 0xffffffff;
+       ElfW(Addr) end = 0, start = (ElfW(Addr))(~0ULL);


-- 
Anthony G. Basile, Ph. D.
Chair of Information Technology
D'Youville College
Buffalo, NY 14201
(716) 829-8197


More information about the uClibc mailing list