segfault in dlclose() when lots of shared objects are dlopened
Anthony G. Basile
basile at opensource.dyc.edu
Mon Dec 22 15:07:15 UTC 2014
Hi all,
I hit this stubborn segfault when buidling librsvg on uclibc.
Specifically the problem occurred when running gdk-pixbuf-query-loaders.
What this does is it reads a libtool .la file and generates cache of
loadable shared objects to be consumed by gdk-pixbuf. In my particular
case, this meant dlopening and dlclosing some 47 shared objects as
reported by "LD_DEBUG=1 /usr/bin/gdk-pixbuf-query-loaders
./libpixbufloader-svg.la 2>&1 | grep ^do_dlopen | grep ctors | wc -l"
Here's the backtrace of the segfault:
Program received signal SIGSEGV, Segmentation fault.
__deregister_frame_info (begin=0x7ffff22d96e0) at
/var/tmp/portage/sys-devel/gcc-4.8.3/work/gcc-4.8.3/libgcc/unwind-dw2-fde.c:222
222
/var/tmp/portage/sys-devel/gcc-4.8.3/work/gcc-4.8.3/libgcc/unwind-dw2-fde.c:
No such file or directory.
(gdb) bt
#0 __deregister_frame_info (begin=0x7ffff22d96e0) at
/var/tmp/portage/sys-devel/gcc-4.8.3/work/gcc-4.8.3/libgcc/unwind-dw2-fde.c:222
#1 0x00007ffff22c281e in __do_global_dtors_aux () from /lib/libbz2.so.1
#2 0x0000555555770da0 in ?? ()
#3 0x0000555555770da0 in ?? ()
#4 0x00007fffffffdde0 in ?? ()
#5 0x00007ffff22d8a2f in _fini () from /lib/libbz2.so.1
#6 0x00007fffffffdde0 in ?? ()
#7 0x00007ffff6f8018d in do_dlclose (vhandle=0x7ffff764a420
<__malloc_lock>, need_fini=32767) at ldso/libdl/libdl.c:860
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
Line 860 of libdl.c is "DL_CALL_FUNC_AT_ADDR (dl_elf_fini,
tpnt->loadaddr, (int (*)(void)));" This suggests that the dtor function
at dl_elf_fini for libbz2.so.1 is not a valid address at this point. An
independent check dlopening/dlclosing libbz2.so.1 shows that the library
is okay.
Nathan Copa suggested in
http://lists.uclibc.org/pipermail/uclibc/2012-October/047059.html that
it might be related to commit 9b42da7d0558884e2a3cc9a8674ccfc752369610.
So I wrote a little patch to toggle that commit on an environment
variable NEW_START. What I found is that previous to 9b42da7, all
DL_LIB_UNMAP-ings are in fact broken. All the lengths are wrong, so all
the munmap syscalls fail. But then my segfault goes away. When I turn
on the fix in 9b42da7, the segfault reappears:
# NEW_START=1 LD_DEBUG=1 /usr/bin/gdk-pixbuf-query-loaders
libpixbufloader-svg.la
...
do_dlclose():859: running dtors for library /lib/libbz2.so.1 at
0x7f5df192ba26
Segmentation fault
Note that the segfault occurs at DL_CALL_FUNC_AT_ADDR which is before
the munmap() of libbz2.so as it should be. So it looks like some
*previous* munmapping unmapped the dl_elf_fini address for libbz2.so,
leading to the segfault.
Now I'm stuck and don't know how to proceed. I asked Rich Felker of
musl, and I may be misunderstanding him, but it seems that these
unmappings are intrinsically dangerous. So, should I be tracking down
some accounting error in all the libdl code which is messing up the
mapping addresses, or should we just define DL_LIB_UNMAP to a noop?
Aside: There is an independent problem in commit 9b42da7. It assumes
that the start and end addresses for the munmapping are 32-bits. The
fix is:
- unsigned int end = 0, start = 0xffffffff;
+ ElfW(Addr) end = 0, start = (ElfW(Addr))(~0ULL);
--
Anthony G. Basile, Ph. D.
Chair of Information Technology
D'Youville College
Buffalo, NY 14201
(716) 829-8197
More information about the uClibc
mailing list