libdl usage count wrapping

Phil Estes estesp at linux.vnet.ibm.com
Tue Sep 23 02:45:20 UTC 2008


Recently I was looking into an issue where someone was claiming pam was
segfaulting after a lot of usage (lots of calls to authenticate
users--usually around 5K calls).  My investigation led me to the point
where I realized that the dlopen() and dlclose() management of ref.
counting is not balanced, which leads to the heaviest "DL_NEEDED"
libraries basically getting incremented to the point of overflowing
"unsigned short usage_count".  This leads to a nasty situation where
libc.so is munmapped (because usage_count == 0), and the next call to a
C runtime function traps, of course.

Since I was working with 0.9.29 snapshot from 2006, I decided to see if
anything in SVN has changed that might impact this.  I was interested to
find the 17530 changeset and accompanying discussion
( http://uclibc.org/lists/uclibc/2007-January/017165.html ) and while
testing it, noted a 10x improvement, given that more of the dependent
libs are added to the list that is walked at do_dlclose that includes a
decrement of usage_count.  However, it is still not exact in that
libc.so's usage_count continues to rise even with matching dlopen() and
dlclose() calls each iteration through the example program (I'll attach
below), and now needs 65K iterations of loading/unloading a lib to get a
segfault.

One way to 'watch' this in gdb is to add a breakpoint in libdl.c around
line 247 or so and use the commands interface to do the following
output:
commands <brnum>
silent
printf "%d : %s\n",(*tpnt1)->usage_count, lpntstr
cont
end

Now continue and watch libc's usage_count climb, and if you are patient
enough, you will get a segfault somewhere after 65,500 iterations.

Obviously one potential fix is to "handle" the wrap of usage_count in
ldso/dl-elf.c by checking for 0 after the increment and setting to "near
max" ..which would then never allow a wrap to zero to occur which causes
the segfault.

However, it would be interesting to know if the uClibc maintainers think
usage_count is important enough for some of the core libs to either (a)
be correct, or (b) be protected from the segfault condition which is
less likely than it used to be given the aforementioned changes, but
still potential for long-running apps (like pam running on a system with
a very long uptime).  I'm slightly interested in trying to fix, but
given I'm no expert on all the various lists and pointers employed via
dlopen() it seems like someone with some skill in the area should make
sure it's done right.  My hunch is that is has to do with the init_fini
list creation and the filtering out of RTLD_GLOBAL libs, but I'm not
sure yet without more debug...but the init_fini list seems to be what is
walked at dlclose that has any relation to decrementing usage_count.

Thanks for any thoughts/input,
Phil Estes
estesp at linux.vnet.ibm.com

Here's my example program for creating the segfault condition..it can
obviously be doctored to load different libs, more libs, less libs, etc.

---stress-dlopen.c----
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>

#define NUMLIBS 4
#define ITERS 50000

void *handles[NUMLIBS];
/*
char *names[NUMLIBS] = { "/lib/security/pam_deny.so",
                         "/lib/security/pam_warn.so",
                         "/lib/libpam.so.0" };
*/
char *names[NUMLIBS] = { "/usr/lib/libxml2.so.2",
                         "/usr/lib/libpcap.so",
                         "/usr/lib/libpng.so.2",
                         "/usr/lib/libxslt.so.1" };

int openlibs()
{
	int i, errors=0;
	for (i = 0; i < NUMLIBS; i++) {
		handles[i] = dlopen(names[i], RTLD_NOW);
		if (handles[i] == 0) {
			errors++;
			fprintf(stderr, "%s\n", dlerror());
		}
    }
    return errors;
}

int closelibs()
{
	int i;
	for (i = 0; i < NUMLIBS; i++) {
		if (handles[i] != 0) {	
			dlclose(handles[i]);
			handles[i] = NULL;
		}
    }
    return 0;
}

int main()
{
	unsigned int retcode = 0, i = 0, cnt = ITERS;

	for(i=0;i<cnt;i++)
	{
		printf("\n Call ... %d/%d\n",i,cnt);
		retcode = openlibs();
		printf("opened %d\n",retcode);
		retcode = closelibs();
		printf("closed %d\n",retcode);

	}
	return 0;

}







More information about the uClibc mailing list