resolver broken in NPTL build

Denys Vlasenko vda.linux at googlemail.com
Sun May 16 04:00:56 UTC 2010


On Wednesday 12 May 2010 09:32, Timo Teräs wrote:
> The combination of
> 
> commit aab4df0fb51660300559f5f29290709db2f7bfee
>   resolv.c: add support for per thread res_state
> 
> commit cca45baf8353d1e338d232f5bdb2d1d6b357f1da
>   /etc/resolv.conf: support "timeout:n" and "attempts:n" options
> 
> .. and NPTL results in broken resolver in very annoying ways.
> 
> Now, it seems that most of the uclibc code does not work well
> if res_state is TLS variable. Technically, this is the correct thing
> to do since this gives proper per-thread resolving behavior, and
> it also makes the config options overridable per thread. This
> probably what apps expect as glibc does it too.

Do you have an example where per-threadness is used?

> But alas, most places use _res to sync up static global variables
> which results in breakage. It gets more or less randomly selected
> which threads options get applied. Also in case of multiple servers
> it looks like the retry logic is shared between all threads, e.g.
> two concurrent resolutions can make other resolvers skip nameservers
> due to shared "last_ns_num".

I see.

> And finally the timeout/attempts commit breaks the accumulated stuff
> horribly. What happens is:
>  1. multithreaded application startups, initializes resolver,
>     resolves things just fine
>  2. resolv.conf gets changed, application calls res_init
>     after res_init uclibc will call res_sync on all resolver functions
>     to refresh globals from the TLS variable _res
>  3. res_init was called only in one thread, so other thread's
>     _res contains all zeroes (yes, this is correct app usage:
>     res_init should be called only from one thread)
>  4. threads not calling res_init get broken resolver due to timeout
>     being set to zero
> 
> Now, one proper solution would be to:
>  1. make __open_nameservers return the configuration options
>  2. pass the config options struct to res_sync_func so it can do
>     the proper overrides from per-thread _res
>  3. remove the related globals and use locally config options from
>     __open_nameservers
> 
> But technically, the correct thing (as in glibc does this) is:
>  - res_init increases global "res_init timestamp"
>  - use _res (or pointer within there) for resolver retries etc.
>    get proper per-thread behavior
>  - resolvers functions call "maybe_init" which reinitialize the
>    TLS'ed resolver state (e.g. reload resolv.conf) if their res_init
>    timestamp is out dated

It looks very complex. Can we just use a shared structure?
That will be simple and small.

Anyone objects to this?

> As an immediate emergency kludge fix, the following might do:
> 
> resolv: fix options handling for TLS _res
> 
> If _res is in TLS (NPTL), it might not get initialized. Assume
> zeroes mean default values.
> 
> Signed-off-by: Timo Teräs <timo.teras at iki.fi>
> 
> diff --git a/libc/inet/resolv.c b/libc/inet/resolv.c
> index 320aec4..f8066d2 100644
> --- a/libc/inet/resolv.c
> +++ b/libc/inet/resolv.c
> @@ -2916,8 +2916,8 @@ static void res_sync_func(void)
>                         __nameserver[n].sa4 = rp->nsaddr_list[n]; /*
> struct copy
>  #endif
>         }
> -       __resolv_timeout = rp->retrans;
> -       __resolv_attempts = rp->retry;
> +       __resolv_timeout = rp->retrans ?: RES_TIMEOUT;
> +       __resolv_attempts = rp->retry ?: RES_DFLRETRY;
>         /* Extend and comment what program is known
>          * to use which _res.XXX member(s).

Comment why this is needed might be good.
-- 
vda


More information about the uClibc mailing list