[PATCH] pselect: Use linux pselect syscall when available
Rich Felker
dalias at libc.org
Sat Dec 19 04:44:13 UTC 2015
On Fri, Dec 18, 2015 at 07:39:19PM -0800, Nicolas S. Dade wrote:
> This supercedes the previous pselect patch from 16 Dec 2015.
>
> Linux has a pselect syscall since 2.6.something. Using it
> rather than emulating it with sigprocmask+select+sigprocmask
> is smaller code, and works properly. (The emulation has
> race conditions when unblocked signals arrive before or
> after the select)
>
> The tv.nsec >= 1E9 handling comes from uclibc's linux select()
> implementation, which itself uses pselect() internally if the
> pselect syscall exists. I though it would be good to do the
> same here.
>
> Note that although the libc pselect() API has 6 arguments,
> the linux kernel syscall as 7 arguments. There is an extra,
> somewhat vestigial, sizeof the signal mask argument.
>
> Signed-off-by: Nicolas S. Dade <nic.dade at gmail.com>
> ---
> libc/sysdeps/linux/common/pselect.c | 52 +++++++++++++++++++++++++++++++++++++
> 1 file changed, 52 insertions(+)
>
> diff --git a/libc/sysdeps/linux/common/pselect.c b/libc/sysdeps/linux/common/pselect.c
> index bf19ce3..3f1dd28 100644
> --- a/libc/sysdeps/linux/common/pselect.c
> +++ b/libc/sysdeps/linux/common/pselect.c
> @@ -30,6 +30,57 @@ static int __NC(pselect)(int nfds, fd_set *readfds, fd_set *writefds,
> fd_set *exceptfds, const struct timespec *timeout,
> const sigset_t *sigmask)
> {
> +#ifdef __NR_pselect6
> +#define NSEC_PER_SEC 1000000000L
> + struct timespec _ts, *ts = 0;
> + if (timeout) {
> + /* The Linux kernel can in some situations update the timeout value.
> + * We do not want that so use a local variable.
> + */
> + _ts = *timeout;
> +
> + /* GNU extension: allow for timespec values where the sub-sec
> + * field is equal to or more than 1 second. The kernel will
> + * reject this on us, so take care of the time shift ourself.
> + * Some applications (like readline and linphone) do this.
> + * See 'clarification on select() type calls and invalid timeouts'
> + * on the POSIX general list for more information.
> + */
> + if (_ts.tv_nsec >= NSEC_PER_SEC) {
> + _ts.tv_sec += _ts.tv_nsec / NSEC_PER_SEC;
> + _ts.tv_nsec %= NSEC_PER_SEC;
> + }
> +
> + ts = &_ts;
> + }
> +
> + /* The pselect6 syscall API is strange. It wants a 7th arg to be
> + * the sizeof(*sigmask). However syscalls with > 6 arguments aren't
> + * supported on linux. So arguments 6 and 7 are stuffed in a struct
> + * and a pointer to that struct is passed as the 6th argument to
> + * the syscall.
> + * Glibc stuffs arguments 6 and 7 in a ulong[2]. Linux reads
> + * them as if there were a struct { sigset_t*; size_t } in
> + * userspace. There woudl be trouble if userspace and the kernel are
> + * compiled differently enough that size_t isn't the same as ulong,
> + * but not enough to trigger the compat layer in linux. I can't
> + * think of such a case, so I'm using linux's struct.
> + * Furthermore Glibc sets the sigsetsize to _NSIG/8. However linux
> + * checks for sizeof(sigset_t), which internally is a ulong array.
> + * This means that if _NSIG isn't a multiple of BITS_PER_LONG then
> + * linux will refuse glibc's value. So I prefer sizeof(sigset_t) for
> + * the value of sigsetsize.
> + */
> + struct {
> + const sigset_t *sigmask;
> + size_t sigsetsize;
> + } args67 = {
> + sigmask,
> + sizeof(sigset_t),
> + };
This might work if uclibc defines sigset_t not to have the ridiculous
expansion space for 1024 signals like glibc does, but it still seems
fragile. I'd use _NSIG/8. That's what we do in musl. (If the size you
pass to the kernel is wrong you get EINVAL or something like that.)
Rich
More information about the uClibc
mailing list