O_NONBLOCK on stdin left set by child (using ash shell)

Denys Vlasenko vda.linux at googlemail.com
Wed Nov 11 00:08:48 UTC 2009


On Tuesday 10 November 2009 21:00, Johns Daniel wrote:
> On Tue, Nov 10, 2009 at 1:43 PM, Cathey, Jim <jcathey at ciena.com> wrote:
> >>It seems to me that it would be bad design to have such an intent.
> >>IOW, one app is changing the *parent shell* in order to pass a
> >>non-standard state on to the next app.
> >
> > Things like stty(1) do this all the time.  Seems to me that
> > it is up to any given program (such as a shell) to force I/O
> > mode bits into any states that are required for it to work
> > correctly, and they generally do this while leaving the rest
> > of the bits alone (inherited).  Not sure how this (ancient)
> > practice bears on the current problem.
> 
> I think the key here is that the shell should do what it takes to make
> things work correctly.

But everything *is* working correctly. The program sets O_NONBLOCK
on stdin and exits. Shell, being a stupid program, has no way
of knowing whether this was intended or not, is better to leave it as is.

Before 1.10.x, shell had no code to work around EWOULDBLOCK
error on read, so it was doing this:

preadfd(void)
{
        int nr;
        char *buf =  parsefile->buf;
        parsenextc = buf;

 retry:
#if ENABLE_FEATURE_EDITING
...
#else
        nr = safe_read(parsefile->fd, buf, BUFSIZ - 1);
#endif

        if (nr < 0) {
                if (parsefile->fd == 0 && errno == EWOULDBLOCK) {
                        int flags = fcntl(0, F_GETFL);
                        if (flags >= 0 && (flags & O_NONBLOCK)) {
                                flags &= ~O_NONBLOCK;
                                if (fcntl(0, F_SETFL, flags) >= 0) {
                                        out2str("sh: turning off NDELAY mode\n");
                                        goto retry;
                                }
                        }
                }
        }

Note that it was not doing it automatically, it ONLY did it
when it had to - when it was reading from stdin. IOW,
if you run a script, it would not do it, and buggy program
could still screw up stdin for subsequent programs.

Now (after 1.10.x) it is using nonblock_safe_read()
instead of just safe_read(), and thus does not need
to switch off O_NONBLOCK.

> The current problem is that the shell stops working correctly (if the
> previous command leaves the shell in a bad state). Even if an app was
> good enough to clean up after itself, we still don't cover the case
> where the app crashes.
> 
> The alternative would be to fix nearly all apps that use stdin (cat, vi, etc)!

Yes. (However how cat could know that O_NONBLOCK wasn't set intentionally?)

There are two ways: (a) detect and switch off O_NONBLOCK
as shown above, or (2) use code which works correctly
even on O_NONBLOCKed fd:

ssize_t nonblock_safe_read(int fd, void *buf, size_t count)
{
        struct pollfd pfd[1];
        ssize_t n;

        while (1) {
                n = read(fd, buf, count);
                if (n >= 0 || errno != EAGAIN)
                        return n;
                /* fd is in O_NONBLOCK mode. Wait using poll and repeat */
                pfd[0].fd = fd;
                pfd[0].events = POLLIN;
                safe_poll(pfd, 1, -1);
        }
}

Or to bug kernel folks to finally do something about broken API.
They might provide a way to do nonblocking reads without
setting troublesome O_NONBLOCK bit. Then applications
can slowly migrate to this non-buggy API.

Yes, I know, this is boring, but unless we bug them enough,
nothing will happen.

--
vda


More information about the busybox mailing list