FEATURE_PREFER_APPLETS should fail gracefully.

Rob Landley rob at landley.net
Wed Feb 2 22:59:37 UTC 2011


On 02/02/2011 09:17 AM, Denys Vlasenko wrote:
> In the case of "rm FILE", fork+unlink syscalls are much heavier than
> just unlink syscall in the typical case of FILE's dentries and inodes
> being in caches already.

Hmmm...

#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdlib.h>

void die(char *error)
{
  fprintf(stderr,"%s\n", error);
  exit(1);
}

int main(int argc, char *argv[])
{
  int fd, i;

  for (i=0; i<100000; i++) {
    fd = open("walrus", O_CREAT|O_EXCL, 0777);
    if (fd<0) die("open");
    close(fd);

    if(unlink("walrus")) die("unlink");
  }
  return 0;
}

$ time ./a.out

real	0m1.073s
user	0m0.010s
sys	0m1.020s

Now add

  if (!fork()) {

before the open() and

    _exit(0);
  } else wait();

after he unlink(), and...

$ time ./a.out

real	0m20.199s
user	0m0.300s
sys	0m7.600s

Yeah, factor of 20 is fairly significant.  Of course mine didn't have
any command line option parsing overhead or similar, it was _just_
measuring the overhead of "fork plus wait" vs "open plus close plus
unlink", and doing a microbenchmark in a very tight loop seldom tells
you useful stuff about real world usage patterns.  (Then again faulting
in additional physical pages vs re-using existing heap...  I'd have to
benchmark it _in_busybox_, which is more effort than I want to go to
right now. :)

>> NOFORK means "shell builtin".
> 
> This is not a consensus right now.

I note that I have quite a history of "don't ask questions, post errors"
style influence on projects.  When I suggested a "patch penguin", Linus
decided to adopt a distributed source control system and eventually
write his own from scratch.  (And of course he was right.)

So at this point, I expect people to take "this is what I intended" with
a large grain of salt. :)

> NOFORK now is an optimization,
> and it is used by xargs, find, and a few more applets via
> spawn_and_wait() function,

Um, isn't xargs pretty much always on the right side of a pipeline?  How
would nofork work in that case?  (If it's not a process, how do you
redirect its stdin..?)

> not just by shells.
> And shell builtins are separate beasts right now. Every time
> I look at shell internals I see that merging those two aren't
> as easy as it might appear at the first glance.

I got shell builtins to work as part of the normal applet table in
toybox, but I designed my command infrastructure with that in mind in
the fist place.

In toys/toysh.c:

USE_TOYSH(NEWTOY(cd, NULL, TOYFLAG_NOFORK))
USE_TOYSH(NEWTOY(exit, NULL, TOYFLAG_NOFORK))

void cd_main(void)
{
        char *dest = *toys.optargs ? *toys.optargs : getenv("HOME");
        if (chdir(dest)) {
                perror_msg("chdir");
                toys.exitval++;
        }
}

void exit_main(void)
{
        exit(*toys.optargs ? atoi(*toys.optargs) : 0);
}

Generally the hard part of this sort of design is _not_ doing stuff that
would screw it up.  (The not-doing-things part is actually what I spent
most of my time on.)

> If we want to redefine meaning of NOFORK, we need to document
> it in docs/nofork_noexec.txt, to make sure at least developers
> are on the same page about them. We need a plan how to merge
> them with builtins.

Basic design work.  You need to know what your infrastructure's supposed
to do before implementing or deploying it.

If you prefer whatever your new meaning of NOFORK is, that's your call.
 It's your project now.  However, using NOFORK to mean "what a shell
builtin needs" worked pretty well for me.  Possibly "BUILTIN" is a
better name than NOFORK...

>>  NOEXEC means "I know for a fact we don't
>> need to reinitialize global state the previous applet may have stomped
>> on".  Note that we don't even have to be careful about freeing memory
>> and closing files, since the OS does that.  I'd need to look more
>> closely at the _exit() vs exit() thing, but if the child and parent
>> aren't doing a CLONE_FILES thing then as long as stdio got flushed
>> before the fork it's probably ok...?
> 
> Yes.
> 
> NOEXEC and NOFORK generally suffer from being clever optimizations
> which need accurate coding and much testing, but aren't getting it -
> people just disable FEATURE_PREFER_APPLETS when they hit
> the first bug, I suspect usually without even reporting it.
> Therefore these features tend to be somewhat broken.

"When in doubt, use brute force" - Ken Thompson.

I was initially attracted to busybox by the simplicity.  The _code_ was
small, and easy to read.  The size of the resulting binary was actually
fairly secondary to me, I was already booting knoppix from a 600
megabyte CD.

The cleverness we had (and the cleverness I tried to build into toybox)
was all about sharing code, meaning "we implement this once, and then
the _rest_ of the code can be simpler because stuff is magically done
for it, rather than having to reimplement things in multiple places".
It was Brian Kernighan's "Single Point of Truth":

  http://www.faqs.org/docs/artu/ch04s02.html

It was the magic of the automatic transmission: it's a very complicated
machine but it means you never have to shift or use the clutch while
driving which removes about 1/5 the complexity of the total process, and
is thus a net win.  (The command option parsing code is like that.  I
optimized mine to the point that commands didn't even _call_ it, it
happened before they were called and they just referred to global
variables containing the results.  Writing a new command involved coming
up with an option string and defining a structure to hold the results.
Plus flag #defines if you were feeling posh, and I kept meaning to
design something that would automatically create THOSE for you too. :)

I spent rather a lot of my tenure on busybox taking stuff that already
worked and figuring out how to do a simpler implementation of it.  I
didn't rewrite bunzip2 to make it smaller (although that was a nice side
effect), and my initial implementation _wasn't_ faster (it was actually
about 10% slower, the biggest chunk of which was because I put a modulus
in an inner loop).  But it was a whole lot simpler.

The reason I never extensively banged on the tar or dpkg or shell
implementations because they weren't simple, to fix them I'd have to go
reimplement the whole thing to be small and clean and straightforward,
and I was never able to set aside a large enough block of time for any
of those.  (I was working on the ext2 stuff when Bruce happened.)

A big part of the reason I haven't particularly wanted to get my hands
dirty on BusyBox again is that what attracted me to it in the first
place is no longer there.  It is now full of clever optimizations I
don't understand, aimed at doing things _other_ than reducing the
complexity of the rest of the program.  My primary goal of simplicity is
now in fourth place behind "size", "features", and "speed".  I
occasionally accepted larger binary size to keep the implementation
simple, and more than once I turned down features to keep the project
simple.  This NOFORK implementation was sacrificing simplicity for
speed, meaning simplicity _has_ to be (at best) in fourth place.

These days, the whole of BusyBox is a giant cross-connected hairball
where "main.c" vanished into libbb/ somewhere and every applet_main() is
listed _twice_ for no readily apparent reason with all sorts of magic
MAIN_EXTERNAL and UNUSED_PARAM annotations that can't possibly be
actually _necessary_.  And I'm sure there are reasons for them, just as
I'm sure the gnu project's code has reasons it put in all its' extra
complexity: I just don't care what those reasons are.

If I wanted that, there are plenty of other implementations providing it.

> 
>>> How does this look?
>>
>> Not really an improvement.  (And all caps function name is scar tissue.)
> 
> Since it can be a macro in some configurations, I'd rather
> let user know it that it is maybe a macro.

Is there any way to make it not be a macro?

And while some lower case names are macros, ALL_CAPS_NAME generally
doesn't have any "maybe" about its macro-ness.

> The previous code had a trivial macro and a lowercase-named
> function exactly to address your new complaint, but then you
> complained about needless level of indirection.
> Please pick one. You can't have both.

Why not?  What was wrong with having users call the xexecvp() wrapper
directly (like they used to), puting the appropriate if
(ENABLE_FEATURE_PREFER_APPLETS) test inline in the function, and having
the noexec text also be in there to make a function call instead of
execing as necessary?

Callers want "run this command" and maybe "don't return from failure".
I don't see why they should care that much about the implementation
details, but I haven't looked at all the call sites recently...

Oh well, I no longer know what I'm talking about with this codebase.

>> I remember when we had something like bb_exec(), and channeled
>> everything through it.  Having two bb_exec variants where one isn't a
>> convenience wrapper around the other would make me step back and go
>> "why?" even without the #ifdef.  However, I haven't really got enough
>> context to step back and go "why" anymore.
> 
> In a few places, execl is much easier to use than execv.
> And wrapping execl into execv portably is impossible IIRC
> (am I wrong?). That's why we have both.

I note that there HAS to be a way to do this since the system call takes
the execv arguments, not execl arguments, so the C library is going to
be converting them.  Let's see what uClibc does...

        n = 0;
        va_start(args, arg);
        do {
                ++n;
        } while (va_arg(args, char *));
        va_end(args);

        p = argv = (char **) EXEC_ALLOC((n+1) * sizeof(char *), size,
EXEC_FUNC_COMMON);

        p[0] = (char *)arg;

        va_start(args, arg);
        do {
                *++p = va_arg(args, char *);
        } while (--n);
        va_end(args);

        n = execve(path, (char *const *) argv, __environ);

That might be smaller rolled up in a for loop, but let's back up and
question our assumptions here:

In toybox I only implemented the one wrapper:

// Die unless we can exec argv[] (or run builtin command).  Note that
// anything with a path isn't a builtin, so /bin/sh won't match the
// builtin sh.
void xexec(char **argv)
{
        toy_exec(argv);
        execvp(argv[0], argv);
        error_exit("No %s", argv[0]);
}

When I wanted to call something inline, I did this:

  char *args[] = {"command", "one", "two", "three", NULL};
  xexec(args[0], args);

And when I didn't want the error_exit(), I just called toy_exec and
execvp() one after the other.

It's possible that the code could have been smaller if I'd been more
clever.  But it probably wouldn't have been simpler.  (My common code
was there to simplify things, not to hide what I was doing.)  I didn't
do an xexecl or a non-path version because I didn't want two code paths
doing approximately the same thing.  That was a bigger consideration for
me than whether the result was actually smaller and faster and so on.

But I was going for simple code.  You're going for small binary size
that runs fast and has lots of features.  If that's what you want,
bouncing off several different files with macros redefining things along
the way may serve your purposes better than anything I would be
interested in looking at.

Rob


More information about the busybox mailing list