The getopt long thing breaks the tree.

Rob Landley rob at landley.net
Fri Jun 2 15:54:08 UTC 2006


On Thursday 01 June 2006 12:39 am, Rich Felker wrote:
> On Wed, May 31, 2006 at 11:24:59PM -0400, Rob Landley wrote:
> > On Wednesday 31 May 2006 9:40 pm, Rich Felker wrote:
> > > On Wed, May 31, 2006 at 05:27:58PM -0400, Rob Landley wrote:
> > > > > What i, personally, care about is standard compliance; There simply
> > > > > is no such thing like getopt_long nor "struct option".  See
> > > > > http://www.opengroup.org/onlinepubs/009695399/functions/getopt.html
> > > >
> > > > There's no such thing as mount in the darn standard either.  You
> > > > should stop using it immediately.
> > >
> > > The standard omits mount for a very good reason: not that it should
> > > not exist, but that specifying such low-level, administrator-specific
> > > things is not the job of the standard but of the particular
> > > implementation.
> >
> > It is not a complete spec, and it doesn't say we can't rely on things
> > that aren't in the spec.
>
> It doesn't say you can't, but it says that programs that do are not
> strictly conforming applications.  Whether you care or not... 

"Don't care" implies a mere lack of caring, where what I actually have is a 
noticeable amount of negative caring here.  I care in the opposite direction.

Busybox is not and has never has been a "strictly conforming" application in 
that since.  We don't even plan to _provide_ a strictly conforming 
environment since we're not doing the internationalization stuff, so just 
about every applet has a set of environment variables we're supposed to care 
about, but don't.

However, we plan to provide, as one of the configuration options, a reasonably 
accurate SUSv3 environment with only documented deviations.  This doesn't 
mean we'll run in one, it means our apps behave according to the descriptions 
in the spec, with specific exceptions such as no internationalization 
support.

Don't confuse the two.

> > There's a difference between the environment we _provide_ and the
> > environments in which we _build_ and _run_.  Pointing to the spec for the
> > environment we provide, I'll look at.  Pointing to the spec about the
> > environment in which we build and run is something about which I
> > _DO_NOT_CARE_.
>
> .. I think we see. :)

A) If you ignore what I say long enough, I'll start ignoring what _you_ say by 
putting you on my spam filter.

B) You are aware that if I'm pushed enough, I'll delete platform.h and make 
_sure_ the project is Linux-only?  I don't want to do that and I think that 
would upset Bernhard, but if it persistently becomes more trouble than it's 
worth and attracts idiots who don't take a hint when it's delivered with a 
hammer to the cranium, I have _mountains_ of escalation I can do here.

> > The environment in which we build and run is Linux.
>
> Linux is not a single environment, it's a kernel.

Says Richard Stallman.  He is wrong.  There's a standard collection of 
packages supported by Red Hat, SuSE, Debian, Gentoo, Slackware, Knoppix, 
Ubuntu, Yellow Dog, and tons of other distros.  The common collection of 
applications that build and run on all of them is building on the Linux 
platform.

Richard's still upset because the GNU project died around 1988 when it 
abandoned its monolithic kernel design for a microkernel design it couldn't 
implement.  He can't let go.  Linux was the successor to a bunch of stalled 
or blocked efforts (the GNU project, comp.os.minix, the berkeley CSRG), some 
of which later revived (Minix had a GPL release, several other follow-ups of 
the Berkeley CSRG are still viable or even successful, in the case of MacOS 
X).

Linux inherited developers and technology from many sources, but it's its own 
project.  The only GNU/Linux distribution is Debian because Debian was taken 
over by the FSF fairly early on and was the official FSF distribution of 
Linux until Debian forked away from the FSF very early because nobody can get 
along with Richard for long.

If you care: http://www.faifzilla.org/ch10.html

Which is less than half of the story, by the way.  (Follow libc5 to libc6, 
Werner Almsberger is a Cygnus guy who can't stand Richard Stallman.  The FSF 
annointed a Linux project as successor and handed over the name because the 
corresponding GNU project was _dead_.  Just like they did when egcs became 
gcc 2.95, because the fsf-developed gcc was so dead it was starting to 
_smell_ so they handed over maintainership to the fork everybody was using.)

Busybox 1.00 was a piece of Linux software.  It was built on gcc 2.95 or 
greater, with glibc 2.1 or uClibc 0.9.26 or greater under GNU Make 3.79.1 and 
the 2.2 or newer Linux kernel.  That's historical fact, that's what it 
NEEDED.  Since then we've moved towards C99 but are still GCC-centric (3.2 
now), and our C libraries are expected to be newer (glibc 2.2 or uClibc 
0.9.27), and linux-2.4, and there's a push to require Make 3.80 that I'm 
still resisting but it's only a matter of time.

And you'll notice, ever single release since 1.00 has been put together by me.  
I didn't become maintainer because Erik suddenly decided I should be 
maintainer.  I'd been acting as maintainer for most of a year already, and I 
_did_ make some of the earlier decisions about portability since Erik wasn't 
around to do so.

Shaun Jackman was the first person I was aware of to start poking non-Linux 
changes into the tree (newlib with libgloss).  I've been making noises about 
better mmu-less support for a while, and slowly working on it.  There's been 
a BSD support patch in bugs.busybox.net (netbsd?) for over a year that I've 
totally ignored because it was too intrusive.

Trying to clean that up into a policy, I introduced platform.h and said it 
would be a good idea to run on things like MacOS X if it's easy to do so and 
we can hide the changes to things like platform.h.  I have no problem with 
running on Digital Unix or Cygwin either.  But NOT if it hurts our ability to 
run on Linux in the slightest.

But you don't attack the standard Linux environment as stupid and that we 
should PORT busybox to MacOS X and rebase with that as our new standard 
platform.  That's where it ENDS.  Busybox is first and foremost a Linux 
project, and it always will be.  We don't stop using glibc extensions just 
because they're glibc extensions.  That is not a reason.  If your platform 
hasn't got dprintf(), IMPLEMENT IT.  I put in the platform.h define because I 
know there's a name conflict, fine.

The current discussion never went through the stage "Since busybox uses 
getopt_long(), should we implement getopt_long() on non-linux platforms that 
want to run busybox?"  It never even came up.  And that's wrong.

The suggestion that we can make busybox smaller by making getopt_ulflags() 
implement its own getopt behavior is a good one.  It can make busybox smaller 
ON_LINUX, and it avoids side effects like argument reordering that are bad 
ON_LINUX.  And that's a convincing argument.

Reducing the requirements of other platforms to implement Linux APIs if they 
want to run busybox is a fringe benefit.  Nothing more.

> The environment 
> you're talking about is GNU/Linux,

No, it's Linux.  If I build a system with uClibc, BusyBox, dropbear, X.org, 
and the Linux kernel, it hasn't got a single GNU package in the final 
deployed system.  How can that possibly be GNU/Linux?

I may have used GNU tools to build it (unless I grab icc or some such), but 
that's like saying the Windows version of The Gimp is a microsoft product 
because it was built with Visual C.  Watcom didn't own OS/2 for the Power PC 
either.

> which I don't have any interest in 
> running anymore due to getting thoroughly sick of it after many years
> of dealing with all the hacks and bloat.

I'm trying to come up with better replacements for portions of it too.

This trend is normal.  When the mainline gets bloated enough, a slimmed-down 
fork emerges.  Dropbear vs openssh.  Galeon vs Mozilla (and Firefox vs 
Galeon, and it's still kinda bloated.)  xfce fs kde/gnome.  uClibc vs glibc, 
and BusyBox vs most of the GNU tools.

Sometimes the forks are internal, such as the modular x.org rewrite (based on 
what was "tinyx"), or matt mackall's -tiny work going into the Linux kernel.  
That's one of the advantages of a modular design, you can replace a component 
with a better implementation.

> > There are specs out there that say what
> > a Linux environment should look like.  The Linux Standard Base is one of
> > them.
>
> The LSB is among the most idiotic "specs" I've ever read.

Didn't say it was a good spec. :)

> It's a waste 
> of hundreds of pages saying essentially "you must have an ABI
> identical to glibc's".

Ok, the spec there then is the glibc documentation.  *shrug*.

> Any spec (such as LSB or SVID) that includes 
> ABI is inherently stupid (from my point of view in case that wasn't
> clear) since ABI is obviously an implementation issue and not
> something that needs to be standardized.

So TCP specifying what the on-the-wire protocol should look like (it's binary) 
was a bad thing?  Linux still runs (statically linked) binaries that ran 
under 0.01.  They disagree with you.

> Moreover if that's not enough LSB actually contradicts POSIX/SUSv3
> intentionally due to bugs in Linux. It changes the rules so that the
> buggy Linux behavior is not considered buggy. :) Unfortunately this
> actually prevents the bugs in Linux from being fixed. :(

"The nice thing about standards is that there are so many to choose from." - 
Andrew S. Tanenbaum, many years ago...

My lack of caring is deep and profound.  I'm not a fan of the LSB.  There are 
multiple conflicting standards.  Wheee.  The application section of SUSv3 is 
incomplete, but vaguely sane.  (Only vaguely: We're not implementing sccs.)  
And thus is the frame of reference from which we can document how we diverge 
when we get around to auditing things.

> > Working around specific deficiencies in the Linux emulation of other
> > platforms are just that: workarounds.
>
> Other platforms have no reason to emulate "Linux" (GNU/Linux). Doing
> so will make them necessarily bloated and bogged down by legacy crap.

I have no reason to care about those other platforms that are neither Linux 
nor remotely compatible with it.

> > You're bitching about an upstream spec that
> > doesn't specify _init_, so by itself it won't even _boot_.
>
> It doesn't specify init because programs running on the system have no
> use in knowing what init is or whether it even exists.

I know WHY the specs are incomplete.  But the fact remains that no 
specification perfectly matches the real world on any system that isn't a toy 
in a laboratory or some overbred ISO9002 piece of crap along the lines of 
ADA.

> This is the 
> correct way to write a standard. You do not specify the things that
> people should not be relying on, and then the implementation is free
> to do them in the way it deems most reasonable.

And thus we use getopt, which is available on Linux.

> Because POSIX/SUSv3 are a proper standard, unlike LSB/SVID, I can make
> my own init program that just enables auto-reaping of children, runs a
> single script, and sleeps for all eternity, and still have a compliant
> system. It suits my needs and there's no legitimate reason for any
> software but the boot scripts to care about how init works, so why
> not?

Busybox can be configured to violate any conceivable standard eight ways from 
sunday.  You can configure busybox to provide only the applets that AREN'T 
mentioned in SUSv3, pretty easily, and remove required features from lots of 
the ones that are.  This has always been the case, and should continue.

> > > On the other hand, it's reasonable to assume that apps
> > > which have nothing to do with low-level, implementation-specific
> > > issues like mounting filesystems and configuring the network should be
> > > able to get by with only the standard interfaces, and work
> > > out-of-the-box on any (mostly-) posix compliant system.
> >
> > If you talk about "this test case breaks on MacOS X", that's real.  If
> > you say "this does not conform to the spec" I'll show you a spec it
> > conforms to, even if it's the gcc man page or Johnathan Corbet's Linux
> > device drivers book.
>
> I've said multiple times: what I talk about is "this test case breaks
> on my implementation of the C library".

Which inconveniences nobody in the world but you.

> My implementation is based 
> exclusively on a reading of SUSv3, and aside from the fact that I
> insist on never imposing arbitrary limits or unbounded time/memory use
> when it's not necessary (I consider these quality/security issues), it
> intentionally does not provide anything beyond what the standards
> require.

Good for you.  Why should I care?

> This is not pedantry for the sake of pedantry.

We disagree on this.

> It's a matter of 
> keeping the implementation small, fast, and simple, and of ensuring
> that implementation may later be replaced by better ones which work in
> a different manner yet which cannot have the same (extended) behavior
> as the original code.

I care about BusyBox working with glibc and uClibc because real world people, 
right now, want to deploy that.  I'm interested in possibly working with 
klibc in future because that's getting bundled into the Linux kernel (it's in 
-mm now) and thus will be seeing wider use.

I'm vaguely interested in working with MacOS X's C library for the simple 
reason that there are tens of millions of seats of that out in the wild, and 
it's not _that_ far from where we are now.

I'm not interested in working with The Hurd, which already has way more seats 
than your thing does.

> I can cite several major examples in glibc where 
> adding nonstandard functionality to various functions has made it
> difficult (coding-wise, or often memory/performance-wise!) to update
> the functions to conform to modern standards (see fnmatch for one,
> uhg!!). Overspecification is the bane of code because it essentially
> locks you into one implementation, and over time that implementation
> will eventually become undesirable.

When we deploy against uClibc or glibc, we use what's available to make our 
code smaller in real world systems.

You are not talking about real world systems that are out there right now.  
Come back when you are.

> > > I apologize that you feel like these issues are wasting your time.
> >
> > No, I feel they're introducing new goals to the project.  We try to
> > provide a SUSv3 environment, but A) that's just an config option, B) what
> > we provide it on is Linux and things that are more or less compatible
> > with Linux.  Changes to support non-linux platforms which require such
> > extensive changes to BusyBox that they can't be confined to something
> > like platform.h may not be worth doing.
>
> Overall I agree. I hate platform-based ifdefs and the like. The
> minimum I'd like to see (and which I have been submitting patches for)
> is to make all the applets that have no dependence on Linux-specific
> functionality so that they work on any SUSv3-compliant system.
>
> A slightly higher level that would also be nice is to make some of the
> Linux-specific applets less dependent on the particular libc used, so
> that they might work with dietlibc, newlib, my implementation, or
> other weird stuff out there that could be useful in embedded
> applications. I do NOT propose doing this with ifdefs, but instead
> with minor changes that do not break glibc/uClibc but allow the code
> to work with non-glibc-compat libs as well.

I'm interested in looking at these.  (This is not the same as automatic 
acceptance.)

Allowing uClibc to configure out extra stuff we don't need is a good thing if 
it can save space on Linux.  That is what I care about.

> > It's the API Linux has, therefore it's what busybox should be
> > primarily using.
>
> It's the api GNU libc has. Last I checked there's no __NR_getopt_long
> in asm/unistd.h. :)

I am not against rewriting our getopt_ulflags() to not rely on libc's 
getopt().  I've said this.  I was actually pondering doing that myself about 
a year ago (whenever I first noticed that tar xvjfC didn't work on uClibc), 
although at the time touching Vladimir's code would have triggered a 
hissy-fit from the prima donna, so I just didn't go there.  And since then, I 
haven't had time to do it personally.

It may be possible for you to achieve your goals, but when I say the reasons 
you're putting forward to do something aren't convincing to me, all you do by 
hammering on those same reasons harder is get me to dismiss your opinions 
completely.

I am _constantly_ asking "Do we really want to do that, why do we want to do 
that, what's the benefit of this, is there a better way, is there something 
else entirely we should do instead that renders this irrelevant, what else 
would this allow us to do..."  There are conflicting goals ALL THE TIME, and 
I'm trying to resolve things so multiple different goals get met.  Focusing 
in on a tiny area to the exclusion of all else is fine, but don't expect me 
to always agree with you.

I'll happily tell you what my priorities are, and suggest how you can get your 
priorities without screwing up mine.  Dismissing my priorities does not seem 
to me to be a useful move on your part.

Standards compliance is _a_ goal but not _the_ goal, and there are many things 
that can be called standards compliance.  We currently have the goal to 
_provide_ as close to an SUSv3 command line environment as we reasonably can 
without bloating the code, and to document the inevitable divergences where 
we decide not to go all the way.  This says NOTHING about the environment we 
run in, and this says nothing about other specs.  And it's also not a very 
high priority goal, we've been working towards it for a while and have made 
progress but we're not going to drop everything to focus on just that any 
time soon.

> Rich

Rob
-- 
Never bet against the cheap plastic solution.



More information about the busybox mailing list