cross compile issues
Rob Landley
rob at landley.net
Sun Mar 7 03:59:55 UTC 2010
On Saturday 06 March 2010 13:53:44 Denys Vlasenko wrote:
> > > The only change here is that __s64 and __u64 are typedef'ed in some
> > > cases. I did it because a user reporter it did not work for him until
> > > he added it.
> > >
> > > Do you think it's wrong?
> >
> > Yes, I think it's wrong. The toolchain that had a problem with that is
> > clearly broken, and cluttering up busybox's code for a brittle workaround
> > for a specific obviously broken toolchain isn't an improvement.
> >
> > The __s64 and __u64 types are kernel internal types. Either they should
> > be cleaned out of the kernel headers by whatever's replacing make
> > headers_install for your toolchain/distro, or they should be #defined by
> > those kernel headers ala this chunk from 2.6.32's loop.h:
> >
> > #include <asm/posix_types.h> /* for __kernel_old_dev_t */
> > #include <linux/types.h> /* for __u64 */
> >
> > /* Backwards compatibility version */
> > struct loop_info {
> > int lo_number; /* ioctl r/o */
> > __kernel_old_dev_t lo_device; /* ioctl r/o */
> >
> > Code that #includes linux/types.h shouldn't have to manually #define
> > __u64. If it does, its headers are broken.
>
> Everything is broken to some extent ("every nontrivial program has bugs").
And they should fix it. We are small and simple. That's the PURPOSE of
BusyBox. Intrducing bugs and brittleness into our code when the problem is
easy to fix in _theirs_ defeats the entire purpose of BusyBox.
For example, there are 8 gazillion broken cross compiler build environments
out there. Every one of them is broken in a slightly different way. It's
IMPOSSIBLE to work around all of those, because you get to the point where
your workarounds start breaking other things.
Simplicity of implementation has always been at _least_ as big design goal for
BusyBox as small size. Way back when they used it to run the display at NORAD
not because they were short on space or on CPU but because they could audit
every line of it and understand what it was _doing_. That's a big advantage,
and not one to discard lightly.
This mess drew my attention because something broke, and I had to wade through
this unnecessary complexity to try to understand whether or not it was his
toolchain or our code that's causing the problem. To be honest, I still don't
know, because he hasn't gotten back to me yet and I don't have his toolchain.
The Free Software Foundation got huge unreadable crappy code in part by adding
#ifdefs with extensions and workarounds for every strange buggy non-posix
system in the world, rather than relying on standards and telling broken
systems to fix their stuff. The end result was such horrible bloat that from-
scratch reimplementations like busybox had enough appeal to get a significant
userbase. Repeating the FSF mistakes of allowing workarounds for other
people's bugs to metasticize through our code is not an improvement.
BusyBox is all about having clear limits and knowing what we DON'T do. That's
the only way to rip out crap and get to the small and simple. You're adding a
bug workaround for maybe three users ever in the entire history of the
project, and not only making everybody who ever reads that code work out why
it's there and what it does, but probably have the extra complexity break the
build of more users than will ever actually benefit from it.
> It makes sense to help some old toolchain to limp along.
Old? It claims to support a 2.6 kernel. We have an _old_ path, it's the 2.4
support. Tell FreeBSD to stop pretending to support 2.6 kernel APIs when it
can't competently do so.
Notice that we already _have_ a test. Look at the code:
#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,0)
blah blah blah
#else
blah blah blah
#endif
Note that it's not an #else if. There is not further test. It's a fallback
for "does not support the new 64-bit API".
Clearly, FreeBSD does not actually support the new 64 bit API. It CLAIMS to,
but if it _did_ we wouldn't need an #ifdef salad for it _within_ the (clean
and simple) 64 bit API block. FreeBSD is _lying_ about whether or not it
properly supports this API. You're adding code to accept this lie.
Show me an actual _Linux_ system that can't #include linux/loop.h. The API
has "linux" in the name, and its in a test explicitly checking for the
existence of a 2.6 _LINUX_ kernel. If FreeBSD is pretending to have a Linux
kernel API then the onus is on FreeBSD to _actually_ have the Linux kernel
API. If they can't competently emulate Linux, then they can't use the Linux
functionality provided by that Linux API, and they either have to disable some
applets or stop lying about what they support. Why on earth is this _our_
problem?
Are you suggesting that reporting this bug to BSD would not help matters?
That FreeBSD is not maintained? That bugs cannot possibly be fixed in their
upstream because Apple hired all of their developers away in 1999? This may
be true, but I still don't see how it's our problem...
> > And if their linux/loop.h isn't #including
> > something to #define __u64, their linux/loop.h is broken. Variants of
> > the same conclusion.
> >
> > In any case, if a horrible workaround like that was worth doing (which
> > this one isn't; they should fix their toolchain) it would belong in
> > platform.h. Making loop.c aware of the existence of specific compiler
> > versions is kind of evil. (It's bad enough making it aware of kernel
> > versions, but that's really an API test. Do we have the 64-bit API or
> > not. It's possible that a cleaner way to do that would be a "support old
> > 2.4 kernel APIs" in menuconfig, but it seems silly to ask people to
> > manually select something we can autodetect at compile time, which is why
> > it was how it was, as the least ugly solution.)
>
> How about this?
Oh dear.
These types are defined in an existing kernel header file:
include/asm-generic/int-l64.h:
typedef signed int s32;
But that typedef is inside an #ifdef __KERNEL__ so if you ever wind up seeing
that from a userspace #include of a linux/* header, it means your kernel
headers were improperly sanitized and thus your toolchain is broken. That's
_why_ they have both s32 and __s32, users of the second may be exported into
userspace (where the double underscore prevents it from conflicting with the
namespace of normal userspace symbols). If the first type ever winds up being
used in userspace, it means your kernel headers were improperly sanitized and
you've just hit a bug.
I.E. the REASON for that markup is to CATCH THAT STUFF LEAKING INTO USERSPACE,
and force the improperly sanitized kernel headers to do a build break so they
catch and fix their mistake.
It's a bit like going "#define #error //", if such a thing actually worked in
your compiler. You've essentially #ifdefed out an assert() because it was
triggering for you, rather than fixing the underlying problem. You've added an
entire header to _break_ the kernel's explicit checking to distinguish
internal from external symbols.
Of course this wouldn't have come up if you weren't adding unnecessary
complexity, bloating the code and making it more brittle, to work around clear
bugs in other people's build environments which are not our problem.
Kernel headers are a can of worms, and there's years of history behind this
can of worms, and as the movie Wargames said, "the only winning move is not to
play". There was a particularly large flamewar about this on linux-kernel in
November 2004, shortly before make headers_install went in.
The _first_ thing to know about kernel headers is that if there's a libc way to
do it, then you should use the libc wrapper and not the kernel #include
directly. The only time you should ever need to #include a Linux file is when
there _is_ no libc wrapper for it, because a portable way to access this
functionality does not exist, generally because Linux invented it (or came up
with its own unique way of doing it).
For example, under solaris (which Oracle seems to have killed off) there was no
"losetup" command. There was lofiadm instead. They had their own completely
unrelated mechanism using a different API. You couldn't use the Solaris API
under Linux, and you couldn't use the Linux API under Solaris.
Meaning _when_ you're using linux headers, you're using Linux-specific
functionality exported by the Linux kernel, and the standard is set by the
Linux kernel. Linus Torvalds was very clear on this, calling it a "one way
street":
http://lkml.indiana.edu/hypermail/linux/kernel/0411.3/1356.html
The __x## sizes are kernel internal types, which actually predate _both_ of
those standards. Here's Linus Torvalds personally explaining that:
http://lkml.indiana.edu/hypermail/linux/kernel/0411.3/1099.html
If your kernel headers are _not_ what the Linux kernel developers consider
properly sanitized kernel headers, then it's the same as trying to build with
a C compiler that doesn't properly implement C99, or build against a C library
that doesn't properly implement Posix. There is something specific broken
which is external to our project, it violates the clearly documented
expectations of busybox, and we can point at it and tell them to fix it.
Note that the "can of worms" you're re-opening used to be a real problem.
Back before "make headers_install", broken kernel headers were an epidemic.
Under the 2.4 kernel you could use the kernel headers directly, just chop out
the #ifdef __KERNEL__ bits and the rest was usable by userspace. But in 2.6
that didn't work anymore, so for a while it was the distro's job to come up
with properly sanitized kernel headers, but they all did it a slightly
different way and embedded developers had to grab obsolete headers from Linux
From Scratch or gentoo or the versions Mariusz Mazur was doing. (Note: this
wasn't _raw_ kernel headers. You had to modify them extensively for use in
userspace at all. It's just that the modifications tended to be imperfect and
brittle.)
Accepting and trying to work around broken headers is one of the big reasons
headers got so crappy to begin with. For a few years there people were block
copying structures out of the headers into their code. Insane as this sounds,
that was literally considered best practice. And then the distros and the
kernel guys got together and said "no, we will have kernel headers that work,
exported by the kernel's build infrastructure itself, and we'll rip out the
horrible workarounds in the 8 gazillion packages and actually REPORT bugs in
the kernel headers so they get FIXED". And they did "make headers_install"
five years ago so there was ONE CONSISTENT set of exported headers, and
everybody started using it.
Note: the layers of workarounds DID NOT WORK. BusyBox was not the only thing
that broke, and they all broke in _different_ways_. You CANNOT use improperly
sanitized kernel headers to build userspace packages reliably, the workarounds
never end. Thus the modern stance that if your kernel headers are not
properly sanitized, either your toolchain is _broken_ or it's NOT FOR LINUX.
For us to try to #define our own __u64 is just as silly as for us to try to
#define our own gcc or glibc internal symbols to work around something that
pretends too have gcc extensions or pretends to have glibc extensions. Either
they have them or they don't. We can test for them and not use them, but it's
NOT OUR JOB try to fix broken attempts to provide these extensions.
The type "__u64" remains a kernel internal type that is supposed to be #defined
by the kernel headers themselves. If their kernel headers are broken it's not
our job to work around that bug, because that just encourages more breakage.
A new Linux kernel comes out every three months, we're allowed to blacklist
clearly buggy broken versions and report the bug upstream where it WILL be
rapidly fixed. There's a whole bugfix-only mechanism for the Linux kernel:
http://lwn.net/Articles/370236/
Your argument is that FreeBSD does _not_ have such a mechanism, that FreeBSD
is irretrievably buggy, and yet FreeBSD is explicitly trying (and explicitly
failing) to emulate a LInux kernel API that even has Linux in the header name.
They can't have it both ways.
Rob
--
Latency is more important than throughput. It's that simple. - Linus Torvalds
More information about the busybox
mailing list