Crash after successfully mounting root filesystem over NFS and st arting busybox
rob at landley.net
Sun Dec 18 20:47:45 UTC 2005
On Sunday 18 December 2005 08:56, Gil Madar wrote:
> Hi All,
> I dumped part of the output, kept in the log_buf.
> Please note the MPC866 receives an exception during access to the dual port
Translation: kernel problem.
> This, of course leads to another exception, and so on.
> The exception is during early stages of running busybox.
So it sounds like we're triggering it, but is it our bug?
Have you tried init=/bin/sh? (Using bash or some such?) See if the bug
happens without busybox even involved.
> If I run a
> while ( 1 ) schedule();
> instead of loading root filesystem over NFS, the target respondes to pings
> without any problem.
The target no longer responding to pings means the kernel has curled up into a
ball and is no longer responding to interrupts.
> I'm clueless whether this is a setup problem, or a software broblem.
It's a kernel problem.
> <4>Freeing unused kernel memory: 92k init
> <6>init() line 782 /bin/busybox
I'm not finding a busybox error message that looks anything like this in
busybox/init/*, is it some kind of debug info you compiled in? Line 782 of
init.c isn't in a function called "init()" (we haven't got a function called
that, we've got init_main() though). It's an "else" in halt_signal(), which
we'd only reach if you're trying to shut down already, so would not be the
first problem no matter what was wrong.
> <4>Oops: kernel access of bad area, sig: 11 [#1]
> <4>Oops: kernel access of bad area, sig: 11 [#2]
Kernel trap, sig 11. I believe that's unhandled memory exception?
> <4>Call trace:  [c0011148] [c0011714] [c001153c] [c0011338]
> [c00033d4] [c0009f38] [c0002f20] [c018f964] [c019d52c] [c018f9b4]
> [c01a941c] [c01ab380] [c01c6b14] [c01c6f30]
You have no debug info so who knows what went wrong (you could try to look
that up in System.map if you're bored), but the 00000000 looks a bit like a
jump to a null pointer.
> <0>Kernel panic - not syncing: Aiee, killing interrupt handler!
And it happend in an interrupt handler. Beautiful. That has _nothing_ to do
> <4> <0>Rebooting in 180 seconds..Oops: kernel access of bad area, sig: 11
Invalid memory access in kernel space.
We may be triggering it, but it would be kind of difficult for that one to be
our bug. (Maybe on a nommu system where kernel space is writeable from user
space busybox could have _corrupted_ the kernel and then the kernel would
have gone boing. But we'd see such an illegal access on other platforms,
where it was possible to debug it.)
Steve Ballmer: Innovation! Inigo Montoya: You keep using that word.
I do not think it means what you think it means.
More information about the busybox