Crash after successfully mounting root filesystem over NFS and st arting busybox

Rob Landley rob at landley.net
Sun Dec 18 20:47:45 UTC 2005


On Sunday 18 December 2005 08:56, Gil Madar wrote:
> Hi All,
>
> I dumped part of the output, kept in the log_buf.
>
> Please note the MPC866 receives an exception during access to the dual port
> ram...

Translation: kernel problem.

> This, of course leads to another exception, and so on.
> The exception is during early stages of running busybox.

So it sounds like we're triggering it, but is it our bug? 

Have you tried init=/bin/sh?  (Using bash or some such?)  See if the bug 
happens without busybox even involved.

> If I run a
> while ( 1 ) schedule();
> instead of loading root filesystem over NFS, the target respondes to pings
> without any problem.

The target no longer responding to pings means the kernel has curled up into a 
ball and is no longer responding to interrupts.

> I'm clueless whether this is a setup problem, or a software broblem.

It's a kernel problem.

> <4>Freeing unused kernel memory: 92k init
> <6>init() line 782 /bin/busybox

I'm not finding a busybox error message that looks anything like this in 
busybox/init/*, is it some kind of debug info you compiled in?  Line 782 of 
init.c isn't in a function called "init()" (we haven't got a function called 
that, we've got init_main() though).  It's an "else" in halt_signal(), which 
we'd only reach if you're trying to shut down already, so would not be the 
first problem no matter what was wrong.

> <4>Oops: kernel access of bad area, sig: 11 [#1]
> <4>Oops: kernel access of bad area, sig: 11 [#2]

Kernel trap, sig 11.  I believe that's unhandled memory exception?

> <4>Call trace: [00000000]  [c0011148]  [c0011714]  [c001153c]  [c0011338]
> [c00033d4]  [c0009f38]  [c0002f20]  [c018f964]  [c019d52c]  [c018f9b4]
> [c01a941c]  [c01ab380]  [c01c6b14]  [c01c6f30]

You have no debug info so who knows what went wrong (you could try to look 
that up in System.map if you're bored), but the 00000000 looks a bit like a 
jump to a null pointer.

> <0>Kernel panic - not syncing: Aiee, killing interrupt handler!

And it happend in an interrupt handler.  Beautiful.  That has _nothing_ to do 
with us.

> <4> <0>Rebooting in 180 seconds..Oops: kernel access of bad area, sig: 11
> [#3]

Invalid memory access in kernel space.

We may be triggering it, but it would be kind of difficult for that one to be 
our bug.  (Maybe on a nommu system where kernel space is writeable from user 
space busybox could have _corrupted_ the kernel and then the kernel would 
have gone boing.  But we'd see such an illegal access on other platforms, 
where it was possible to debug it.)

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.



More information about the busybox mailing list