klogd problem: questions about busybox behavior (1.15.3)

Sat May 22 19:43:15 UTC 2010

On Friday 21 May 2010 00:03, Paul Smith wrote:
> Hi all.  I've just finished tracking down a weird problem on our system.
> I can fix the "top-level" issue, but I'm trying to determine whether
> there are any real bugs here (besides my error) or not.
> 
> We're using an intel-based embedded system running in a ramdisk with a
> 2.6.32.13 kernel, glibc, and busybox 1.15.3.  On this system we're using
> busybox syslogd and klogd.  We noticed that we were getting corrupted
> kernel messages in the syslog file output.  dmesg showed correct values
> but the kernel messages kept in /var/log/messages were all messed up.
> 
> Tracking this down it appears the cause of the problem is that there are
> two klogd processes running on my system.  So, my first question is, is
> this expected that if two klogd are running that they would "step on"
> each others' access to the kernel log ring buffer,

Unfortunately, yes. That's how it works.

> or something, and 
> generate corrupted output?  I would have naively expected each process
> to maintain its own pointer and I might get duplicated messages, but not
> corrupted messages.  If I run two "dmesg" commands at the same time I
> don't see this kind of corruption.

Because dmesg does not need to sit and wait for a new protion of data.
It just dumps entire buffer. This opration does not require per-process
pointers to buffer.

> Or, is this a bug in busybox klogd? 
> Or is it a bug in Linux 2.6.32.13?

Neither. It worked that way for years.

> Trying to determine why I have two klogd processes I see this: the
> bring-up of the system is managed by busybox init.  To start the system
> it runs:
> 
>         ::sysinit:/etc/init.d/rcS start
>         ::once:/etc/rootfs start
> 
> In the "rootfs" script is where klogd is started, then it does some
> things such as change the system time, timezone, hostname, etc.  The way
> things work we restart the logging system multiple times as these
> changes are implemented, so the log messages have the right timezone and
> hostname.
> 
> The first time I restart I kill klogd, but init (apparently) never reaps
> the child so it's running as a zombie.

Please show me the process *tree*. In this tree, is this zombie a child
of init, or a child of some other process? If former, then it's
a bug in init. Try newer busybox. If it doesn't help, let me know.

> Then I start a new klogd.  Then 
> more booting happens and I want to restart it again, but when I use
> pidof to find the pid of klogd I get back TWO pids: the zombie and the
> real klogd.  Due to an error in my script where I don't expect to get
> back >1 PID, the kill fails and I start another klogd.  Now I have one
> zombie and two real klogd's, so that's where this corruption comes from.

Use "killall klogd", it would kill *all* klogds.

> At some later point (once my "rootfs" script ends?), init will reap the
> zombie klogd and I'll be left with just the two klogd processes.
> 
> I can fix my scripting, of course, but that brings me to the next
> question: is it expected/desirable that init doesn't reap children while
> it's waiting for a command like this to complete?  This seems like
> sub-optimal behavior.

init must reap zombies. But it can only reap _its children_.
Example:

$ sh -c 'sleep 1 & exec sleep 999'

In other xterm:

$ top -bn1 | grep -e PID -e sleep
  PID  PPID USER     STAT   VSZ %MEM CPU %CPU COMMAND
25808 25787 root     S      920   0%   2   0% grep -e PID -e sleep
25781  1224 root     S      916   0%   0   0% sleep 1260
 3599     1 root     S      916   0%   0   0% sleep 32000
25805 25784 root     S      916   0%   1   0% sleep 999
25806 25805 root     Z        0   0%   1   0% [sleep]

See? Process 25806 died, but its parent process 25805
did not reap it. So 25806 will stay in state Z until process
25805 reaps it, or until process 25805 exits. In latter
case, 25806 will be reparented to init, and only _THEN_
init can reap it.

You need to check PPID of zombies on your system.
-- 
vda