[PATCH] init: handle kexec clean reboot
Laurent Bercot
ska-dietlibc at skarnet.org
Thu May 23 13:02:49 UTC 2013
> "kill -STOP 1" then :) Yes, it works even on PID 1. :)
Yeah, looks like it does on Linux. Mind if I still find this ugly ?
>> necessary to disable supervision. If the supervision chain is rooted in
>> process 1 as it should be,
>
> Who said "it should be"? I disagree that it's best to run supervisor
> as PID 1. Wait till a bug in the supervisor kills it and hangs
> your box.
I said "rooted in process 1", I didn't say that all the supervision
logic needed to be in process 1. If your supervision chain isn't rooted
in process 1, you can still hose your system with random SIGKILLS - as
sent by the OOM killer, for instance.
That doesn't mean process 1 should be complex. It only needs to monitor
at least *one* other process - which is exactly what runit does.
busybox init and s6-svscan monitor several processes, they all qualify
as root of the supervision chain, too, and they're still very simple.
Don't worry, we're on the same side here.
>> this can only be done by signalling process 1 in some way.
> Yes. SIGSTOP is the signal you look for :)
So we have two approaches here.
The first one:
* stop process 1
* scan processes to kill everything except my own session, so my calling
shell survives.
- this requires stopping all processes during the scan, to avoid race
conditions
- but I want my calling shell to resume after I'm done, so I'll send
SIGCONT to everyone
- oops, I don't want to include process 1 in my SIGCONT, else it will
restart stuff I don't want restarted
- so I need to send SIGCONT to everyone except process 1, which
requires doing it by hand
- after a couple hundred system calls, I can exit and return control
to my calling shell
The second one:
* have process 1 listen to a signal (or any kind of message) that tells
it "going to shutdown".
* process 1 stops supervising services and executes into a shutdown
script, which can be provided by the admin.
* said shutdown script runs as pid 1, so it can kill -9 -1 without
any further questions. One system call -> clean slate.
Call me stubborn, but I still think the second approach is cleaner.
s6 has been designed to work like this and it has given me the fastest
shutdown/boot times of all the init systems I've tried - and I've tried
a few.
Additionally, this approach only requires that kill(-1, something)
doesn't hit process 1, which is the case with virtually every Unix
kernel out there. Why Linux chose to do things differently makes no
sense to me: process 1 is still special with Linux since killing it
makes your kernel panic. Why choose to protect *the sending process*
from signals sent to everyone, instead of protecting *process 1*
which is the special, vital one ? This screams "half-thought design".
Ah well. If we disagree on this, we can still agree that systemd sucks. ;)
--
Laurent
More information about the busybox
mailing list