[PATCH] getrandom: new applet

Fri Jul 8 15:25:08 UTC 2016

Hi Rob,

2016-07-07 17:49 GMT+02:00 Rob Landley <rob at landley.net>:
> On 07/06/2016 11:41 AM, Etienne Champetier wrote:
>> Now you really hate the fact that getrandom() is a syscall.
>
> I do not hate the fact getrandom is a syscall. I'm asking what the point
> is of a new applet to call this syscall. You have suggested it could
> block to show when /dev/urandom is producing a higher grade of
> randomness than it does before being properly seeded. That is, as far as
> I can tell, the only reason for your new applet to exist rather than
> upgrading $RANDOM in ash/hush.

I'm not suggesting anything, it's the API promise, it's written in the
man page (and that's why i quoted it)
Yes that is the only reason i want to use it
if someone use the applet, it will be safe (API/man page promise), it
will block,
but for now it will block less time than /dev/random (you made the
test, 2m40 vs 6m, my test was 15s vs 45s)

>
>> getrandom() was introduced to fight file descriptor exhaustion and
>> also added a new behavior, blocking until /dev/urandom is initialized.
>> I'm only interested in the new behavior (which was added for the
>> embedded world).
>>
>> We could say that waiting 6 minutes (using /dev/random) instead of 3
>> minutes (using getrandom()) is acceptable, but now what if you have 10
>> soft wanting to do that, /dev/random might block again
>
> I was suggesting reading a byte from /dev/random (at boot time, from
> your init scripts) to see when /dev/urandom was initialized, and then
> reading from /dev/urandom after that. Serializing the start of things
> that need "high grade" random numbers (such as key generation) until
> after the system has collected some entropy from clock skew. I.E.
> presumably the same thing your blocking call to your new applet would
> accomplish. (Although if your system is tickless, I'm not sure how much
> entropy it's actually collecting when it goes quiescent, so it may block
> quite a while if you don't do the "write mac address into /dev/urandom"
> workaround to at least get this box to vary from adjacent boxes.)

Your solution will work, but for now it will block at least 2 times
longer (first /dev/urandom is seeded with X bits, getrandom() unblock,
and only after that /dev/random is seeded with another X bits and
start to unblock).

Also your solution need to be implemented at the "distro" level, you
make a script that read /dev/random, and tell all other packages
maintainers to somehow wait on your script
It's not simpler in my opinion

>
>> I will quote man getrandom(2):
>> By default, getrandom() draws entropy from the /dev/urandom pool.
>> [...]
>> If the pool has not yet been initialized, then the call blocks.
>> [...]
>> When a sufficient number of random bits has been collected, the
>> /dev/urandom entropy pool is considered to be initialized.
>
> Did you miss the part where I read the man page and wrote an example
> program to test what it was doing last time around?

quote from your previous mails:
"How? /dev/urandom and/dev/random pull from the same pool."
"If the entropy pool contains 128 bits but /dev/random won't give it to
userspace, file a bug with the kernel guys."
"It blocked for 2 minutes and 40 seconds waiting for the random pool to
initialize"
"Which is why you'd read a byte from /dev/random first if you need to
wait for the pool to have entropy in it"

you are refering to THE pool when the man page talks about /dev/random
pool and /dev/urandom pool
(the fact that there is a 3rd pool, the input pool, is an implementation detail)

so i've seen your code but it's not obvious that you've read and
undestand the man page, sorry.

>
>> It doesn't state that it's 128 bits because it can change in the
>> future, the API just block until it's safe, it's clearly defined.
>> getrandom() is not a revolution but it's better than /dev/random
>> Is it worth 207K (uncompressed), that's Denys call
>
> The bit about having two pools

yes

> and reading a byte from /dev/random
> potentially slowing the proper initialization of /dev/urandom

That is not what I've explained, /dev/random starts his init after
/dev/urandom finish his init
(you've removed that part while responding, the very beginning of my
previous mail)

so no, reading from /dev/random will not slow down /dev/urandom init

> (even if
> you write it back) sounds strange and potentially a kernel bug.

so no kernel bug here

> Adding a
> syscall to work around a kernel bug seems odd,

The fact that you can read from /dev/urandom before it's fully
initialized is a problem,
but they (kernel devs) tried and making /dev/urandom block break too
many systems
They (kernel devs) could have changed /dev/random to behave like
getrandom() but many people would have screamed,
so to introduce this new behavior you need a new char device or a new syscall.

> especially since the
> stated justification was so code derived from OpenBSD could be more
> easily supported by Linux without being rewritten.

It was to fight fd exhaustion, you can abort() (data loss?),
you can return an error and hope openssl user will handle it well (in
many case they don't),
or you can remove the problem entirely

OpenBSB guys pushed for it, Linux maintainers accepted it, i was not
part of this debate so stop bring it up

>
> I had arguments with the OpenBSD guys back in 2001 that openssh should
> not _hang_ when you do "sleep 999 & exit". The context is that when
> ifconfig brought up an rtl8139 device it launched a kernel thread, and
> that kernel thread had the same stdin/stdout/stderr as the kernel
> process, which meant the last user of the pty hadn't closed it yet and
> thus openssh wouldn't exit even though the shell it had launched had
> exited... Oh here, just read the thread:
>
> https://lkml.org/lkml/2001/5/1/30
>
> The point is that the official position of OpenBSD with regards to
> openssl and openssh has always been that any deviation in behavior
> between OpenBSD and Linux means, by definition, that Linux is wrong.
> That the openssl/openssh ports to Linux were very much second class
> citizens, and if Linux didn't adapt to what they did hell would freeze
> over before they changed their code.

Don't really care who is right or wrong,
now that there is getrandom() i don't think we are going to see
/dev/getrandom one day
I want the getrandom() behavior, for now it cost 207K (uncompressed)

>
>>> Ok, this is a new assertion. How does that work? How would /dev/urandom
>>> get random data to seed it without /dev/random having enough entropy for
>>> one byte of output?
>>
>> not really, see my mail from 29 june 2016 at 17:04
>> also answered at the begining of this email i think
>
> You keep pointing me back at what you've already said, despite the fact
> I read what you said.

my mail:
you have
- /dev/urandom, which never block
- getrandom(), which block until /dev/urandom is properly initialized
(system wide), then never block again and read from /dev/urandom
- /dev/random, which only start it initialization after getrandom()
unblock, and block when it estimates entropy is low

Your sentence is a bit going in the opposite direction of what i'm
pointing you at,
that is why i'm pointing you back at it ...

>
>>> How? /dev/urandom and/dev/random pull from the same pool. If it's
>>> ignoring the entropy accounting, your definition of "cryptographically
>>> secure" differs from the conventional one.
>>
>> As said at the begining it's not the same pool.
>
> So you're saying it should actually be:
>
>   dd bs=1 count=1 if=/dev/random of=/dev/urandom
>
> Or possibly that writing it back doesn't matter...

It's even a bit more complicated than that :)
for an in-depth explanation https://lwn.net/Articles/525459/ (already posted)

Writing to /dev/urandom or /dev/random go into the input pool, so it
can add entropy to the system,
but the kernel doesn't really trust the user so it will not increment
the entropy count

man random(4)
  Writing to /dev/random or /dev/urandom will update the entropy pool
  with the data written, but this will not result in a higher entropy
  count.  This means that it will impact the contents read from both
  files, but it will not make reads from /dev/random faster.

>
>> Many cryptographers think that the blocking of /dev/random is useless
>> or bad, including DJB (Daniel J. Bernstein)

To be more precise the "block again" after first unblock is considered
useless or bad

>
> You're proposing an app whose sole purpose is to block.The above is an
> alternate way to block using an existing app through an existing api. If
> it takes significantly longer for that API to give 8 bits of randomness
> than it takes another API to give 128 bits of randomness, somebody
> should ask the kernel guys why.

... (answered many times)

>
>> Still a quote from DJB
>
> So /dev/urandom doesn't need fresh entropy the same way qmail doesn't
> need a license (or the ability of anyone but him to distribute updated
> versions of it)?

seems he changes his mind
http://cr.yp.to/qmail/dist.html

It needs "fresh" entropy, just not every few Kb, the CSPRNG period is
bigger than that

>
>>> Pretty much by definition crypto exploits are something the developer
>>> didn't think of, so cryptographers tend to be paranoid.
>>
>> I'm just trying to use a well defined API here,
>
> A) So am I,
>
> B) To accomplish _what_? You seem to want to use the new API because it
> exists.

unblock (2 times) faster, nothing more

>
>> not to review kernel code
>> If the kernel is broken, it will be fixed,
>
> http://landley.net/notes-2013.html#04-03-2013 links to old 2006 lkml
> messages where I first raised an issue I was fixing at the time, because
> nobody else had in the 7 years in between.
>
> I periodically submitted approximately the same perl removal patches for
> 6 years (here's 2009, https://lwn.net/Articles/353830/ and here's 2012
> http://lkml.iu.edu/hypermail/linux/kernel/1201.2/02849.html). I believe
> they finally went in in 2013.
>
> Here's the squashfs guy explaining his 5 year struggle to get that
> upstream: https://lwn.net/Articles/563578/
>
> Linux-kernel development circled the wagons against outsiders quite some
> time ago. (My guess is about a year into the SCO trial.)
>
> Hands up everybody who thinks that if I don't bother to resubmit
> http://lkml.iu.edu/hypermail/linux/kernel/1606.2/05686.html (which I
> probably won't because my patch Works For Me) that the issue will be
> fixed this year.
>
>> but the API will still exists, and will then keep it initial promise
>
> Yes, /dev/random and /dev/urandom should still be there and still be
> usable for the forseeable future. Just like they have been for the past
> 20 years.

getrandom() has been introduced 2 year ago and it's not going away either
the fact that it's 20 years old isn't a good thing,
/dev/urandom is broken if you use it too early (bastian case again)
(that's why you now have a nice log line like "random: <prog> urandom
read with 4 bits of entropy available" in kernel log)

>
>>>>> which is what
>>>>> /dev/urandom is for. You've implied that your new function call is
>>>>> somehow superior to /dev/urandom because you're magically generating
>>>>> entropy when the entropy pool is empty, in userspace, somehow.
>>>>
>>>> no no and no, this syscall block until /dev/urandom is initialized
>>>> with 128bits of entropy
>>>
>>> dd if=/dev/random of=/dev/random bs=16 count=1
>>>
>>> There, you've blocked until there were 128 bits of entropy and you
>>> haven't actually consumed any entropy because you fed it right back.
>>>
>>> If the entropy pool contains 128 bits but /dev/random won't give it to
>>> userspace, file a bug with the kernel guys.
>>
>> As explained at the begining the first 128bits goes to /dev/urandom so
>> you wait at least for the first 256 bits (or even more, i don't know)
>> and writing to /dev/random doesn't increase the entropy estimation so
>> you don't consume entropy but the kernel entropy estimation goes down
>> by 128bits, and /dev/random will enventually block again
>
> Which is why further reads would be from the non-blocking /dev/urandom.
> In the example the point of the read from /dev/random is to block (at
> boot time) until you can read from an initialized /dev/urandom.
>
> ("Here is way you can block." "But if you do that again it might block
> again!" "Um... yeah?")

2m40 vs 6m, your test, not mine

>
>>> So to save a file descriptor, you're making an applet.
>>
>> no, i'm only interresed in the blocking behavior of getrandom()
> ...
>> again, i'm only interested in the blocking behavior, not the fd
>> exhaustion prevention
>> (which explain why it's a syscall and not a character device)
>
> So if urandom and random have different pools, and urandom gets the
> first 128 bits of entropy, reading one byte from /dev/random at init
> time to wait until the pool is initialized requires the system to
> collect 136 total bits of entropy.

I've just written 256, but you've read me ...
For you the /dev/random CSPRNG will only wait for 8bits to consider
it's seeded when the /dev/urandom wait for 128bits (again same CSPRNG
function)?

> Your mechanism would return after it
> had collected 128 bits of entropy. And the difference between 128 and
> 136 justifies a new applet,
256...

> one which apparently renders userspace
> mitigation strategies for the problem category (ala the "write /proc
> stuff into the /dev node") irrelevant despite the remaining 3 minute
> wait in my test even with your new api.

As i've explained writing to /dev/(u)random doesn't speed up the unlocking
and i'm not convinced that /proc is full of unused source of entropy

>
> I'm going to stop here, thanks. Good luck with your thing, I've lost
> interest in the outcome.

Hope you've at least learned a bit how the Linux RNG work

Regards
Etienne

>
> Rob