[PATCH] mkswap: generate UUID

Denys Vlasenko vda.linux at googlemail.com
Fri Jun 19 07:30:37 UTC 2009


On Friday 19 June 2009 04:10, Colin Watson wrote:
> While the metaphors above are certainly colourful, the code that Denys
> posted does not have randomness even close to 2^128. rand() is entirely
> deterministic based on the seed value you give to srand(). Denys' patch
> uses monotonic_us() plus getpid() as the seed. Plenty of systems, such
> as Linux, cap getpid() to 2^15 or thereabouts by default. Not exactly
> very random, especially in an installer kind of situation where you may
> have a fairly predictable sequence of events since boot, but let's
> generously grant a flat distribution over 2^15.
> 
> Then we have time - well, whatever time may be, it isn't random. Sure,
> you have microseconds, so it's a bit better than it might be, but that
> hardly gives you 113 bits of randomness. CLOCK_MONOTONIC, which Denys'
> patch used, is typically (and on Linux) implemented simply as a
> monotonically increasing counter that's initialised to zero at boot, so
> it has nothing to do with whether you're running at the same wall-clock
> time as somebody else, only to do with how long your computer has been
> running. mkswap is often used in that installer kind of situation I
> mentioned, and in that case you probably have a range of maybe fifteen
> minutes within which most of the runs are going to fall (obviously some
> people will take ages to set up partitions, some people will go off and
> have a coffee, or whatever, but if you're focusing on the task at hand
> or if you're doing mass automatic deployments then they'll be quite
> well-clustered). Converted to microseconds, that's 30 bits of
> randomness.
> 
> Now, you can't add those 30 bits onto the 15 above because Denys' patch
> just adds monotonic_us() and getpid() together; it doesn't XOR them or
> anything. 2^30 + 2^15 is near enough to 2^30 to make no odds.
> Furthermore ("among our weapons ..."), srand() takes an unsigned int as
> its argument anyway, so you can't possibly do any better than 2^32 even
> if you think the assumptions above are wrong.

You are right.

I didn't realize UUIDs need to be that unique.

> Please bear in mind that I was replying specifically to Denys' patch,
> not to an abstract concept of a patch that used proper random numbers.
> The reason I'm concerned about it is that I *do* understand the scale of
> these things and that patch (which is in master now) isn't in the right
> ballpark at all. I spent several days and nights of my life last year on
> very little sleep dealing with cleanup and mitigation of the notorious
> Debian openssl bug, in which an accidental mispatch caused a shortage of
> randomness such that only 2^15 distinct keys were possible for each key
> type/length combination. I know what powers of two do, and the kinds of
> thing that happen if you don't have enough of them.

How about this?

        i = open("/dev/urandom", O_RDONLY);
        if (i >= 0) {
                read(i, buf, 16);
                close(i);
        }
        /* Paranoia. /dev/urandom may be missing.
         * rand() is guaranteed to generate at least [0, 2^15) range,
         * but lowest bits in some libc are not so "random".  */
        srand(monotonic_us());
        pid = getpid();
        while (1) {
                for (i = 0; i < 16; i++)
                        buf[i] ^= rand() >> 5;
                if (pid == 0)
                        break;
                srand(pid);
                pid = 0;
        }

        buf[4 + 2    ] = (buf[4 + 2    ] & 0x0f) | 0x40; /* time_hi_and_version */
        buf[4 + 2 + 2] = (buf[4 + 2 + 2] & 0x3f) | 0x80; /* clk_seq_and_variant */

        bin2hex(uuid_string, (void*) buf, 16);
        /* f.e. UUID=dfd9c173-be52-4d27-99a5-c34c6c2ff55f */
        printf("UUID=%.8s"  "-%.4s-%.4s-%.4s-%.12s\n",
                uuid_string,
                uuid_string+8,
                uuid_string+8+4,
                uuid_string+8+4+4,
                uuid_string+8+4+4+4
        );

> Now, granted, it doesn't matter so much if swap UUIDs are really
> universally unique when you compare them to something like SSH keys. But
> let's consider the example of a large organisation deploying Linux
> automatically on lots of machines. We can probably expect those machines
> to be of roughly the same specification such that they'll reach similar
> stages of the installer at about the same time, plus or minus a few
> seconds, and we can certainly expect their deployment to be such that
> the installer takes an identical or nearly identical code path each
> time. I'd be surprised at this point if you had much more than 2^24
> random bits to play with. The probability of real collisions within the
> organisation is no longer negligible.

Wow, they would mix their swap partitions! Run for your lives.

> The generation function you (Rob) sent in
> http://lists.busybox.net/pipermail/busybox/2009-June/069690.html seems
> much better to me and largely fine, except for two mistakes:
> 
>  * RFC2518 appears to be in error when it says that you should set the
>    most significant bit of the first octet of the node ID to 1. RFC4122
>    instead says you should set the *least* significant bit, which
>    matches the description of IEEE 802 MAC addresses in
>    http://en.wikipedia.org/wiki/MAC_address.
> 
>  * Your code uses uuid[11] as if it were the first octet of the node ID.
>    I think this should in fact be uuid[10].
> 
> If I were writing it I think I'd follow libuuid's lead

That code is a good example of code explosion for no discernible reason
which is so prevalent nowadays. No wonder just about anything today
takes megabytes of code.

Getting a list of present network cards because I am about to mkswap?!
No thanks.

> and add some 
> stirring about with rand() and such as a bit of an insurance policy just
> in case /dev/urandom is broken (you'd still be screwed, of course, but
> at least less screwed than always getting all zeroes or whatever), but
> that may not be necessary in the context of busybox.
> 
> Can I recommend at least looking at libuuid and seeing what can be used,
> though? It could be made a lot smaller simply by stripping out all the
> time-based UUID stuff and assuming the existence of /dev/urandom, and as
> I said above this should all be configured off by default.

The code above does something like that. Is it ok with you?
--
vda


More information about the busybox mailing list