[PATCH 1/2] udhcpd: sanitize invalid hostnames to match rfcs
Isaac Dunham
ibid.ag at gmail.com
Tue Oct 20 03:47:07 UTC 2015
On Mon, Oct 19, 2015 at 05:41:16PM -0400, Rich Felker wrote:
> On Mon, Oct 19, 2015 at 10:52:27AM +0200, walter harms wrote:
> >
> >
> > Am 18.10.2015 23:26, schrieb Isaac Dunham:
> > > On Sun, Oct 18, 2015 at 07:55:38PM +0200, walter harms wrote:
> > >>
> > >>
> > >> Am 18.10.2015 07:54, schrieb Isaac Dunham:
> > >>> RFC952/RFC1123 limit the characters in a hostname for a node to
> > >>> [-a-zA-Z0-9], with '-' being legal only in the middle; we were
> > >>> accepting everything from ' ' to '~'.
[snip code]
> > >> since several tools check for hostnames,
> > >> maybe it is useful to make this a function ?
> > >
> > > What this does is not simply 'check for validity'; it *makes* a hostname
> > > valid, which is not what most tools need.
> > > It also is exclusively for leaf node names, rather than an FQDN (ie,
> > > '.' is not valid here).
> > >
> > > It would be possible to design a function that can check or fix the
> > > hostname depending how it's called, though I wonder if that's
> > > doing too much in a single call.
> > >
> > > It would probably have to be something like this:
> > >
> > > #define HOSTCHECK_LEAF 0x1 //leaf hostname-no '.' allowed
> > > #define HOSTCHECK_FIX 0x2 //fix-replace invalid chars with '-'/'X'
> > >
> > > //return NULL if valid, pointer to first invalid char otherwise
> > > char * validate_hostname(char *p, int flags);
> > >
> > > This does not handle transforming a URL via punycode, of course.
(This comment was intended as a paranthetical, though I forgot the
parantheses.)
> > > Would such an interface be desireable?
('such an interface' was intended to refer to validate_hostname(), as it
seems Walter took it.)
> > note: i did not make an inventory if this is needed by other
> > programms but i can imagine that with 'hostname' it would be useful.
>
> I see no reason hostnames should be represented as punycode anywhere
> except DNS query packets, or in other protocols that require encoding
> as such. Everywhere else they should just be normal printable text.
As far as I'm aware, the restriction is 'alphanumeric ascii or sometimes
-', and the standards apply only to network protocols (so basically
DHCP and DNS, including all the networking programs that accept
hostnames), and the discussion is about where we need to force
hostnames to comply with that.
punycode is one common convention on how to map non-ascii printable
text to permitted chars, though we don't currently support it and I'm
eager to not implement it myself.
I suspect Walter assumed that the standards applied to all hostnames,
and was suggesting that some form of sanitization is needed there.
I'm inclined to think that:
-hostname should fail rather than silently mutilating the name, and it
should impose no restrictions above the kernel's restrictions.
-DNS queries could be rejected, passed on as invalid hostnames, or fixed
to punycode; the notes in the RFCs referred to seem to imply that
there *is* a possibility of technically invalid hostnames existing,
but configurations should avoid it because it renders a host
inaccessible to some clients.
I would think that we should try to avoid publishing invalid
hostnames, but recognize that someone might have managed to ignore
the RFCs - after all, we might end up on a network that used
busybox 1.23 dnsd, with one host named 'host_1' or similar...
(I haven't checked dnsd, but it may have a similar bug.)
-udhcpc should probably sanitize invalid hostnames or reject them; it
runs on the user end, so a bad config is fixable there.
dnsd should emit/log an error on encountering them; I don't know
whether sanitizing or ignoring bad hostnames is better, but aborting
would be a potential DOS issue.
udhcpd only deals with hostnames once a lease is granted, so we can only
delete invalid hostnames or sanitize them.
Thanks,
Isaac Dunham
More information about the busybox
mailing list