zcip, link-local ARP responses

Sebastian Fett db_extern at gmx.de
Tue Aug 4 10:00:08 UTC 2015


> On Tue, Jul 21, 2015 at 2:03 PM, dbextern <db_extern at gmx.de> wrote:
>> Hello!
>>
>> For our link local functionality I'm using udhcpc together with zcip out of busybox on a Blackfin BF-537 CPU without MMU.
>>
>> The base functionality is there.
>> But when I connect two networks with stable IP addresses, and both networks had the same IP(s) in them, the conflict stays unresolved.
>> My setup is one master PC or MAC and several embedded boards.
>>
>> What I would expect (following RFC3927 https://tools.ietf.org/html/rfc3927#page-10[https://3c.gmx.net/mail/client/dereferrer?redirectUrl=https%3A%2F%2Ftools.ietf.org%2Fhtml%2Frfc3927%23page-10]; chapter 2.x and 4) is that after a while the ARP messages will resolve the problem because the still running zcip sees ARP responses on it's own IP and reacts accordingly.
>> What really happens is:
>> * Windows only sends broadcast ARP requests as long as it never got an answer. After that there is only unicast. And the requested device answers with a unicast as well.
>> * the MAC always sends broadcasts. And both embedded devices with the conflicting IP answer. But with a unicast ARP reply to the MAC.
>>
>> As I understand the specification (last paragraph in chapter 2.5) then all ARP messages between devices with link local addresses should be link layer broadcasts.
>> Did I get that right? And if yes, why does zcip not follow that rule?
>
> zcip does use bcast for all packets it sends:
>
> # tcpdump -nliwlan0 -s0 -vv -e arp
> tcpdump: listening on wlan0, link-type EN10MB (Ethernet), capture size
> 65535 bytes
>           <zcip runs>
> 02:49:51.078369 00:04:e2:64:23:c2 > ff:ff:ff:ff:ff:ff, ethertype ARP
> (0x0806), length 42: arp who-has 169.254.194.171 tell 0.0.0.0
> 02:49:52.391711 00:04:e2:64:23:c2 > ff:ff:ff:ff:ff:ff, ethertype ARP
> (0x0806), length 42: arp who-has 169.254.194.171 tell 0.0.0.0
> 02:49:54.254628 00:04:e2:64:23:c2 > ff:ff:ff:ff:ff:ff, ethertype ARP
> (0x0806), length 42: arp who-has 169.254.194.171 tell 0.0.0.0
> 02:49:55.305731 00:04:e2:64:23:c2 > ff:ff:ff:ff:ff:ff, ethertype ARP
> (0x0806), length 42: arp who-has 169.254.194.171 (00:04:e2:64:23:c2)
> tell 169.254.194.171
> 02:49:57.307788 00:04:e2:64:23:c2 > ff:ff:ff:ff:ff:ff, ethertype ARP
> (0x0806), length 42: arp who-has 169.254.194.171 (00:04:e2:64:23:c2)
> tell 169.254.194.171
> 02:49:59.309844 00:04:e2:64:23:c2 > ff:ff:ff:ff:ff:ff, ethertype ARP
> (0x0806), length 42: arp who-has 169.254.194.171 (00:04:e2:64:23:c2)
> tell 169.254.194.171

Thanks for looking into that.

These are the probing and announcement messages from zcip. They need to 
be and are bcasts.
What it does not do is answer normal ARP requests with a broadcast. 
Which is specified in RCF3927 as the right thing to do.

After some time with this problem I think the main reason for this is 
that we can not keep the kernel from answering normal ARP requests. And 
zcip does not want to double the answer.
And from what has been said in the netdev mailing list keeping bcasts 
out of your system is more important than following the specs.

I have a device (a Dante Brooklyn Modul for an audio stream) that does 
exactly what I expected. But that modul does not have any ARP code in 
the kernel. I assume it uses the option of completely taking care of ARP 
in userspace.

My current solution is to send an extra ARP response from zcip. That 
means two replies to every request (one from zcip, one from kernel). But 
that is better for my usecase than not having a bcast answer. And I am 
not part of a big network but only a small single purpose one. So nobody 
else gets hurt.

>
>
>> And a more general question: when I handle an ARP packet in zcip, how does the kernel know not to work on it as well?
>
> I don't know...
>


More information about the busybox mailing list