NTPD bug?

Denys Vlasenko vda.linux at googlemail.com
Tue Sep 29 12:20:37 UTC 2020


Looks like this is the case, thanks!

fixed in git

On Thu, Sep 24, 2020 at 12:05 AM Stavros Filargyropoulos
<stavros at pensando.io> wrote:
>
> Hi,
>
> I am running busybox's NTPD and noticed that it sometimes exits.
> Enabling debugs I see the following logs:
>
> ntpd: sending query to 10.255.254.13
> ntpd: sending query to 10.255.254.11
> ntpd: reply from 10.255.254.13: offset:-1.049563 delay:0.002000
> status:0x24 strat:4 refid:0x0bfeff0a rootdelay:0.020172 reach:0x03
> ntpd: setting time to 2020-09-23 06:59:38.341065 (offset -1.049563s)
> ntpd: recv(10.255.254.11) error: Bad file descriptor
>
> Looking at the code, what I think is happening is that we receive
> replies from .11 and .13 almost simultaneously, in the same poll()
> call.
>
> So we are the loop processing the fds from the poll() call here:
>         for (; nfds != 0 && j < i; j++) {
>             if (pfd[j].revents /* & (POLLIN|POLLERR)*/) {
>                 ...
>                 nfds--;
>                 recv_and_process_peer_pkt(idx2peer[j]);
>                 ...
>
> First we process the reply from .13, so we call
> recv_and_process_peer_pkt() which calls update_local_clock() and
> step_time(). We can verify that by the log "setting time to 2020-09-23
> 06:59:38.341065 (offset -1.049563s)".
>
> Then I think we may be going inside this condition:
> for (item = G.ntp_peers; item != NULL; item = item->link) {
>         peer_t *pp = (peer_t *) item->data;
>         reset_peer_stats(pp, offset);
>         //bb_error_msg("offset:%+f pp->next_action_time:%f -> %f",
>         //    offset, pp->next_action_time, pp->next_action_time + offset);
>         pp->next_action_time += offset;
>         if (pp->p_fd >= 0) {
>             /* We wait for reply from this peer too.
>              * But due to step we are doing, reply's data is no longer
>              * useful (in fact, it'll be bogus). Stop waiting for it.
>              */
>             close(pp->p_fd);
>             pp->p_fd = -1;
>             set_next(pp, RETRY_INTERVAL);
>         }
>
> and hence closing the socket of .11.
>
> Finally we return to the loop where we process the FDs from the poll,
> and we try to recv from .11, which is closed and set to -1.
>
> Could that be the problem?
>
> Thanks,
> Stavros
> _______________________________________________
> busybox mailing list
> busybox at busybox.net
> http://lists.busybox.net/mailman/listinfo/busybox


More information about the busybox mailing list