"diff" behavior in 1.15.3 and 1.16.0

Matheus Izvekov mizvekov at gmail.com
Fri Mar 12 21:56:12 UTC 2010


On 15:18 Fri 12 Mar     , Chris Metcalf wrote:
> On 3/12/2010 2:53 PM, Matheus Izvekov wrote:
> > On 13:06 Fri 12 Mar     , Chris Metcalf wrote:
> >
> > I'm the author of the diff rewrite, but you forgot to cc me.
> > Could've seen this message sooner otherwise.
> >   
> 
> Sorry, good point.  I assumed all the maintainers were reading the list
> anyway, which of course may not be true for some maintainers.  I'll cc
> you going forward.
> 
> >> We noticed a seg fault in "diff" while building MySQL on a Tilera
> >> processor board under Linux.  I tracked it down to the fact that in one
> >> place a pointer is initialized with "ptr = start - 1" and then later
> >> tested with "start > ptr".  Unfortunately in the failing case "start" is
> >> NULL, so "ptr" ends up as a very large number (-16UL, basically), and so
> >> the ">" test doesn't work.  I can provide a more detailed changelog
> >> entry and patch if changes to 1.15.3 are still interesting, but I'm
> >> guessing they're not, due to the complete rewrite of "diff" in 1.16.0.
> >>     
> >
> > I can't find that line. There are no variables named start or ptr.
> >   
> 
> I only provided details at the "anecdotal" level since, as I said, I
> assume no one really cares about the 1.15.3 code any more.  The actual
> variable names are context_vec_start and context_vec_ptr, and the
> specific bug fix is as follows.  To reproduce the bug, just do "diff -w
> foo bar" where foo and bar are only different in whitespace.
> 
> @@ -770,3 +770,3 @@
> 
> -       if (context_vec_start > context_vec_ptr)
> +       if (context_vec_start >= context_vec_ptr + 1)
>                 return;

There is some misunderstanding here. That code above is from the old
version of diff.
I just tested here, and under those conditions you mentioned, the new diff
works as expected, and the old one segfaults, just like you said.

> >> However that version of "diff" doesn't work very well, at least for us. 
> >> I'd be curious to know if, for example, it passes the testsuite diff
> >> tests for other platforms.  On our platform the 1.16.0 diff gives bogus
> >> output like this:
> >>     
> >
> > It passes testsuite here on x86_64 and x86_32 machines, and it passes
> > testsuite for whetever platform Denys, the busybox maintainer, uses.
> >   
> 
> OK, that's good to know.  I tried building it with our SGI-based
> compiler; I can try with our gcc and see if I get the same lossage or
> not, and I can try running it on our 64-bit platform and see if it
> reproduces there.
> 
> >> # echo foo > /tmp/foo
> >> # diff -u /tmp/foo /etc/resolv.conf
> >> --- /tmp/foo
> >> +++ /etc/resolv.conf
> >> @@ -1 +1,3 @@
> >> -foo
> >>
> >> \ No newline at end of file
> >> ++domain internal.tiler+a.com tad.internal.ti#
> >>
> >>
> >> Notice the filenames are missing at the end of the "---" and "+++"
> >>     
> >
> > I am not seeing that. What am I missing here?
> >   
> 
> My mistake - I should have said "the date and time are missing".  Where
> diff 1.16.0 printed this:
> 
> --- /tmp/foo
> +++ /etc/resolv.conf
> 
> 
> you would normally see the following (and you do with 1.15.3):
> 
> --- /tmp/foo    2010-03-12 15:02:36.000000000 -0500
> +++ /etc/resolv.conf    2010-02-22 10:11:05.000000000 -0500

Yes, that was intentional. I asked here on the list if anyone would be
bothered if that feature was removed, and no one seemed to care. Are you
just noticing this, or are you complaining about it? Do you need this?
Even though old diff used ctime for that, and gnu diff seems to use
something else with another format?

> 
> >> lines, the "No newline at end of file" message is issued incorrectly,
> >>     
> >
> > Why is it issued incorrectly?
> >   
> 
> Both files end with newlines.

Yes, that's not correct then. But I'm still bothered that some tests in
the suite are failing for you. Can you provide more details on that?

> > I'd like to see that if you could provide a test case. But better wait
> > until we figure out why the testsuite fails for you.
> >   
> 
> Indeed.  The testsuite doesn't work for us, so that's the place to start.
> 
> >> (Well, the testsuite almost passes: I did need a small hack in
> >> coreutils/tail.c to pass "tail.test", which was reporting a "short
> >> write" error otherwise, and the "grep handles/matches NUL" tests in
> >> grep.tests don't seem to pass for us either.  If there's interest I can
> >> send some patches for that stuff as well.)
> >>     
> >
> > That seems pretty weird, if I may be so blunt.
> >   
> 
> The tail.c bug is pretty clearly wrong; I don't know why it isn't seen
> on other platforms.  The fix is to guard the call to
> "xwrite(STDOUT_FILENO, buf + nread - nwrite, nwrite)" at
> coreutils/tail.c:244 with an "if (nwrite > 0)", since otherwise the
> third parameter of xwrite() can be negative.  Consider count = 50, nread
> = 10 for the first read; the "nwrite -= (count - seen)" at line 232 will
> make nwrite (initialized to the value of nread) negative.

Can you provide a test case for that?

> I didn't investigate the grep-NUL issue further, since it seemed like a
> pretty minor corner case.  The symptom is that we're not seeing the NUL
> in the output, and grepping for a line with just a NUL is not finding
> it.  It sounds like this should work in 1.15.3, though, so I'll open a
> bug internally to investigate this one further.
> 
> > Are there any emulators, or any way someone could test stuff under that
> > board? Without having to pay for one ofcourse.
> >   
> 
> Good question.  I think there is some attempt being made to gear up for
> something like this, but it hasn't happened yet.
> 
> Thanks!
> 
> -- 
> Chris Metcalf, Tilera Corp.
> http://www.tilera.com
> 
> 


More information about the busybox mailing list