[PATCH] diff: rewrite V2. -1005 bytes
Matheus Izvekov
mizvekov at gmail.com
Fri Jan 15 00:47:15 UTC 2010
On 01:30 Fri 15 Jan , Denys Vlasenko wrote:
> On Friday 15 January 2010 00:17, Matheus Izvekov wrote:
> > > > > > Actually, you did notice EOF was a token, but then you reused the eof
> > > > > > flag in the 8th bit position to represent it.
> > > > > > Problem is, with flag -b set, the user does not see that token,
> > > > > > because it is considered a space, and it should treat all kinds
> > > > > > of spaces as if they were the same.
> > > > > >
> > > > > > To sum it up, all flags should be moved up 1 bit and 0x1ff should be
> > > > > > used instead of 0xff as the mask.
> > > > >
> > > > > Can you point out where exactly is the bug:
> > > > >
> > > > > typedef int token_t;
> > > > > enum {
> > > > > SHIFT_EOF = (sizeof(token_t)*8 - 8) - 1,
> > > > remove this line
> > > > > TOK_EOF = 1 << 8,
> > > ...
> > >
> > > This isn't answering my question.
> > > I meant: "explain where my code goes off the rails".
> > >
> > > You don't explain it. Instead, you are showing how you'd change
> > > the code so that it is "becoming correct".
> > > This does not convince me that old code was incorrect.
> > >
> > > Below is the diff between your code and mine.
> > > As far as I can see, it nearly literally translates
> > > usage of bool fields into the usage of bytes.
> > > So where's the bug?
> >
> > The bug would be that when flag -b is set, and an EOF comes, the caller
> > would see token 0xffffff20 instead of 0x20.
>
> It would see tok == TOK_PRINT + TOK_EOF + TOK_EOL + TOK_SPACE + 0x20,
> not 0xffffff20.
>
And then TOK2CHAR would convert that to 0xffffff20 instead of 0x20.
#define TOK2CHAR(t) (((t) << SHIFT_EOF) >> SHIFT_EOF)
SHIFT_EOF = (sizeof(token_t)*8 - 8) - 1 = 23
so (int)0x120 << 23 >> 23 = 0xffffff20 because of the sign extension.
> > The problem is reusing the same bit that represents that the file ended
> > to also represent information about the token itself.
> > For example, if you changed this part:
> >
> > + tok &= ~0xff;
> > + tok |= TOK_SPACE + ' ';
> >
> > to the below:
> >
> > + tok &= ~0x1ff;
> > + tok |= TOK_SPACE + ' ';
> >
> > Then the caller would never know the file ended, and would get stuck.
>
> Yes, *then* it would be wrong. But I do not do tok &= ~0x1ff,
> I do tok &= ~0xff, preserving TOK_EOF bit.
>
> --
> vda
More information about the busybox
mailing list