[PATCH] diff: rewrite V2. -1005 bytes

Matheus Izvekov mizvekov at gmail.com
Fri Jan 15 00:47:15 UTC 2010


On 01:30 Fri 15 Jan     , Denys Vlasenko wrote:
> On Friday 15 January 2010 00:17, Matheus Izvekov wrote:
> > > > > > Actually, you did notice EOF was a token, but then you reused the eof
> > > > > > flag in the 8th bit position to represent it.
> > > > > > Problem is, with flag -b set, the user does not see that token,
> > > > > > because it is considered a space, and it should treat all kinds
> > > > > > of spaces as if they were the same.
> > > > > > 
> > > > > > To sum it up, all flags should be moved up 1 bit and 0x1ff should be
> > > > > > used instead of 0xff as the mask.
> > > > > 
> > > > > Can you point out where exactly is the bug:
> > > > > 
> > > > > typedef int token_t;
> > > > > enum {
> > > > >         SHIFT_EOF = (sizeof(token_t)*8 - 8) - 1,
> > > >           remove this line
> > > > >         TOK_EOF   = 1 << 8,
> > > ...
> > > 
> > > This isn't answering my question.
> > > I meant: "explain where my code goes off the rails".
> > > 
> > > You don't explain it. Instead, you are showing how you'd change
> > > the code so that it is "becoming correct".
> > > This does not convince me that old code was incorrect.
> > > 
> > > Below is the diff between your code and mine.
> > > As far as I can see, it nearly literally translates
> > > usage of bool fields into the usage of bytes.
> > > So where's the bug?
> > 
> > The bug would be that when flag -b is set, and an EOF comes, the caller
> > would see token 0xffffff20 instead of 0x20.
> 
> It would see tok == TOK_PRINT + TOK_EOF + TOK_EOL + TOK_SPACE + 0x20,
> not 0xffffff20.
> 

And then TOK2CHAR would convert that to 0xffffff20 instead of 0x20.

#define TOK2CHAR(t) (((t) << SHIFT_EOF) >> SHIFT_EOF)

SHIFT_EOF = (sizeof(token_t)*8 - 8) - 1 = 23

so (int)0x120 << 23 >> 23 = 0xffffff20 because of the sign extension.

> > The problem is reusing the same bit that represents that the file ended
> > to also represent information about the token itself.
> > For example, if you changed this part:
> > 
> > +                               tok &= ~0xff;
> > +                               tok |= TOK_SPACE + ' ';
> > 
> > to the below:
> > 
> > +                               tok &= ~0x1ff;
> > +                               tok |= TOK_SPACE + ' ';
> > 
> > Then the caller would never know the file ended, and would get stuck.
> 
> Yes, *then* it would be wrong. But I do not do tok &= ~0x1ff,
> I do tok &= ~0xff, preserving TOK_EOF bit.
> 
> --
> vda


More information about the busybox mailing list