Testiness.

Rob Landley rob at landley.net
Thu Sep 1 00:21:16 UTC 2005


On Wednesday 31 August 2005 15:54, Wolfgang Denk wrote:

> > So its not a bug its a feature.

Isn't he supposed to put quotes around "feature" when he says that?

Or is he actually implying this fundamental behavior change, without warning 
or documentation, isn't completely broken?  Despite the fact that the central 
design idea behind UTF8 is the ability to maintain backwards compatability 
with existing software that doesn't use it, and that bundling a change to the 
sequencing of existing characters with the ability to decode new additions to 
the character set is moronic at best?

> I hear the words, but I'm afraid I don't understand.
>
> Can please you explain how it is possible that '[a-z]*' matches "CVS"
> or "Makefile"? What is the flow of logic,  and  where  is  the  place
> where the folding of upper and lower case characters creeps in?

Especially since 'a' is still 97, and 'z' is still 122.  I know this because 
I'm feeding an ascii document into "sort", and invoking sort via a shell 
script that also consists of ascii character...

The really _fun_ part is doing this on an ascii text file that comes with or 
is maintained by the OS, such as /etc/passwd, man pages, anything 
under /usr/doc.  They aren't proposing a new storage mechanism for these text 
files which is incompatible with existing ones.  An mbox file is still an 
mbox file.

Now add in the fact that sort (which we were talking about) has the -f option 
to ignore case.  (It's in the SUSv3 spec.)  I did not supply the -f option.  
The _bug_ that I'm seeing is that with UTF8, the -f option is forced on.  
That is a bug.

Rob



More information about the busybox mailing list