Bug in wc.

Rob Landley rob at landley.net
Tue Mar 9 15:12:01 UTC 2010


On Monday 08 March 2010 19:56:07 Denys Vlasenko wrote:
> On Tuesday 09 March 2010 01:55, Harald Becker wrote:
> > Hallo Rob!
> >
> > > Why do we have unnecessary leading whitespace?  What happend to small
> > > and simple and doing no more than absolutely necessary?
> >
> > As far as I remember the original (K&R) behavior of wc was always to
> > produce leading whitespace (fixed format output). Only the newer
> > versions of gnu wc striped of this leading whitespace. That lead to
> > several shell script failures that had to be fixed during the last years.
>
> ... and now we have script failures because _new_ scripts expect _new_
> output format  >>:(  "Progress" sometimes looks like pointless churn.

Endless lateral churn presented as progress is why I've more or less given up 
on Linux on the Desktop.  (As repeatedly ranted about in my blog.)

But at least for BusyBox, we can point at a standard and beat the hell out of 
it with regression test suite.  And in this case, the standard (SUSv4) 
specifies no leading whitespace ever:

>STDOUT
>
>    By default, the standard output shall contain an entry for each input
> file of the form:
>
>    "%d %d %d %s\n", <newlines>, <words>, <bytes>, <file>
>
>    If the -m option is specified, the number of characters shall replace
> the <bytes> field in this format.
>
>    If any options are specified and the -l option is not specified, the
> number of <newline> characters shall not be written.
>
>    If any options are specified and the -w option is not specified, the
> number of words shall not be written.
>
>    If any options are specified and neither -c nor -m is specified, the
> number of bytes or characters shall not be written.
>
>    If no input file operands are specified, no name shall be written and no
> <blank> characters preceding the pathname shall be written.
>
>    If more than one input file operand is specified, an additional line
> shall be written, of the same format as the other lines, except that the
> word total (in the POSIX locale) shall be written instead of a pathname and
> the total of each column shall be written as appropriate. Such an
> additional line, if any, is written at the end of the output.

Rob
-- 
Latency is more important than throughput. It's that simple. - Linus Torvalds


More information about the busybox mailing list