"tar t" and aesthetics -- the fundamental problem

Rob Landley rob at landley.net
Wed Apr 19 14:59:42 UTC 2006


On Tuesday 18 April 2006 12:37 pm, Robert P. J. Day wrote:
>   as i mentioned in an earlier post, there are (among other
> discrepancies) a couple moderately-annoying differences between the
> output forms from GNU tar to BB tar:
>
> GNU tar
> drwxr-xr-x rpjday/rpjday     0 2006-04-18 08:55:23 ./
> drwxr-xr-x rpjday/rpjday     0 2006-04-18 08:55:04 ./d2/
> -rw-r--r-- rpjday/rpjday     0 2006-04-18 08:55:04 ./d2/f2
> BB tar
> drwxr-xr-x 500/500         0 2006-04-18 08:55:23 .
> drwxr-xr-x 500/500         0 2006-04-18 08:55:04 ./d2
> -rw-r--r-- 500/500         0 2006-04-18 08:55:04 ./d2/f2
> drwxr-xr-x 500/500         0 2006-04-18 08:55:00 ./d1
> -rw-r--r-- 500/500         0 2006-04-18 08:55:00 ./d1/f1
>
>   note one of those differences

The big one being that we aren't looking up user name in /etc/passwd?

>   -- how GNU tar adds a trailing "/" to 
> directories whereas BB tar does not.

Is this in the archive itself, or just in the v output?  (You didn't say what 
you did to produce that.)

> i call this "annoying" since my 
> tar compatibility test suite was going to involve running several tar
> operations and "diff"ing the outputs.  grrrrr ....

We don't test against the gnu version, we test against an expected output.  
(The gnu implementation is not a spec.)

>   i've traced the source of that difference to these files:
>
>   libunarchive/
> 	header_list.c
> 	header_verbose_list.c
>
> header_list.c:
> --------------
>
> void header_list(const file_header_t *file_header)
> {
>         puts(file_header->name);
> }
>
> header_verbose_list.c:
> ----------------------
>
> void header_verbose_list(const file_header_t *file_header)
> {
>         struct tm *mtime = localtime(&(file_header->mtime));
>
>         printf("%s %d/%d%10u %4u-%02u-%02u %02u:%02u:%02u %s",
>                 bb_mode_string(file_header->mode),
>                 file_header->uid,
>                 file_header->gid,
>                 (unsigned int) file_header->size,
>                 1900 + mtime->tm_year,
>                 1 + mtime->tm_mon,
>                 mtime->tm_mday,
>                 mtime->tm_hour,
>                 mtime->tm_min,
>                 mtime->tm_sec,
>                 file_header->name);

This isn't shared code with ls -l, I take it?

>         if (file_header->link_name) {
>                 printf(" -> %s", file_header->link_name);
>         }
>         /* putchar isnt used anywhere else i dont think */
>         puts("");
> }
>
>
>   note the absolutely fundamental weakness here -- both of those
> routines simply print using a specific format, which means that, if
> you wanted to change the output format, you'd have to hack those
> routines, which would affect every other place from which they were
> invoked.

Such as ls -l.

>   it would seem that a much more flexible approach would be that, if
> you had a routine that was supposed to print some information, it
> should call a *formatting* routine whose job it was to generate the
> corresponding output string *to be printed*.
>
>   thus, header_list() wouldn't simply
>
> 	puts(file_header->name) ;
>
> it would call, perhaps, format_file_name(file_header->name), and would
> get in return the properly-formatted string ready for printing.  such
> a formatting routine might take an extra flag, specifying whether the
> caller wanted GNU-compatible behaviour or not.

Which is smaller code?

>   i think it's pretty much essential that, if there are going to be
> possible variations in output format, you're going to have to
> introduce this kind of intermediate formatting routine which, given a
> data structure of some kind, will return a string representing its
> printable format.  (essentially, you're defining a "stringify"
> operation for some data structures.)
>
>   thoughts?

Show a real world use for this infrastructure.  Busybox is not full of 
infrastructure waiting to be used "just in case".  Our primary goal is 
"small".  Simple, fast, flexible...  All those jockey for position after 
"small".

Rob
-- 
Never bet against the cheap plastic solution.



More information about the busybox mailing list