[PATCH] have tar with GNU extensions use base-256 encoding for large fields

Denys Vlasenko vda.linux at googlemail.com
Wed May 4 19:11:35 UTC 2011

On Tue, May 3, 2011 at 1:56 AM, Ian Wienand <ianw at vmware.com> wrote:
> Hi,
> Currently if busybox tar encounters a negative time_t on a file, it
> just puts the sign-extended value into the tar file (see [1] where it
> says "Portable file timestamps cannot be negative").  I think it's best
> to leave the standard tar POSIX-ish; so this change gives a warning
> when a negative timestamp is seen, and leaves the timestamp as zero.

All this trouble just for the single one case where we store time?

#define PUT_OCTAL(a, b) putOctal((a), sizeof(a), (b))
	PUT_OCTAL(header.size, size);
	PUT_OCTAL(header.mode, statbuf->st_mode & 07777);
	PUT_OCTAL(header.uid, statbuf->st_uid);
	PUT_OCTAL(header.gid, statbuf->st_gid);
	PUT_OCTAL(header.mtime, statbuf->st_mtime);  <========HERE
		PUT_OCTAL(header.devmajor, major(statbuf->st_rdev));
		PUT_OCTAL(header.devminor, minor(statbuf->st_rdev));
		PUT_OCTAL(header.devmajor, major(statbuf->st_rdev));
		PUT_OCTAL(header.devminor, minor(statbuf->st_rdev));
		PUT_OCTAL(header.size, statbuf->st_size);

It's much easier to just add statbuf->st_mtime >= 0 ? statbuf->st_mtime : 0

> However, when GNU extensions are turned on, it seems the best thing to
> do is use base-256 encoding to represent the timestamp.

I don't think that there are people who really have to have reliable handling
of negative mtime. And positive overflow in octal string stored in
char mtime[12]
field will occur in year 4149 or so. I tend to settle for a simpler fix

> This also has
> the advantage that we can easily use this encoding for the file size
> too.  base-256 encoded fields are represented by having their top byte
> with the top bit set, and no trailing NULL, and then the actual value
> base-256 encoded obviously.

I know. We already handle base-256 encoded fields on unpacking.

So, we are talking about this code:

                if (sizeof(statbuf->st_size) > 4
                 && statbuf->st_size > (off_t)0777777777777LL
                ) {
                        bb_error_msg_and_die("can't store file '%s' "
                                "of size %"OFF_FMT"u, aborting",
                                fileName, statbuf->st_size);
                header.typeflag = REGTYPE;
                PUT_OCTAL(header.size, statbuf->st_size);

I propose the following code to use base-256 encoding there:

                /* header.size field is 12 bytes long */
                /* Does octal-encoded size fit? */
                uoff_t filesize = statbuf->st_size;
                if (sizeof(filesize) <= 4
                 || filesize <= (uoff_t)0777777777777LL
                ) {
                        PUT_OCTAL(header.size, filesize);
                /* Does base256-encoded size fit?
                 * It always does unless off_t is wider than 64 bits.
                else if (ENABLE_FEATURE_TAR_GNU_EXTENSIONS
#if ULLONG_MAX > 0xffffffffffffffffLL /* 2^64-1 */
                 && (filesize <= 0x3fffffffffffffffffffffffLL)
                ) {
                        /* GNU tar uses "base-256 encoding" for very
large numbers.
                         * Encoding is binary, with highest bit always
set as a marker
                         * and sign in next-highest bit:
                         * 80 00 .. 00 - zero
                         * bf ff .. ff - largest positive number
                         * ff ff .. ff - minus 1
                         * c0 00 .. 00 - smallest negative number
                        char *p8 = header.size + sizeof(header.size);
                        do {
                                *--p8 = (uint8_t)filesize;
                                filesize >>= 8;
                        } while (p8 != header.size);
                        *p8 |= 0x80;
                } else {
                        bb_error_msg_and_die("can't store file '%s' "
                                "of size %"OFF_FMT"u, aborting",
                                fileName, statbuf->st_size);

function                                             old     new   delta
writeTarHeader                                       841     979    +138

Is it acceptable?


More information about the busybox mailing list