[BusyBox] [patch] micro-bunzip version 3. :)

Rob Landley rob at landley.net
Fri Oct 17 05:19:03 UTC 2003


On Thursday 16 October 2003 22:01, Manuel Novoa III wrote:
> Hello Rob,
>
> Here's a patch to your bunzip-3.c file.  Nice work btw.

Thanks.

I looked over the patch and it all looks good.  I erred on the side of 
simplicity/clarity a few times while unraveling the code (after a while, I 
just wanted to see what it was doing, very explicitly. :)

The memset was leftover from earlier, and the memmove was me punting after 
getting frustrated by the "compiler bug".  I particularly like GET_A_BIT; 
Making a special case out of single bit gets is a good solution to inlining 
that sucker.  (I'd thought about inlining it somehow, but hadn't come up with 
a solution.)

I'm currently working on a version that moves outbuf handling into 
uncompressStream, and makes write_bunzip_data always use a supplied output 
buffer and length.  This removes an unnecessary copy, avoids allocating 4k of 
output buffer when we don't need it, and makes everything use the same code 
path to flush out any hidden bugs.  (For one thing, the CRC calculation has 
some funky corner cases depending on where the buffering interrupts it.  This 
doesn't show up with the filehandling buffer flush in the middle of the loop, 
because it doesn't get interrupted, but tar could easily care about this...)

If I stay awake long enough tonight, I'll try to get it debugged and get your 
patch integrated with what I've been doing.  And then I'm stepping away from 
it for a while, getting some sleep, and going off to work on the compression 
side, which means I may not surface again for two weeks... :)

> Anyway, on my machine, decompressing linux-2.6.0-test7.tar.bz2
> to /dev/null gave the following times:
>
>         bunzip-3.c    bzcat (system)   bunzip-3.c (patched)
> real    0m24.420s     0m22.725s        0m20.701s
> user    0m23.930s     0m22.170s        0m20.180s
> sys     0m0.070s      0m0.080s         0m0.140s
>
> Size of the patched version is comparable (slightly larger or
> smaller depending on compiler flags).

Cool.  Speed was the one potential reason to still use the old code. :)

I say put this patch into CVS now, and when I get my next version ready I'll 
prepare a patch against the CVS version.

> Manuel

Rob

(P.S.  In the version I'm banging on now, I've simplified the license to just 
LGPL.  I read the OSL a bit more closely and the patent termination clause 
would have bit IBM in their counter-suit of SCO if the code in question had 
been OSL instead of GPL, and I've decided I just don't want to beta-test 
legal code right now.  Unless I'm terminally sleep deprived (which is a 
distinct possibility), you're the only other contributor to this so far 
(except sort of jseward, who just wants the attribution paragraph kept intact 
and otherwise doesn't much care about the license).  Just making sure that, 
as a contributor, you're okay with this... :)



More information about the busybox mailing list