[BusyBox] [patch] Making micro-bunzip play nice with tar...

Rob Landley rob at landley.net
Thu Oct 16 04:07:21 UTC 2003

My bug from last night was a missing ! operator. :)

Okay, here's the patch against micro-bunzip to give it a "read the next x 
bytes" call.

Look at uncompressStream to see a normal decompression session.  First call 
start_bunzip(&bd,in_fd) to allocate/init bunzip_data and read the file 
header.  Then loop calling read_bunzip_data(bd) to fill the intermediate 
buffer and write_bunzip_data to get data out.  If you call 
write_bunzip_data(bd,out_fd,0,0) it'll flush the entire intermediate buffer 
to the filehandle.

If you call write_bunzip_data(bd,-1,bufptr,length) it'll write up to length 
bytes into bufptr, returning the number of bytes it gave you.  If that value 
is less than you asked for, you have to call read_bunzip_data(bd) to fill the 
buffer again, then call 
write_bunzip_data(bd,-1,bufptr+already,length-already) to ask for the 
remaining bytes.  read_bunzip_data will tell you when you hit the end of the 
file (returning RETVAL_LAST_BLOCK; it's up to you to remember to check the 
file's CRC.  If write_bunzip_data returns RETVAL_LAST_BLOCK it means the 
block CRC failed, but it will have altered the file CRC to force a failure 
during that comparison so you can just treat them both the same...)

Notice that all error return values are now negative, so if write_bunzip_data 
returns anything <0 that means it got an error.  Zero means read the next 
block.  Positive means you got data, although you may need to read the next 
block anyway if you didn't get ENOUGH data...

Glenn: is this what you wanted for tar?


P.S. There are a few other cleanups in this patch, including a namespace 
change of everything to have bunzip in the name since I made several things 
extern so tar could use them.  One of the cleanups eliminated "put_byte" and 
put it inline in the one place it was used.  Inlining put_byte sped up the 
code.  We're now within one second of bunzip decompressing the linux-kernel 
tarball on my laptop (and depending how the cache alignment works out we're 
sometimes a second or two faster).  The down side of all this is it added 320 
bytes to the resulting executable size, so the -Os stripped version is now 
6824 bytes.  But I suppose I can live with that...

P.P.S.  If you ever want to decompress stuff from memory, that's what the new 
arguments to start_bunzip.  Feed it a pointer to a buffer, and a length, and 
it'll decompress from there instead of from a file.  (Feed it -1 from the 
file handle.)  The buffer must contain a complete bunzip file, it can't ask 
for more when it hits the end.  (Well, it'll try to read IOBUF_SIZE bytes 
from the file handle you fed it to refill the buffer, which might actually 
work under the right circumstances... :)  The reason for this is I was 
thinking about the old "bzip compressed linux kernel" patch, and that this 
engine is much nicer for doing that... :)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bunzip-2.patch
Type: text/x-diff
Size: 9312 bytes
Desc: not available
Url : http://lists.busybox.net/pipermail/busybox/attachments/20031015/efdad3c4/attachment.bin 

More information about the busybox mailing list