[BusyBox] Start of bunzip2 cleanup.

Rob Landley rob at landley.net
Sun Oct 5 08:40:29 UTC 2003

In preparation for adding bzip compression support, I've been reading through 
the bzip2 support, and decided cleaning it up would be a good thing.  (If 
nothing else, the gratuitous malloc and free of three different structures 
for every block of compressed data is just evil.)

My original goal was to avoid touching anything in BZ2_bzDecompress() and up, 
and just clean up the wrappers around them (as Glenn already started to do).  
But then I hit the tangle of incestuous pointers between DState, bz_stream, 
and bzFile.

if bzFile goes away and becomes local variables in uncompressStream (which 
it's sort of crying out for), then the strm member of DStream should 
logically be an inline bz_stream structure rather than a pointer to one, 
which also allows the state member of bz_stream to go away since that points 
to the DState it lives in.

Each DState has exactly one bz_stream associated with it, and each bz_stream 
has exactly one DState.  (Each bzFile has one of each, but that's an optional 
wrapper.)  The fact bz_stream and DState are seperately allocated and have 
pointers to each other is a bad sign.

This patch untangles them.  (It's not really intended to reduce code size in 
and of itself, just make further cleanups possible.)  I'd appreciate some 
additional eyeballs confirming I didn't screw up, since I'm basically 
touching high-voltage code here. :)

With this patch, bz_stream becomes the first member of DState (I.E. no longer 
allocated separately), several circular pointer groupings are turned into 
trees, and all the references that need to be adjusted because of this are 

All this is just basically paving the way for further cleanup.  I fixed up 
code in this patch I intend to rip out soon.  But this is the minimal patch 
that needs to touch things in BZ2_bzDecompress()  and above, and does just 
the adjustment there that's necessary and very little else.  (Okay, one 
whitespace cleanup, the removal of a duplicate #define, and adding the vi 
tabstop comment to the top of the file.  But that's it.)

The really annoying part of this is that the argument passed to 
BZ2_bzDecompress() becomes a DState pointer instead of a bz_stream pointer.  
(I could continue to pass the bz_stream pointer and just typecast it, since 
it's the first member, but that's evil.)  Since bz_stream already had to have 
a member intialized to point to a valid DState anyway, this isn't really 
anything more than a cosmetic change.  But it's still annoying, since this is 
part of the library api bzip2 documents.  (Admittedly, in our code it's a 
static function, but still.  It's the principle of the thing.)

This patch compiled and had no errors doing:

./busybox bzcat ../../newfirmware/base/linux-2.6.0-test6.tar.bz2 | tar tv

The rest of the cleanup I currently have planned shouldn't touch anything in 
bzDecompress or above, just in the wrapper around bzDecompress.  (I.E. 
bzReadOpen, bzReadClose, and read_bz2 can all be merged into uncompressStream 
and somewhere between 1/2 and 2/3 of the resulting mess can be just deleted.)

Note: the void * removed from bz_stream was actually a DState pointer, they 
just declared their structures in the wrong order and didn't have the type 
available yet.  I moved bzFile down past DState for a similar reason, 
although the only actual change to bzFile was swapping the DState * for a 
bz_stream pointer (and a name change to force the compiler to complain about 
any code that used it that I didn't see).

The entire rest of the patch is fallout from this.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: bzip2.patch
Type: text/x-diff
Size: 5681 bytes
Desc: not available
Url : http://lists.busybox.net/pipermail/busybox/attachments/20031005/552c1770/attachment.bin 

More information about the busybox mailing list