[BusyBox] My cleaned up bunzip.

Rob Landley rob at landley.net
Tue Oct 14 20:00:54 UTC 2003


Okay, I need about one more day of polishing, but the attached bunzip.c file 
works if you do this:

gcc -Os bunzip.c -o bunzip
cat linux-2.6.0-test6.tar.bz2 | ./bunzip | tar tv

If anybody finds a bzipped file this doesn't work on, please tell me.  (If 
it's an "obsolete archive format" error I may not actually DO anything about 
it, but I still want to know. :)

It's about 15% slower than the original, but I think I know why (modulus with 
% for finding the current selector is expensive; I'll go back to incrementing 
counters).

Ignore the Miss Piggy quote (I.E. "I don't understand any of this" undoing the 
burrows-wheeler transform).  I found an article explaining the 
Burrows-Wheeler transform (in Dr. Dobbs Journal Sep 1996, article available 
at http://dogma.net/markn/articles/bwt/bwt.htm), and also the original DEC 
paper on the subject, which you can get as a PDF:
http://gatekeeper.dec.com/pub/DEC/SRC/research-reports/abstracts/src-rr-124.html

Right now, this toy just decompresses from stdin to stdout.  Compiled with -Os 
and stripped, the executable is 7496 bytes. :)

Now I've got to write up some documentation on how bunzip works, although I 
hope this source is readable.  All 560 lines of it. :)

Rob

P.S.  <evil>Bwahahahahahahahaha!</evil>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bunzip.c
Type: text/x-csrc
Size: 19345 bytes
Desc: not available
Url : http://lists.busybox.net/pipermail/busybox/attachments/20031014/8744de7b/attachment.c 


More information about the busybox mailing list