[BusyBox] Weird, long-term thought on message strings

Manuel Novoa III mnovoa3 at bellsouth.net
Wed Jan 31 21:17:31 UTC 2001


On Wed, 31 Jan 2001, Larry Doolittle wrote:
> Run the following perl program on busybox/*.c
>
(stuff snipped)
>
> $ perl $ABOVE *.c | wc
>     548    1960   12578
> $ perl $ABOVE *.c | gzip -c | wc
>      24     112    4419
> 
> This gives a rough picture of the _possible_ savings.
> Actually _implementing_ message text compression in
> a clean, maintainable way may not be practical.
> 
> OTOH, if the concept were kept in mind during an
> internationalization effort, someone might come up
> with a breakthrough.

This is related to a couple of things I've played around with.  One is the
patch I posted yesterday about changing how usage strings are dealt with.  The
next step with that is looking at compressing them.  I think I have a way of
hooking into unzip without forking or allocating a buffer for the entire usage
string set in order to extract the usage string needed.  It still requires
unziping the entire set of usage strings though.  Unfortunately, I won't be
able to look at this again until late next week, as I'll be away from my
computer.

The other thing is something I was looking at for uClibc;  I coded a trivial
text compressor that typically saves about 30% on normal text and for which the
decompressor plus data table is < 200 bytes.  (Size of the decompressor was
important since there wasn't much data to compress in that case.)  The savings
are less than gzip, but I could decompress each string individually (something
that is important here but not necessarily for the usage messages).  I had put
it on the shelf though in order to do some other work on uClibc.

Something like that (de)compressor could be applied in this case.  I actually
tried it on the usage message strings (in isolation) about two weeks ago, but
shelved it because while the size of the executable decreased, the size of the
gzip compressed executable increased (because the internally used compressor was
much less efficient).  That would be an issue for people putting busybox on
compressed ram disks and such.  I requested feedback about such issues when I
posted yesterday, but Erik is the only one who replied.  Anyway, this is
something I'll probably look into when I get back.

Manuel





More information about the busybox mailing list