[PATCH 2/2] (g)unzip: Optimize inflate_codes()

Denys Vlasenko vda.linux at googlemail.com
Sat Feb 13 03:47:19 UTC 2010


On Friday 12 February 2010 00:46, Joakim Tjernlund wrote:
> Denys Vlasenko <vda.linux at googlemail.com> wrote on 2010/02/12 00:28:44:
> >
> > On Friday 12 February 2010 00:14, Joakim Tjernlund wrote:
> > >
> > > Denys Vlasenko <vda.linux at googlemail.com> wrote on 2010/02/11 23:52:30:
> > >
> > > > From: Denys Vlasenko <vda.linux at googlemail.com>
> > > > To: Joakim Tjernlund <joakim.tjernlund at transmode.se>
> > > > Cc: busybox at busybox.net, Rob Landley <rob at landley.net>
> > > > Date: 2010/02/11 23:52
> > > > Subject: Re: [PATCH 2/2] (g)unzip: Optimize inflate_codes()
> > > >
> > > > On Thursday 11 February 2010 08:42, Joakim Tjernlund wrote:
> > > > > >
> > > > > > Great. But looks like you forgot to send the patch without this debug.
> > > > > > The only patch I found has this:
> > > > > >
> > > > > > +                                       /* Align out addr */
> > > > > > +                                       if (e < 3)
> > > > > > +                                               fprintf(stderr, "error
> > > > len:%d\n", e);
> > > > >
> > > > > Hehe, here we go then. Looking at the gzip code I think it is crap though.
> > > > > The upstream gzip code is old and unoptimized. One should just scrap
> > > > > it and redo it with zlib instead.
> > > > >
> > >
> > > >
> > > > It segfaults on 277 Mb gz file (a source tree of old openoffice version):
> > >
> > > what arch?
> >
> > x86-32
> >
> > > >
> > > > # time ./busybox_old gunzip <OOo_2.0.2_src.tar.gz >/dev/null
> > > >
> > > > real    0m13.616s
> > > > user    0m13.493s
> > > > sys     0m0.116s
> > > > # time ./busybox gunzip <OOo_2.0.2_src.tar.gz >/dev/null
> > > > /bin/bash: line 1: 15635 Segmentation fault      ./busybox gunzip < OOo_2.0.
> > > > 2_src.tar.gz > /dev/null
> > > >
> > > > real    0m2.198s
> > > > user    0m2.186s
> > > > sys     0m0.011s
> > >
> > > Maybe you could stick that debug printout back?
> >
> > # ./busybox gunzip <OOo_2.0.2_src.tar.gz | md5sum
> > gunzip: error len:2
> > gunzip: error len:2
> > 85bff4b99d1516ace2e8d11bcc4718f5  -
> 
> I was afraid this could happen. Either this .gz is compressed with an old
> faulty gzip or my patch is buggy :(
> Could you try to recompress the archive with latest gzip?

Recompressed with gzip 1.3.12, which made archive bigger (!!?):

# ls -l OOo_2.0.2_src.tar.gz OOo_2.0.2_src.tar1.gz
-rw-r--r--    1 root     root     290748396 Feb 11 23:49 OOo_2.0.2_src.tar.gz
-rw-r--r--    1 root     root     290774340 Feb 13 04:28 OOo_2.0.2_src.tar1.gz

# ./busybox_old gunzip <../OOo_2.0.2_src.tar.gz | md5sum
a1a4e036091b51aa986e4745909e7fac  -
# ./busybox_old gunzip <../OOo_2.0.2_src.tar1.gz | md5sum
a1a4e036091b51aa986e4745909e7fac  -

Still segfaults:

# ./busybox gunzip <../OOo_2.0.2_src.tar1.gz >/dev/null
/bin/bash: line 1: 29005 Segmentation fault      ./busybox gunzip < ../OOo_2.0.2_src.tar1.gz > /dev/null


> I guess we must fix it either way, does this work for you?
> 
> From 39b4ff13430651bc000d542d0b9fc973449f84c8 Mon Sep 17 00:00:00 2001
> From: Joakim Tjernlund <Joakim.Tjernlund at transmode.se>
> Date: Fri, 12 Feb 2010 00:43:42 +0100
> Subject: [PATCH] unzip: need to guard against len < 3
> 
> ---
>  archival/libunarchive/decompress_unzip.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/archival/libunarchive/decompress_unzip.c b/archival/libunarchive/decompress_unzip.c
> index 111d7fc..e64b60d 100644
> --- a/archival/libunarchive/decompress_unzip.c
> +++ b/archival/libunarchive/decompress_unzip.c
> @@ -603,6 +603,8 @@ static NOINLINE int inflate_codes(STATE_PARAM_ONLY)
>  						gunzip_window[w++] = gunzip_window[dd++];
>  						--e;
>  					}
> +					if (e < 2)
> +						goto less_than_two;
>  					/* Use pre increment as it is faster on some arch's */
>  					sout = (unsigned short *) (gunzip_window + w - 2);
>  					if (delta > 2) {
> @@ -630,6 +632,7 @@ static NOINLINE int inflate_codes(STATE_PARAM_ONLY)
>  					}
>  					w += e & ~1;
>  					dd += e & ~1;
> +				less_than_two:
>  					if (e & 1)
>  						gunzip_window[w++] = gunzip_window[dd++];
>  				}

With this fix it works:

# ./busybox gunzip <../OOo_2.0.2_src.tar1.gz | md5sum
a1a4e036091b51aa986e4745909e7fac  -

I don't see any speed difference, though:

# (time ./busybox gunzip <../OOo_2.0.2_src.tar1.gz >/dev/null) 2>&1 | grep real
real    0m13.443s
# (time ./busybox gunzip <../OOo_2.0.2_src.tar1.gz >/dev/null) 2>&1 | grep real
real    0m13.380s
# (time ./busybox gunzip <../OOo_2.0.2_src.tar1.gz >/dev/null) 2>&1 | grep real
real    0m13.442s
# (time ./busybox_old gunzip <../OOo_2.0.2_src.tar1.gz >/dev/null) 2>&1 | grep real
real    0m13.477s
# (time ./busybox_old gunzip <../OOo_2.0.2_src.tar1.gz >/dev/null) 2>&1 | grep real
real    0m13.405s
# (time ./busybox_old gunzip <../OOo_2.0.2_src.tar1.gz >/dev/null) 2>&1 | grep real
real    0m13.379s

-- 
vda


More information about the busybox mailing list