[BusyBox] Re: [PATCH] Re: Sigh... this time i attached the file (bunzip-4.c)

Rob Landley rob at landley.net
Thu Oct 23 06:46:56 UTC 2003

On Thursday 23 October 2003 01:39, Erik Andersen wrote:
> On Thu Oct 23, 2003 at 12:34:06AM -0500, Rob Landley wrote:
> > When reading from a filehandle, we can easily figure out how much we
> > overshot and lseek backwards a bit.  The only question is should this be
> > in the gzip/bzip library code, or in the code that calls it?
> Of course, that doesn't work nearly so well when that descriptor
> was created with pipe(2)...
>  -Erik

Well, yeah.

Hence the "when reading form a filehandle" bit, by which I really meant "when 
reading from a file".

Since we don't have an arbitrary length unget, there's not much else to do.  
They can copy the unused data out of our input buffer, which is even more 
idiosyncratic than lseek, or we can read bytes from the file as get_bits 
needs them (which would positively KILL our throughput)...

This isn't really a new problem.  Generally, reading from a pipe they'll send 
just _just_ the bzip data, and EOF at the end of it.  It might be nice if we 
had the option of returning how many bytes of input we read, but we're 
talking about a problem that isn't really new.  (The previous version read in 
5000 byte chunks too...)


