[BusyBox] Re: Where's the bzip2 compressed linux-kernel patch?
Jörn Engel
joern at wohnheim.fh-wedel.de
Wed Dec 17 13:46:00 UTC 2003
On Wed, 22 October 2003 06:13:50 -0500, Rob Landley wrote:
> On Tuesday 21 October 2003 08:15, Jörn Engel wrote:
>
> > 3.1. "aaaaaaaaaaaaaaaaaaaa..." - all characters identical - simple to
> > catch, not a real problem.
> > 3.2. "aaaaaaaaaaaaaaaaaaaa...aaab" - a single character different.
> > This is much harder to catch and optimize. It is also a very
> > real problem, as many binaries have large ranges of 0x00, so this
> > needs to be fast.
> > 3.3. "abcdefghijklmnopq...abcdefghijklmnopq..." - harmless string
> > concatenated to itself. Also causes O(n^2) behaviour, but needs
> > a completely different solution.
> > 3.4. "a" -> "aba" -> "abacaba" -> "abacabadabacaba" - nasty
> > combinations of 3.2 and 3.3. Special treatment for either one
> > causes you to catch the other worst case here. Hard.
>
> Looking at the block sorting machinery, I've largely been trying to see if I
> could throw any of it away without losing too much speed, or at least
> refactor it to make the fallback order a bit more readily apparent. I don't
> expect to be able to fundamentally improve on the sorting theory end of
> things without spending several months in the library, which I haven't got
> time for just now.
You might be able to improve things very quickly. Someone found this
nice piece of code:
http://www.cs.dartmouth.edu/~doug/sarray/sarray.tar
Featuring, among other things, sorting the suffix array in O(n*log(n))
for *any* data. Haven't looked at the code closely yet, but I guess
it uses the Larsson-Sadakane Algorithm.
Alternatively, you could look at this paper:
http://citeseer.nj.nec.com/567924.html
Worst case performance should be about O(n*log(n)*n^x). You can tune
x down below .5, maybe close to .1 or so. Big feature is memory
consumption just slightly above 5n.
Jörn
--
There's nothing better for promoting creativity in a medium than
making an audience feel "Hmm I could do better than that!"
-- Douglas Adams in a slashdot interview
More information about the busybox
mailing list