[PATCH] Compress common substrings in applet names to save space

Jody Bruchon jody at jodybruchon.com
Wed Apr 20 12:58:39 UTC 2016


On 2016-04-20 08:03, Denys Vlasenko wrote:
 > On Wed, Apr 20, 2016 at 12:55 PM, Guillermo Rodriguez Garcia
 > <guille.rodriguez at gmail.com> wrote:
 >>>> When strings are abbreviated the new find_applet_by_name() is about
 >>>> 34% slower than currently.  (All comparisons are for the default
 >>>> configuration.)
 >>>
 >>> Crazy :D
 >>
 >> Is this really worth it? 34% slower may have a visible impact if 
applets are
 >> executed frequently. And then there's the added complexity. All of 
that for
 >> just some 500 bytes ?
 >
 > In busybox land, 500 bytes are not small.
 >
 > However, I also think that this idea is slightly too insane...

It is definitely interesting. The problem from my perspective is that 
the potential performance drop is way too high. On platforms where 500 
bytes matter, chances are that CPU power is also very limited. This 
increases the fork cost when any BusyBox applet is invoked, meaning 
shell scripts will be much slower.

Some thoughts:

How about integrating the binary search speed increase but leaving out 
the name compression for now?

What about hashing the names and binary searching a fixed-width hash 
table instead? If interested, I created a computationally "cheap" hash 
algorithm with a very low collision rate and may help (feel free to 
re-license under GPL as needed): https://github.com/jbruchon/jodyhash

Would it be useful to have a compile-time option to cache the last few 
applets executed so that shell scripts that call the same applets 
repetitively will go faster?


More information about the busybox mailing list