[BusyBox] Applet Hashing for Busybox

Mark Whitley markw at lineo.com
Fri Oct 20 23:16:05 UTC 2000


Fellow Busyboxers,

Erik has mentioned to me in the past that the current linear search approach
to finding applets in busybox.c leaves something to be desired: it's slow and
it gets slower as we add more applets. He mentioned in passing that it would
be great if I were to "do something about that". So here I go.

What I have added are some modifications and new code that will create a
pre-hashed list of applets such that lookup time at run-time is faster. The
cost is a little more memory used at run-time and a slightly longer
compilation phase as an auxillary program needs to be run to generate the
pre-hashed applet list. No new tool dependencies have been introduced; the
auxillary program uses shell and C to get its job done.

Points of note:

 - The applet listing in busybox.c (struc BB_applets = { ...) has been moved
   into it's own file 'mapplets.h' (short for 'Master Applets'). Why did I do
   this? Because...

 - The auxillary program 'hashapps.c' reads the mapplets.h file and generates
   applets.h from it, which contains a pre-hashed listing. At the top of this
   file is a #define for turning on some debug noise so you can see what it's
   doing.  

 - I used a double-hashing approach for building a fixed-size, open-addressed
   hash table (the contents of which are output to 'applets.h'). The alert
   tester will note that there are some "EMPTY" entries generated -- Do not
   lose any sleep over this.  To make the magic work, I need a hash table size
   that's a prime number. This means finding a prime number slightly larger
   than the number of applets #defined in Config.h, hence a little more memory
   used. A necessary evil, but I think the benefits are worth it (more
   explanation of the benefits below).

 - A new target for building 'hashapps' from 'hashapps.c' was added to the
   Makefile. The 'all' target depends on it. Also added is a Makefile option
   for turning on/off the hashing. I did this so that people can use either
   the  existing, linear search approah or the new hashing approach; this is
   not an all-or-nothing proposition. You will also find a Makefile option for
   turning on some debug noise while busybox is running so you can see how it
   works/performs at run time.

 - The ammount of code added to busybox.c is minimal; I tried to do the bulk
   of the work at compile-time so that run-time performance will be maximized
   and code bloat minimized.

 - The alert reader will note that there is some code duplication between
   hashapps.c and busybox.c, namely the hashing routines. Suggestions on how
   to easily/elegantly share this code between the two programs are welcome,
   but I didn't think it was a big enough deal to worry about so I
   cut-n-pasted. So sue me.


As with most any hashing approach, this one generates some collisions. Here
are some stats:

  #apps #collisions
     68 0
     20 1
     10 2
      3 3
      3 4
      1 5
      2 7
      3 8
      3 9
      2 10
      1 13
      1 16
      1 18

As we can see, nearly 2/3rds of the time we locate the desired app in the hash
table on the first try. Those apps that aren't found on the first try are very
likely found on the second or third try. The WORST case scenario looks at 18
entries in the table before finding the desired applet, which is still
_4_times_better_ than the AVERAGE case for the linear search (72).


All in all, I think it's a win. I also think it's best to keep it configurable
so people can configure busybox both ways. The Makefile probably could use a
teensy bit more massaging to make sure that all the dependencies are set up
just-so.

Patch is attached; it is against the latest CVS. Please look it over and give
comments / suggestions / feedback / frosty bananas / etc.

Cheers,

Mark Whitley
markw at lineo.com
-------------- next part --------------
diff -urN busybox.virgin/Makefile busybox/Makefile
--- busybox.virgin/Makefile	Mon Sep 25 15:45:57 2000
+++ busybox/Makefile	Fri Oct 20 16:32:47 2000
@@ -49,6 +49,10 @@
 # larger than 2GB!
 DOLFS = false
 
+# Turn this on if you want to use applet hashing
+CFLAGS+=-DUSE_APPLET_HASHING
+# Turn this on if you want some debug noise when using applet hashing
+CFLAGS+=-DDEBUG_APPLET_HASHING
 
 # If you are running a cross compiler, you may want to set this
 # to something more interesting...
@@ -121,7 +125,7 @@
 endif
 
 
-all: busybox busybox.links doc
+all: hashapps busybox busybox.links doc
 
 doc: olddoc
 
@@ -228,6 +232,12 @@
 						\
 	tar -cvzf busybox-$(VERSION).tar.gz busybox-$(VERSION)/;
 
+
+hashapps: hashapps.c mapplets.h
+	$(CC) -Wall -g hashapps.c -o hashapps
+	./hashapps
+
 .PHONY: tags
+
 tags:
 	ctags -R .
diff -urN busybox.virgin/busybox.c busybox/busybox.c
--- busybox.virgin/busybox.c	Thu Oct 19 16:28:06 2000
+++ busybox/busybox.c	Fri Oct 20 16:35:00 2000
@@ -7,368 +7,10 @@
 #define bb_need_full_version
 #define BB_DECLARE_EXTERN
 #include "messages.c"
+#include "applets.h"
 
 static int been_there_done_that = 0;
 
-const struct BB_applet applets[] = {
-
-#ifdef BB_AR
-	{"ar", ar_main, _BB_DIR_USR_BIN, ar_usage},
-#endif
-#ifdef BB_BASENAME
-	{"basename", basename_main, _BB_DIR_USR_BIN, basename_usage},
-#endif
-	{"busybox", busybox_main, _BB_DIR_BIN, NULL},
-#ifdef BB_CAT
-	{"cat", cat_main, _BB_DIR_BIN, cat_usage},
-#endif
-#ifdef BB_CHMOD_CHOWN_CHGRP
-	{"chgrp", chmod_chown_chgrp_main, _BB_DIR_BIN, chgrp_usage},
-#endif
-#ifdef BB_CHMOD_CHOWN_CHGRP
-	{"chmod", chmod_chown_chgrp_main, _BB_DIR_BIN, chmod_usage},
-#endif
-#ifdef BB_CHMOD_CHOWN_CHGRP
-	{"chown", chmod_chown_chgrp_main, _BB_DIR_BIN, chown_usage},
-#endif
-#ifdef BB_CHROOT
-	{"chroot", chroot_main, _BB_DIR_USR_SBIN, chroot_usage},
-#endif
-#ifdef BB_CLEAR
-	{"clear", clear_main, _BB_DIR_USR_BIN, clear_usage},
-#endif
-#ifdef BB_CHVT
-	{"chvt", chvt_main, _BB_DIR_USR_BIN, chvt_usage},
-#endif
-#ifdef BB_CMP
-	{"cmp", cmp_main, _BB_DIR_USR_BIN, cmp_usage},
-#endif
-#ifdef BB_CP_MV
-	{"cp", cp_mv_main, _BB_DIR_BIN, cp_usage},
-#endif
-#ifdef BB_CUT
-	{"cut", cut_main, _BB_DIR_USR_BIN, cut_usage},
-#endif
-#ifdef BB_DATE
-	{"date", date_main, _BB_DIR_BIN, date_usage},
-#endif
-#ifdef BB_DC
-	{"dc", dc_main, _BB_DIR_USR_BIN, dc_usage},
-#endif
-#ifdef BB_DD
-	{"dd", dd_main, _BB_DIR_BIN, dd_usage},
-#endif
-#ifdef BB_DF
-	{"df", df_main, _BB_DIR_BIN, df_usage},
-#endif
-#ifdef BB_DIRNAME
-	{"dirname", dirname_main, _BB_DIR_USR_BIN, dirname_usage},
-#endif
-#ifdef BB_DMESG
-	{"dmesg", dmesg_main, _BB_DIR_BIN, dmesg_usage},
-#endif
-#ifdef BB_DOS2UNIX
-	{"dos2unix", dos2unix_main, _BB_DIR_USR_BIN, dos2unix_usage},
-#endif
-#ifdef BB_DU
-	{"du", du_main, _BB_DIR_USR_BIN, du_usage},
-#endif
-#ifdef BB_DUMPKMAP
-	{"dumpkmap", dumpkmap_main, _BB_DIR_BIN, dumpkmap_usage},
-#endif
-#ifdef BB_DUTMP
-	{"dutmp", dutmp_main, _BB_DIR_USR_SBIN, dutmp_usage},
-#endif
-#ifdef BB_ECHO
-	{"echo", echo_main, _BB_DIR_BIN, echo_usage},
-#endif
-#ifdef BB_EXPR
-	{"expr", expr_main, _BB_DIR_USR_BIN, expr_usage},
-#endif
-#ifdef BB_TRUE_FALSE
-	{"false", false_main, _BB_DIR_BIN, false_usage},
-#endif
-#ifdef BB_FBSET
-	{"fbset", fbset_main, _BB_DIR_USR_SBIN, NULL},
-#endif
-#ifdef BB_FDFLUSH
-	{"fdflush", fdflush_main, _BB_DIR_BIN, fdflush_usage},
-#endif
-#ifdef BB_FIND
-	{"find", find_main, _BB_DIR_USR_BIN, find_usage},
-#endif
-#ifdef BB_FREE
-	{"free", free_main, _BB_DIR_USR_BIN, free_usage},
-#endif
-#ifdef BB_FREERAMDISK
-	{"freeramdisk", freeramdisk_main, _BB_DIR_SBIN, freeramdisk_usage},
-#endif
-#ifdef BB_DEALLOCVT
-	{"deallocvt", deallocvt_main, _BB_DIR_USR_BIN, deallocvt_usage},
-#endif
-#ifdef BB_FSCK_MINIX
-	{"fsck.minix", fsck_minix_main, _BB_DIR_SBIN, fsck_minix_usage},
-#endif
-#ifdef BB_GETOPT
-	{"getopt", getopt_main, _BB_DIR_BIN, getopt_usage},
-#endif
-#ifdef BB_GREP
-	{"grep", grep_main, _BB_DIR_BIN, grep_usage},
-#endif
-#ifdef BB_GUNZIP
-	{"gunzip", gunzip_main, _BB_DIR_BIN, gunzip_usage},
-#endif
-#ifdef BB_GZIP
-	{"gzip", gzip_main, _BB_DIR_BIN, gzip_usage},
-#endif
-#ifdef BB_HALT
-	{"halt", halt_main, _BB_DIR_SBIN, halt_usage},
-#endif
-#ifdef BB_HEAD
-	{"head", head_main, _BB_DIR_USR_BIN, head_usage},
-#endif
-#ifdef BB_HOSTID
-	{"hostid", hostid_main, _BB_DIR_USR_BIN, hostid_usage},
-#endif
-#ifdef BB_HOSTNAME
-	{"hostname", hostname_main, _BB_DIR_BIN, hostname_usage},
-#endif
-#ifdef BB_ID
-	{"id", id_main, _BB_DIR_USR_BIN, id_usage},
-#endif
-#ifdef BB_INIT
-	{"init", init_main, _BB_DIR_SBIN, NULL},
-#endif
-#ifdef BB_INSMOD
-	{"insmod", insmod_main, _BB_DIR_SBIN, insmod_usage},
-#endif
-#ifdef BB_KILL
-	{"kill", kill_main, _BB_DIR_BIN, kill_usage},
-#endif
-#ifdef BB_KILLALL
-	{"killall", kill_main, _BB_DIR_USR_BIN, kill_usage},
-#endif
-#ifdef BB_LENGTH
-	{"length", length_main, _BB_DIR_USR_BIN, length_usage},
-#endif
-#ifdef BB_LINUXRC
-	{"linuxrc", init_main, _BB_DIR_ROOT, NULL},
-#endif
-#ifdef BB_LN
-	{"ln", ln_main, _BB_DIR_BIN, ln_usage},
-#endif
-#ifdef BB_LOADACM
-	{"loadacm", loadacm_main, _BB_DIR_USR_BIN, loadacm_usage},
-#endif
-#ifdef BB_LOADFONT
-	{"loadfont", loadfont_main, _BB_DIR_USR_BIN, loadfont_usage},
-#endif
-#ifdef BB_LOADKMAP
-	{"loadkmap", loadkmap_main, _BB_DIR_SBIN, loadkmap_usage},
-#endif
-#ifdef BB_LOGGER
-	{"logger", logger_main, _BB_DIR_USR_BIN, logger_usage},
-#endif
-#ifdef BB_LOGNAME
-	{"logname", logname_main, _BB_DIR_USR_BIN, logname_usage},
-#endif
-#ifdef BB_LS
-	{"ls", ls_main, _BB_DIR_BIN, ls_usage},
-#endif
-#ifdef BB_LSMOD
-	{"lsmod", lsmod_main, _BB_DIR_SBIN, lsmod_usage},
-#endif
-#ifdef BB_MAKEDEVS
-	{"makedevs", makedevs_main, _BB_DIR_SBIN, makedevs_usage},
-#endif
-#ifdef BB_MD5SUM
-	{"md5sum", md5sum_main, _BB_DIR_USR_BIN, md5sum_usage},
-#endif
-#ifdef BB_MKDIR
-	{"mkdir", mkdir_main, _BB_DIR_BIN, mkdir_usage},
-#endif
-#ifdef BB_MKFIFO
-	{"mkfifo", mkfifo_main, _BB_DIR_USR_BIN, mkfifo_usage},
-#endif
-#ifdef BB_MKFS_MINIX
-	{"mkfs.minix", mkfs_minix_main, _BB_DIR_SBIN, mkfs_minix_usage},
-#endif
-#ifdef BB_MKNOD
-	{"mknod", mknod_main, _BB_DIR_BIN, mknod_usage},
-#endif
-#ifdef BB_MKSWAP
-	{"mkswap", mkswap_main, _BB_DIR_SBIN, mkswap_usage},
-#endif
-#ifdef BB_MKTEMP
-	{"mktemp", mktemp_main, _BB_DIR_BIN, mktemp_usage},
-#endif
-#ifdef BB_NC
-	{"nc", nc_main, _BB_DIR_USR_BIN, nc_usage},
-#endif
-#ifdef BB_MORE
-	{"more", more_main, _BB_DIR_BIN, more_usage},
-#endif
-#ifdef BB_MOUNT
-	{"mount", mount_main, _BB_DIR_BIN, mount_usage},
-#endif
-#ifdef BB_MT
-	{"mt", mt_main, _BB_DIR_BIN, mt_usage},
-#endif
-#ifdef BB_CP_MV
-	{"mv", cp_mv_main, _BB_DIR_BIN, mv_usage},
-#endif
-#ifdef BB_NSLOOKUP
-	{"nslookup", nslookup_main, _BB_DIR_USR_BIN, nslookup_usage},
-#endif
-#ifdef BB_PING
-	{"ping", ping_main, _BB_DIR_BIN, ping_usage},
-#endif
-#ifdef BB_POWEROFF
-	{"poweroff", poweroff_main, _BB_DIR_SBIN, poweroff_usage},
-#endif
-#ifdef BB_PRINTF
-	{"printf", printf_main, _BB_DIR_USR_BIN, printf_usage},
-#endif
-#ifdef BB_PS
-	{"ps", ps_main, _BB_DIR_BIN, ps_usage},
-#endif
-#ifdef BB_PWD
-	{"pwd", pwd_main, _BB_DIR_BIN, pwd_usage},
-#endif
-#ifdef BB_RDATE
-	{"rdate", rdate_main, _BB_DIR_USR_BIN, rdate_usage},
-#endif
-#ifdef BB_READLINK
-	{"readlink", readlink_main, _BB_DIR_USR_BIN, readlink_usage},
-#endif
-#ifdef BB_REBOOT
-	{"reboot", reboot_main, _BB_DIR_SBIN, reboot_usage},
-#endif
-#ifdef BB_RENICE
-	{"renice", renice_main, _BB_DIR_USR_BIN, renice_usage},
-#endif
-#ifdef BB_RESET
-	{"reset", reset_main, _BB_DIR_USR_BIN, reset_usage},
-#endif
-#ifdef BB_RM
-	{"rm", rm_main, _BB_DIR_BIN, rm_usage},
-#endif
-#ifdef BB_RMDIR
-	{"rmdir", rmdir_main, _BB_DIR_BIN, rmdir_usage},
-#endif
-#ifdef BB_RMMOD
-	{"rmmod", rmmod_main, _BB_DIR_SBIN, rmmod_usage},
-#endif
-#ifdef BB_SED
-	{"sed", sed_main, _BB_DIR_BIN, sed_usage},
-#endif
-#ifdef BB_SETKEYCODES
-	{"setkeycodes", setkeycodes_main, _BB_DIR_USR_BIN, setkeycodes_usage},
-#endif
-#ifdef BB_SH
-	{"sh", shell_main, _BB_DIR_BIN, shell_usage},
-#endif
-#ifdef BB_SLEEP
-	{"sleep", sleep_main, _BB_DIR_BIN, sleep_usage},
-#endif
-#ifdef BB_SORT
-	{"sort", sort_main, _BB_DIR_USR_BIN, sort_usage},
-#endif
-#ifdef BB_SYNC
-	{"sync", sync_main, _BB_DIR_BIN, sync_usage},
-#endif
-#ifdef BB_SYSLOGD
-	{"syslogd", syslogd_main, _BB_DIR_SBIN, syslogd_usage},
-#endif
-#ifdef BB_SWAPONOFF
-	{"swapon", swap_on_off_main, _BB_DIR_SBIN, swapon_usage},
-#endif
-#ifdef BB_SWAPONOFF
-	{"swapoff", swap_on_off_main, _BB_DIR_SBIN, swapoff_usage},
-#endif
-#ifdef BB_TAIL
-	{"tail", tail_main, _BB_DIR_USR_BIN, tail_usage},
-#endif
-#ifdef BB_TAR
-	{"tar", tar_main, _BB_DIR_BIN, tar_usage},
-#endif
-#ifdef BB_TELNET
-	{"telnet", telnet_main, _BB_DIR_USR_BIN, telnet_usage},
-#endif
-#ifdef BB_TEST
-	{"test", test_main, _BB_DIR_USR_BIN, test_usage},
-#endif
-#ifdef BB_TEE
-	{"tee", tee_main, _BB_DIR_USR_BIN, tee_usage},
-#endif
-#ifdef BB_TOUCH
-	{"touch", touch_main, _BB_DIR_BIN, touch_usage},
-#endif
-#ifdef BB_TR
-	{"tr", tr_main, _BB_DIR_USR_BIN, tr_usage},
-#endif
-#ifdef BB_TRUE_FALSE
-	{"true", true_main, _BB_DIR_BIN, true_usage},
-#endif
-#ifdef BB_TTY
-	{"tty", tty_main, _BB_DIR_USR_BIN, tty_usage},
-#endif
-#ifdef BB_UMOUNT
-	{"umount", umount_main, _BB_DIR_BIN, umount_usage},
-#endif
-#ifdef BB_UNAME
-	{"uname", uname_main, _BB_DIR_BIN, uname_usage},
-#endif
-#ifdef BB_UNIQ
-	{"uniq", uniq_main, _BB_DIR_USR_BIN, uniq_usage},
-#endif
-#ifdef BB_UNIX2DOS
-	{"unix2dos", unix2dos_main, _BB_DIR_USR_BIN, unix2dos_usage},
-#endif
-#ifdef BB_UNRPM
-	{"unrpm", unrpm_main, _BB_DIR_USR_BIN, unrpm_usage},
-#endif
-#ifdef BB_UPDATE
-	{"update", update_main, _BB_DIR_SBIN, update_usage},
-#endif
-#ifdef BB_UPTIME
-	{"uptime", uptime_main, _BB_DIR_USR_BIN, uptime_usage},
-#endif
-#ifdef BB_UUENCODE
-	{"uuencode", uuencode_main, _BB_DIR_USR_BIN, uuencode_usage},
-#endif
-#ifdef BB_UUDECODE
-	{"uudecode", uudecode_main, _BB_DIR_USR_BIN, uudecode_usage},
-#endif
-#ifdef BB_USLEEP
-	{"usleep", usleep_main, _BB_DIR_BIN, usleep_usage},
-#endif
-#ifdef BB_WC
-	{"wc", wc_main, _BB_DIR_USR_BIN, wc_usage},
-#endif
-#ifdef BB_WGET
-	{"wget", wget_main, _BB_DIR_USR_BIN, wget_usage},
-#endif
-#ifdef BB_WHICH
-	{"which", which_main, _BB_DIR_USR_BIN, which_usage},
-#endif
-#ifdef BB_WHOAMI
-	{"whoami", whoami_main, _BB_DIR_USR_BIN, whoami_usage},
-#endif
-#ifdef BB_XARGS
-	{"xargs", xargs_main, _BB_DIR_USR_BIN, xargs_usage},
-#endif
-#ifdef BB_YES
-	{"yes", yes_main, _BB_DIR_USR_BIN, yes_usage},
-#endif
-#ifdef BB_GUNZIP
-	{"zcat", gunzip_main, _BB_DIR_BIN, gunzip_usage},
-#endif
-#ifdef BB_TEST
-	{"[", test_main, _BB_DIR_USR_BIN, test_usage},
-#endif
-	{0,NULL,0,NULL}
-};
 
 const char *applet_name;
 
@@ -439,6 +81,17 @@
 
 #endif /* BB_FEATURE_INSTALLER */
 
+#ifdef USE_APPLET_HASHING
+static int hash1(const char *s, int m)
+{ 
+	return (((s[0] + s[strlen(s)-1]) * strlen(s)) % m);
+}
+
+static int hash2(const char *s, int m)
+{
+	return (1 + hash1(s, m) % (m - 2));
+}
+#endif
 
 int main(int argc, char **argv)
 {
@@ -491,6 +144,39 @@
 	}
 #endif
 
+#ifdef USE_APPLET_HASHING
+
+	{
+#ifdef DEBUG_APPLET_HASHING
+		int collisions = 0;
+#endif
+		int i, pos;
+
+		for (i = 0; i < htsize; i++) {
+			pos = (hash1(applet_name, htsize) + (i * hash2(applet_name, htsize))) % htsize;
+			if (strcmp(applet_name, applets[pos].name) == 0) {
+				/* found it */
+#ifdef DEBUG_APPLET_HASHING
+				fprintf(stderr, "found %s, collisions: %i\n", applet_name, collisions);
+#endif
+				if (applets[pos].usage && argv[1] && strcmp(argv[1], "--help") == 0)
+					usage(applets[pos].usage);
+				exit(((*(applets[pos].main)) (argc, argv)));
+			}
+			/* didn't find it, keep trying */
+#ifdef DEBUG_APPLET_HASHING
+			collisions++;
+#endif
+		}
+
+#ifdef DEBUG_APPLET_HASHING
+				fprintf(stderr, "Did not find applet in hash table.\n");
+#endif
+
+	}
+
+#else
+
 	while (a->name != 0) {
 		if (strcmp(applet_name, a->name) == 0) {
 			if (a->usage && argv[1] && strcmp(argv[1], "--help") == 0)
@@ -499,6 +185,9 @@
 		}
 		a++;
 	}
+
+#endif
+
 	return(busybox_main(argc, argv));
 }
 
diff -urN busybox.virgin/hashapps.c busybox/hashapps.c
--- busybox.virgin/hashapps.c	Wed Dec 31 17:00:00 1969
+++ busybox/hashapps.c	Fri Oct 20 16:31:42 2000
@@ -0,0 +1,240 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <limits.h> /* INT_MAX */
+#include <regex.h>
+
+#define DEBUG_HASHING /* uncomment to print some noise as we are hashing */
+
+static unsigned const int BUFFSIZE = 120;
+
+struct BB_applet_info {
+	char name[40];
+	char main[40];
+	char location[40];
+	char usage[40];
+};
+
+unsigned int nearest_prime(unsigned int num)
+{
+	unsigned int divisor;
+	short divisible;
+
+	while (num < INT_MAX) {
+		divisible = 0;
+		for (divisor = num-1; divisor > 1; divisor--) {
+			if (num % divisor == 0) {
+				divisible++;
+				break;
+			}
+		}
+		if (!divisible)
+			return num;
+		num++;
+	}
+
+	return -1;
+}
+
+int hash1(const char *s, int m)
+{
+	/* I experienced the fewest collisions with busybox strings with:
+	 * (first letter + last letter) * string length */
+
+	return (((s[0] + s[strlen(s)-1]) * strlen(s)) % m);
+}
+
+int hash2(const char *s, int m)
+{
+	/* Variations on a theme...  */
+	return (1 + hash1(s, m) % (m - 2));
+}
+
+int applet_hash_pos(struct BB_applet_info *applethash, int htsize, const char *string)
+{
+	int i, pos;
+#ifdef DEBUG_HASHING
+	int collisions = 0;
+#endif
+
+	for (i = 0; i < htsize; i++) {
+		pos = (hash1(string, htsize) + (i * hash2(string, htsize))) % htsize;
+		if (strlen(applethash[pos].name) == 0) {
+#ifdef DEBUG_HASHING
+			fprintf(stderr, "collisions for %s: %i\n", string, collisions);
+#endif
+			return pos;
+		}
+		/* if we didn't return just now, there was a collision, try again */
+#ifdef DEBUG_HASHING
+		collisions++;
+#endif
+	}
+
+	/* hash table is full */
+	return -1;
+}
+
+
+int main(void)
+{
+	FILE *pgrep;
+	char buf[BUFFSIZE];
+	unsigned int napplets = 0;
+	unsigned int htsize = 0;
+	struct BB_applet_info *applethash = NULL;
+	FILE *defappl;
+	FILE *outfile;
+	int hashpos;
+	char *pstr;
+	char *tok;
+	char app_name[BUFFSIZE];
+	int i;
+
+	fprintf(stderr, "Running applet hasher...");
+
+	/*
+	 * Figure out how many applets we are going to include.
+	 *
+	 * We do this  by going through Config.h and increment a variable any time
+	 * we match the regex "^#define[ \t]+BB_.*" but not "BB_FEATURE".
+	 *
+	 * grep -c "^#define BB_" Config.h | grep -v "BB_FEATURE"
+	 *
+	 */
+
+	pgrep = popen("grep -c '^#define BB_' Config.h | grep -v 'BB_FEATURE'", "r");
+	fgets(buf, BUFFSIZE, pgrep);
+	pclose(pgrep);
+	napplets = atoi(buf);
+
+	/*
+	 * We then find a prime number for the hash table size that is just
+	 * *slightly* larger than the number of applets.
+	 */
+
+	htsize = nearest_prime(napplets);
+	/* (if you want to reduce collisions, you could multiply this by 2, but then
+	 * that would bloat up the memory used at run time, so let's just keep the
+	 * hash table nice 'n small mmm'K?) */
+
+	/*
+	 * Make the hash table
+	 */
+
+	applethash = (struct BB_applet_info *)calloc(htsize, sizeof(struct BB_applet_info));
+	if (applethash == NULL) {
+		perror("Could not allocate space for hash table.\n");
+		exit(-1);
+	}
+
+	/*
+	 * Next, we go through the Master Applets file 'mapplets.h' (extracted from
+	 * busybox.c) and for every line that matches "#ifdef (BB_.*)" we test to
+	 * see if there is a match for "#define BB_*" in Config.h. If there is, we
+	 * add the next line (i.e. {"cat", cat_main, _BB_DIR_BIN, cat_usage}) to
+	 * the hash table.
+	 */
+
+	defappl = popen("\
+for def in `gcc -E -dM Config.h | sed 's/^#define //g' | grep -v BB_FEATURE`;\
+do\
+	sed -n '/^#ifdef \\<'$def'\\>/,/^#endif/p' mapplets.h | grep '{';\
+done", "r");
+
+	while (fgets(buf, BUFFSIZE, defappl)) {
+
+		/* advance past the \t{" to the app name */
+		pstr = strpbrk(buf, "\"");
+		pstr++;
+
+		tok = strtok(pstr, "\", ");
+		strcpy(app_name, tok);
+
+		/* get the hash code for the app name */
+		hashpos = applet_hash_pos(applethash, htsize, app_name);
+		/* sanity check */
+		if (hashpos == -1) {
+			perror("APPLET HASHING ERROR -- HORRIBLE INFECTIOUS DISEASE");
+			exit(-1);
+		}
+		/* else... */
+		strcpy(applethash[hashpos].name, app_name);
+
+		tok = strtok(NULL, ", ");
+		strcpy(applethash[hashpos].main, tok);
+
+		tok = strtok(NULL, ", ");
+		strcpy(applethash[hashpos].location, tok);
+
+		tok = strtok(NULL, "}, ");
+		strcpy(applethash[hashpos].usage, tok);
+
+		/* i++; */
+	}
+
+
+	/*
+	 * Now that the hash table is fully built, we write the contents out to a
+	 * file called applets_hash.h which includes all of the applets defined in
+	 * Config.h (this will look a little different than the master file.
+	 */
+
+	outfile = fopen("applets.h", "w");
+
+	/* print the file header */
+	fprintf(outfile, "\
+/*
+ * applets.h - A pre-hashed listing of configured applets generated from mapplets.h
+ */
+
+const int htsize = %i;
+
+const struct BB_applet applets[] = {
+", htsize);
+
+	for (i = 0; i < htsize; i++) {
+
+		/* now print out all the entries */
+
+		/* special case for nulls */
+		if (strlen(applethash[i].name) == 0) {
+#ifdef DEBUG_HASHING
+			/* print the array index next to each applet entry */
+			fprintf(outfile, " /* %i */ {\"EMPTY\", NULL, 0, NULL},\n", i);
+#else
+			fprintf(outfile, "\t{\"EMPTY\", NULL, 0, NULL},\n");
+#endif
+		} else {
+#ifdef DEBUG_HASHING
+			/* print the array index next to each applet entry */
+			fprintf(outfile, " /* %i */ {\"%s\", %s, %s, %s},\n", i,
+#else
+			fprintf(outfile, "\t{\"%s\", %s, %s, %s},\n",
+#endif
+				applethash[i].name,
+				applethash[i].main,
+				applethash[i].location,
+				applethash[i].usage);
+		}
+
+	}
+
+	/* last, print out the footer */
+	fprintf(outfile, "\
+	{0,NULL,0,NULL}
+};
+");
+
+	fclose(outfile);
+
+	/*
+	 * cleanup
+	 */
+
+	free(applethash);
+
+	fprintf(stderr, " Done\n");
+
+	return 0;
+}
diff -urN busybox.virgin/mapplets.h busybox/mapplets.h
--- busybox.virgin/mapplets.h	Wed Dec 31 17:00:00 1969
+++ busybox/mapplets.h	Fri Oct 20 16:30:11 2000
@@ -0,0 +1,365 @@
+/*
+ * mapplets.h - A master listing of applets in BusyBox
+ */
+
+const struct BB_applet applets[] = {
+
+#ifdef BB_AR
+	{"ar", ar_main, _BB_DIR_USR_BIN, ar_usage},
+#endif
+#ifdef BB_BASENAME
+	{"basename", basename_main, _BB_DIR_USR_BIN, basename_usage},
+#endif
+	{"busybox", busybox_main, _BB_DIR_BIN, NULL},
+#ifdef BB_CAT
+	{"cat", cat_main, _BB_DIR_BIN, cat_usage},
+#endif
+#ifdef BB_CHMOD_CHOWN_CHGRP
+	{"chgrp", chmod_chown_chgrp_main, _BB_DIR_BIN, chgrp_usage},
+#endif
+#ifdef BB_CHMOD_CHOWN_CHGRP
+	{"chmod", chmod_chown_chgrp_main, _BB_DIR_BIN, chmod_usage},
+#endif
+#ifdef BB_CHMOD_CHOWN_CHGRP
+	{"chown", chmod_chown_chgrp_main, _BB_DIR_BIN, chown_usage},
+#endif
+#ifdef BB_CHROOT
+	{"chroot", chroot_main, _BB_DIR_USR_SBIN, chroot_usage},
+#endif
+#ifdef BB_CLEAR
+	{"clear", clear_main, _BB_DIR_USR_BIN, clear_usage},
+#endif
+#ifdef BB_CHVT
+	{"chvt", chvt_main, _BB_DIR_USR_BIN, chvt_usage},
+#endif
+#ifdef BB_CMP
+	{"cmp", cmp_main, _BB_DIR_USR_BIN, cmp_usage},
+#endif
+#ifdef BB_CP_MV
+	{"cp", cp_mv_main, _BB_DIR_BIN, cp_usage},
+#endif
+#ifdef BB_CUT
+	{"cut", cut_main, _BB_DIR_USR_BIN, cut_usage},
+#endif
+#ifdef BB_DATE
+	{"date", date_main, _BB_DIR_BIN, date_usage},
+#endif
+#ifdef BB_DC
+	{"dc", dc_main, _BB_DIR_USR_BIN, dc_usage},
+#endif
+#ifdef BB_DD
+	{"dd", dd_main, _BB_DIR_BIN, dd_usage},
+#endif
+#ifdef BB_DF
+	{"df", df_main, _BB_DIR_BIN, df_usage},
+#endif
+#ifdef BB_DIRNAME
+	{"dirname", dirname_main, _BB_DIR_USR_BIN, dirname_usage},
+#endif
+#ifdef BB_DMESG
+	{"dmesg", dmesg_main, _BB_DIR_BIN, dmesg_usage},
+#endif
+#ifdef BB_DOS2UNIX
+	{"dos2unix", dos2unix_main, _BB_DIR_USR_BIN, dos2unix_usage},
+#endif
+#ifdef BB_DU
+	{"du", du_main, _BB_DIR_USR_BIN, du_usage},
+#endif
+#ifdef BB_DUMPKMAP
+	{"dumpkmap", dumpkmap_main, _BB_DIR_BIN, dumpkmap_usage},
+#endif
+#ifdef BB_DUTMP
+	{"dutmp", dutmp_main, _BB_DIR_USR_SBIN, dutmp_usage},
+#endif
+#ifdef BB_ECHO
+	{"echo", echo_main, _BB_DIR_BIN, echo_usage},
+#endif
+#ifdef BB_EXPR
+	{"expr", expr_main, _BB_DIR_USR_BIN, expr_usage},
+#endif
+#ifdef BB_TRUE_FALSE
+	{"false", false_main, _BB_DIR_BIN, false_usage},
+#endif
+#ifdef BB_FBSET
+	{"fbset", fbset_main, _BB_DIR_USR_SBIN, NULL},
+#endif
+#ifdef BB_FDFLUSH
+	{"fdflush", fdflush_main, _BB_DIR_BIN, fdflush_usage},
+#endif
+#ifdef BB_FIND
+	{"find", find_main, _BB_DIR_USR_BIN, find_usage},
+#endif
+#ifdef BB_FREE
+	{"free", free_main, _BB_DIR_USR_BIN, free_usage},
+#endif
+#ifdef BB_FREERAMDISK
+	{"freeramdisk", freeramdisk_main, _BB_DIR_SBIN, freeramdisk_usage},
+#endif
+#ifdef BB_DEALLOCVT
+	{"deallocvt", deallocvt_main, _BB_DIR_USR_BIN, deallocvt_usage},
+#endif
+#ifdef BB_FSCK_MINIX
+	{"fsck.minix", fsck_minix_main, _BB_DIR_SBIN, fsck_minix_usage},
+#endif
+#ifdef BB_GETOPT
+	{"getopt", getopt_main, _BB_DIR_BIN, getopt_usage},
+#endif
+#ifdef BB_GREP
+	{"grep", grep_main, _BB_DIR_BIN, grep_usage},
+#endif
+#ifdef BB_GUNZIP
+	{"gunzip", gunzip_main, _BB_DIR_BIN, gunzip_usage},
+#endif
+#ifdef BB_GZIP
+	{"gzip", gzip_main, _BB_DIR_BIN, gzip_usage},
+#endif
+#ifdef BB_HALT
+	{"halt", halt_main, _BB_DIR_SBIN, halt_usage},
+#endif
+#ifdef BB_HEAD
+	{"head", head_main, _BB_DIR_USR_BIN, head_usage},
+#endif
+#ifdef BB_HOSTID
+	{"hostid", hostid_main, _BB_DIR_USR_BIN, hostid_usage},
+#endif
+#ifdef BB_HOSTNAME
+	{"hostname", hostname_main, _BB_DIR_BIN, hostname_usage},
+#endif
+#ifdef BB_ID
+	{"id", id_main, _BB_DIR_USR_BIN, id_usage},
+#endif
+#ifdef BB_INIT
+	{"init", init_main, _BB_DIR_SBIN, NULL},
+#endif
+#ifdef BB_INSMOD
+	{"insmod", insmod_main, _BB_DIR_SBIN, insmod_usage},
+#endif
+#ifdef BB_KILL
+	{"kill", kill_main, _BB_DIR_BIN, kill_usage},
+#endif
+#ifdef BB_KILLALL
+	{"killall", kill_main, _BB_DIR_USR_BIN, kill_usage},
+#endif
+#ifdef BB_LENGTH
+	{"length", length_main, _BB_DIR_USR_BIN, length_usage},
+#endif
+#ifdef BB_LINUXRC
+	{"linuxrc", init_main, _BB_DIR_ROOT, NULL},
+#endif
+#ifdef BB_LN
+	{"ln", ln_main, _BB_DIR_BIN, ln_usage},
+#endif
+#ifdef BB_LOADACM
+	{"loadacm", loadacm_main, _BB_DIR_USR_BIN, loadacm_usage},
+#endif
+#ifdef BB_LOADFONT
+	{"loadfont", loadfont_main, _BB_DIR_USR_BIN, loadfont_usage},
+#endif
+#ifdef BB_LOADKMAP
+	{"loadkmap", loadkmap_main, _BB_DIR_SBIN, loadkmap_usage},
+#endif
+#ifdef BB_LOGGER
+	{"logger", logger_main, _BB_DIR_USR_BIN, logger_usage},
+#endif
+#ifdef BB_LOGNAME
+	{"logname", logname_main, _BB_DIR_USR_BIN, logname_usage},
+#endif
+#ifdef BB_LS
+	{"ls", ls_main, _BB_DIR_BIN, ls_usage},
+#endif
+#ifdef BB_LSMOD
+	{"lsmod", lsmod_main, _BB_DIR_SBIN, lsmod_usage},
+#endif
+#ifdef BB_MAKEDEVS
+	{"makedevs", makedevs_main, _BB_DIR_SBIN, makedevs_usage},
+#endif
+#ifdef BB_MD5SUM
+	{"md5sum", md5sum_main, _BB_DIR_USR_BIN, md5sum_usage},
+#endif
+#ifdef BB_MKDIR
+	{"mkdir", mkdir_main, _BB_DIR_BIN, mkdir_usage},
+#endif
+#ifdef BB_MKFIFO
+	{"mkfifo", mkfifo_main, _BB_DIR_USR_BIN, mkfifo_usage},
+#endif
+#ifdef BB_MKFS_MINIX
+	{"mkfs.minix", mkfs_minix_main, _BB_DIR_SBIN, mkfs_minix_usage},
+#endif
+#ifdef BB_MKNOD
+	{"mknod", mknod_main, _BB_DIR_BIN, mknod_usage},
+#endif
+#ifdef BB_MKSWAP
+	{"mkswap", mkswap_main, _BB_DIR_SBIN, mkswap_usage},
+#endif
+#ifdef BB_MKTEMP
+	{"mktemp", mktemp_main, _BB_DIR_BIN, mktemp_usage},
+#endif
+#ifdef BB_NC
+	{"nc", nc_main, _BB_DIR_USR_BIN, nc_usage},
+#endif
+#ifdef BB_MORE
+	{"more", more_main, _BB_DIR_BIN, more_usage},
+#endif
+#ifdef BB_MOUNT
+	{"mount", mount_main, _BB_DIR_BIN, mount_usage},
+#endif
+#ifdef BB_MT
+	{"mt", mt_main, _BB_DIR_BIN, mt_usage},
+#endif
+#ifdef BB_CP_MV
+	{"mv", cp_mv_main, _BB_DIR_BIN, mv_usage},
+#endif
+#ifdef BB_NSLOOKUP
+	{"nslookup", nslookup_main, _BB_DIR_USR_BIN, nslookup_usage},
+#endif
+#ifdef BB_PING
+	{"ping", ping_main, _BB_DIR_BIN, ping_usage},
+#endif
+#ifdef BB_POWEROFF
+	{"poweroff", poweroff_main, _BB_DIR_SBIN, poweroff_usage},
+#endif
+#ifdef BB_PRINTF
+	{"printf", printf_main, _BB_DIR_USR_BIN, printf_usage},
+#endif
+#ifdef BB_PS
+	{"ps", ps_main, _BB_DIR_BIN, ps_usage},
+#endif
+#ifdef BB_PWD
+	{"pwd", pwd_main, _BB_DIR_BIN, pwd_usage},
+#endif
+#ifdef BB_RDATE
+	{"rdate", rdate_main, _BB_DIR_USR_BIN, rdate_usage},
+#endif
+#ifdef BB_READLINK
+	{"readlink", readlink_main, _BB_DIR_USR_BIN, readlink_usage},
+#endif
+#ifdef BB_REBOOT
+	{"reboot", reboot_main, _BB_DIR_SBIN, reboot_usage},
+#endif
+#ifdef BB_RENICE
+	{"renice", renice_main, _BB_DIR_USR_BIN, renice_usage},
+#endif
+#ifdef BB_RESET
+	{"reset", reset_main, _BB_DIR_USR_BIN, reset_usage},
+#endif
+#ifdef BB_RM
+	{"rm", rm_main, _BB_DIR_BIN, rm_usage},
+#endif
+#ifdef BB_RMDIR
+	{"rmdir", rmdir_main, _BB_DIR_BIN, rmdir_usage},
+#endif
+#ifdef BB_RMMOD
+	{"rmmod", rmmod_main, _BB_DIR_SBIN, rmmod_usage},
+#endif
+#ifdef BB_SED
+	{"sed", sed_main, _BB_DIR_BIN, sed_usage},
+#endif
+#ifdef BB_SETKEYCODES
+	{"setkeycodes", setkeycodes_main, _BB_DIR_USR_BIN, setkeycodes_usage},
+#endif
+#ifdef BB_SH
+	{"sh", shell_main, _BB_DIR_BIN, shell_usage},
+#endif
+#ifdef BB_SLEEP
+	{"sleep", sleep_main, _BB_DIR_BIN, sleep_usage},
+#endif
+#ifdef BB_SORT
+	{"sort", sort_main, _BB_DIR_USR_BIN, sort_usage},
+#endif
+#ifdef BB_SYNC
+	{"sync", sync_main, _BB_DIR_BIN, sync_usage},
+#endif
+#ifdef BB_SYSLOGD
+	{"syslogd", syslogd_main, _BB_DIR_SBIN, syslogd_usage},
+#endif
+#ifdef BB_SWAPONOFF
+	{"swapon", swap_on_off_main, _BB_DIR_SBIN, swapon_usage},
+#endif
+#ifdef BB_SWAPONOFF
+	{"swapoff", swap_on_off_main, _BB_DIR_SBIN, swapoff_usage},
+#endif
+#ifdef BB_TAIL
+	{"tail", tail_main, _BB_DIR_USR_BIN, tail_usage},
+#endif
+#ifdef BB_TAR
+	{"tar", tar_main, _BB_DIR_BIN, tar_usage},
+#endif
+#ifdef BB_TELNET
+	{"telnet", telnet_main, _BB_DIR_USR_BIN, telnet_usage},
+#endif
+#ifdef BB_TEST
+	{"test", test_main, _BB_DIR_USR_BIN, test_usage},
+#endif
+#ifdef BB_TEE
+	{"tee", tee_main, _BB_DIR_USR_BIN, tee_usage},
+#endif
+#ifdef BB_TOUCH
+	{"touch", touch_main, _BB_DIR_BIN, touch_usage},
+#endif
+#ifdef BB_TR
+	{"tr", tr_main, _BB_DIR_USR_BIN, tr_usage},
+#endif
+#ifdef BB_TRUE_FALSE
+	{"true", true_main, _BB_DIR_BIN, true_usage},
+#endif
+#ifdef BB_TTY
+	{"tty", tty_main, _BB_DIR_USR_BIN, tty_usage},
+#endif
+#ifdef BB_UMOUNT
+	{"umount", umount_main, _BB_DIR_BIN, umount_usage},
+#endif
+#ifdef BB_UNAME
+	{"uname", uname_main, _BB_DIR_BIN, uname_usage},
+#endif
+#ifdef BB_UNIQ
+	{"uniq", uniq_main, _BB_DIR_USR_BIN, uniq_usage},
+#endif
+#ifdef BB_UNIX2DOS
+	{"unix2dos", unix2dos_main, _BB_DIR_USR_BIN, unix2dos_usage},
+#endif
+#ifdef BB_UNRPM
+	{"unrpm", unrpm_main, _BB_DIR_USR_BIN, unrpm_usage},
+#endif
+#ifdef BB_UPDATE
+	{"update", update_main, _BB_DIR_SBIN, update_usage},
+#endif
+#ifdef BB_UPTIME
+	{"uptime", uptime_main, _BB_DIR_USR_BIN, uptime_usage},
+#endif
+#ifdef BB_UUENCODE
+	{"uuencode", uuencode_main, _BB_DIR_USR_BIN, uuencode_usage},
+#endif
+#ifdef BB_UUDECODE
+	{"uudecode", uudecode_main, _BB_DIR_USR_BIN, uudecode_usage},
+#endif
+#ifdef BB_USLEEP
+	{"usleep", usleep_main, _BB_DIR_BIN, usleep_usage},
+#endif
+#ifdef BB_WC
+	{"wc", wc_main, _BB_DIR_USR_BIN, wc_usage},
+#endif
+#ifdef BB_WGET
+	{"wget", wget_main, _BB_DIR_USR_BIN, wget_usage},
+#endif
+#ifdef BB_WHICH
+	{"which", which_main, _BB_DIR_USR_BIN, which_usage},
+#endif
+#ifdef BB_WHOAMI
+	{"whoami", whoami_main, _BB_DIR_USR_BIN, whoami_usage},
+#endif
+#ifdef BB_XARGS
+	{"xargs", xargs_main, _BB_DIR_USR_BIN, xargs_usage},
+#endif
+#ifdef BB_YES
+	{"yes", yes_main, _BB_DIR_USR_BIN, yes_usage},
+#endif
+#ifdef BB_GUNZIP
+	{"zcat", gunzip_main, _BB_DIR_BIN, gunzip_usage},
+#endif
+#ifdef BB_TEST
+	{"[", test_main, _BB_DIR_USR_BIN, test_usage},
+#endif
+	{0,NULL,0,NULL}
+};
+


More information about the busybox mailing list