[PATCH] ash: improve speed of variable pattern substitution

Alin Mr almr.oss at outlook.com
Wed Jul 21 20:54:27 UTC 2021


Ron, thanks for your quick patch and the hints. I saw your message and was thinking through;  fixed-length patterns don't need backtracking. '?' (without '*') also results in fixed-length patterns, so maybe that'd be more complicated to detect.

To motivate this a bit: it started with quoters. Since we have neither bash's ${x at Q}​, nor `printf %q`, it's harder to escape shell code for eval, or to produce quoted SQL values **efficiently**.

I'm working on a shell logging wrapper (fakesh.sh: https://gitlab.com/kstr0k/f8ksh) which logs all shell invocations, with start / end timestamps. This came about because Kakoune (the editor) uses POSIX shell as an extension language (and so does most of its plugin ecosystem). So, hundreds of tiny shell calls / second.

As I expected, switching from dash to "applet-biased" busybox (STANDALONE + PREFER_APPLETS) makes a big difference. So I invested a bit of effort into making busybox available for users (I have a setup script that can compile, or download and extract directly from distro packages: https://gitlab.com/kstr0k/f8ksh/-/blob/master/setup-bbox).

fakesh.sh produces '<<EOF'-escaped log files that are also shell code (fastest to produce, but verbose); these are later parsed to SQL. The log converter (itself running under BusyBox) was pretty slow, so I switched to ${x///}. But this made the converter even slower. This is how I got here.

Busybox is an amazing tool. I've known it since my embedded days, but it turns out that, with builtin coreutils (no external calls), it can work as a very effective extension language. It would be nice to make even more applets nofork (e.g. sed is currently a "runner", and I understand why, but that's too bad for scripts that just use it to munge strings).


More information about the busybox mailing list