[PATCH] ash: improve speed of variable pattern substitution
Alin Mr
almr.oss at outlook.com
Wed Jul 21 20:54:27 UTC 2021
Ron, thanks for your quick patch and the hints. I saw your message and was thinking through; fixed-length patterns don't need backtracking. '?' (without '*') also results in fixed-length patterns, so maybe that'd be more complicated to detect.
To motivate this a bit: it started with quoters. Since we have neither bash's ${x at Q}, nor `printf %q`, it's harder to escape shell code for eval, or to produce quoted SQL values **efficiently**.
I'm working on a shell logging wrapper (fakesh.sh: https://gitlab.com/kstr0k/f8ksh) which logs all shell invocations, with start / end timestamps. This came about because Kakoune (the editor) uses POSIX shell as an extension language (and so does most of its plugin ecosystem). So, hundreds of tiny shell calls / second.
As I expected, switching from dash to "applet-biased" busybox (STANDALONE + PREFER_APPLETS) makes a big difference. So I invested a bit of effort into making busybox available for users (I have a setup script that can compile, or download and extract directly from distro packages: https://gitlab.com/kstr0k/f8ksh/-/blob/master/setup-bbox).
fakesh.sh produces '<<EOF'-escaped log files that are also shell code (fastest to produce, but verbose); these are later parsed to SQL. The log converter (itself running under BusyBox) was pretty slow, so I switched to ${x///}. But this made the converter even slower. This is how I got here.
Busybox is an amazing tool. I've known it since my embedded days, but it turns out that, with builtin coreutils (no external calls), it can work as a very effective extension language. It would be nice to make even more applets nofork (e.g. sed is currently a "runner", and I understand why, but that's too bad for scripts that just use it to munge strings).
More information about the busybox
mailing list