[PATCH] awk: fix handling of literal backslashes in replacement

Yao Zi ziyao at disroot.org
Thu Nov 14 11:11:50 UTC 2024

According to POSIX standard, a backslash in the replacement of sub()
should be treated as a literal backslash if it is not preceded by a '&'
or another backslash. But busybox awk skips it unconditionally,
regardless of the following character. For example,

  $ echo "abc" | busybox awk 'sub(/abc/, "\\d")'

where \d is expected here. This is known to break rsync's documentation

Let's check the next character before skipping the backslash, following
POSIX standard and behavior of GNU awk.

Link: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html
Link: https://github.com/RsyncProject/rsync/blob/62bb9bba022ce6a29f8c92307d5569c338b2f711/help-from-md.awk#L22
Fixes: 5f84c5633 ("awk: fix backslash handling in sub() builtins")
Signed-off-by: Yao Zi <ziyao at disroot.org>
 editors/awk.c       | 7 ++++++-
 testsuite/awk.tests | 5 +++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/editors/awk.c b/editors/awk.c
index 64e752f4b..40f5ba7f7 100644
--- a/editors/awk.c
+++ b/editors/awk.c
@@ -2636,8 +2636,13 @@ static int awk_sub(node *rn, const char *repl, int nm, var *src, var *dest /*,in
 					resbuf = qrealloc(resbuf, residx + replen + n, &resbufsize);
 					memcpy(resbuf + residx, sp + pmatch[j].rm_so - start_ofs, n);
 					residx += n;
-				} else
+				} else {
+/* '\\' and '&' following a backslash keep its original meaning, any other
+ * occurrence of a '\\' should be treated as literal */
+					if (bslash && c != '\\' && c != '&')
+						resbuf[residx++] = '\\';
 					resbuf[residx++] = c;
+				}
 				bslash = 0;
diff --git a/testsuite/awk.tests b/testsuite/awk.tests
index be25f6696..61b3bc7d6 100755
--- a/testsuite/awk.tests
+++ b/testsuite/awk.tests
@@ -617,4 +617,9 @@ testing 'awk gsub erroneous word start match' \
 	'abc\n' \
 	'' ''
+testing 'awk sub literal backslash in replacement' \
+	'awk '$sq'sub(/abc/, "\\\d")'$sq \
+	'\d\n' \
+	'' 'abc\n'

More information about the busybox mailing list