[PATCH] awk: fix handling of literal backslashes in replacement

Yao Zi ziyao at disroot.org
Thu Nov 14 11:11:50 UTC 2024


According to POSIX standard, a backslash in the replacement of sub()
should be treated as a literal backslash if it is not preceded by a '&'
or another backslash. But busybox awk skips it unconditionally,
regardless of the following character. For example,

  $ echo "abc" | busybox awk 'sub(/abc/, "\\d")'
  d

where \d is expected here. This is known to break rsync's documentation
converter.

Let's check the next character before skipping the backslash, following
POSIX standard and behavior of GNU awk.

Link: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html
Link: https://github.com/RsyncProject/rsync/blob/62bb9bba022ce6a29f8c92307d5569c338b2f711/help-from-md.awk#L22
Fixes: 5f84c5633 ("awk: fix backslash handling in sub() builtins")
Signed-off-by: Yao Zi <ziyao at disroot.org>
---
 editors/awk.c       | 7 ++++++-
 testsuite/awk.tests | 5 +++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/editors/awk.c b/editors/awk.c
index 64e752f4b..40f5ba7f7 100644
--- a/editors/awk.c
+++ b/editors/awk.c
@@ -2636,8 +2636,13 @@ static int awk_sub(node *rn, const char *repl, int nm, var *src, var *dest /*,in
 					resbuf = qrealloc(resbuf, residx + replen + n, &resbufsize);
 					memcpy(resbuf + residx, sp + pmatch[j].rm_so - start_ofs, n);
 					residx += n;
-				} else
+				} else {
+/* '\\' and '&' following a backslash keep its original meaning, any other
+ * occurrence of a '\\' should be treated as literal */
+					if (bslash && c != '\\' && c != '&')
+						resbuf[residx++] = '\\';
 					resbuf[residx++] = c;
+				}
 				bslash = 0;
 			}
 		}
diff --git a/testsuite/awk.tests b/testsuite/awk.tests
index be25f6696..61b3bc7d6 100755
--- a/testsuite/awk.tests
+++ b/testsuite/awk.tests
@@ -617,4 +617,9 @@ testing 'awk gsub erroneous word start match' \
 	'abc\n' \
 	'' ''
 
+testing 'awk sub literal backslash in replacement' \
+	'awk '$sq'sub(/abc/, "\\\d")'$sq \
+	'\d\n' \
+	'' 'abc\n'
+
 exit $FAILCOUNT
-- 
2.47.0



More information about the busybox mailing list