[Bug 5090] New: sed and awk mishandle \b \< \B
bugzilla at busybox.net
bugzilla at busybox.net
Thu Apr 12 04:22:03 UTC 2012
https://bugs.busybox.net/show_bug.cgi?id=5090
Summary: sed and awk mishandle \b \< \B
Product: Busybox
Version: 1.19.x
Platform: PC
OS/Version: Linux
Status: NEW
Severity: minor
Priority: P5
Component: Standard Compliance
AssignedTo: unassigned at busybox.net
ReportedBy: dubiousjim at gmail.com
CC: busybox-cvs at busybox.net
Estimated Hours: 0.0
BusyBox 1.19.3, built against uClibc 0.9.32, on i686 Linux
Since this affects both sed and awk, perhaps it's an issue with uClibc.
However, it does not affect BusyBox egrep.
$ printf 'abcd efgh\n' | sed -n 's/\b[a-z]/<&>/pg'
Expected result: <a>bcd <e>fgh
Actual result: <a><b><c><d> <e><f><g><h>
$ printf 'abcd efgh\n' | sed -n 's/\<[a-z]/<&>/pg'
Expected result: <a>bcd <e>fgh
Actual result: <a><b><c><d> <e><f><g><h>
$ printf 'abcd efgh\n' | sed -n 's/\B[a-z]/<&>/pg'
Expected result: a<b><c><d> e<f><g><h>
Actual result: a<b>c<d> e<f>g<h> # misses the c and g
$ printf 'abcd efgh\n' | awk '{gsub(/\<[a-z]/,"<&>"); print $0}'
Expected result: <a>bcd <e>fgh
Actual result: <a><b><c><d> <e><f><g><h>
$ printf 'abcd efgh\n' | awk '{gsub(/\B[a-z]/,"<&>"); print $0}'
Expected result: a<b><c><d> e<f><g><h>
Actual result: a<b>c<d> e<f>g<h> # misses the c and g
The end-of-word elements all give the expected results:
$ printf 'abcd efgh\n' | sed -n 's/[a-z]\b/<&>/pg'
abc<d> efg<h>
$ printf 'abcd efgh\n' | sed -n 's/[a-z]\>/<&>/pg'
abc<d> efg<h>
$ printf 'abcd efgh\n' | sed -n 's/[a-z]\B/<&>/pg'
<a><b><c>d <e><f><g>h
$ printf 'abcd efgh\n' | awk '{gsub(/[a-z]\>/,"<&>"); print $0}'
abc<d> efg<h>
$ printf 'abcd efgh\n' | awk '{gsub(/[a-z]\B/,"<&>"); print $0}'
<a><b><c>d <e><f><g>h
--
Configure bugmail: https://bugs.busybox.net/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
More information about the busybox-cvs
mailing list