[BusyBox 0000615]: sed convert 0x00 to 0x0a
bugs at busybox.net
bugs at busybox.net
Sat Dec 2 20:14:17 UTC 2006
A NOTE has been added to this issue.
======================================================================
http://busybox.net/bugs/view.php?id=615
======================================================================
Reported By: robang74
Assigned To: BusyBox
======================================================================
Project: BusyBox
Issue ID: 615
Category: Other
Reproducibility: always
Severity: major
Priority: normal
Status: assigned
======================================================================
Date Submitted: 12-28-2005 01:26 PST
Last Modified: 12-02-2006 12:14 PST
======================================================================
Summary: sed convert 0x00 to 0x0a
Description:
bash-3.00# dd if=/dev/zero bs=1k count=1 >/tmp/test
1+0 records in
1+0 records out
bash-3.00# echo ciao>>/tmp/test
bash-3.00# dd if=/dev/zero bs=1k count=1 >>/tmp/test
1+0 records in
1+0 records out
bash-3.00# cat /tmp/test | sed -e "s/ciao/miao/g" >/tmp/test2
bash-3.00# wc -l /tmp/test
1 /tmp/test
bash-3.00# wc -l /tmp/test2
2048 /tmp/test2
======================================================================
----------------------------------------------------------------------
robang74 - 12-28-05 03:18
----------------------------------------------------------------------
sed loose the last 0x00 if it exist:
bash-3.00# ls -al /tmp/test*
-rw-r--r-- 1 0 0 2053 Dec 28 11:16 /tmp/test
-rw-r--r-- 1 0 0 2052 Dec 28 11:16 /tmp/test2
AFTER PATCH:
/ # dd if=/dev/zero bs=1k count=1 >/tmp/test
1+0 records in
1+0 records out
/ # echo ciao>>/tmp/test
/ # dd if=/dev/zero bs=1k count=1 >>/tmp/test
1+0 records in
1+0 records out
/ # cat /tmp/test | sed -e "s/ciao/miao/g" >/tmp/test2
/ # wc -l /tmp/test2
1 /tmp/test2
/ # wc -l /tmp/test
1 /tmp/test
/ # ls -al /tmp/test*
-rw-r--r-- 1 0 0 2053 Dec 28 11:17 /tmp/test
-rw-r--r-- 1 0 0 2053 Dec 28 11:17 /tmp/test2
----------------------------------------------------------------------
landley - 01-01-06 22:47
----------------------------------------------------------------------
Hmmm... Tricky dealing with embedded nulls when C usually considers null
to be an end of string indicator.
Your fix causes problems in other contexts. It was outputting a newline
because there are times when that is appropriate, and this would output a
null then. Something like:
echo -n thingy > one
echo -n again > two
sed "s/i/z/" one two > three
The output should be "thzngy\nagazn" and I think it would be
"thzngy\0agazn". Not that I've tested it just now.
I'm trying to get 1.1.0 out this friday. Not sure I'll get to this before
then...
----------------------------------------------------------------------
robang74 - 01-02-06 01:25
----------------------------------------------------------------------
patch n.2 fix the problem of newline in more files
[roberto at wsraf big]$ patch -p0 < sed_2.patch
patching file busybox-1.01/editors/sed.c
[roberto at wsraf busybox-1.01]$ make menuconfig && make
[roberto at wsraf busybox-1.01]$ echo -n thingy > one
[roberto at wsraf busybox-1.01]$ echo -n again > two
[roberto at wsraf busybox-1.01]$ sed "s/i/z/" one two > three
[roberto at wsraf busybox-1.01]$ ./busybox sed "s/i/z/" one two > four
[roberto at wsraf busybox-1.01_sed2]$ cat three
thzngy
agazn[roberto at wsraf busybox-1.01_sed2]$ cat four
thzngy
agazn[roberto at wsraf busybox-1.01_sed2]$
----------------------------------------------------------------------
landley - 01-08-06 11:34
----------------------------------------------------------------------
I tried a slightly cleaned up version of the second patch and it caused two
more busybox tests to fail.
To see the specific failures, cd testsuite and "./runtest -v sed". I
might get around to looking at it some more today, if so I'll apply the
result and close out the bug.
Rob
----------------------------------------------------------------------
robang74 - 01-09-06 03:30
----------------------------------------------------------------------
AFTER PATCH n.3 sed fails the same testa as before
[roberto at wsraf testsuite]$ ./runtest sed | grep FAIL
FAIL: sed s//g (exhaustive)
FAIL: sed n (flushes pattern space, terminates early)
FAIL: sed N (doesn't flush pattern space when terminating)
BUT passes the new one
[roberto at wsraf testsuite]$ ./runtest sed | grep binary
PASS: sed s onto a binary input (with zeros)
----------------------------------------------------------------------
robang74 - 01-09-06 10:15
----------------------------------------------------------------------
svn 13198 with sed_3 patch applied, defconfig
[roberto at wsraf busybox_sed3]$ size busybox
text data bss dec hex filename
236511 2220 28484 267215 413cf busybox
svn 13201 original, defconfig
[roberto at wsraf busybox]$ size busybox
text data bss dec hex filename
236527 2156 28548 267231 413df busybox
Between 13198 and 13201 size grows 15 bytes more tham sed_3 patch,
at least on my WS but I have gcc 4.0 (good or bad, so it is).
----------------------------------------------------------------------
robang74 - 02-28-06 03:09
----------------------------------------------------------------------
Patch n.4 applies to svn 14360, it comes in two flowers:
norename: which is not include a variable rename s/lastchar/no_newline/g
and the patch itself with the variable rename for improved code
readibility
Switching from the patch to the norename one is pretty simple
(automatic):
svn co ...
cp -af busybox busybox.orig
cat sed_4-svn14360.patch | \
sed -e "s/lastchar/no_newline/g" >sed_4-svn14360_tmp.patch
patch -p0 <sed_3-svn14360_tmp.patch
mv busybox busybox_sed4
cp -a busybox.orig busybox
diff -pru busybox busybox_sed4 >sed_4-svn14360_norename.patch
----------------------------------------------------------------------
landley - 03-01-06 07:49
----------------------------------------------------------------------
Did you notice the change to get_line_from_file.c you reverted, and the
change to sed.c (and for that matter, sort.c) that matched that? The way
you were appending the length at an arbitrary aligment would confuse
platforms that don't like unaligned access, plus you hard-wired in an
assumption that a pointer is 4 bytes long (not true for 64 bit platforms).
----------------------------------------------------------------------
landley - 03-01-06 07:59
----------------------------------------------------------------------
I applied the .4-norename patch, reverted the get_line_from_file part, and
made the two line fix to have get_chunk_from_file() set len. It built
fine, and then I ran the sed regression test suite.
Failed spectacularly. It's appending an extra blank line to most things,
and in at least one case a garbage character...
----------------------------------------------------------------------
robang74 - 03-02-06 00:50
----------------------------------------------------------------------
> Did you notice the change to get_line_from_file.c you reverted, and the
> change to sed.c (and for that matter, sort.c) that matched that? The
way
> you were appending the length at an arbitrary aligment would confuse
> platforms that don't like unaligned access, plus you hard-wired in an
> assumption that a pointer is 4 bytes long (not true for 64 bit
platforms).
not pointers but integer, anyway to fix 64 bit platform
s/-6/-2-sizeof(int)/
every platforms ignore what is wroten after \0 so appending information
after the null chars is ok for me. If am wrong it necessary to report
lastchar back to the caller... so change the API
get_line_from_file( ..., &lastchar);
> I applied the .4-norename patch, reverted the get_line_from_file part,
and
> made the two line fix to have get_chunk_from_file() set len. It built
> fine, and then I ran the sed regression test suite.
>
> Failed spectacularly. It's appending an extra blank line to most
things,
> and in at least one case a garbage character...
I know. Because you have reverted the get_chunk_from_file() part.
That is the reason because that part is needed.
----------------------------------------------------------------------
robang74 - 03-02-06 01:44
----------------------------------------------------------------------
PATCH n.5 resolve the get_line_from_file.c reverted issue: it takes
get_line_from_file.c as near as possible to the original one from svn
14360
----------------------------------------------------------------------
robang74 - 03-03-06 00:44
----------------------------------------------------------------------
PATCH n.6 as suggested by Rob does not need to modify get_line_from_file.c
----------------------------------------------------------------------
robang74 - 03-04-06 07:11
----------------------------------------------------------------------
patch -p0 <busybox.14360_sed7_testsuite.patch
make allnoconfig
make menuconfig editors/sed -->YES
cd busybox/testsuite
[roberto at nbraf testsuite]$ ./sed.tests | grep FAIL | tail -n1
FAIL: sed zero support
[roberto at nbraf testsuite]$ ./sed.tests | grep FAIL | wc -l
7 <--- same 6 + 1 (my test s/ciao/miao/ on 000ciao\n000)
cd ../..
patch -p0 <busybox.14360_sed7.patch
cd busybox/testsuite
make -C ..
[roberto at nbraf testsuite]$ ./sed.tests | grep FAIL
FAIL: sed s//g (exhaustive)
FAIL: sed n (flushes pattern space, terminates early)
FAIL: sed N (doesn't flush pattern space when terminating)
FAIL: sed embedded NUL
FAIL: sed append autoinserts newline
FAIL: sed clusternewline
[roberto at nbraf testsuite]$ ./sed.tests | grep FAIL | wc -l
6
About FAIL: sed s//g (exhaustive) I have gone a step further:
old expected is \nbang\nbang
new expected is 0bang0bang0
correct one would be 0bang0woo0
Now, sed supports zeros but it consider it as \n in substitutions
----------------------------------------------------------------------
robang74 - 03-04-06 09:09
----------------------------------------------------------------------
PATCH n.8 solve the sed nul regresssion test failure
[roberto at nbraf testsuite]$ ./sed.tests | wc -l
41
[roberto at nbraf testsuite]$ ./sed.tests | grep FAIL | wc -l
5
[roberto at nbraf testsuite]$ ./sed.tests | grep FAIL
FAIL: sed s//g (exhaustive)
FAIL: sed n (flushes pattern space, terminates early)
FAIL: sed N (doesn't flush pattern space when terminating)
FAIL: sed append autoinserts newline
FAIL: sed clusternewline
----------------------------------------------------------------------
robang74 - 03-05-06 18:58
----------------------------------------------------------------------
PATCH n9: currently \0 are not allowed in the command line
PASS: sed embedded NUL
PASS: sed embedded NUL g
sed: Unsupported command o
FAIL: sed NUL in command
Please consider that \0 in command line is completely another thing than
\0 in the strings. I would consider this patch the last about \0 in the
strings.
----------------------------------------------------------------------
robang74 - 11-30-06 01:05
----------------------------------------------------------------------
Before applaying patch:
[roberto at GEDX0327 testsuite]$ pwd; ./sed.tests 2>/dev/null | grep FAIL
/home/roberto/busybox/busybox-1.2.2.1/testsuite
FAIL: sed s//g (exhaustive)
FAIL: sed n (flushes pattern space, terminates early)
FAIL: sed N (doesn't flush pattern space when terminating)
FAIL: sed embedded NUL
FAIL: sed embedded NUL g
FAIL: sed NUL in command
FAIL: sed append autoinserts newline
FAIL: sed clusternewline
FAIL: sed nonexistent label
PASS: sed -i with no arg [GNUFAIL]
After applaying patch:
[roberto at GEDX0327 testsuite]$ pwd; ./sed.tests 2>/dev/null | grep FAIL
/home/roberto/busybox/busybox-1.2.2.1_sed/testsuite
FAIL: sed s//g (exhaustive)
FAIL: sed n (flushes pattern space, terminates early)
FAIL: sed N (doesn't flush pattern space when terminating)
FAIL: sed NUL in command
FAIL: sed append autoinserts newline
FAIL: sed clusternewline
FAIL: sed nonexistent label
PASS: sed -i with no arg [GNUFAIL]
Manual test after applaying patch
dd if=/dev/zero bs=1k count=1 >/tmp/test; echo ciao>>/tmp/test; dd
if=/dev/zero bs=1k count=1 >>/tmp/test; cat /tmp/test |
_install/bin/busybox sed -e "s/ciao/miao/g" >/tmp/test2; wc -l /tmp/test;
wc -l /tmp/test2; hexdump /tmp/test; hexdump /tmp/test2
entrati 1+0 record
usciti 1+0 record
entrati 1+0 record
usciti 1+0 record
1 /tmp/test
1 /tmp/test2
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
0000400 6963 6f61 000a 0000 0000 0000 0000 0000
0000410 0000 0000 0000 0000 0000 0000 0000 0000
*
0000800 0000 0000 0000
0000805
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
0000400 696d 6f61 000a 0000 0000 0000 0000 0000
0000410 0000 0000 0000 0000 0000 0000 0000 0000
*
0000800 0000 0000 0000
0000805
----------------------------------------------------------------------
robang74 - 11-30-06 04:13
----------------------------------------------------------------------
busybox-20061130_sed.patch: apply to 20061130 snapshot version
----------------------------------------------------------------------
vda - 12-02-06 09:59
----------------------------------------------------------------------
#!/bin/sh
{
dd if=/dev/zero bs=16 count=1 2>/dev/null
echo ciao
dd if=/dev/zero bs=16 count=1 2>/dev/null
} | ./busybox sed -e "s/ciao/miao/g" | hexdump -vC
{
dd if=/dev/zero bs=16 count=1 2>/dev/null
echo ciao
dd if=/dev/zero bs=16 count=1 2>/dev/null
} | /usr/bin/sed -e "s/ciao/miao/g" | hexdump -vC
echo ============
echo -n thingy >z1
echo -n again >z2
>znull
./busybox sed "s/i/z/" z1 z2 znull | hexdump -vC
/usr/bin/sed "s/i/z/" z1 z2 znull | hexdump -vC
echo ============
echo -ne "\0bang\0woo\0" | ./busybox sed -e 's/woo/bang/' | hexdump -vC
echo -ne "\0bang\0woo\0" | /usr/bin/sed -e 's/woo/bang/' | hexdump -vC
unpatched busybox:
00000000 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a
|................|
00000010 6d 69 61 6f 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a
|miao............|
00000020 0a 0a 0a 0a |....|
00000024
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
|................|
00000010 6d 69 61 6f 0a 00 00 00 00 00 00 00 00 00 00 00
|miao............|
00000020 00 00 00 00 00 |.....|
00000025
============
00000000 74 68 7a 6e 67 79 0a 61 67 61 7a 6e
|thzngy.agazn|
0000000c
00000000 74 68 7a 6e 67 79 0a 61 67 61 7a 6e
|thzngy.agazn|
0000000c
============
00000000 0a 62 61 6e 67 0a 62 61 6e 67 |.bang.bang|
0000000a
00000000 00 62 61 6e 67 00 62 61 6e 67 00 |.bang.bang.|
0000000b
Patched per revision 16754:
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
|................|
00000010 6d 69 61 6f 0a 00 00 00 00 00 00 00 00 00 00 00
|miao............|
00000020 00 00 00 00 00 |.....|
00000025
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
|................|
00000010 6d 69 61 6f 0a 00 00 00 00 00 00 00 00 00 00 00
|miao............|
00000020 00 00 00 00 00 |.....|
00000025
============
00000000 74 68 7a 6e 67 79 61 67 61 7a 6e |thzngyagazn|
0000000b
00000000 74 68 7a 6e 67 79 0a 61 67 61 7a 6e
|thzngy.agazn|
0000000c
============
00000000 00 62 61 6e 67 00 62 61 6e 67 00 |.bang.bang.|
0000000b
00000000 00 62 61 6e 67 00 62 61 6e 67 00 |.bang.bang.|
0000000b
----------------------------------------------------------------------
vda - 12-02-06 12:14
----------------------------------------------------------------------
Revision 16770 fixes that issue too. Tested with:
#!/bin/sh
function tst() {
{
dd if=/dev/zero bs=16 count=1 2>/dev/null
echo ciao
dd if=/dev/zero bs=16 count=1 2>/dev/null
} | $1 -e "s/ciao/miao/g" | hexdump -vC
echo ============
echo -n thingy >z1
echo -n thingy >z1
echo -ne "thingy\0" >z3
echo -ne "\0" >zzero
>znull
$1 "s/i/z/" z1 z2 z3 z1 znull z1 zzero zzero znull znull z1 | hexdump
-vC
echo ============
echo -ne "\0bang\0woo\0" | $1 -e 's/woo/bang/' | hexdump -vC
}
tst "./busybox sed" >x1
tst "/usr/bin/sed" >x2
diff -u x1 x2 >x.diff || { echo Different!; sleep 2; }
Can I close ticket now?
Issue History
Date Modified Username Field Change
======================================================================
12-28-05 01:26 robang74 New Issue
12-28-05 01:26 robang74 Status new => assigned
12-28-05 01:26 robang74 Assigned To => BusyBox
12-28-05 01:26 robang74 Issue Monitored: robang74
12-28-05 03:18 robang74 Note Added: 0000821
12-28-05 03:18 robang74 File Added: sed.diff
01-01-06 22:47 landley Note Added: 0000835
01-02-06 01:25 robang74 Note Added: 0000836
01-02-06 01:25 robang74 File Added: sed_2.patch
01-08-06 11:34 landley Note Added: 0000873
01-09-06 03:29 robang74 Note Added: 0000886
01-09-06 03:30 robang74 Note Edited: 0000886
01-09-06 03:54 robang74 File Added: sed_3.patch
01-09-06 10:15 robang74 Note Added: 0000890
02-28-06 03:05 robang74 File Added: sed_4-svn14360.patch
02-28-06 03:05 robang74 File Added: sed_4-svn14360_norename.patch
02-28-06 03:09 robang74 Note Added: 0001148
03-01-06 07:49 landley Note Added: 0001154
03-01-06 07:59 landley Note Added: 0001155
03-02-06 00:33 robang74 Note Added: 0001156
03-02-06 00:50 robang74 Note Edited: 0001156
03-02-06 01:40 robang74 File Added: sed_5-svn14360.patch
03-02-06 01:41 robang74 File Added: sed_5-svn14360_norename.patch
03-02-06 01:43 robang74 Note Added: 0001157
03-02-06 01:44 robang74 Note Edited: 0001157
03-03-06 00:43 robang74 File Added: sed_6-svn14360.patch
03-03-06 00:43 robang74 File Added: sed_6-svn14360_norename.patch
03-03-06 00:44 robang74 Note Added: 0001160
03-04-06 07:11 robang74 Note Added: 0001161
03-04-06 07:12 robang74 File Added: busybox.14360_sed7_testsuite.patch
03-04-06 07:12 robang74 File Added: busybox.14360_sed7.patch
03-04-06 09:09 robang74 Note Added: 0001162
03-04-06 09:10 robang74 File Added: busybox.14360_sed8_testsuite.patch
03-04-06 09:10 robang74 File Added: busybox.14360_sed8.patch
03-05-06 18:57 robang74 File Added: busybox.14453_sed9.patch
03-05-06 18:58 robang74 Note Added: 0001164
03-14-06 07:39 robang74 File Added: busybox.14536_sed9.patch
11-30-06 01:05 robang74 Note Added: 0001841
11-30-06 01:07 robang74 File Added: busybox-1.2.2.1_sed.patch
11-30-06 04:12 robang74 File Added: busybox-20061130_sed.patch
11-30-06 04:13 robang74 Note Added: 0001843
12-02-06 09:59 vda Note Added: 0001853
12-02-06 10:00 vda File Added: sed_rev16754.patch
12-02-06 12:14 vda Note Added: 0001854
======================================================================
More information about the busybox-cvs
mailing list