[BusyBox 0000615]: sed convert 0x00 to 0x0a

bugs at busybox.net bugs at busybox.net
Sat Dec 2 17:59:02 UTC 2006


A NOTE has been added to this issue. 
====================================================================== 
http://busybox.net/bugs/view.php?id=615 
====================================================================== 
Reported By:                robang74
Assigned To:                BusyBox
====================================================================== 
Project:                    BusyBox
Issue ID:                   615
Category:                   Other
Reproducibility:            always
Severity:                   major
Priority:                   normal
Status:                     assigned
====================================================================== 
Date Submitted:             12-28-2005 01:26 PST
Last Modified:              12-02-2006 09:59 PST
====================================================================== 
Summary:                    sed convert 0x00 to 0x0a
Description: 
bash-3.00# dd if=/dev/zero bs=1k count=1 >/tmp/test
1+0 records in
1+0 records out
bash-3.00# echo ciao>>/tmp/test
bash-3.00# dd if=/dev/zero bs=1k count=1 >>/tmp/test
1+0 records in
1+0 records out
bash-3.00# cat /tmp/test | sed -e "s/ciao/miao/g" >/tmp/test2
bash-3.00# wc -l /tmp/test
      1 /tmp/test
bash-3.00# wc -l /tmp/test2
   2048 /tmp/test2
====================================================================== 

---------------------------------------------------------------------- 
 robang74 - 12-28-05 03:18  
---------------------------------------------------------------------- 
sed loose the last 0x00 if it exist:

bash-3.00# ls -al /tmp/test*
-rw-r--r--    1 0        0            2053 Dec 28 11:16 /tmp/test
-rw-r--r--    1 0        0            2052 Dec 28 11:16 /tmp/test2


AFTER PATCH:

/ # dd if=/dev/zero bs=1k count=1 >/tmp/test
1+0 records in
1+0 records out
/ # echo ciao>>/tmp/test
/ # dd if=/dev/zero bs=1k count=1 >>/tmp/test
1+0 records in
1+0 records out
/ # cat /tmp/test | sed -e "s/ciao/miao/g" >/tmp/test2
/ # wc -l /tmp/test2
      1 /tmp/test2
/ # wc -l /tmp/test
      1 /tmp/test
/ # ls -al /tmp/test*
-rw-r--r--    1 0        0            2053 Dec 28 11:17 /tmp/test
-rw-r--r--    1 0        0            2053 Dec 28 11:17 /tmp/test2 

---------------------------------------------------------------------- 
 landley - 01-01-06 22:47  
---------------------------------------------------------------------- 
Hmmm...  Tricky dealing with embedded nulls when C usually considers null
to be an end of string indicator.

Your fix causes problems in other contexts.  It was outputting a newline
because there are times when that is appropriate, and this would output a
null then.  Something like:

echo -n thingy > one
echo -n again > two
sed "s/i/z/" one two > three

The output should be "thzngy\nagazn" and I think it would be
"thzngy\0agazn".  Not that I've tested it just now.

I'm trying to get 1.1.0 out this friday.  Not sure I'll get to this before
then... 

---------------------------------------------------------------------- 
 robang74 - 01-02-06 01:25  
---------------------------------------------------------------------- 
patch n.2 fix the problem of newline in more files


[roberto at wsraf big]$ patch -p0 < sed_2.patch
patching file busybox-1.01/editors/sed.c
[roberto at wsraf busybox-1.01]$ make menuconfig && make
[roberto at wsraf busybox-1.01]$ echo -n thingy > one
[roberto at wsraf busybox-1.01]$ echo -n again > two
[roberto at wsraf busybox-1.01]$ sed "s/i/z/" one two > three
[roberto at wsraf busybox-1.01]$ ./busybox sed "s/i/z/" one two > four
[roberto at wsraf busybox-1.01_sed2]$ cat three
thzngy
agazn[roberto at wsraf busybox-1.01_sed2]$ cat four
thzngy
agazn[roberto at wsraf busybox-1.01_sed2]$ 

---------------------------------------------------------------------- 
 landley - 01-08-06 11:34  
---------------------------------------------------------------------- 
I tried a slightly cleaned up version of the second patch and it caused two
more busybox tests to fail.

To see the specific failures, cd testsuite and "./runtest -v sed".  I
might get around to looking at it some more today, if so I'll apply the
result and close out the bug.

Rob 

---------------------------------------------------------------------- 
 robang74 - 01-09-06 03:30  
---------------------------------------------------------------------- 
AFTER PATCH n.3 sed fails the same testa as before

[roberto at wsraf testsuite]$ ./runtest sed | grep FAIL
FAIL: sed s//g (exhaustive)
FAIL: sed n (flushes pattern space, terminates early)
FAIL: sed N (doesn't flush pattern space when terminating)

BUT passes the new one

[roberto at wsraf testsuite]$ ./runtest sed | grep binary
PASS: sed s onto a binary input (with zeros)

 

---------------------------------------------------------------------- 
 robang74 - 01-09-06 10:15  
---------------------------------------------------------------------- 
svn 13198 with sed_3 patch applied, defconfig

[roberto at wsraf busybox_sed3]$ size busybox
   text    data     bss     dec     hex filename
 236511    2220   28484  267215   413cf busybox

 svn 13201 original, defconfig

[roberto at wsraf busybox]$ size busybox
   text    data     bss     dec     hex filename
 236527    2156   28548  267231   413df busybox


 Between 13198 and 13201 size grows 15 bytes more tham sed_3 patch, 
at least on my WS but I have gcc 4.0 (good or bad, so it is). 

---------------------------------------------------------------------- 
 robang74 - 02-28-06 03:09  
---------------------------------------------------------------------- 
Patch n.4 applies to svn 14360, it comes in two flowers:

 norename: which is not include a variable rename s/lastchar/no_newline/g
 
 and the patch itself with the variable rename for improved code
readibility

Switching from the patch to the norename one is pretty simple
(automatic):

svn co ...
cp -af busybox busybox.orig

cat sed_4-svn14360.patch | \
sed -e "s/lastchar/no_newline/g" >sed_4-svn14360_tmp.patch

patch -p0 <sed_3-svn14360_tmp.patch
mv busybox busybox_sed4
cp -a busybox.orig busybox
diff -pru busybox busybox_sed4 >sed_4-svn14360_norename.patch 

---------------------------------------------------------------------- 
 landley - 03-01-06 07:49  
---------------------------------------------------------------------- 
Did you notice the change to get_line_from_file.c you reverted, and the
change to sed.c (and for that matter, sort.c) that matched that?  The way
you were appending the length at an arbitrary aligment would confuse
platforms that don't like unaligned access, plus you hard-wired in an
assumption that a pointer is 4 bytes long (not true for 64 bit platforms). 

---------------------------------------------------------------------- 
 landley - 03-01-06 07:59  
---------------------------------------------------------------------- 
I applied the .4-norename patch, reverted the get_line_from_file part, and
made the two line fix to have get_chunk_from_file() set len.  It built
fine, and then I ran the sed regression test suite.

Failed spectacularly.  It's appending an extra blank line to most things,
and in at least one case a garbage character... 

---------------------------------------------------------------------- 
 robang74 - 03-02-06 00:50  
---------------------------------------------------------------------- 
> Did you notice the change to get_line_from_file.c you reverted, and the
> change to sed.c (and for that matter, sort.c) that matched that?  The
way
> you were appending the length at an arbitrary aligment would confuse
> platforms that don't like unaligned access, plus you hard-wired in an
> assumption that a pointer is 4 bytes long (not true for 64 bit
platforms).


 not pointers but integer, anyway to fix 64 bit platform

 s/-6/-2-sizeof(int)/

 every platforms ignore what is wroten after \0 so appending information
after the null chars is ok for me. If am wrong it necessary to report
lastchar back to the caller... so change the API

 get_line_from_file( ..., &lastchar);

 


> I applied the .4-norename patch, reverted the get_line_from_file part,
and
> made the two line fix to have get_chunk_from_file() set len.  It built
> fine, and then I ran the sed regression test suite.
>
> Failed spectacularly.  It's appending an extra blank line to most
things,
> and in at least one case a garbage character... 


 I know. Because you have reverted the get_chunk_from_file() part.
 That is the reason because that part is needed.

 

---------------------------------------------------------------------- 
 robang74 - 03-02-06 01:44  
---------------------------------------------------------------------- 
PATCH n.5 resolve the get_line_from_file.c reverted issue: it takes
get_line_from_file.c as near as possible to the original one from svn
14360

 

---------------------------------------------------------------------- 
 robang74 - 03-03-06 00:44  
---------------------------------------------------------------------- 
PATCH n.6 as suggested by Rob does not need to modify get_line_from_file.c 

---------------------------------------------------------------------- 
 robang74 - 03-04-06 07:11  
---------------------------------------------------------------------- 
patch -p0 <busybox.14360_sed7_testsuite.patch
make allnoconfig
make menuconfig editors/sed -->YES
cd busybox/testsuite

[roberto at nbraf testsuite]$ ./sed.tests | grep FAIL | tail -n1
FAIL: sed zero support

[roberto at nbraf testsuite]$ ./sed.tests | grep FAIL | wc -l
7   <--- same 6 + 1 (my test s/ciao/miao/ on 000ciao\n000)

cd ../..
patch -p0 <busybox.14360_sed7.patch
cd busybox/testsuite
make -C ..

[roberto at nbraf testsuite]$ ./sed.tests | grep FAIL
FAIL: sed s//g (exhaustive)
FAIL: sed n (flushes pattern space, terminates early)
FAIL: sed N (doesn't flush pattern space when terminating)
FAIL: sed embedded NUL
FAIL: sed append autoinserts newline
FAIL: sed clusternewline

[roberto at nbraf testsuite]$ ./sed.tests | grep FAIL | wc -l
6


About FAIL: sed s//g (exhaustive) I have gone a step further:

old expected is           \nbang\nbang

new expected is           0bang0bang0

correct one would be      0bang0woo0

Now, sed supports zeros but it consider it as \n in substitutions 

---------------------------------------------------------------------- 
 robang74 - 03-04-06 09:09  
---------------------------------------------------------------------- 
PATCH n.8 solve the sed nul regresssion test failure

[roberto at nbraf testsuite]$ ./sed.tests | wc -l
41

[roberto at nbraf testsuite]$ ./sed.tests | grep FAIL | wc -l
5

[roberto at nbraf testsuite]$ ./sed.tests | grep FAIL
FAIL: sed s//g (exhaustive)
FAIL: sed n (flushes pattern space, terminates early)
FAIL: sed N (doesn't flush pattern space when terminating)
FAIL: sed append autoinserts newline
FAIL: sed clusternewline 

---------------------------------------------------------------------- 
 robang74 - 03-05-06 18:58  
---------------------------------------------------------------------- 
PATCH n9: currently \0 are not allowed in the command line

PASS: sed embedded NUL
PASS: sed embedded NUL g
sed: Unsupported command o
FAIL: sed NUL in command

 Please consider that \0 in command line is completely another thing than
\0 in the strings. I would consider this patch the last about \0 in the
strings. 

---------------------------------------------------------------------- 
 robang74 - 11-30-06 01:05  
---------------------------------------------------------------------- 
Before applaying patch:

[roberto at GEDX0327 testsuite]$ pwd; ./sed.tests 2>/dev/null | grep FAIL
/home/roberto/busybox/busybox-1.2.2.1/testsuite
FAIL: sed s//g (exhaustive)
FAIL: sed n (flushes pattern space, terminates early)
FAIL: sed N (doesn't flush pattern space when terminating)
FAIL: sed embedded NUL
FAIL: sed embedded NUL g
FAIL: sed NUL in command
FAIL: sed append autoinserts newline
FAIL: sed clusternewline
FAIL: sed nonexistent label
PASS: sed -i with no arg [GNUFAIL]

After applaying patch:

[roberto at GEDX0327 testsuite]$ pwd; ./sed.tests 2>/dev/null | grep FAIL
/home/roberto/busybox/busybox-1.2.2.1_sed/testsuite
FAIL: sed s//g (exhaustive)
FAIL: sed n (flushes pattern space, terminates early)
FAIL: sed N (doesn't flush pattern space when terminating)
FAIL: sed NUL in command
FAIL: sed append autoinserts newline
FAIL: sed clusternewline
FAIL: sed nonexistent label
PASS: sed -i with no arg [GNUFAIL]

Manual test after applaying patch

dd if=/dev/zero bs=1k count=1 >/tmp/test; echo ciao>>/tmp/test; dd
if=/dev/zero bs=1k count=1 >>/tmp/test; cat /tmp/test |
_install/bin/busybox sed -e "s/ciao/miao/g" >/tmp/test2; wc -l /tmp/test;
wc -l /tmp/test2; hexdump /tmp/test; hexdump /tmp/test2
entrati 1+0 record
usciti 1+0 record
entrati 1+0 record
usciti 1+0 record
1 /tmp/test
1 /tmp/test2
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
0000400 6963 6f61 000a 0000 0000 0000 0000 0000
0000410 0000 0000 0000 0000 0000 0000 0000 0000
*
0000800 0000 0000 0000
0000805
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
0000400 696d 6f61 000a 0000 0000 0000 0000 0000
0000410 0000 0000 0000 0000 0000 0000 0000 0000
*
0000800 0000 0000 0000
0000805 

---------------------------------------------------------------------- 
 robang74 - 11-30-06 04:13  
---------------------------------------------------------------------- 
busybox-20061130_sed.patch: apply to 20061130 snapshot version 

---------------------------------------------------------------------- 
 vda - 12-02-06 09:59  
---------------------------------------------------------------------- 
#!/bin/sh
{
dd if=/dev/zero bs=16 count=1 2>/dev/null
echo ciao
dd if=/dev/zero bs=16 count=1 2>/dev/null
} | ./busybox sed -e "s/ciao/miao/g" | hexdump -vC
{
dd if=/dev/zero bs=16 count=1 2>/dev/null
echo ciao
dd if=/dev/zero bs=16 count=1 2>/dev/null
} | /usr/bin/sed -e "s/ciao/miao/g" | hexdump -vC
echo ============
echo -n thingy >z1
echo -n again >z2
>znull
./busybox sed "s/i/z/" z1 z2 znull | hexdump -vC
/usr/bin/sed "s/i/z/" z1 z2 znull | hexdump -vC
echo ============
echo -ne "\0bang\0woo\0" | ./busybox sed -e 's/woo/bang/' | hexdump -vC
echo -ne "\0bang\0woo\0" | /usr/bin/sed -e 's/woo/bang/' | hexdump -vC

unpatched busybox:
00000000  0a 0a 0a 0a 0a 0a 0a 0a  0a 0a 0a 0a 0a 0a 0a 0a 
|................|
00000010  6d 69 61 6f 0a 0a 0a 0a  0a 0a 0a 0a 0a 0a 0a 0a 
|miao............|
00000020  0a 0a 0a 0a                                       |....|
00000024
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00 
|................|
00000010  6d 69 61 6f 0a 00 00 00  00 00 00 00 00 00 00 00 
|miao............|
00000020  00 00 00 00 00                                    |.....|
00000025
============
00000000  74 68 7a 6e 67 79 0a 61  67 61 7a 6e             
|thzngy.agazn|
0000000c
00000000  74 68 7a 6e 67 79 0a 61  67 61 7a 6e             
|thzngy.agazn|
0000000c
============
00000000  0a 62 61 6e 67 0a 62 61  6e 67                    |.bang.bang|
0000000a
00000000  00 62 61 6e 67 00 62 61  6e 67 00                 |.bang.bang.|
0000000b

Patched per revision 16754:

00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00 
|................|
00000010  6d 69 61 6f 0a 00 00 00  00 00 00 00 00 00 00 00 
|miao............|
00000020  00 00 00 00 00                                    |.....|
00000025
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00 
|................|
00000010  6d 69 61 6f 0a 00 00 00  00 00 00 00 00 00 00 00 
|miao............|
00000020  00 00 00 00 00                                    |.....|
00000025
============
00000000  74 68 7a 6e 67 79 61 67  61 7a 6e                 |thzngyagazn|
0000000b
00000000  74 68 7a 6e 67 79 0a 61  67 61 7a 6e             
|thzngy.agazn|
0000000c
============
00000000  00 62 61 6e 67 00 62 61  6e 67 00                 |.bang.bang.|
0000000b
00000000  00 62 61 6e 67 00 62 61  6e 67 00                 |.bang.bang.|
0000000b 

Issue History 
Date Modified   Username       Field                    Change               
====================================================================== 
12-28-05 01:26  robang74       New Issue                                    
12-28-05 01:26  robang74       Status                   new => assigned     
12-28-05 01:26  robang74       Assigned To               => BusyBox         
12-28-05 01:26  robang74       Issue Monitored: robang74                    
12-28-05 03:18  robang74       Note Added: 0000821                          
12-28-05 03:18  robang74       File Added: sed.diff                         
01-01-06 22:47  landley        Note Added: 0000835                          
01-02-06 01:25  robang74       Note Added: 0000836                          
01-02-06 01:25  robang74       File Added: sed_2.patch                      
01-08-06 11:34  landley        Note Added: 0000873                          
01-09-06 03:29  robang74       Note Added: 0000886                          
01-09-06 03:30  robang74       Note Edited: 0000886                         
01-09-06 03:54  robang74       File Added: sed_3.patch                      
01-09-06 10:15  robang74       Note Added: 0000890                          
02-28-06 03:05  robang74       File Added: sed_4-svn14360.patch                 
  
02-28-06 03:05  robang74       File Added: sed_4-svn14360_norename.patch        
           
02-28-06 03:09  robang74       Note Added: 0001148                          
03-01-06 07:49  landley        Note Added: 0001154                          
03-01-06 07:59  landley        Note Added: 0001155                          
03-02-06 00:33  robang74       Note Added: 0001156                          
03-02-06 00:50  robang74       Note Edited: 0001156                         
03-02-06 01:40  robang74       File Added: sed_5-svn14360.patch                 
  
03-02-06 01:41  robang74       File Added: sed_5-svn14360_norename.patch        
           
03-02-06 01:43  robang74       Note Added: 0001157                          
03-02-06 01:44  robang74       Note Edited: 0001157                         
03-03-06 00:43  robang74       File Added: sed_6-svn14360.patch                 
  
03-03-06 00:43  robang74       File Added: sed_6-svn14360_norename.patch        
           
03-03-06 00:44  robang74       Note Added: 0001160                          
03-04-06 07:11  robang74       Note Added: 0001161                          
03-04-06 07:12  robang74       File Added: busybox.14360_sed7_testsuite.patch   
                
03-04-06 07:12  robang74       File Added: busybox.14360_sed7.patch             
      
03-04-06 09:09  robang74       Note Added: 0001162                          
03-04-06 09:10  robang74       File Added: busybox.14360_sed8_testsuite.patch   
                
03-04-06 09:10  robang74       File Added: busybox.14360_sed8.patch             
      
03-05-06 18:57  robang74       File Added: busybox.14453_sed9.patch             
      
03-05-06 18:58  robang74       Note Added: 0001164                          
03-14-06 07:39  robang74       File Added: busybox.14536_sed9.patch             
      
11-30-06 01:05  robang74       Note Added: 0001841                          
11-30-06 01:07  robang74       File Added: busybox-1.2.2.1_sed.patch            
       
11-30-06 04:12  robang74       File Added: busybox-20061130_sed.patch           
        
11-30-06 04:13  robang74       Note Added: 0001843                          
12-02-06 09:59  vda            Note Added: 0001853                          
======================================================================




More information about the busybox-cvs mailing list