[BusyBox 0000615]: sed convert 0x00 to 0x0a

bugs at busybox.net bugs at busybox.net
Thu Mar 2 09:43:41 UTC 2006


A NOTE has been added to this issue. 
====================================================================== 
http://busybox.net/bugs/view.php?id=615 
====================================================================== 
Reported By:                robang74
Assigned To:                BusyBox
====================================================================== 
Project:                    BusyBox
Issue ID:                   615
Category:                   Other
Reproducibility:            always
Severity:                   major
Priority:                   normal
Status:                     assigned
====================================================================== 
Date Submitted:             12-28-2005 01:26 PST
Last Modified:              03-02-2006 01:43 PST
====================================================================== 
Summary:                    sed convert 0x00 to 0x0a
Description: 
bash-3.00# dd if=/dev/zero bs=1k count=1 >/tmp/test
1+0 records in
1+0 records out
bash-3.00# echo ciao>>/tmp/test
bash-3.00# dd if=/dev/zero bs=1k count=1 >>/tmp/test
1+0 records in
1+0 records out
bash-3.00# cat /tmp/test | sed -e "s/ciao/miao/g" >/tmp/test2
bash-3.00# wc -l /tmp/test
      1 /tmp/test
bash-3.00# wc -l /tmp/test2
   2048 /tmp/test2
====================================================================== 

---------------------------------------------------------------------- 
 robang74 - 12-28-05 03:18  
---------------------------------------------------------------------- 
sed loose the last 0x00 if it exist:

bash-3.00# ls -al /tmp/test*
-rw-r--r--    1 0        0            2053 Dec 28 11:16 /tmp/test
-rw-r--r--    1 0        0            2052 Dec 28 11:16 /tmp/test2


AFTER PATCH:

/ # dd if=/dev/zero bs=1k count=1 >/tmp/test
1+0 records in
1+0 records out
/ # echo ciao>>/tmp/test
/ # dd if=/dev/zero bs=1k count=1 >>/tmp/test
1+0 records in
1+0 records out
/ # cat /tmp/test | sed -e "s/ciao/miao/g" >/tmp/test2
/ # wc -l /tmp/test2
      1 /tmp/test2
/ # wc -l /tmp/test
      1 /tmp/test
/ # ls -al /tmp/test*
-rw-r--r--    1 0        0            2053 Dec 28 11:17 /tmp/test
-rw-r--r--    1 0        0            2053 Dec 28 11:17 /tmp/test2 

---------------------------------------------------------------------- 
 landley - 01-01-06 22:47  
---------------------------------------------------------------------- 
Hmmm...  Tricky dealing with embedded nulls when C usually considers null
to be an end of string indicator.

Your fix causes problems in other contexts.  It was outputting a newline
because there are times when that is appropriate, and this would output a
null then.  Something like:

echo -n thingy > one
echo -n again > two
sed "s/i/z/" one two > three

The output should be "thzngy\nagazn" and I think it would be
"thzngy\0agazn".  Not that I've tested it just now.

I'm trying to get 1.1.0 out this friday.  Not sure I'll get to this before
then... 

---------------------------------------------------------------------- 
 robang74 - 01-02-06 01:25  
---------------------------------------------------------------------- 
patch n.2 fix the problem of newline in more files


[roberto at wsraf big]$ patch -p0 < sed_2.patch
patching file busybox-1.01/editors/sed.c
[roberto at wsraf busybox-1.01]$ make menuconfig && make
[roberto at wsraf busybox-1.01]$ echo -n thingy > one
[roberto at wsraf busybox-1.01]$ echo -n again > two
[roberto at wsraf busybox-1.01]$ sed "s/i/z/" one two > three
[roberto at wsraf busybox-1.01]$ ./busybox sed "s/i/z/" one two > four
[roberto at wsraf busybox-1.01_sed2]$ cat three
thzngy
agazn[roberto at wsraf busybox-1.01_sed2]$ cat four
thzngy
agazn[roberto at wsraf busybox-1.01_sed2]$ 

---------------------------------------------------------------------- 
 landley - 01-08-06 11:34  
---------------------------------------------------------------------- 
I tried a slightly cleaned up version of the second patch and it caused two
more busybox tests to fail.

To see the specific failures, cd testsuite and "./runtest -v sed".  I
might get around to looking at it some more today, if so I'll apply the
result and close out the bug.

Rob 

---------------------------------------------------------------------- 
 robang74 - 01-09-06 03:30  
---------------------------------------------------------------------- 
AFTER PATCH n.3 sed fails the same testa as before

[roberto at wsraf testsuite]$ ./runtest sed | grep FAIL
FAIL: sed s//g (exhaustive)
FAIL: sed n (flushes pattern space, terminates early)
FAIL: sed N (doesn't flush pattern space when terminating)

BUT passes the new one

[roberto at wsraf testsuite]$ ./runtest sed | grep binary
PASS: sed s onto a binary input (with zeros)

 

---------------------------------------------------------------------- 
 robang74 - 01-09-06 10:15  
---------------------------------------------------------------------- 
svn 13198 with sed_3 patch applied, defconfig

[roberto at wsraf busybox_sed3]$ size busybox
   text    data     bss     dec     hex filename
 236511    2220   28484  267215   413cf busybox

 svn 13201 original, defconfig

[roberto at wsraf busybox]$ size busybox
   text    data     bss     dec     hex filename
 236527    2156   28548  267231   413df busybox


 Between 13198 and 13201 size grows 15 bytes more tham sed_3 patch, 
at least on my WS but I have gcc 4.0 (good or bad, so it is). 

---------------------------------------------------------------------- 
 robang74 - 02-28-06 03:09  
---------------------------------------------------------------------- 
Patch n.4 applies to svn 14360, it comes in two flowers:

 norename: which is not include a variable rename s/lastchar/no_newline/g
 
 and the patch itself with the variable rename for improved code
readibility

Switching from the patch to the norename one is pretty simple
(automatic):

svn co ...
cp -af busybox busybox.orig

cat sed_4-svn14360.patch | \
sed -e "s/lastchar/no_newline/g" >sed_4-svn14360_tmp.patch

patch -p0 <sed_3-svn14360_tmp.patch
mv busybox busybox_sed4
cp -a busybox.orig busybox
diff -pru busybox busybox_sed4 >sed_4-svn14360_norename.patch 

---------------------------------------------------------------------- 
 landley - 03-01-06 07:49  
---------------------------------------------------------------------- 
Did you notice the change to get_line_from_file.c you reverted, and the
change to sed.c (and for that matter, sort.c) that matched that?  The way
you were appending the length at an arbitrary aligment would confuse
platforms that don't like unaligned access, plus you hard-wired in an
assumption that a pointer is 4 bytes long (not true for 64 bit platforms). 

---------------------------------------------------------------------- 
 landley - 03-01-06 07:59  
---------------------------------------------------------------------- 
I applied the .4-norename patch, reverted the get_line_from_file part, and
made the two line fix to have get_chunk_from_file() set len.  It built
fine, and then I ran the sed regression test suite.

Failed spectacularly.  It's appending an extra blank line to most things,
and in at least one case a garbage character... 

---------------------------------------------------------------------- 
 robang74 - 03-02-06 00:50  
---------------------------------------------------------------------- 
> Did you notice the change to get_line_from_file.c you reverted, and the
> change to sed.c (and for that matter, sort.c) that matched that?  The
way
> you were appending the length at an arbitrary aligment would confuse
> platforms that don't like unaligned access, plus you hard-wired in an
> assumption that a pointer is 4 bytes long (not true for 64 bit
platforms).


 not pointers but integer, anyway to fix 64 bit platform

 s/-6/-2-sizeof(int)/

 every platforms ignore what is wroten after \0 so appending information
after the null chars is ok for me. If am wrong it necessary to report
lastchar back to the caller... so change the API

 get_line_from_file( ..., &lastchar);

 


> I applied the .4-norename patch, reverted the get_line_from_file part,
and
> made the two line fix to have get_chunk_from_file() set len.  It built
> fine, and then I ran the sed regression test suite.
>
> Failed spectacularly.  It's appending an extra blank line to most
things,
> and in at least one case a garbage character... 


 I know. Because you have reverted the get_chunk_from_file() part.
 That is the reason because that part is needed.

 

---------------------------------------------------------------------- 
 robang74 - 03-02-06 01:43  
---------------------------------------------------------------------- 
PATCH n.5 resolve the get_line_from_file.c reverted issue: it takes
get_line_from_file.c as near as possible to the original from svn 14360 

Issue History 
Date Modified   Username       Field                    Change               
====================================================================== 
12-28-05 01:26  robang74       New Issue                                    
12-28-05 01:26  robang74       Status                   new => assigned     
12-28-05 01:26  robang74       Assigned To               => BusyBox         
12-28-05 01:26  robang74       Issue Monitored: robang74                    
12-28-05 03:18  robang74       Note Added: 0000821                          
12-28-05 03:18  robang74       File Added: sed.diff                         
01-01-06 22:47  landley        Note Added: 0000835                          
01-02-06 01:25  robang74       Note Added: 0000836                          
01-02-06 01:25  robang74       File Added: sed_2.patch                      
01-08-06 11:34  landley        Note Added: 0000873                          
01-09-06 03:29  robang74       Note Added: 0000886                          
01-09-06 03:30  robang74       Note Edited: 0000886                         
01-09-06 03:54  robang74       File Added: sed_3.patch                      
01-09-06 10:15  robang74       Note Added: 0000890                          
02-28-06 03:05  robang74       File Added: sed_4-svn14360.patch                 
  
02-28-06 03:05  robang74       File Added: sed_4-svn14360_norename.patch        
           
02-28-06 03:09  robang74       Note Added: 0001148                          
03-01-06 07:49  landley        Note Added: 0001154                          
03-01-06 07:59  landley        Note Added: 0001155                          
03-02-06 00:33  robang74       Note Added: 0001156                          
03-02-06 00:50  robang74       Note Edited: 0001156                         
03-02-06 01:40  robang74       File Added: sed_5-svn14360.patch                 
  
03-02-06 01:41  robang74       File Added: sed_5-svn14360_norename.patch        
           
03-02-06 01:43  robang74       Note Added: 0001157                          
======================================================================




More information about the busybox-cvs mailing list