[BusyBox] busybox, sed, and zlib's configure script

Rob Landley rob at landley.net
Mon Sep 22 12:09:52 UTC 2003


On Monday 22 September 2003 06:34, Glenn McGrath wrote:
> On Mon, 22 Sep 2003 07:06:04 -0400
>
> Rob Landley <rob at landley.net> wrote:
> > On Monday 22 September 2003 06:57, Glenn McGrath wrote:
> > > On Mon, 22 Sep 2003 04:16:31 -0400
> > >
> > > I put the newline warning in because i expected it to break
> > > something, but i couldnt think of any other way around the problem
> > > without rewriting regex handling. (the problem being libc's dont
> > > support pattern matching with a newline character as part of the
> > > pattern.
> >
> > I was thinking of going with an escape sequence more like \0xFF\0x01,
> > and then when adjusting the buffer replacing any actual 0xff with
> > \0xff\0x02.  (Ugly and inefficient, but it wouldn't clash with any
> > incoming datastream and 0xff is extremely unlikely to occur in the
> > datastream anyway.)  But I wanted to read through the rest of the code
> > first to make sure I understood the problem, and just didn't have time
> > this weekend...
>
> hmm, i had not thought of using 0xFF, but im not sure on the charcter
> set allowed in regular expressions.

Well, it doesn't matter what the character is.  It's just that if you use 
in-band signaling, whatever your escape character is must itself be escaped 
if it occurs in the text.  I just chose 0xFF as unlikely to occur normally in 
the text, so the amount of gratuitous escaping can be kept to a minimum.  But 
if it does occur (or if somebody searches for it), then it needs to be 
translated to the appropriate escape sequence and back.

Still, all this is working around a broken regex implementaiton, and if the 
point of busybox is to stay small, lean, and simple...

If it can be made to work with uclibc, I say file a bug with glibc and move 
on...


> > > > What's a test case that would fail because of this hack being
> > > > enabled?
> > >
> > > This works
> > >
> > > $ echo "\n" | sed 's/\\n/foo/'
> > > foo
> > > $ echo "\n" | ./busybox sed 's/\\n/foo/'
> > > \foo
> > >
> > >
> > > These dont
> > > $ echo "\n" | sed 's/\n/foo/'
> > > \n
> > > $ echo "\n" | ./busybox sed 's/\n/foo/'
> > > \foo
> > >
> > >
> > > $ echo "\n" | sed 's/\\\n/foo/'
> > > \n
> > > $ echo "\n" | ./busybox sed 's/\\\n/foo/'
> > > foo
> >
> > Seems like the behavior of all three differs between gnu and busybox
> > sed...?
>
> My bad, i didnt have the option enabled !

Now I'm confused: echo "\n" produces a backslash followed by the letter n.  
You'd need -e to get it to interpret the sequence as a newline...

Why does this need special treatment?  How would, say, \q be different?

The behavior of gnu sed in this regard is just ODD...

$ echo "\n" | sed 's/\n/foo/'
\n
$ echo "\n" | sed 's/\\n/foo/'
foo
$ echo "\q" | sed 's/\q/foo/'
\foo
$ echo "\q" | sed 's/\\q/foo/'
foo
$ echo -e "this\nis\na\ntest\n" | sed 's/\n/food/'
this
is
a
test
$ echo -e "this\nis\na\ntest\n" | sed 's/\\n/food/'
this
is
a
test

What are we attempting to accomplish, exactly?

> $ echo "\n" | ./busybox sed 's/\n/foo/'
> foo
>
> $ echo "\n" | ./busybox sed 's/\\n/foo/'
> foo
>
> $ echo "\n" | ./busybox sed 's/\\\n/foo/'
>
> (2 blank lines)
> $
>
>
> So its only the last one thats broken.

The first one doesn't match either.  Didn't gnu sed produce "/n"...?

Rob



More information about the busybox mailing list