[BusyBox] busybox, sed, and zlib's configure script
Rob Landley
rob at landley.net
Mon Sep 22 12:09:52 UTC 2003
On Monday 22 September 2003 06:34, Glenn McGrath wrote:
> On Mon, 22 Sep 2003 07:06:04 -0400
>
> Rob Landley <rob at landley.net> wrote:
> > On Monday 22 September 2003 06:57, Glenn McGrath wrote:
> > > On Mon, 22 Sep 2003 04:16:31 -0400
> > >
> > > I put the newline warning in because i expected it to break
> > > something, but i couldnt think of any other way around the problem
> > > without rewriting regex handling. (the problem being libc's dont
> > > support pattern matching with a newline character as part of the
> > > pattern.
> >
> > I was thinking of going with an escape sequence more like \0xFF\0x01,
> > and then when adjusting the buffer replacing any actual 0xff with
> > \0xff\0x02. (Ugly and inefficient, but it wouldn't clash with any
> > incoming datastream and 0xff is extremely unlikely to occur in the
> > datastream anyway.) But I wanted to read through the rest of the code
> > first to make sure I understood the problem, and just didn't have time
> > this weekend...
>
> hmm, i had not thought of using 0xFF, but im not sure on the charcter
> set allowed in regular expressions.
Well, it doesn't matter what the character is. It's just that if you use
in-band signaling, whatever your escape character is must itself be escaped
if it occurs in the text. I just chose 0xFF as unlikely to occur normally in
the text, so the amount of gratuitous escaping can be kept to a minimum. But
if it does occur (or if somebody searches for it), then it needs to be
translated to the appropriate escape sequence and back.
Still, all this is working around a broken regex implementaiton, and if the
point of busybox is to stay small, lean, and simple...
If it can be made to work with uclibc, I say file a bug with glibc and move
on...
> > > > What's a test case that would fail because of this hack being
> > > > enabled?
> > >
> > > This works
> > >
> > > $ echo "\n" | sed 's/\\n/foo/'
> > > foo
> > > $ echo "\n" | ./busybox sed 's/\\n/foo/'
> > > \foo
> > >
> > >
> > > These dont
> > > $ echo "\n" | sed 's/\n/foo/'
> > > \n
> > > $ echo "\n" | ./busybox sed 's/\n/foo/'
> > > \foo
> > >
> > >
> > > $ echo "\n" | sed 's/\\\n/foo/'
> > > \n
> > > $ echo "\n" | ./busybox sed 's/\\\n/foo/'
> > > foo
> >
> > Seems like the behavior of all three differs between gnu and busybox
> > sed...?
>
> My bad, i didnt have the option enabled !
Now I'm confused: echo "\n" produces a backslash followed by the letter n.
You'd need -e to get it to interpret the sequence as a newline...
Why does this need special treatment? How would, say, \q be different?
The behavior of gnu sed in this regard is just ODD...
$ echo "\n" | sed 's/\n/foo/'
\n
$ echo "\n" | sed 's/\\n/foo/'
foo
$ echo "\q" | sed 's/\q/foo/'
\foo
$ echo "\q" | sed 's/\\q/foo/'
foo
$ echo -e "this\nis\na\ntest\n" | sed 's/\n/food/'
this
is
a
test
$ echo -e "this\nis\na\ntest\n" | sed 's/\\n/food/'
this
is
a
test
What are we attempting to accomplish, exactly?
> $ echo "\n" | ./busybox sed 's/\n/foo/'
> foo
>
> $ echo "\n" | ./busybox sed 's/\\n/foo/'
> foo
>
> $ echo "\n" | ./busybox sed 's/\\\n/foo/'
>
> (2 blank lines)
> $
>
>
> So its only the last one thats broken.
The first one doesn't match either. Didn't gnu sed produce "/n"...?
Rob
More information about the busybox
mailing list