[Bug 12981] awk: seems to try to falsely expand strings

Bernhard Reutner-Fischer rep.dot.nop at gmail.com
Sat Jun 6 00:17:09 UTC 2020


On Fri, 05 Jun 2020 19:50:07 +0000
bugzilla at busybox.net wrote:

> https://bugs.busybox.net/show_bug.cgi?id=12981
> 
> --- Comment #3 from Steffen Nurpmeso <steffen at sdaoden.eu> ---
> The nawk bug is already fixed, heh. Luckily i saved away the script, another
> version did not reveal the crash there.  The final version is also entirely
> different.  I mean hey, i came to bed at half past five in the morning, ok.

eh ;)

> But yes i indeed can.  Here mawk, mawk-debian, nawk, gawk:

That's great, thanks alot for this simplification!

> 
>  #?0|kent:tmp$ </dev/null awk -v i=1 'BEGIN {print "hey fish " ++i "."}'
>  hey fish 2.
> 
> Compared to
> 
>  #?0|sdaoden:steffen$ </dev/null awk -v i=1 'BEGIN {print "hey fish " ++i "."}'
>  01.
> 
> Any why has it to be minimal, i never found a test series for busybox (i looked

I didn't say minimal. I said "small, self-contained".

In your bug-report, you referenced a
IN="${SRCDIR}"su/gen-errors.h
which you did not attach. Although i could probably construct an input
that exposes the alleged misbehaviour (or maybe i would not be able to
if it was another kind of bug as we're dealing with here) i would have
to guess. You did not provide a readily testable reproducer for us to
debug. We'd have to guess or suspect the problem to be obscured by an
overly complex, incomplete testcase. This is very discouraging a
situation if you just have a couple of minute to read through a report
while, let's say, you are in the bus on your way home after eight or
thirteen hours of work.

> out for one years ago)?
> Just asking..

We indeed have a testsuite. See testsuite/
as well as
$ make help
$ make check
and, specifically
testsuite/awk.tests

We've had a testsuite since a long time now. An internal one. We
usually test some prominent cases with external testsuites. We used to
(and still some of us do) build our setups with busybox alone so the
most visible stuff supposedly works fine. In order not to regress we
strive to flesh out the internal testsuite. Except for network related
stuff which would usually be rather complicated to regtest in our
current testsuite due to it's simplicity.

> 
> I have asked Aharon Robbins for that "++i should be (++i)" of yours.

Name dropping usually doesn't work for me for i know nothing, but sure,
if he knows awk land i'm all ears -- whoever that may be.. Please
excuse my ignorance.

> He kindly proved me wrong regarding my boolean use of getline() not getline()>0
> already, i had to fix that in several places.  If he responds to support my
> prefix/postfix in/decrement has highest priority .. i'll be back :-)

looking forward to that.
> 
> A nice weekend everybody i (who hates web interfaces like this, i wished i
> could simply respond via email) wish from Germany,

web interfaces are great to not forget stuff, but sure,
let's just take this to the list for a wider audience and easier
interaction for both of us.

Folks, recap.
So, in this bug-reoprt what we have is:
- there is no awk standard (that i'm aware of at least; SuS is
  silent)
- awk has no (well) defined string-concatination precedence
- some (well) documented awk implementations explicitly warn
  script-authors to rely on precedence around string concatenation,
  like e.g.:
  https://www.gnu.org/software/gawk/manual/html_node/Concatenation.html
---8<---
Because string concatenation does not have an explicit operator, it is often necessary to ensure that it happens at the right time by using parentheses to enclose the items to concatenate.
---8<---
- new users of awk usually expect a well resp. clearly defined language
  but IMHO we do not face the luxory to deal with that in this context.
  Rule of least surprise for the user applies, as usual. Undefined
  corners of underspecified tools/behaviour is usually clarified
  centrally (by a standard of some sorts to back up claims or document
  consensus umong implementors). No such luxory here AFAIK.
- If you write your scripts defensively (i.e. know your stuff, to
  phrase it politely) all is well even with current busybox awk (see
  below)

Bad:
$ ./busybox awk -v i=1 'BEGIN {print "fourty " ++i "."}'
01.

Expected:
$ ./busybox awk -v i=1 'BEGIN {print "fourty " (++i) "."}'
fourty 2.

Proposed fixes?
Follow up further improvements for awk? Equally robustness improvements to deal with possible similar cases you might make up while testing or debugging stuff? Or maybe you spot some areas that are written way too elaborate and you can save some bytes after having fixed this unspecified but seemingly widely supported precedence convention?

> Ciao,

cheers and have phun,
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.busybox.net/pipermail/busybox/attachments/20200606/4cd8fdee/attachment-0001.asc>


More information about the busybox mailing list