[PATCH] Fix nextafterf

Sun Feb 8 01:47:30 UTC 2009

On Friday 06 February 2009 03:39, Jie Zhang wrote:
> >> A patch is attached. This patch copies nextafterf from glibc. Tested on 
> >> Blackfin with the above test case. If it's OK, please apply. Thanks.
> > 
> > Why the patch touches libm/s_nextafter.c (the double version)?
> > Your comment says nothing about it.
> > 
> Sorry for not explain that.
> 
> math_opt_barrier and math_force_eval were added in glibc to fix glibc bug:
> 
> http://sources.redhat.com/bugzilla/show_bug.cgi?id=3306
> 
> When I copied nextafterf, I noticed this. So I modified nextafter, too.

Ok, I took a look.

There is optimized version of them for i386 and x86_64.
You didn't use that.

Also, glibc didn't bother to explain what these macros do.
I would like to do better than that, I want people
to be able to hack on uclibc not in a "blind" mode.
Code readability needs to be improved.

I added this:

/*
 * math_opt_barrier(x): force expression x to be evaluated and put into
 * a floating point register or memory. This macro returns the value.
 *
 * math_force_eval(x): force expression x to be evaluated and put into
 * a floating point register or memory *of the appropriate size*.
 * This forces floating point flags to be set correctly
 * (for example, when float value is overflowing, but FPU registers
 * are wide enough to "hide" this).
 */

Is it correct?

Now, let's see how they are used in nextafterf.

        if (ix == 0) { /* x == 0? */
                float u;
                /* return +-minsubnormal */
                SET_FLOAT_WORD(x, (hy & 0x80000000) | 1);
                u = math_opt_barrier(x);
                u = u * u;
                math_force_eval(u); /* raise underflow flag */
                return x;

Ok, we constructed smallest possible subnormal in x.
Why do we load it through math_opt_barrier()?
Why we multiply it by itself?
Can't we just math_force_eval(x)?
If no, we cannot, WE NEED TO EXPLAIN WHY.

Btw, asm output is:

                SET_FLOAT_WORD(x, (hy & 0x80000000) | 1);
                u = math_opt_barrier(x);
                u = u * u;
                math_force_eval(u); /* raise underflow flag */
                return x;

        andl    $-2147483648, %edx
        orl     $1, %edx
        movl    %edx, 4(%esp)
        flds    4(%esp)
        fmul    %st(0), %st
        fstps   8(%esp)
        flds    4(%esp)
        jmp     .L4

                SET_FLOAT_WORD(x, (hy & 0x80000000) | 1);
                u = x * x;
                math_force_eval(u); /* raise underflow flag */
                return x;

        andl    $-2147483648, %edx
        orl     $1, %edx
        movl    %edx, (%esp)
        flds    (%esp)
        fld     %st(0)
        fmul    %st(1), %st
        fstps   4(%esp)
        jmp     .L4

                SET_FLOAT_WORD(x, (hy & 0x80000000) | 1);
                math_force_eval(x); /* raise underflow flag */
                return x;

        andl    $-2147483648, %edx
        orl     $1, %edx
        movl    %edx, (%esp)
        flds    (%esp)
        fsts    16(%esp)
        jmp     .L4

which tells me that  /* raise underflow flag */ comment
is on the wrong line! _multiplication_ is the operation which sets
underflow flag, FST instruction never sets it (I just checked
the manual). In all cases, FST instruction generated here
is useless, it does no useful work whatsoever.

> > You also did not remove the wrapper from float_wrappers.c
> > 
> I indeed removed it in my patch.

Ah, now I see. sorry.

> > I think math_opt_barrier macro should be either eliminated
> > or moved to its sole user, libm/s_nextafterf.c
> > 
> nextafter also needs them. But if uClibc does not care of the spec on 
> this corner, they can be removed from both.

Btw, can you give an example C code which demonstrates
how math_opt_barrier() and math_force_eval() are resulting
in correct implemetation. So far it sounds somewhat nebulous.

IOW: do you have a testcase which checks that nextafter[f]
sets these exception flags and whatnot?
--
vda