Quantcast
Channel: Intel® C++ Compiler
Viewing all articles
Browse latest Browse all 1616

Parentheses not honored when using FMA

$
0
0
Hello,

I have doubts about asm generated by the Intel compiler for the following code that performs orientation test for a point and a segment:

double orient_test_2d(const double m[2], const double a[2], const double b[2])

{
  double am1 = a[0]-m[0];
  double bm1 = b[1]-m[1];
  double am2 = a[1]-m[1];
  double bm2 = b[0]-m[0];

  return ((am1)*(bm1)) - ((am2)*(bm2));
}

In the return statement the operands are all in parentheses. Intel compiler optimizes the statement and introduces a FMA instruction. I think this is wrong because FMA causes the subtraction and multiplication to be effectively executed at the same time, while the source specifies that the multiplications should be performed before the subtraction.

This is the assembly generated by 'icc-15.0.3 -O3 -march=core-avx2 test.c -c -S'

        vmovsd    (%rdi), %xmm5                                 #3.22
        vmovsd    8(%rdi), %xmm3                                #5.22
        vmovsd    8(%rsi), %xmm2                                #5.17
        vmovsd    (%rdx), %xmm4                                 #6.17
        vsubsd    %xmm3, %xmm2, %xmm6                           #5.22
        vsubsd    %xmm5, %xmm4, %xmm7                           #6.22
        vmovsd    8(%rdx), %xmm1                                #4.17
        vmovsd    (%rsi), %xmm0                                 #3.17
        vsubsd    %xmm3, %xmm1, %xmm8                           #4.22
        vmulsd    %xmm7, %xmm6, %xmm9                           #8.33
        vsubsd    %xmm5, %xmm0, %xmm0                           #3.22
        vfmsub213sd %xmm9, %xmm8, %xmm0                         #8.33
        ret                                                     #8.33

I believe that in order to honor the parentheses icc should NOT generate the FMA, but stick to the programmed operation order. If I wanted a FMA, I would not use parentheses around the multiplications. Current behavior of icc results in inconsistent behavior in my code due to floating point arithmetic errors.

As a side note, the GNU compiler does exactly the same thing.

Can this be considered a bug, or is this the expected behavior? If this is ok, how can I assure in a portable way the effect that I need?

Thanks!

Marcin Krotkiewski


Viewing all articles
Browse latest Browse all 1616

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>