Quantcast
Channel: Intel® C++ Compiler
Viewing all articles
Browse latest Browse all 1616

VADDSSL instruction?

$
0
0

Dear Intel developers,

I'm using intel 15 on E5-2670 processor. Analyzing my code by using Vtune, in a particolar line when I unpack a m128 type in order to sum in a single floating point each elements like horizontal sum, like this:

 

_mm_store_ps(denom_arr_tmp, denom_tmp);

 semblance[m_local] += denom_arr_tmp[0]+denom_arr_tmp[1]+denom_arr_tmp[2]+denom_arr_tmp[3];

 

The assembly generated is:

vunpckhps %xmm2, %xmm2, %xmm3
movq  -0x80(%rbp), %rax
vaddssl  -0x9c(%rbp), %xmm2, %xmm4
vaddss %xmm3, %xmm4, %xmm5
vaddssl  -0x94(%rbp), %xmm5, %xmm6
vaddssl  (%rax,%r14,4), %xmm6, %xmm7
vmovssl  %xmm7, (%rax,%r14,4) 

 

My question is: what is VADDSSL instruction? What's the difference with VADDSS? How I can optimize that piece of code? Actually is a bottleneck.

Thanks.

 


Viewing all articles
Browse latest Browse all 1616

Trending Articles